Apply Now Apply Now Apply Now
header_logo
Post thumbnail
PYTHON

What Is Pickling in Python? A Complete Guide 

By Vishalini Devarajan

Many Python developers reach a point where they need to save a complex object, like a trained machine learning model or a parsed data structure, and retrieve it later without rebuilding it from scratch. Pickling in Python solves exactly this problem by converting any Python object into a storable byte format. Understanding how pickling works, when to use it, and its security limitations is essential for any backend or data science developer working with Python in 2026.

Table of contents


  1. TL;DR Summary
  2. What Is Pickling in Python?
  3. How to Use the pickle Module
  4. Pickle Protocols
  5. Pickling Custom Classes
  6. Pickling in Machine Learning: Saving Trained Models
  7. Conclusion
  8. FAQs
    • What is pickling in Python? 
    • What is the difference between pickling and unpickling? 
    • Is it safe to unpickle data from unknown sources? 
    • What objects cannot be pickled in Python? 
    • What is the difference between pickle and JSON in Python? 
    • How do I save a machine learning model using pickle? 
    • What is the dill library and how does it relate to pickle? 
    • When should I use joblib instead of pickle in Python? 

TL;DR Summary

Pickling in Python is the process of serialising a Python object into a byte stream so it can be saved to a file, sent over a network, or stored in a database. The reverse process, converting the byte stream back into a Python object, is called unpickling. Python’s built-in pickle module handles both operations. Pickling is widely used in machine learning for saving trained models, in distributed systems for passing objects between processes, and in caching for storing computed results.

Want to build real-world Python skills covering serialization, data handling, and backend development? Explore HCL GUVI’s Python Programming Course, designed for developers ready to go beyond the basics. 

What Is Pickling in Python?

Pickling is the process of converting a Python object into a binary byte stream using the pickle module. This byte stream can be written to a file, stored in a database, or transmitted over a network.

Unpickling is the reverse: reading the byte stream and reconstructing the original Python object from it.

Objects that can be pickled include:

  • Integers, floats, strings, booleans
  • Lists, tuples, dictionaries, sets
  • Functions and classes defined at the module level
  • Instances of most user-defined classes
  • Trained machine learning models like scikit-learn estimators

Read More: What is Python Packages Explained & How to use them

How to Use the pickle Module

Python’s pickle module comes built into the standard library. No installation is needed.

  1. Pickling an Object to a File
import pickle

data = {
    “name”: “Priya”,
    “scores”: [95, 88, 91],
    “active”: True
}

with open(“data.pkl”, “wb”) as file:
    pickle.dump(data, file)

pickle.dump writes the serialised byte stream directly to a file. The file must be opened in binary write mode using “wb”.

  1. Unpickling an Object from a File
import pickle

with open(“data.pkl”, “rb”) as file:
    loaded_data = pickle.load(file)

print(loaded_data)
# Output: {‘name’: ‘Priya’, ‘scores’: [95, 88, 91], ‘active’: True}

pickle.load reads the byte stream and reconstructs the original dictionary exactly. The file must be opened in binary read mode using “rb”.

  1. Pickling to a Byte String in Memory

Use pickle.dumps and pickle.loads when you want to work with the byte stream in memory rather than writing to a file:

import pickle

model_data = {“weights”: [0.5, 0.3, 0.8], “bias”: 0.1}

byte_stream = pickle.dumps(model_data)
print(type(byte_stream))  # Output: bytes

restored = pickle.loads(byte_stream)
print(restored)  # Output: {‘weights’: [0.5, 0.3, 0.8], ‘bias’: 0.1}

This is useful when passing objects between processes or storing them in Redis or a database without intermediate files.

Want to build real-world Python skills covering serialisation, data handling, and backend development? Explore HCL GUVI’s Python Programming Course, designed for developers ready to go beyond the basics. 

MDN

Pickle Protocols

Python’s pickle module supports multiple protocol versions that affect compatibility and performance.

ProtocolPython VersionNotes
0All versionsHuman-readable ASCII format
1All versionsBinary format, backward compatible
2Python 2.3+Better support for new-style classes
3Python 3.0+Default in Python 3, bytes support
4Python 3.4+Supports very large objects
5Python 3.8+Out-of-band data buffers for efficiency

Specify a protocol explicitly when compatibility matters:

pickle.dump(data, file, protocol=4)

Use the highest protocol your Python version supports for best performance. Use protocol 2 if you need Python 2 and Python 3 compatibility.

💡 Did You Know?

Joblib, a popular library in the Python machine learning ecosystem, is built on top of pickle and adds features such as compression and parallel processing for efficiently storing large NumPy arrays and machine learning models. It is the default serialization tool commonly used with scikit-learn model persistence workflows and can save and load array-heavy objects significantly faster than raw pickle, making it a preferred choice for production machine learning applications.

Pickling Custom Classes

Pickling works with user-defined classes as long as they are importable at unpickling time.

import pickle

class Student:
    def __init__(self, name, grade):
        self.name = name
        self.grade = grade

    def __repr__(self):
        return fStudent({self.name}, {self.grade})”

student = Student(“Arun”, “A”)

byte_stream = pickle.dumps(student)
restored = pickle.loads(byte_stream)

print(restored)        # Output: Student(Arun, A)
print(restored.name)   # Output: Arun

The class definition must be available when you unpickle. If the Student class is missing or has changed, unpickling will fail with an AttributeError or ModuleNotFoundError.

Pickling in Machine Learning: Saving Trained Models

One of the most common uses of pickling in Python is saving trained scikit-learn models so they can be reused without retraining.

import pickle
from sklearn.linear_model import LogisticRegression
import numpy as np

X = np.array([[1, 2], [3, 4], [5, 6]])
y = np.array([0, 1, 0])

model = LogisticRegression()
model.fit(X, y)

with open(“model.pkl”, “wb”) as file:
    pickle.dump(model, file)

with open(“model.pkl”, “rb”) as file:
    loaded_model = pickle.load(file)

print(loaded_model.predict([[3, 4]]))  # Output: [1]

The trained model with all its learned weights is saved to disk and restored later. This is a core pattern in ML pipelines and API serving workflows where model training and inference run at different times.

💡 Did You Know?

Joblib, a popular library in the Python machine learning ecosystem, is built on top of pickle and adds features such as compression and parallel processing for efficiently storing large NumPy arrays and machine learning models. It is the default serialization tool commonly used with scikit-learn model persistence workflows and can save and load array-heavy objects significantly faster than raw pickle, making it a preferred choice for production machine learning applications.

Conclusion

Understanding how to use the pickle module correctly, choosing the right protocol, and recognising its security limitations ensures you apply it safely and effectively.

As you work on more advanced Python projects involving ML pipelines, distributed task queues like Celery, or high-performance data workflows, you will encounter pickling at every layer.

FAQs

What is pickling in Python? 

Pickling is the process of serialising a Python object into a binary byte stream using the pickle module so it can be saved, transmitted, or stored and later restored.

What is the difference between pickling and unpickling? 

Pickling converts a Python object into a byte stream. Unpickling converts the byte stream back into the original Python object.

Is it safe to unpickle data from unknown sources? 

No. Unpickling untrusted data is a serious security risk because a malicious pickle file can execute arbitrary code. Only unpickle data from sources you fully control.

What objects cannot be pickled in Python? 

Lambda functions, locally defined functions, file handles, database connections, and some built-in types like generators cannot be pickled with the standard pickle module.

What is the difference between pickle and JSON in Python? 

Pickle supports almost all Python objects but is binary and Python-specific. JSON supports only basic types but is human-readable, cross-language, and much safer for untrusted data.

How do I save a machine learning model using pickle? 

Open a file in binary write mode and use pickle.dump(model, file) to save. Use pickle.load(file) in binary read mode to restore the model later with all its trained parameters intact.

What is the dill library and how does it relate to pickle? 

dill extends Python’s pickle module to handle objects that standard pickle cannot serialise, including lambda functions, closures, and locally defined classes.

MDN

When should I use joblib instead of pickle in Python? 

Use joblib when serialising large numpy arrays, pandas DataFrames, or scikit-learn models. Joblib handles compression and memory-mapped files automatically, making it significantly faster than raw pickle for numerical data.

Success Stories

Did you enjoy this article?

Schedule 1:1 free counselling

Similar Articles

Loading...
Get in Touch
Chat on Whatsapp
Request Callback
Share logo Copy link
Table of contents Table of contents
Table of contents Articles
Close button

  1. TL;DR Summary
  2. What Is Pickling in Python?
  3. How to Use the pickle Module
  4. Pickle Protocols
  5. Pickling Custom Classes
  6. Pickling in Machine Learning: Saving Trained Models
  7. Conclusion
  8. FAQs
    • What is pickling in Python? 
    • What is the difference between pickling and unpickling? 
    • Is it safe to unpickle data from unknown sources? 
    • What objects cannot be pickled in Python? 
    • What is the difference between pickle and JSON in Python? 
    • How do I save a machine learning model using pickle? 
    • What is the dill library and how does it relate to pickle? 
    • When should I use joblib instead of pickle in Python?