pickle
import pickle The pickle module implements binary protocols for serializing and deserializing a Python object hierarchy. “Pickling” converts a Python object into a byte stream; “unpickling” converts that byte stream back into an object. This process is also known as serialization or marshalling.
Pickle preserves Python object types and structures, making it ideal for saving complex Python objects to disk or transmitting them over a network. Unlike JSON, pickle can handle virtually any Python object, including custom class instances, functions, and circular references.
Functions
pickle.dump()
Writes a pickled representation of a Python object to a file.
Signature:
pickle.dump(obj, file, protocol=None, *, fix_imports=True, buffers=None)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
obj | object | — | The Python object to serialize |
file | file-like | — | A file object opened in binary write mode (‘wb’) |
protocol | int | None | Pickle protocol version (0-5), None uses HIGHEST_PROTOCOL |
fix_imports | bool | True | Map Python 3 names to Python 2 names if protocol < 3 |
buffers | iterable | None | Out-of-band buffers for protocol 5+ |
Returns: None — output is written directly to the file.
Example:
import pickle
data = {"name": "Alice", "scores": [95, 87, 92]}
with open("data.pkl", "wb") as f:
pickle.dump(data, f)
print("Data saved to data.pkl")
# Data saved to data.pkl
pickle.dumps()
Returns the pickled representation of a Python object as a bytes object.
Signature:
pickle.dumps(obj, protocol=None, *, fix_imports=True, buffers=None)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
obj | object | — | The Python object to serialize |
protocol | int | None | Pickle protocol version (0-5), None uses HIGHEST_PROTOCOL |
fix_imports | bool | True | Map Python 3 names to Python 2 names if protocol < 3 |
buffers | iterable | None | Out-of-band buffers for protocol 5+ |
Returns: bytes — the pickled representation of the object.
Example:
import pickle
data = [1, 2, 3, "hello", {"key": "value"}]
pickled = pickle.dumps(data)
print(f"Pickled length: {len(pickled)} bytes")
print(f"Pickled data: {pickled[:50]}...")
# Pickled length: 46 bytes
# Pickled data: b'\x80\x05\x95\x0b\x00\x00\x00\x00\x00\x00\x00]...
pickle.load()
Reads a pickled object from a file.
Signature:
pickle.load(file, *, fix_imports=True, encoding='ASCII', errors='strict', buffers=None)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
file | file-like | — | A file object opened in binary read mode (‘rb’) |
fix_imports | bool | True | Map Python 2 names to Python 3 names |
encoding | str | ’ASCII’ | Encoding for Python 2 strings (‘ASCII’, ‘Latin-1’, ‘bytes’) |
errors | str | ’strict’ | Error handling mode for encoding |
buffers | iterable | None | Out-of-band buffers for protocol 5+ |
Returns: the deserialized Python object.
Example:
import pickle
# First, create the file
data = {"name": "Bob", "age": 30}
with open("data.pkl", "wb") as f:
pickle.dump(data, f)
# Now load it
with open("data.pkl", "rb") as f:
loaded_data = pickle.load(f)
print(loaded_data)
# {'name': 'Bob', 'age': 30}
pickle.loads()
Reads a pickled object from a bytes object.
Signature:
pickle.loads(data, *, fix_imports=True, encoding='ASCII', errors='strict', buffers=None)
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
data | bytes | — | The pickled byte data |
fix_imports | bool | True | Map Python 2 names to Python 3 names |
encoding | str | ’ASCII’ | Encoding for Python 2 strings |
errors | str | ’strict’ | Error handling mode |
buffers | iterable | None | Out-of-band buffers for protocol 5+ |
Returns: the deserialized Python object.
Example:
import pickle
original = {"tasks": ["write", "review", "publish"]}
pickled = pickle.dumps(original)
loaded = pickle.loads(pickled)
print(loaded == original)
# True
Common Patterns
Saving and loading with different protocols
import pickle
data = {"version": "1.0", "config": {"timeout": 30}}
# Protocol 0 (ASCII, most compatible)
with open("data0.pkl", "wb") as f:
pickle.dump(data, f, protocol=0)
# Protocol 4 (Python 3.4+, faster)
with open("data4.pkl", "wb") as f:
pickle.dump(data, f, protocol=4)
# Protocol 5 (Python 3.8+, supports out-of-band data)
with open("data5.pkl", "wb") as f:
pickle.dump(data, f, protocol=5)
import os
print(f"Protocol 0: {os.path.getsize('data0.pkl')} bytes")
print(f"Protocol 4: {os.path.getsize('data4.pkl')} bytes")
print(f"Protocol 5: {os.path.getsize('data5.pkl')} bytes")
Pickling custom classes
import pickle
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def __repr__(self):
return f"Person(name='{self.name}', age={self.age})"
alice = Person("Alice", 30)
# Pickle and unpickle
pickled = pickle.dumps(alice)
bob = pickle.loads(pickled)
print(bob)
print(bob.name, bob.age)
# Person(name='Alice', age=30)
# Alice 30
Security
Never unpickle data from untrusted sources. The pickle module is not secure. Malicious pickle data can execute arbitrary code during unpickling because the unpickler can call any Python function or constructor.
# DANGEROUS - never do this!
# pickle.loads(untrusted_data) # Could execute arbitrary code
For untrusted data, use safer alternatives:
- JSON — for human-readable, safe data exchange
- msgpack — for efficient binary serialization
- MessagePack or Avro — for schema-defined data
What Can Be Pickled
Pickle can serialize most Python objects:
- Built-in types: None, bool, int, float, complex, str, bytes, bytearray
- Collections: tuple, list, set, frozenset, dict
- Functions and classes defined at module level (by name reference)
- Class instances (their
__dict__is saved)
Cannot be pickled:
- Open file handles or network connections
- Database connections
- Lambda functions (anonymous functions)
- Generator objects
- Modules (only references by name)