Getting Started with Dataclasses
Dataclasses simplify the process of writing classes that primarily store data. They automatically generate __init__, __repr__, __eq__, and other methods, reducing boilerplate code. If you have written classes that mainly hold values, dataclasses eliminate the repetitive parts.
Basic Syntax
The @dataclass decorator turns a regular class into a dataclass:
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
# Create an instance
p = Point(10, 20)
print(p)
# Point(x=10, y=20)
Compare this to a regular class — you would normally write __init__, __repr__, and __eq__ manually. Dataclasses generate these automatically based on your field definitions.
Default Values
Assign default values by assigning in the class body. Fields with defaults must come after fields without defaults:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int = 0 # default value
# Using default
person1 = Person("Alice")
print(person1.name, person1.age)
# Alice 0
# Overriding default
person2 = Person("Bob", 30)
print(person2.name, person2.age)
# Bob 30
If you try to put a field with a default before a field without one, Python raises an error.
Mutable Defaults Gotcha
Never use mutable objects as default values. This is a common mistake that causes hard-to-find bugs:
# WRONG - this causes bugs
@dataclass
class Broken:
tags: list = [] # Mutable default!
# Every instance shares the same list
b1 = Broken()
b1.tags.append("bug")
print(b1.tags)
# ['bug']
print(Broken().tags) # Surprise! Also has 'bug'
# ['bug']
# RIGHT - use field() for mutable defaults
from dataclasses import dataclass, field
@dataclass
class Fixed:
tags: list = field(default_factory=list)
# Each instance gets its own list
f1 = Fixed()
f1.tags.append("correct")
print(f1.tags)
# ['correct']
print(Fixed().tags) # Fresh empty list
# []
The field() function with default_factory creates a new list for each instance.
Immutability
Add frozen=True to make instances immutable after creation:
from dataclasses import dataclass
@dataclass(frozen=True)
class Config:
host: str
port: int
config = Config("localhost", 8080)
config.port = 9000 # Raises FrozenInstanceError
This is useful for constants, configuration objects, and settings that should never change. The frozen mode uses __setattr__ to block modifications after initialization.
Field Validation
Use __post_init__ to validate and transform fields after initialization:
from dataclasses import dataclass
@dataclass
class User:
name: str
age: int
def __post_init__(self):
if self.age < 0:
raise ValueError("Age cannot be negative")
# This raises an error
try:
user = User("Alice", -5)
except ValueError as e:
print(e)
# Age cannot be negative
You can also use __post_init__ to compute derived fields:
from dataclasses import dataclass, field
from datetime import datetime
@dataclass
class Post:
title: str
created_at: datetime = field(default_factory=datetime.now)
post = Post("My Post")
print(post.created_at)
# 2026-03-07 13:42:15.123456
Comparison Methods
By default, dataclasses generate __eq__ and __repr__ automatically. To enable comparison operators like <, <=, >, >=, add order=True to the decorator:
from dataclasses import dataclass
@dataclass(order=True)
class Point:
x: int
y: int
p1 = Point(10, 20)
p2 = Point(10, 20)
p3 = Point(20, 10)
print(p1 == p2) # True - same values
# True
print(p1 < p3) # True - compares x first, 10 < 20
# True
Without order=True, comparison operators raise TypeError. The ordering follows field definition order, so x is compared before y.
Slots for Memory Efficiency
Dataclasses support slots=True to reduce memory usage. This feature requires Python 3.10 or later:
from dataclasses import dataclass
@dataclass(slots=True) # Python 3.10+
class Tiny:
x: int
y: int
Using slots=True works like adding __slots__ to a regular class. Each instance only stores the attributes you define, rather than a full __dict__. This significantly reduces memory footprint when you create many instances.
Field Options
The field() function provides fine-grained control over each field:
from dataclasses import dataclass, field
@dataclass
class Product:
id: int
name: str
price: float = field(default=0.0, compare=False)
tags: list = field(default_factory=list)
Key options include:
default— sets a default valuedefault_factory— calls a function to produce the default (for mutable types)compare— whether to include this field in comparisonsrepr— whether to include this field in the repr
When to Use Dataclasses
Dataclasses shine in specific scenarios. Use them when your class mainly holds data without complex behavior. They excel at representing database records, API responses, configuration objects, and value objects.
The automatic generation of __repr__ and __eq__ alone makes dataclasses worthwhile. You get meaningful string representations and proper equality checking without writing boilerplate.
Consider dataclasses when:
- Your class primarily stores values (no complex behavior)
- You want automatic comparison and representation methods
- You need clear, readable class definitions with type hints
- You prefer immutable data structures (combine with
frozen=True) - You want to reduce boilerplate code
Stick with regular classes when:
- You need custom initialization logic beyond basic validation
- The class has significant behavior methods
- You require complex property calculations
- You need to inherit from a specific base class that conflicts with dataclass