Getting Started with Dataclasses

· 4 min read · Updated March 7, 2026 · beginner
dataclasses classes python data-structures

Dataclasses simplify the process of writing classes that primarily store data. They automatically generate __init__, __repr__, __eq__, and other methods, reducing boilerplate code. If you have written classes that mainly hold values, dataclasses eliminate the repetitive parts.

Basic Syntax

The @dataclass decorator turns a regular class into a dataclass:

from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

# Create an instance
p = Point(10, 20)
print(p)
# Point(x=10, y=20)

Compare this to a regular class — you would normally write __init__, __repr__, and __eq__ manually. Dataclasses generate these automatically based on your field definitions.

Default Values

Assign default values by assigning in the class body. Fields with defaults must come after fields without defaults:

from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int = 0  # default value

# Using default
person1 = Person("Alice")
print(person1.name, person1.age)
# Alice 0

# Overriding default
person2 = Person("Bob", 30)
print(person2.name, person2.age)
# Bob 30

If you try to put a field with a default before a field without one, Python raises an error.

Mutable Defaults Gotcha

Never use mutable objects as default values. This is a common mistake that causes hard-to-find bugs:

# WRONG - this causes bugs
@dataclass
class Broken:
    tags: list = []  # Mutable default!

# Every instance shares the same list
b1 = Broken()
b1.tags.append("bug")
print(b1.tags)
# ['bug']
print(Broken().tags)  # Surprise! Also has 'bug'
# ['bug']

# RIGHT - use field() for mutable defaults
from dataclasses import dataclass, field

@dataclass
class Fixed:
    tags: list = field(default_factory=list)

# Each instance gets its own list
f1 = Fixed()
f1.tags.append("correct")
print(f1.tags)
# ['correct']
print(Fixed().tags)  # Fresh empty list
# []

The field() function with default_factory creates a new list for each instance.

Immutability

Add frozen=True to make instances immutable after creation:

from dataclasses import dataclass

@dataclass(frozen=True)
class Config:
    host: str
    port: int

config = Config("localhost", 8080)
config.port = 9000  # Raises FrozenInstanceError

This is useful for constants, configuration objects, and settings that should never change. The frozen mode uses __setattr__ to block modifications after initialization.

Field Validation

Use __post_init__ to validate and transform fields after initialization:

from dataclasses import dataclass

@dataclass
class User:
    name: str
    age: int

    def __post_init__(self):
        if self.age < 0:
            raise ValueError("Age cannot be negative")

# This raises an error
try:
    user = User("Alice", -5)
except ValueError as e:
    print(e)
# Age cannot be negative

You can also use __post_init__ to compute derived fields:

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class Post:
    title: str
    created_at: datetime = field(default_factory=datetime.now)

post = Post("My Post")
print(post.created_at)
# 2026-03-07 13:42:15.123456

Comparison Methods

By default, dataclasses generate __eq__ and __repr__ automatically. To enable comparison operators like <, <=, >, >=, add order=True to the decorator:

from dataclasses import dataclass

@dataclass(order=True)
class Point:
    x: int
    y: int

p1 = Point(10, 20)
p2 = Point(10, 20)
p3 = Point(20, 10)

print(p1 == p2)  # True - same values
# True
print(p1 < p3)   # True - compares x first, 10 < 20
# True

Without order=True, comparison operators raise TypeError. The ordering follows field definition order, so x is compared before y.

Slots for Memory Efficiency

Dataclasses support slots=True to reduce memory usage. This feature requires Python 3.10 or later:

from dataclasses import dataclass

@dataclass(slots=True)  # Python 3.10+
class Tiny:
    x: int
    y: int

Using slots=True works like adding __slots__ to a regular class. Each instance only stores the attributes you define, rather than a full __dict__. This significantly reduces memory footprint when you create many instances.

Field Options

The field() function provides fine-grained control over each field:

from dataclasses import dataclass, field

@dataclass
class Product:
    id: int
    name: str
    price: float = field(default=0.0, compare=False)
    tags: list = field(default_factory=list)

Key options include:

  • default — sets a default value
  • default_factory — calls a function to produce the default (for mutable types)
  • compare — whether to include this field in comparisons
  • repr — whether to include this field in the repr

When to Use Dataclasses

Dataclasses shine in specific scenarios. Use them when your class mainly holds data without complex behavior. They excel at representing database records, API responses, configuration objects, and value objects.

The automatic generation of __repr__ and __eq__ alone makes dataclasses worthwhile. You get meaningful string representations and proper equality checking without writing boilerplate.

Consider dataclasses when:

  • Your class primarily stores values (no complex behavior)
  • You want automatic comparison and representation methods
  • You need clear, readable class definitions with type hints
  • You prefer immutable data structures (combine with frozen=True)
  • You want to reduce boilerplate code

Stick with regular classes when:

  • You need custom initialization logic beyond basic validation
  • The class has significant behavior methods
  • You require complex property calculations
  • You need to inherit from a specific base class that conflicts with dataclass