attrs vs dataclasses: A Practical Comparison
If you have written Python classes that are mostly containers for data, you have probably felt the pain of writing repetitive __init__, __repr__, and __eq__ methods. Two solutions have emerged: the built-in dataclasses module (Python 3.7+) and the third-party attrs library.
This guide compares them head-to-head so you can pick the right tool for your project.
The Core Similarity
Both attrs and dataclasses exist to solve the same problem: reducing boilerplate when creating data-holding classes. They both automatically generate __init__, __repr__, __eq__, and other special methods based on your field definitions.
# dataclasses (stdlib)
from dataclasses import dataclass
@dataclass
class Point:
x: float
y: float
# attrs (third-party)
import attr
@attr.s
class Point:
x: float
y: float
Both produce a class that works essentially the same way:
p1 = Point(1.0, 2.0)
p2 = Point(1.0, 2.0)
print(p1 == p2) # True
# Point(x=1.0, y=2.0)
Key Differences at a Glance
| Feature | dataclasses | attrs |
|---|---|---|
| Stdlib | Yes (3.7+) | No (pip install) |
| Auto-generate methods | Basic set | Extended set |
| Validators | No | Yes |
| Converters | No | Yes |
| Immutability | frozen=True | frozen=True |
| Slots | slots=True | auto_attribs=True |
Basic Usage
dataclasses
from dataclasses import dataclass
@dataclass
class User:
name: str
email: str
age: int = 0 # default value
user = User("Alice", "alice@example.com")
print(user)
# User(name='Alice', email='alice@example.com', age=0)
attrs
import attr
@attr.s
class User:
name: str
email: str
age: int = attr.ib(default=0)
user = User("Alice", "alice@example.com")
print(user)
# User(name='Alice', email='alice@example.com', age=0)
Notice the slightly different syntax for default values: attr.ib(default=0) instead of just = 0.
Immutability
Both libraries support creating frozen (immutable) instances.
dataclasses
from dataclasses import dataclass
@dataclass(frozen=True)
class RGB:
red: int
green: int
blue: int
color = RGB(255, 128, 0)
# color.red = 0 # Raises FrozenInstanceError
attrs
import attr
@attr.s(frozen=True)
class RGB:
red: int
green: int
blue: int
color = RGB(255, 128, 0)
# color.red = 0 # Raises FrozenInstanceError
Validators
This is where attrs pulls ahead. Dataclasses have no built-in validation—you need to use __post_init__ or external libraries. attrs has validators built in.
attrs Validators
import attr
from typing import List
@attr.s
class Person:
name: str = attr.ib()
age: int = attr.ib()
email: str = attr.ib()
@age.validator
def check_age(self, attribute, value):
if value < 0:
raise ValueError(f"Age cannot be negative: {value}")
@email.validator
def check_email(self, attribute, value):
if "@" not in value:
raise ValueError(f"Invalid email: {value}")
# Person("Bob", -5, "bob") # Raises ValueError
You can also use validators from attr.validators:
import attr
from attr import validators
@attr.s
class Config:
port: int = attr.ib(validator=validators.in_range(1, 65535))
debug: bool = attr.ib(validator=validators.instance_of(bool))
hosts: list = attr.ib(validator=validators.min_len(1))
dataclasses Validation
With dataclasses, you need to manually implement validation:
from dataclasses import dataclass
@dataclass
class Person:
name: str
age: int
email: str
def __post_init__(self):
if self.age < 0:
raise ValueError(f"Age cannot be negative: {self.age}")
if "@" not in self.email:
raise ValueError(f"Invalid email: {self.email}")
This works, but it is more verbose and less reusable than attrs validators.
Converters
attrs supports converters that transform values on the way in:
import attr
@attr.s
class User:
name: str = attr.ib(converter=str.strip)
active: bool = attr.ib(converter=lambda x: bool(x))
user = User(" Alice ", "yes")
print(user.name) # "Alice" (stripped)
print(user.active) # True
Dataclasses have no equivalent—you would need to handle conversion in __post_init__ or elsewhere.
Field Options
dataclasses field()
from dataclasses import dataclass, field
@dataclass
class User:
name: str
password_hash: str = field(repr=False) # exclude from repr
tags: list = field(default_factory=list) # mutable default
attrs field()
import attr
@attr.s
class User:
name: str
password_hash: str = attr.ib(repr=False) # exclude from repr
tags: list = attr.ib(factory=list) # mutable default
The syntax differs slightly, but the capabilities are similar.
Slots
Both support __slots__ for memory efficiency.
dataclasses
@dataclass(slots=True)
class Point:
x: float
y: float
attrs
@attr.s(slots=True)
class Point:
x: float
y: float
Serialization
Neither library handles serialization directly, but both work well with companion libraries.
For attrs, use cattrs:
import attr
import cattrs
@attr.s
class User:
name: str
age: int
structured = {"name": "Alice", "age": 30}
user = cattrs.structure(structured, User)
output = cattrs.unstructure(user)
For dataclasses, use the built-in asdict or external libraries:
from dataclasses import dataclass, asdict, field
@dataclass
class User:
name: str
age: int
user = User("Alice", 30)
output = asdict(user) # {'name': 'Alice', 'age': 30}
For more complex serialization, cattrs also supports dataclasses.
When to Choose Which
Choose dataclasses when:
- You want zero dependencies
- Your data classes are simple and do not need validation
- You are working with Python 3.7+ and prefer the stdlib
- You need slots for memory efficiency in large datasets
Choose attrs when:
- You need built-in validators
- You want converters for automatic type conversion
- You need more control over generated methods
- You are building a library or framework that benefits from attrs patterns
- You do not mind adding a dependency
Performance
Both libraries add minimal overhead. In most applications, the difference is negligible. If you are processing millions of instances, slots=True (both libraries) matters more than which library you choose.
Migrating from Dataclasses to attrs
If you start with dataclasses and later need validators or converters, the migration is straightforward:
# Before: dataclass
from dataclasses import dataclass
@dataclass
class User:
name: str
email: str
def __post_init__(self):
if "@" not in self.email:
raise ValueError("Invalid email")
# After: attrs
import attr
@attr.s
class User:
name: str
email: str = attr.ib(validator=attr.validators.matches_re(r".*@.*"))
The attrs version is more concise and the validator is reusable.
Conclusion
Dataclasses are the right choice for simple data containers where you just want to reduce boilerplate. They are built-in, require no dependencies, and integrate well with the standard library.
attrs is the better choice when you need more power—validators, converters, and more control over how your classes behave. The extra dependency is worth it for projects that need these features.
Both are mature, well-maintained libraries. Your choice depends on your specific needs, not on which is “better” in absolute terms.
See Also
dataclassesmodule — Python built-in dataclasses documentation- attrs documentation — Official attrs library documentation
__slots__— Using slots for memory efficiency in Python classes