The Problem Dataclasses Solve
Writing a basic data-holding class in Python is repetitive:
class User:
def __init__(self, name: str, age: int, email: str):
self.name = name
self.age = age
self.email = email
def __repr__(self):
return f"User(name={self.name!r}, age={self.age!r}, email={self.email!r})"
def __eq__(self, other):
return (self.name, self.age, self.email) == (other.name, other.age, other.email)
With @dataclass, this becomes:
from dataclasses import dataclass
@dataclass
class User:
name: str
age: int
email: str
Python auto-generates __init__, __repr__, and __eq__ from the field annotations.
Default Values
from dataclasses import dataclass, field
from typing import Optional
@dataclass
class Config:
host: str = 'localhost'
port: int = 3000
debug: bool = False
tags: list[str] = field(default_factory=list) # mutable defaults need field()
metadata: dict = field(default_factory=dict)
name: Optional[str] = None
Important: Never use tags: list = [] as a default โ mutable defaults are shared across instances. Use field(default_factory=list) instead.
# โ Wrong: all instances share the same list
@dataclass
class Bad:
items: list = [] # This will cause a TypeError โ dataclass prevents it
# โ
Correct: each instance gets its own list
@dataclass
class Good:
items: list = field(default_factory=list)
Frozen Dataclasses (Immutable)
@dataclass(frozen=True)
class Point:
x: float
y: float
p = Point(1.0, 2.0)
p.x = 5.0 # TypeError: cannot assign to field 'x'
# Frozen dataclasses are hashable (can be used as dict keys or in sets)
points = {Point(0, 0), Point(1, 1)}
Post-Init Processing
@dataclass
class Circle:
radius: float
diameter: float = field(init=False) # Not in __init__ params
area: float = field(init=False)
def __post_init__(self):
if self.radius < 0:
raise ValueError("Radius cannot be negative")
self.diameter = self.radius * 2
self.area = 3.14159 * self.radius ** 2
c = Circle(radius=5)
print(c.diameter) # 10.0
print(c.area) # 78.53975
Field Customization
from dataclasses import dataclass, field
@dataclass
class Product:
name: str
price: float
_internal_id: str = field(repr=False, compare=False) # Hidden from repr/eq
created_at: str = field(default_factory=lambda: datetime.now().isoformat())
tags: list[str] = field(default_factory=list)
# Computed field โ not part of __init__
display_price: str = field(init=False, repr=True)
def __post_init__(self):
self.display_price = f"${self.price:.2f}"
field() options:
defaultโ default valuedefault_factoryโ callable for mutable defaultsrepr=Falseโ exclude from__repr__compare=Falseโ exclude from__eq__init=Falseโ exclude from__init__(set in__post_init__)
Ordering
@dataclass(order=True)
class Version:
major: int
minor: int
patch: int
v1 = Version(1, 2, 3)
v2 = Version(1, 3, 0)
print(v1 < v2) # True (compares major, then minor, then patch)
versions = [Version(2, 0, 0), Version(1, 9, 5), Version(1, 10, 0)]
print(sorted(versions))
# [Version(major=1, minor=9, patch=5), Version(major=1, minor=10, patch=0), ...]
Dataclass Utilities
from dataclasses import asdict, astuple, replace, fields
@dataclass
class Point:
x: float
y: float
p = Point(1.0, 2.0)
# Convert to dict (useful for serialization)
asdict(p) # {'x': 1.0, 'y': 2.0}
# Convert to tuple
astuple(p) # (1.0, 2.0)
# Create a copy with some fields changed (frozen-safe)
p2 = replace(p, x=5.0) # Point(x=5.0, y=2.0)
# Inspect fields programmatically
for f in fields(p):
print(f.name, f.type) # x float, y float
Dataclass vs NamedTuple vs TypedDict
dataclass | NamedTuple | TypedDict | |
|---|---|---|---|
| Mutable | Yes (unless frozen) | No | Yes (dict) |
| Methods | Yes | Yes | No |
| Inheritance | Yes | Yes | Yes |
| JSON serializable | Via asdict() | Via ._asdict() | Native |
| Use when | Data + behavior | Simple immutable data | Dict structure |
Key Takeaways
@dataclassgenerates__init__,__repr__,__eq__automatically- Use
field(default_factory=list)for mutable defaults โ never bare[] frozen=Truemakes instances immutable and hashable__post_init__runs after__init__โ validate and compute derived fields thereasdict(),replace(),fields()are utility functions for working with dataclassesorder=Truegenerates comparison methods for sorting