Python dataclass
Open Source Your Knowledge, Become a Contributor
Technology knowledge has to be shared and made accessible for free. Join the movement.
Introduction
Among the new features of Python 3.7, a new one is the decorator @dataclass that simplify the creation of data classes by auto-generating special methods such as __init__() and __repr__().
A data class is a class whose main purpose is to store data without functionality. This kind of class, also known as data structure, is very common. For example, a class used to store the coordinates of a point is simply a class with 3 fields (x, y, z).
However, we often need to add a constructor, a representation method, a comparison function, etc. These functions are cumbersome, and this is precisely what should be handled transparently by the language.
As a matter of fact, some languages, such as Kotlin, already offers an easy way to create data classes. In Java this can be done using the Lombok library and its @Data annotation.
Example
Here's an example of use of @dataclass:
By default, this will auto-generate the functions needed to instantiate, compare and print the data class instances.
In other words, this is equivalent to:
Note that this particular example could also be done using namedtuple, but the syntax is more complex to understand, even if it is shorter:
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y', 'z'], defaults=(0.0,))
dataclass Parameters
The @dataclass decorator accepts a list of parameters to control which methods should be generated:
@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)ΒΆ
init: if True, generates the__init__method.repr: if True, generates the__repr__method.eq: if True, generates the__eq__method by comparing the fields as they were tuples.order: if True, generates the__lt__,__le__,__gt__, and__ge__methods.unsafe_hash: if False, generates the__hash__method depending on the values ofeqandfrozen. If True, the__hash__function will be generated.frozen: if True, then the instances will be immutable (read-only).
See the documentation for more information.
Field-specific configuration
In the dataclasses module, there's a field function that allows to provide field-specific configuration:
This allows to control the default value, whether it should be displayed by the __repr__ method, ignored by the comparison functions, included in the __hash__ method, etc.
def field(*, default=MISSING, default_factory=MISSING, repr=True,
hash=None, init=True, compare=True, metadata=None)
See the documentation for more information.
Post-init processing
The generated __init__() code will call a method named __post_init__(). This is useful to initialize a variable based on the values of other variables. Note that if no __init__ method is generated, then __post_init__ will not be called.
Other Dataclasses Functions
The dataclasses module also provide a bunch of useful functions:
fields: return a tuple ofFieldobjects. AFieldobject contains the configuration of a field.asdict: converts an instance of data class to a dict of its fields.astuple: converts an instance of data class to a tuple of its fields.make_dataclass: creates a new data class dynamically.replace: clone the given data class instance and modify some fields.is_dataclass: tells whether the given object is an instance of a data class.