Python dataclass
Open Source Your Knowledge, Become a Contributor
Technology knowledge has to be shared and made accessible for free. Join the movement.
Introduction
Among the new features of Python 3.7, a new one is the decorator @dataclass
that simplify the creation of data classes by auto-generating special methods such as __init__()
and __repr__()
.
A data class is a class whose main purpose is to store data without functionality. This kind of class, also known as data structure, is very common. For example, a class used to store the coordinates of a point is simply a class with 3 fields (x, y, z).
However, we often need to add a constructor, a representation method, a comparison function, etc. These functions are cumbersome, and this is precisely what should be handled transparently by the language.
As a matter of fact, some languages, such as Kotlin, already offers an easy way to create data classes. In Java this can be done using the Lombok library and its @Data
annotation.
Example
Here's an example of use of @dataclass
:
By default, this will auto-generate the functions needed to instantiate, compare and print the data class instances.
In other words, this is equivalent to:
Note that this particular example could also be done using namedtuple
, but the syntax is more complex to understand, even if it is shorter:
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y', 'z'], defaults=(0.0,))
dataclass Parameters
The @dataclass
decorator accepts a list of parameters to control which methods should be generated:
@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)ΒΆ
init
: if True, generates the__init__
method.repr
: if True, generates the__repr__
method.eq
: if True, generates the__eq__
method by comparing the fields as they were tuples.order
: if True, generates the__lt__
,__le__
,__gt__
, and__ge__
methods.unsafe_hash
: if False, generates the__hash__
method depending on the values ofeq
andfrozen
. If True, the__hash__
function will be generated.frozen
: if True, then the instances will be immutable (read-only).
See the documentation for more information.
Field-specific configuration
In the dataclasses
module, there's a field
function that allows to provide field-specific configuration:
This allows to control the default value, whether it should be displayed by the __repr__
method, ignored by the comparison functions, included in the __hash__
method, etc.
def field(*, default=MISSING, default_factory=MISSING, repr=True,
hash=None, init=True, compare=True, metadata=None)
See the documentation for more information.
Post-init processing
The generated __init__()
code will call a method named __post_init__()
. This is useful to initialize a variable based on the values of other variables. Note that if no __init__
method is generated, then __post_init__
will not be called.
Other Dataclasses Functions
The dataclasses module also provide a bunch of useful functions:
fields
: return a tuple ofField
objects. AField
object contains the configuration of a field.asdict
: converts an instance of data class to a dict of its fields.astuple
: converts an instance of data class to a tuple of its fields.make_dataclass
: creates a new data class dynamically.replace
: clone the given data class instance and modify some fields.is_dataclass
: tells whether the given object is an instance of a data class.