Integration with dataclasses and attrs

SQLAlchemy as of version 2.0 features “native dataclass” integration where an Annotated Declarative Table mapping may be turned into a Python dataclass by adding a single mixin or decorator to mapped classes.

New in version 2.0: Integrated dataclass creation with ORM Declarative classes

There are also patterns available that allow existing dataclasses to be mapped, as well as to map classes instrumented by the attrs third party integration library.

Declarative Dataclass Mapping

SQLAlchemy Annotated Declarative Table mappings may be augmented with an additional mixin class or decorator directive, which will add an additional step to the Declarative process after the mapping is complete that will convert the mapped class in-place into a Python dataclass, before completing the mapping process which applies ORM-specific instrumentation to the class. The most prominent behavioral addition this provides is generation of an __init__() method with fine-grained control over positional and keyword arguments with or without defaults, as well as generation of methods like __repr__() and __eq__().

From a PEP 484 typing perspective, the class is recognized as having Dataclass-specific behaviors, most notably by taking advantage of PEP 681 “Dataclass Transforms”, which allows typing tools to consider the class as though it were explicitly decorated using the @dataclasses.dataclass decorator.

Note

Support for PEP 681 in typing tools as of July 3, 2022 is limited and is currently known to be supported by Pyright, but not yet Mypy. When PEP 681 is not supported, typing tools will see the __init__() constructor provided by the DeclarativeBase superclass, if used, else will see the constructor as untyped.

Dataclass conversion may be added to any Declarative class either by adding the MappedAsDataclass mixin to a DeclarativeBase class hierarchy, or for decorator mapping by using the registry.mapped_as_dataclass() class decorator.

The MappedAsDataclass mixin may be applied either to the Declarative Base class or any superclass, as in the example below:

from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import MappedAsDataclass


class Base(MappedAsDataclass, DeclarativeBase):
    """subclasses will be converted to dataclasses"""


class User(Base):
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]

Or may be applied directly to classes that extend from the Declarative base:

from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import MappedAsDataclass


class Base(DeclarativeBase):
    pass


class User(MappedAsDataclass, Base):
    """User class will be converted to a dataclass"""

    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]

When using the decorator form, only the registry.mapped_as_dataclass() decorator is supported:

from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry


reg = registry()


@reg.mapped_as_dataclass
class User:
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]

Class level feature configuration

Support for dataclasses features is partial. Currently supported are the init, repr, eq, order and unsafe_hash features, match_args and kw_only are supported on Python 3.10+. Currently not supported are the frozen and slots features.

When using the mixin class form with MappedAsDataclass, class configuration arguments are passed as class-level parameters:

from sqlalchemy.orm import DeclarativeBase
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import MappedAsDataclass


class Base(DeclarativeBase):
    pass


class User(MappedAsDataclass, Base, repr=False, unsafe_hash=True):
    """User class will be converted to a dataclass"""

    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]

When using the decorator form with registry.mapped_as_dataclass(), class configuration arguments are passed to the decorator directly:

from sqlalchemy.orm import registry
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column


reg = registry()


@reg.mapped_as_dataclass(unsafe_hash=True)
class User:
    """User class will be converted to a dataclass"""

    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]

For background on dataclass class options, see the dataclasses documentation at @dataclasses.dataclass.

Attribute Configuration

SQLAlchemy native dataclasses differ from normal dataclasses in that attributes to be mapped are described using the Mapped generic annotation container in all cases. Mappings follow the same forms as those documented at Declarative Table with mapped_column(), and all features of mapped_column() and Mapped are supported.

Additionally, ORM attribute configuration constructs including mapped_column(), relationship() and composite() support per-attribute field options, including init, default, default_factory and repr. The names of these arguments is fixed as specified in PEP 681. Functionality is equivalent to dataclasses:

Another key difference from dataclasses is that default values for attributes must be configured using the default parameter of the ORM construct, such as mapped_column(default=None). A syntax that resembles dataclass syntax which accepts simple Python values as defaults without using @dataclases.field() is not supported.

As an example using mapped_column(), the mapping below will produce an __init__() method that accepts only the fields name and fullname, where name is required and may be passed positionally, and fullname is optional. The id field, which we expect to be database-generated, is not part of the constructor at all:

from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry

reg = registry()


@reg.mapped_as_dataclass
class User:
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    name: Mapped[str]
    fullname: Mapped[str] = mapped_column(default=None)


# 'fullname' is optional keyword argument
u1 = User("name")

Column Defaults

In order to accommodate the name overlap of the default argument with the existing Column.default parameter of the Column construct, the mapped_column() construct disambiguates the two names by adding a new parameter mapped_column.insert_default, which will be populated directly into the Column.default parameter of Column, independently of what may be set on mapped_column.default, which is always used for the dataclasses configuration. For example, to configure a datetime column with a Column.default set to the func.utc_timestamp() SQL function, but where the parameter is optional in the constructor:

from datetime import datetime

from sqlalchemy import func
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry

reg = registry()


@reg.mapped_as_dataclass
class User:
    __tablename__ = "user_account"

    id: Mapped[int] = mapped_column(init=False, primary_key=True)
    created_at: Mapped[datetime] = mapped_column(
        insert_default=func.utc_timestamp(), default=None
    )

With the above mapping, an INSERT for a new User object where no parameter for created_at were passed proceeds as:

>>> with Session(e) as session:
...     session.add(User())
...     session.commit()
BEGIN (implicit) INSERT INTO user_account (created_at) VALUES (utc_timestamp()) [generated in 0.00010s] () COMMIT

Integration with Annotated

The approach introduced at Mapping Whole Column Declarations to Python Types illustrates how to use PEP 593 Annotated objects to package whole mapped_column() constructs for re-use. This feature is supported with the dataclasses feature. One aspect of the feature however requires a workaround when working with typing tools, which is that the PEP 681-specific arguments init, default, repr, and default_factory must be on the right hand side, packaged into an explicit mapped_column() construct, in order for the typing tool to interpret the attribute correctly. As an example, the approach below will work perfectly fine at runtime, however typing tools will consider the User() construction to be invalid, as they do not see the init=False parameter present:

from typing import Annotated

from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry

# typing tools will ignore init=False here
intpk = Annotated[int, mapped_column(init=False, primary_key=True)]

reg = registry()


@reg.mapped_as_dataclass
class User:
    __tablename__ = "user_account"
    id: Mapped[intpk]


# typing error: Argument missing for parameter "id"
u1 = User()

Instead, mapped_column() must be present on the right side as well with an explicit setting for mapped_column.init; the other arguments can remain within the Annotated construct:

from typing import Annotated

from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry

intpk = Annotated[int, mapped_column(primary_key=True)]

reg = registry()


@reg.mapped_as_dataclass
class User:
    __tablename__ = "user_account"

    # init=False and other pep-681 arguments must be inline
    id: Mapped[intpk] = mapped_column(init=False)


u1 = User()

Relationship Configuration

The Mapped annotation in combination with relationship() is used in the same way as described at Basic Relationship Patterns. When specifying a collection-based relationship() as an optional keyword argument, the relationship.default_factory parameter must be passed and it must refer to the collection class that’s to be used. Many-to-one and scalar object references may make use of relationship.default if the default value is to be None:

from typing import List

from sqlalchemy import ForeignKey
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import mapped_column
from sqlalchemy.orm import registry
from sqlalchemy.orm import relationship

reg = registry()


@reg.mapped_as_dataclass
class Parent:
    __tablename__ = "parent"
    id: Mapped[int] = mapped_column(primary_key=True)
    children: Mapped[List["Child"]] = relationship(
        default_factory=list, back_populates="parent"
    )


@reg.mapped_as_dataclass
class Child:
    __tablename__ = "child"
    id: Mapped[int] = mapped_column(primary_key=True)
    parent_id: Mapped[int] = mapped_column(ForeignKey("parent.id"))
    parent: Mapped["Parent"] = relationship(default=None)

The above mapping will generate an empty list for Parent.children when a new Parent() object is constructed without passing children, and similarly a None value for Child.parent when a new Child() object is constructed without passsing parent.

While the relationship.default_factory can be automatically derived from the given collection class of the relationship() itself, this would break compatibility with dataclasses, as the presence of relationship.default_factory or relationship.default is what determines if the parameter is to be required or optional when rendered into the __init__() method.

Applying ORM Mappings to an existing dataclass

SQLAlchemy’s native dataclass support builds upon the previous version of the feature first introduced in SQLAlchemy 1.4, which supports the application of ORM mappings to a class after it has been processed with the @dataclass decorator, by using either the registry.mapped() class decorator, or the registry.map_imperatively() method to apply ORM mappings to the class using Imperative. This approach is still viable for applications that are using partially or fully imperative mapping forms with dataclasses.

For fully Declarative mapping combined with dataclasses, the Declarative Dataclass Mapping approach should be preferred.

New in version 1.4: Added support for direct mapping of Python dataclasses

To map an existing dataclass, SQLAlchemy’s “inline” declarative directives cannot be used directly; ORM directives are assigned using one of three techniques:

The general process by which SQLAlchemy applies mappings to a dataclass is the same as that of an ordinary class, but also includes that SQLAlchemy will detect class-level attributes that were part of the dataclasses declaration process and replace them at runtime with the usual SQLAlchemy ORM mapped attributes. The __init__ method that would have been generated by dataclasses is left intact, as is the same for all the other methods that dataclasses generates such as __eq__(), __repr__(), etc.

Mapping dataclasses using Declarative With Imperative Table

An example of a mapping using @dataclass using Declarative with Imperative Table (a.k.a. Hybrid Declarative) is below. A complete Table object is constructed explicitly and assigned to the __table__ attribute. Instance fields are defined using normal dataclass syntaxes. Additional MapperProperty definitions such as relationship(), are placed in the __mapper_args__ class-level dictionary underneath the properties key, corresponding to the Mapper.properties parameter:

from __future__ import annotations

from dataclasses import dataclass, field
from typing import List, Optional

from sqlalchemy import Column, ForeignKey, Integer, String, Table
from sqlalchemy.orm import registry, relationship

mapper_registry = registry()


@mapper_registry.mapped
@dataclass
class User:
    __table__ = Table(
        "user",
        mapper_registry.metadata,
        Column("id", Integer, primary_key=True),
        Column("name", String(50)),
        Column("fullname", String(50)),
        Column("nickname", String(12)),
    )
    id: int = field(init=False)
    name: Optional[str] = None
    fullname: Optional[str] = None
    nickname: Optional[str] = None
    addresses: List[Address] = field(default_factory=list)

    __mapper_args__ = {  # type: ignore
        "properties": {
            "addresses": relationship("Address"),
        }
    }


@mapper_registry.mapped
@dataclass
class Address:
    __table__ = Table(
        "address",
        mapper_registry.metadata,
        Column("id", Integer, primary_key=True),
        Column("user_id", Integer, ForeignKey("user.id")),
        Column("email_address", String(50)),
    )
    id: int = field(init=False)
    user_id: int = field(init=False)
    email_address: Optional[str] = None

In the above example, the User.id, Address.id, and Address.user_id attributes are defined as field(init=False). This means that parameters for these won’t be added to __init__() methods, but Session will still be able to set them after getting their values during flush from autoincrement or other default value generator. To allow them to be specified in the constructor explicitly, they would instead be given a default value of None.

For a relationship() to be declared separately, it needs to be specified directly within the Mapper.properties dictionary which itself is specified within the __mapper_args__ dictionary, so that it is passed to the constructor for Mapper. An alternative to this approach is in the next example.

Mapping dataclasses using Declarative Mapping

Deprecated since version 2.0: This approach to Declarative mapping with dataclasses should be considered as legacy. It will remain supported however is unlikely to offer any advantages against the new approach detailed at Declarative Dataclass Mapping.

The fully declarative approach requires that Column objects are declared as class attributes, which when using dataclasses would conflict with the dataclass-level attributes. An approach to combine these together is to make use of the metadata attribute on the dataclass.field object, where SQLAlchemy-specific mapping information may be supplied. Declarative supports extraction of these parameters when the class specifies the attribute __sa_dataclass_metadata_key__. This also provides a more succinct method of indicating the relationship() association:

from __future__ import annotations

from dataclasses import dataclass, field
from typing import List

from sqlalchemy import Column, ForeignKey, Integer, String
from sqlalchemy.orm import registry, relationship

mapper_registry = registry()


@mapper_registry.mapped
@dataclass
class User:
    __tablename__ = "user"

    __sa_dataclass_metadata_key__ = "sa"
    id: int = field(
        init=False, metadata={"sa": mapped_column(Integer, primary_key=True)}
    )
    name: str = field(default=None, metadata={"sa": mapped_column(String(50))})
    fullname: str = field(default=None, metadata={"sa": mapped_column(String(50))})
    nickname: str = field(default=None, metadata={"sa": mapped_column(String(12))})
    addresses: List[Address] = field(
        default_factory=list, metadata={"sa": relationship("Address")}
    )


@mapper_registry.mapped
@dataclass
class Address:
    __tablename__ = "address"
    __sa_dataclass_metadata_key__ = "sa"
    id: int = field(
        init=False, metadata={"sa": mapped_column(Integer, primary_key=True)}
    )
    user_id: int = field(
        init=False, metadata={"sa": mapped_column(ForeignKey("user.id"))}
    )
    email_address: str = field(default=None, metadata={"sa": mapped_column(String(50))})

Using Declarative Mixins with Dataclasses

In the section Composing Mapped Hierarchies with Mixins, Declarative Mixin classes are introduced. One requirement of declarative mixins is that certain constructs that can’t be easily duplicated must be given as callables, using the declared_attr decorator, such as in the example at Mixing in Relationships:

class RefTargetMixin:
    @declared_attr
    def target_id(cls):
        return mapped_column("target_id", ForeignKey("target.id"))

    @declared_attr
    def target(cls):
        return relationship("Target")

This form is supported within the Dataclasses field() object by using a lambda to indicate the SQLAlchemy construct inside the field(). Using declared_attr() to surround the lambda is optional. If we wanted to produce our User class above where the ORM fields came from a mixin that is itself a dataclass, the form would be:

@dataclass
class UserMixin:
    __tablename__ = "user"

    __sa_dataclass_metadata_key__ = "sa"

    id: int = field(
        init=False, metadata={"sa": mapped_column(Integer, primary_key=True)}
    )

    addresses: List[Address] = field(
        default_factory=list, metadata={"sa": lambda: relationship("Address")}
    )


@dataclass
class AddressMixin:
    __tablename__ = "address"
    __sa_dataclass_metadata_key__ = "sa"
    id: int = field(
        init=False, metadata={"sa": mapped_column(Integer, primary_key=True)}
    )
    user_id: int = field(
        init=False, metadata={"sa": lambda: mapped_column(ForeignKey("user.id"))}
    )
    email_address: str = field(default=None, metadata={"sa": mapped_column(String(50))})


@mapper_registry.mapped
class User(UserMixin):
    pass


@mapper_registry.mapped
class Address(AddressMixin):
    pass

New in version 1.4.2: Added support for “declared attr” style mixin attributes, namely relationship() constructs as well as Column objects with foreign key declarations, to be used within “Dataclasses with Declarative Table” style mappings.

Mapping dataclasses using Imperative Mapping

As described previously, a class which is set up as a dataclass using the @dataclass decorator can then be further decorated using the registry.mapped() decorator in order to apply declarative-style mapping to the class. As an alternative to using the registry.mapped() decorator, we may also pass the class through the registry.map_imperatively() method instead, so that we may pass all Table and Mapper configuration imperatively to the function rather than having them defined on the class itself as class variables:

from __future__ import annotations

from dataclasses import dataclass
from dataclasses import field
from typing import List

from sqlalchemy import Column
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import MetaData
from sqlalchemy import String
from sqlalchemy import Table
from sqlalchemy.orm import registry
from sqlalchemy.orm import relationship

mapper_registry = registry()


@dataclass
class User:
    id: int = field(init=False)
    name: str = None
    fullname: str = None
    nickname: str = None
    addresses: List[Address] = field(default_factory=list)


@dataclass
class Address:
    id: int = field(init=False)
    user_id: int = field(init=False)
    email_address: str = None


metadata_obj = MetaData()

user = Table(
    "user",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(50)),
    Column("fullname", String(50)),
    Column("nickname", String(12)),
)

address = Table(
    "address",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("user_id", Integer, ForeignKey("user.id")),
    Column("email_address", String(50)),
)

mapper_registry.map_imperatively(
    User,
    user,
    properties={
        "addresses": relationship(Address, backref="user", order_by=address.c.id),
    },
)

mapper_registry.map_imperatively(Address, address)

Applying ORM mappings to an existing attrs class

The attrs library is a popular third party library that provides similar features as dataclasses, with many additional features provided not found in ordinary dataclasses.

A class augmented with attrs uses the @define decorator. This decorator initiates a process to scan the class for attributes that define the class’ behavior, which are then used to generate methods, documentation, and annotations.

The SQLAlchemy ORM supports mapping an attrs class using Declarative with Imperative Table or Imperative mapping. The general form of these two styles is fully equivalent to the Mapping dataclasses using Declarative Mapping and Mapping dataclasses using Declarative With Imperative Table mapping forms used with dataclasses, where the inline attribute directives used by dataclasses or attrs are unchanged, and SQLAlchemy’s table-oriented instrumentation is applied at runtime.

The @define decorator of attrs by default replaces the annotated class with a new __slots__ based class, which is not supported. When using the old style annotation @attr.s or using define(slots=False), the class does not get replaced. Furthermore attrs removes its own class-bound attributes after the decorator runs, so that SQLAlchemy’s mapping process takes over these attributes without any issue. Both decorators, @attr.s and @define(slots=False) work with SQLAlchemy.

Mapping attrs with Declarative “Imperative Table”

In the “Declarative with Imperative Table” style, a Table object is declared inline with the declarative class. The @define decorator is applied to the class first, then the registry.mapped() decorator second:

from __future__ import annotations

from typing import List
from typing import Optional

from attrs import define
from sqlalchemy import Column
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import MetaData
from sqlalchemy import String
from sqlalchemy import Table
from sqlalchemy.orm import Mapped
from sqlalchemy.orm import registry
from sqlalchemy.orm import relationship

mapper_registry = registry()


@mapper_registry.mapped
@define(slots=False)
class User:
    __table__ = Table(
        "user",
        mapper_registry.metadata,
        Column("id", Integer, primary_key=True),
        Column("name", String(50)),
        Column("FullName", String(50), key="fullname"),
        Column("nickname", String(12)),
    )
    id: Mapped[int]
    name: Mapped[str]
    fullname: Mapped[str]
    nickname: Mapped[str]
    addresses: Mapped[List[Address]]

    __mapper_args__ = {  # type: ignore
        "properties": {
            "addresses": relationship("Address"),
        }
    }


@mapper_registry.mapped
@define(slots=False)
class Address:
    __table__ = Table(
        "address",
        mapper_registry.metadata,
        Column("id", Integer, primary_key=True),
        Column("user_id", Integer, ForeignKey("user.id")),
        Column("email_address", String(50)),
    )
    id: Mapped[int]
    user_id: Mapped[int]
    email_address: Mapped[Optional[str]]

Note

The attrs slots=True option, which enables __slots__ on a mapped class, cannot be used with SQLAlchemy mappings without fully implementing alternative attribute instrumentation, as mapped classes normally rely upon direct access to __dict__ for state storage. Behavior is undefined when this option is present.

Mapping attrs with Imperative Mapping

Just as is the case with dataclasses, we can make use of registry.map_imperatively() to map an existing attrs class as well:

from __future__ import annotations

from typing import List

from attrs import define
from sqlalchemy import Column
from sqlalchemy import ForeignKey
from sqlalchemy import Integer
from sqlalchemy import MetaData
from sqlalchemy import String
from sqlalchemy import Table
from sqlalchemy.orm import registry
from sqlalchemy.orm import relationship

mapper_registry = registry()


@define(slots=False)
class User:
    id: int
    name: str
    fullname: str
    nickname: str
    addresses: List[Address]


@define(slots=False)
class Address:
    id: int
    user_id: int
    email_address: Optional[str]


metadata_obj = MetaData()

user = Table(
    "user",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("name", String(50)),
    Column("fullname", String(50)),
    Column("nickname", String(12)),
)

address = Table(
    "address",
    metadata_obj,
    Column("id", Integer, primary_key=True),
    Column("user_id", Integer, ForeignKey("user.id")),
    Column("email_address", String(50)),
)

mapper_registry.map_imperatively(
    User,
    user,
    properties={
        "addresses": relationship(Address, backref="user", order_by=address.c.id),
    },
)

mapper_registry.map_imperatively(Address, address)

The above form is equivalent to the previous example using Declarative with Imperative Table.