SQLAlchemy 2.1 Documentation
Changes and Migration
- What’s New in SQLAlchemy 2.1?¶
Row
now represents individual column types directly withoutTuple
- Asyncio “greenlet” dependency no longer installs by default
- ORM Relationship allows callable for back_populates
- ORM Mapped Dataclasses no longer populate implicit
default
in__dict__
- URL stringify and parse now supports URL escaping for the “database” portion
- Potential breaking change to odbc_connect= handling for mssql+pyodbc
- 2.1 Changelog
- 2.0 Changelog
- 1.4 Changelog
- 1.3 Changelog
- 1.2 Changelog
- 1.1 Changelog
- 1.0 Changelog
- 0.9 Changelog
- 0.8 Changelog
- 0.7 Changelog
- 0.6 Changelog
- 0.5 Changelog
- 0.4 Changelog
- 0.3 Changelog
- 0.2 Changelog
- 0.1 Changelog
- SQLAlchemy 2.0 - Major Migration Guide
- What’s New in SQLAlchemy 2.0?
- What’s New in SQLAlchemy 1.4?
- What’s New in SQLAlchemy 1.3?
- What’s New in SQLAlchemy 1.2?
- What’s New in SQLAlchemy 1.1?
- What’s New in SQLAlchemy 1.0?
- What’s New in SQLAlchemy 0.9?
- What’s New in SQLAlchemy 0.8?
- What’s New in SQLAlchemy 0.7?
- What’s New in SQLAlchemy 0.6?
- What’s new in SQLAlchemy 0.5?
- What’s new in SQLAlchemy 0.4?
Project Versions
- Previous: Changes and Migration
- Next: 2.1 Changelog
- Up: Home
- On this page:
- What’s New in SQLAlchemy 2.1?
Row
now represents individual column types directly withoutTuple
- Asyncio “greenlet” dependency no longer installs by default
- ORM Relationship allows callable for back_populates
- ORM Mapped Dataclasses no longer populate implicit
default
in__dict__
- URL stringify and parse now supports URL escaping for the “database” portion
- Potential breaking change to odbc_connect= handling for mssql+pyodbc
What’s New in SQLAlchemy 2.1?¶
About this Document
This document describes changes between SQLAlchemy version 2.0 and version 2.1.
Row
now represents individual column types directly without Tuple
¶
SQLAlchemy 2.0 implemented a broad array of PEP 484 typing throughout
all components, including a new ability for row-returning statements such
as select()
to maintain track of individual column types, which
were then passed through the execution phase onto the Result
object and then to the individual Row
objects. Described
at SQL Expression / Statement / Result Set Typing, this approach solved several issues
with statement / row typing, but some remained unsolvable. In 2.1, one
of those issues, that the individual column types needed to be packaged
into a typing.Tuple
, is now resolved using new PEP 646 integration,
which allows for tuple-like types that are not actually typed as Tuple
.
In SQLAlchemy 2.0, a statement such as:
stmt = select(column("x", Integer), column("y", String))
Would be typed as:
Select[Tuple[int, str]]
In 2.1, it’s now typed as:
Select[int, str]
When executing stmt
, the Result
and Row
objects will be typed as Result[int, str]
and Row[int, str]
, respectively.
The prior workaround using Row._t
to type as a real Tuple
is no longer needed and projects can migrate off this pattern.
Mypy users will need to make use of Mypy 1.7 or greater for pep-646 integration to be available.
Limitations¶
Not yet solved by pep-646 or any other pep is the ability for an arbitrary
number of expressions within Select
and others to be mapped to
row objects, without stating each argument position explicitly within typing
annotations. To work around this issue, SQLAlchemy makes use of automated
“stub generation” tools to generate hardcoded mappings of different numbers of
positional arguments to constructs like select()
to resolve to
individual Unpack[]
expressions (in SQLAlchemy 2.0, this generation
produced Tuple[]
annotations instead). This means that there are arbitrary
limits on how many specific column expressions will be typed within the
Row
object, without restoring to Any
for remaining
expressions; for select()
, it’s currently ten expressions, and
for DML expressions like insert()
that use Insert.returning()
,
it’s eight. If and when a new pep that provides a Map
operator
to pep-646 is proposed, this limitation can be lifted. [1] Originally, it was
mistakenly assumed that this limitation prevented pep-646 from being usable at all,
however, the Unpack
construct does in fact replace everything that
was done using Tuple
in 2.0.
An additional limitation for which there is no proposed solution is that
there’s no way for the name-based attributes on Row
to be
automatically typed, so these continue to be typed as Any
(e.g. row.x
and row.y
for the above example). With current language features,
this could only be fixed by having an explicit class-based construct that
allows one to compose an explicit Row
with explicit fields
up front, which would be verbose and not automatic.
Asyncio “greenlet” dependency no longer installs by default¶
SQLAlchemy 1.4 and 2.0 used a complex expression to determine if the
greenlet
dependency, needed by the asyncio
extension, could be installed from pypi using a pre-built wheel instead
of having to build from source. This because the source build of greenlet
is not always trivial on some platforms.
Disadvantages to this approach included that SQLAlchemy needed to track
exactly which versions of greenlet
were published as wheels on pypi;
the setup expression led to problems with some package management tools
such as poetry
; it was not possible to install SQLAlchemy without
greenlet
being installed, even though this is completely feasible
if the asyncio extension is not used.
These problems are all solved by keeping greenlet
entirely within the
[asyncio]
target. The only downside is that users of the asyncio extension
need to be aware of this extra installation dependency.
ORM Relationship allows callable for back_populates¶
To help produce code that is more amenable to IDE-level linting and type
checking, the relationship.back_populates
parameter now
accepts both direct references to a class-bound attribute as well as
lambdas which do the same:
class A(Base):
__tablename__ = "a"
id: Mapped[int] = mapped_column(primary_key=True)
# use a lambda: to link to B.a directly when it exists
bs: Mapped[list[B]] = relationship(back_populates=lambda: B.a)
class B(Base):
__tablename__ = "b"
id: Mapped[int] = mapped_column(primary_key=True)
a_id: Mapped[int] = mapped_column(ForeignKey("a.id"))
# A.bs already exists, so can link directly
a: Mapped[A] = relationship(back_populates=A.bs)
ORM Mapped Dataclasses no longer populate implicit default
in __dict__
¶
This behavioral change addresses a widely reported issue with SQLAlchemy’s
Declarative Dataclass Mapping feature that was introduced in 2.0.
SQLAlchemy ORM has always featured a behavior where a particular attribute on
an ORM mapped class will have different behaviors depending on if it has an
actively set value, including if that value is None
, versus if the
attribute is not set at all. When Declarative Dataclass Mapping was introduced, the
mapped_column.default
parameter introduced a new capability
which is to set up a dataclass-level default to be present in the generated
__init__
method. This had the unfortunate side effect of breaking various
popular workflows, the most prominent of which is creating an ORM object with
the foreign key value in lieu of a many-to-one reference:
class Base(MappedAsDataclass, DeclarativeBase):
pass
class Parent(Base):
__tablename__ = "parent"
id: Mapped[int] = mapped_column(primary_key=True, init=False)
related_id: Mapped[int | None] = mapped_column(ForeignKey("child.id"), default=None)
related: Mapped[Child | None] = relationship(default=None)
class Child(Base):
__tablename__ = "child"
id: Mapped[int] = mapped_column(primary_key=True, init=False)
In the above mapping, the __init__
method generated for Parent
would in Python code look like this:
def __init__(self, related_id=None, related=None): ...
This means that creating a new Parent
with related_id
only would populate
both related_id
and related
in __dict__
:
# 2.0 behavior; will INSERT NULL for related_id due to the presence
# of related=None
>>> p1 = Parent(related_id=5)
>>> p1.__dict__
{'related_id': 5, 'related': None, '_sa_instance_state': ...}
The None
value for 'related'
means that SQLAlchemy favors the non-present
related Child
over the present value for 'related_id'
, which would be
discarded, and NULL
would be inserted for 'related_id'
instead.
In the new behavior, the __init__
method instead looks like the example below,
using a special constant DONT_SET
indicating a non-present value for 'related'
should be ignored. This allows the class to behave more closely to how
SQLAlchemy ORM mapped classes traditionally operate:
def __init__(self, related_id=DONT_SET, related=DONT_SET): ...
We then get a __dict__
setup that will follow the expected behavior of
omitting related
from __dict__
and later running an INSERT with
related_id=5
:
# 2.1 behavior; will INSERT 5 for related_id
>>> p1 = Parent(related_id=5)
>>> p1.__dict__
{'related_id': 5, '_sa_instance_state': ...}
Dataclass defaults are delivered via descriptor instead of __dict__¶
The above behavior goes a step further, which is that in order to
honor default values that are something other than None
, the value of the
dataclass-level default (i.e. set using any of the
mapped_column.default
,
column_property.default
, or deferred.default
parameters) is directed to be delivered at the
Python descriptor level using mechanisms in SQLAlchemy’s attribute
system that normally return None
for un-popualted columns, so that even though the default is not
populated into __dict__
, it’s still delivered when the attribute is
accessed. This behavior is based on what Python dataclasses itself does
when a default is indicated for a field that also includes init=False
.
In the example below, an immutable default "default_status"
is applied to a column called status
:
class Base(MappedAsDataclass, DeclarativeBase):
pass
class SomeObject(Base):
__tablename__ = "parent"
id: Mapped[int] = mapped_column(primary_key=True, init=False)
status: Mapped[str] = mapped_column(default="default_status")
In the above mapping, constructing SomeObject
with no parameters will
deliver no values inside of __dict__
, but will deliver the default
value via descriptor:
# object is constructed with no value for ``status``
>>> s1 = SomeObject()
# the default value is not placed in ``__dict__``
>>> s1.__dict__
{'_sa_instance_state': ...}
# but the default value is delivered at the object level via descriptor
>>> s1.status
'default_status'
# the value still remains unpopulated in ``__dict__``
>>> s1.__dict__
{'_sa_instance_state': ...}
The value passed
as mapped_column.default
is also assigned as was the
case before to the Column.default
parameter of the
underlying Column
, where it takes
place as a Python-level default for INSERT statements. So while __dict__
is never populated with the default value on the object, the INSERT
still includes the value in the parameter set. This essentially modifies
the Declarative Dataclass Mapping system to work more like traditional
ORM mapped classes, where a “default” means just that, a column level
default.
Dataclass defaults are accessible on objects even without init¶
As the new behavior makes use of descriptors in a similar way as Python
dataclasses do themselves when init=False
, the new feature implements
this behavior as well. This is an all new behavior where an ORM mapped
class can deliver a default value for fields even if they are not part of
the __init__()
method at all. In the mapping below, the status
field is configured with init=False
, meaning it’s not part of the
constructor at all:
class Base(MappedAsDataclass, DeclarativeBase):
pass
class SomeObject(Base):
__tablename__ = "parent"
id: Mapped[int] = mapped_column(primary_key=True, init=False)
status: Mapped[str] = mapped_column(default="default_status", init=False)
When we construct SomeObject()
with no arguments, the default is accessible
on the instance, delivered via descriptor:
>>> so = SomeObject()
>>> so.status
default_status
Related Changes¶
This change includes the following API changes:
The
relationship.default
parameter, when present, only accepts a value ofNone
, and is only accepted when the relationship is ultimately a many-to-one relationship or one that establishesrelationship.uselist
asFalse
.The
mapped_column.default
andmapped_column.insert_default
parameters are mutually exclusive, and only one may be passed at a time. The behavior of the two parameters is equivalent at theColumn
level, however at the Declarative Dataclass Mapping level, onlymapped_column.default
actually sets the dataclass-level default with descriptor access; usingmapped_column.insert_default
will have the effect of the object attribute defaulting toNone
on the instance until the INSERT takes place, in the same way it works on traditional ORM mapped classes.
URL stringify and parse now supports URL escaping for the “database” portion¶
A URL that includes URL-escaped characters in the database portion will now parse with conversion of those escaped characters:
>>> from sqlalchemy import make_url
>>> u = make_url("driver://user:pass@host/database%3Fname")
>>> u.database
'database?name'
Previously, such characters would not be unescaped:
>>> # pre-2.1 behavior
>>> from sqlalchemy import make_url
>>> u = make_url("driver://user:pass@host/database%3Fname")
>>> u.database
'database%3Fname'
This change also applies to the stringify side; most special characters in the database name will be URL escaped, omitting a few such as plus signs and slashes:
>>> from sqlalchemy import URL
>>> u = URL.create("driver", database="a?b=c")
>>> str(u)
'driver:///a%3Fb%3Dc'
Where the above URL correctly round-trips to itself:
>>> make_url(str(u))
driver:///a%3Fb%3Dc
>>> make_url(str(u)).database == u.database
True
Whereas previously, special characters applied programmatically would not be escaped in the result, leading to a URL that does not represent the original database portion. Below, b=c is part of the query string and not the database portion:
>>> from sqlalchemy import URL
>>> u = URL.create("driver", database="a?b=c")
>>> str(u)
'driver:///a?b=c'
Potential breaking change to odbc_connect= handling for mssql+pyodbc¶
Fixed a mssql+pyodbc issue where valid plus signs in an already-unquoted
odbc_connect=
(raw DBAPI) connection string were replaced with spaces.
Previously, the pyodbc connector would always pass the odbc_connect value
to unquote_plus(), even if it was not required. So, if the (unquoted)
odbc_connect value contained PWD=pass+word
that would get changed to
PWD=pass word
, and the login would fail. One workaround was to quote
just the plus sign — PWD=pass%2Bword
— which would then get unquoted
to PWD=pass+word
.
Implementations using the above workaround with URL.create()
to specify a plus sign in the PWD=
argument of an odbc_connect string
will have to remove the workaround and just pass the PWD=
value as it
would appear in a valid ODBC connection string (i.e., the same as would be
required if using the connection string directly with pyodbc.connect()
).
flambé! the dragon and The Alchemist image designs created and generously donated by Rotem Yaari.
Created using Sphinx 7.2.6. Documentation last generated: Sun 27 Apr 2025 04:02:25 PM EDT