Horizontal Sharding

Horizontal sharding support.

Defines a rudimental ‘horizontal sharding’ system which allows a Session to distribute queries and persistence operations across multiple databases.

For a usage example, see the Horizontal Sharding example included in the source distribution.

Deep Alchemy

The horizontal sharding extension is an advanced feature, involving a complex statement -> database interaction as well as use of semi-public APIs for non-trivial cases. Simpler approaches to refering to multiple database “shards”, most commonly using a distinct Session per “shard”, should always be considered first before using this more complex and less-production-tested system.

API Documentation

Object Name Description

set_shard_id

a loader option for statements to apply a specific shard id to the primary query as well as for additional relationship and column loaders.

ShardedQuery

Query class used with ShardedSession.

ShardedSession

class sqlalchemy.ext.horizontal_shard.ShardedSession
method sqlalchemy.ext.horizontal_shard.ShardedSession.__init__(shard_chooser: ShardChooser, identity_chooser: Optional[IdentityChooser] = None, execute_chooser: Optional[Callable[[ORMExecuteState], Iterable[Any]]] = None, shards: Optional[Dict[str, Any]] = None, query_cls: Type[Query[_T]] = <class 'sqlalchemy.ext.horizontal_shard.ShardedQuery'>, *, id_chooser: Optional[Callable[[Query[_T], Iterable[_T]], Iterable[Any]]] = None, query_chooser: Optional[Callable[[Executable], Iterable[Any]]] = None, **kwargs: Any) None

Construct a ShardedSession.

Parameters:
  • shard_chooser – A callable which, passed a Mapper, a mapped instance, and possibly a SQL clause, returns a shard ID. This id may be based off of the attributes present within the object, or on some round-robin scheme. If the scheme is based on a selection, it should set whatever state on the instance to mark it in the future as participating in that shard.

  • identity_chooser

    A callable, passed a Mapper and primary key argument, which should return a list of shard ids where this primary key might reside.

    Changed in version 2.0: The identity_chooser parameter supersedes the id_chooser parameter.

  • execute_chooser

    For a given ORMExecuteState, returns the list of shard_ids where the query should be issued. Results from all shards returned will be combined together into a single listing.

    Changed in version 1.4: The execute_chooser parameter supersedes the query_chooser parameter.

  • shards – A dictionary of string shard names to Engine objects.

method sqlalchemy.ext.horizontal_shard.ShardedSession.connection_callable(mapper: Mapper[_T] | None = None, instance: Any | None = None, shard_id: ShardIdentifier | None = None, **kw: Any) Connection

Provide a Connection to use in the unit of work flush process.

method sqlalchemy.ext.horizontal_shard.ShardedSession.get_bind(mapper: _EntityBindKey[_O] | None = None, *, shard_id: ShardIdentifier | None = None, instance: Any | None = None, clause: ClauseElement | None = None, **kw: Any) _SessionBind

Return a “bind” to which this Session is bound.

The “bind” is usually an instance of Engine, except in the case where the Session has been explicitly bound directly to a Connection.

For a multiply-bound or unbound Session, the mapper or clause arguments are used to determine the appropriate bind to return.

Note that the “mapper” argument is usually present when Session.get_bind() is called via an ORM operation such as a Session.query(), each individual INSERT/UPDATE/DELETE operation within a Session.flush(), call, etc.

The order of resolution is:

  1. if mapper given and Session.binds is present, locate a bind based first on the mapper in use, then on the mapped class in use, then on any base classes that are present in the __mro__ of the mapped class, from more specific superclasses to more general.

  2. if clause given and Session.binds is present, locate a bind based on Table objects found in the given clause present in Session.binds.

  3. if Session.binds is present, return that.

  4. if clause given, attempt to return a bind linked to the MetaData ultimately associated with the clause.

  5. if mapper given, attempt to return a bind linked to the MetaData ultimately associated with the Table or other selectable to which the mapper is mapped.

  6. No bind can be found, UnboundExecutionError is raised.

Note that the Session.get_bind() method can be overridden on a user-defined subclass of Session to provide any kind of bind resolution scheme. See the example at Custom Vertical Partitioning.

Parameters:
  • mapper – Optional mapped class or corresponding Mapper instance. The bind can be derived from a Mapper first by consulting the “binds” map associated with this Session, and secondly by consulting the MetaData associated with the Table to which the Mapper is mapped for a bind.

  • clause – A ClauseElement (i.e. select(), text(), etc.). If the mapper argument is not present or could not produce a bind, the given expression construct will be searched for a bound element, typically a Table associated with bound MetaData.

class sqlalchemy.ext.horizontal_shard.set_shard_id

a loader option for statements to apply a specific shard id to the primary query as well as for additional relationship and column loaders.

The set_shard_id option may be applied using the Executable.options() method of any executable statement:

stmt = (
    select(MyObject).
    where(MyObject.name == 'some name').
    options(set_shard_id("shard1"))
)

Above, the statement when invoked will limit to the “shard1” shard identifier for the primary query as well as for all relationship and column loading strategies, including eager loaders such as selectinload(), deferred column loaders like defer(), and the lazy relationship loader lazyload().

In this way, the set_shard_id option has much wider scope than using the “shard_id” argument within the Session.execute.bind_arguments dictionary.

New in version 2.0.0.

Class signature

class sqlalchemy.ext.horizontal_shard.set_shard_id (sqlalchemy.orm.ORMOption)

method sqlalchemy.ext.horizontal_shard.set_shard_id.__init__(shard_id: str, propagate_to_loaders: bool = True)

Construct a set_shard_id option.

Parameters:
  • shard_id – shard identifier

  • propagate_to_loaders – if left at its default of True, the shard option will take place for lazy loaders such as lazyload() and defer(); if False, the option will not be propagated to loaded objects. Note that defer() always limits to the shard_id of the parent row in any case, so the parameter only has a net effect on the behavior of the lazyload() strategy.

attribute sqlalchemy.ext.horizontal_shard.set_shard_id.propagate_to_loaders

if True, indicate this option should be carried along to “secondary” SELECT statements that occur for relationship lazy loaders as well as attribute load / refresh operations.

class sqlalchemy.ext.horizontal_shard.ShardedQuery

Query class used with ShardedSession.

Legacy Feature

The ShardedQuery is a subclass of the legacy Query class. The ShardedSession now supports 2.0 style execution via the ShardedSession.execute() method.

Members

set_shard()

method sqlalchemy.ext.horizontal_shard.ShardedQuery.set_shard(shard_id: str) Self

Return a new query, limited to a single shard ID.

All subsequent operations with the returned query will be against the single shard regardless of other state.

The shard_id can be passed for a 2.0 style execution to the bind_arguments dictionary of Session.execute():

results = session.execute(
    stmt,
    bind_arguments={"shard_id": "my_shard"}
)