Sqlalchemy
SQLAlchemy ORM and Core patterns for models, sessions, queries, and async database access
You are an expert in SQLAlchemy for database modeling, querying, and session management in Python applications. ## Key Points - Always use the 2.0-style `select()` API. The legacy `session.query()` API is in maintenance mode. - Use `Mapped` type annotations for all columns to get full IDE support and type checking. - Set `expire_on_commit=False` on sessions used in web frameworks to avoid lazy-load surprises after commit. - Use `joinedload` for single-valued relationships and `selectinload` for collections to avoid N+1 queries. - Run Alembic migrations through CI/CD and never use `Base.metadata.create_all()` in production. - Use `pool_pre_ping=True` to gracefully handle stale database connections. - Use `session.execute(insert(...), [...])` for bulk inserts rather than looping `session.add()`. - Accessing lazy-loaded relationships after the session is closed raises `DetachedInstanceError`. Always eager-load or access within the session scope. - Forgetting `await` on async session methods. Every `session.execute()`, `session.commit()`, and `session.refresh()` must be awaited. - Using `expire_on_commit=True` (the default) with async code, causing implicit I/O when accessing attributes after commit. - Defining relationships without `back_populates`, leading to inconsistent in-memory state when both sides are modified. - Not setting `compare_type=True` in Alembic, causing column type changes to be silently ignored during autogenerate. ## Quick Example ```bash pip install "sqlalchemy[asyncio]" asyncpg alembic ``` ```bash alembic init alembic alembic revision --autogenerate -m "add articles table" alembic upgrade head alembic downgrade -1 ```
skilldb get python-web-skills/SqlalchemyFull skill: 320 linesSQLAlchemy — Python Web Development
You are an expert in SQLAlchemy for database modeling, querying, and session management in Python applications.
Core Philosophy
Overview
SQLAlchemy provides both a high-level ORM and a lower-level Core expression language for working with relational databases. Version 2.0 introduced a unified query interface using select(), native async support, and improved type annotations. It works with PostgreSQL, MySQL, SQLite, and other databases through dialects.
Setup & Configuration
pip install "sqlalchemy[asyncio]" asyncpg alembic
Sync Engine
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, DeclarativeBase
engine = create_engine(
"postgresql+psycopg://user:pass@localhost/mydb",
pool_size=10,
max_overflow=20,
pool_pre_ping=True,
echo=False,
)
SessionLocal = sessionmaker(bind=engine, expire_on_commit=False)
Async Engine
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker
async_engine = create_async_engine(
"postgresql+asyncpg://user:pass@localhost/mydb",
pool_size=10,
max_overflow=20,
)
AsyncSessionLocal = async_sessionmaker(
bind=async_engine,
expire_on_commit=False,
)
Core Patterns
Declarative Models (2.0 Style)
from datetime import datetime, timezone
from sqlalchemy import String, Text, ForeignKey, Index
from sqlalchemy.orm import (
DeclarativeBase, Mapped, mapped_column, relationship,
)
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(primary_key=True)
username: Mapped[str] = mapped_column(String(80), unique=True)
email: Mapped[str] = mapped_column(String(120), unique=True)
articles: Mapped[list["Article"]] = relationship(
back_populates="author", cascade="all, delete-orphan"
)
def __repr__(self) -> str:
return f"<User {self.username}>"
class Article(Base):
__tablename__ = "articles"
__table_args__ = (
Index("ix_articles_author_published", "author_id", "published_at"),
)
id: Mapped[int] = mapped_column(primary_key=True)
title: Mapped[str] = mapped_column(String(200))
slug: Mapped[str] = mapped_column(String(200), unique=True)
body: Mapped[str] = mapped_column(Text)
author_id: Mapped[int] = mapped_column(ForeignKey("users.id"))
published_at: Mapped[datetime | None] = mapped_column(default=None)
created_at: Mapped[datetime] = mapped_column(
default=lambda: datetime.now(timezone.utc)
)
author: Mapped["User"] = relationship(back_populates="articles")
tags: Mapped[list["Tag"]] = relationship(
secondary="article_tags", back_populates="articles"
)
Session Management
# Sync context manager
from contextlib import contextmanager
@contextmanager
def get_session():
session = SessionLocal()
try:
yield session
session.commit()
except Exception:
session.rollback()
raise
finally:
session.close()
# FastAPI async dependency
async def get_db():
async with AsyncSessionLocal() as session:
try:
yield session
await session.commit()
except Exception:
await session.rollback()
raise
Querying (2.0 Style)
from sqlalchemy import select, func, and_, or_
from sqlalchemy.orm import selectinload, joinedload
# Basic select
stmt = select(Article).where(Article.published_at.isnot(None))
result = session.execute(stmt)
articles = result.scalars().all()
# Eager loading
stmt = (
select(Article)
.options(
joinedload(Article.author), # JOIN for single relation
selectinload(Article.tags), # separate IN query for collection
)
.where(Article.published_at.isnot(None))
.order_by(Article.published_at.desc())
.limit(20)
)
# Aggregation
stmt = (
select(User.username, func.count(Article.id).label("article_count"))
.join(Article, User.id == Article.author_id)
.group_by(User.username)
.having(func.count(Article.id) > 5)
.order_by(func.count(Article.id).desc())
)
# Subqueries
subq = (
select(func.count(Article.id))
.where(Article.author_id == User.id)
.correlate(User)
.scalar_subquery()
)
stmt = select(User, subq.label("article_count"))
# Async query
async with AsyncSessionLocal() as session:
result = await session.execute(stmt)
rows = result.all()
Bulk Operations
from sqlalchemy import insert, update, delete
# Bulk insert
session.execute(
insert(Article),
[
{"title": "First", "slug": "first", "body": "...", "author_id": 1},
{"title": "Second", "slug": "second", "body": "...", "author_id": 1},
],
)
session.commit()
# Bulk update
session.execute(
update(Article)
.where(Article.published_at.is_(None))
.values(published_at=datetime.now(timezone.utc))
)
# Bulk delete
session.execute(
delete(Article).where(Article.created_at < cutoff_date)
)
Migrations with Alembic
alembic init alembic
alembic revision --autogenerate -m "add articles table"
alembic upgrade head
alembic downgrade -1
# alembic/env.py (key section)
from app.models import Base
target_metadata = Base.metadata
def run_migrations_online():
connectable = engine_from_config(config.get_section("alembic"))
with connectable.connect() as connection:
context.configure(
connection=connection,
target_metadata=target_metadata,
compare_type=True,
)
with context.begin_transaction():
context.run_migrations()
Events and Hooks
from sqlalchemy import event
@event.listens_for(Article, "before_insert")
def generate_slug(mapper, connection, target):
if not target.slug:
target.slug = slugify(target.title)
@event.listens_for(engine, "before_cursor_execute")
def log_slow_queries(conn, cursor, statement, parameters, context, executemany):
conn.info.setdefault("query_start_time", []).append(time.perf_counter())
@event.listens_for(engine, "after_cursor_execute")
def check_slow_queries(conn, cursor, statement, parameters, context, executemany):
total = time.perf_counter() - conn.info["query_start_time"].pop()
if total > 0.5:
logger.warning("Slow query (%.3fs): %s", total, statement[:200])
Hybrid Properties
from sqlalchemy.ext.hybrid import hybrid_property
class Article(Base):
__tablename__ = "articles"
# ... columns ...
@hybrid_property
def is_published(self) -> bool:
return self.published_at is not None
@is_published.expression
def is_published(cls):
return cls.published_at.isnot(None)
# Works both in Python and in SQL
article.is_published # True/False in Python
select(Article).where(Article.is_published) # SQL WHERE clause
Best Practices
- Always use the 2.0-style
select()API. The legacysession.query()API is in maintenance mode. - Use
Mappedtype annotations for all columns to get full IDE support and type checking. - Set
expire_on_commit=Falseon sessions used in web frameworks to avoid lazy-load surprises after commit. - Use
joinedloadfor single-valued relationships andselectinloadfor collections to avoid N+1 queries. - Run Alembic migrations through CI/CD and never use
Base.metadata.create_all()in production. - Use
pool_pre_ping=Trueto gracefully handle stale database connections. - Use
session.execute(insert(...), [...])for bulk inserts rather than loopingsession.add().
Common Pitfalls
- Accessing lazy-loaded relationships after the session is closed raises
DetachedInstanceError. Always eager-load or access within the session scope. - Forgetting
awaiton async session methods. Everysession.execute(),session.commit(), andsession.refresh()must be awaited. - Using
expire_on_commit=True(the default) with async code, causing implicit I/O when accessing attributes after commit. - Defining relationships without
back_populates, leading to inconsistent in-memory state when both sides are modified. - Not setting
compare_type=Truein Alembic, causing column type changes to be silently ignored during autogenerate. - Misunderstanding
cascade="all, delete-orphan". This controls ORM-level cascades, not database-levelON DELETE. Set both for safety.
Anti-Patterns
Over-engineering for hypothetical scale. Building for millions of users when you have hundreds adds complexity without value. Solve today's problems first.
Ignoring the existing ecosystem. Reinventing functionality that mature libraries already provide well wastes time and introduces unnecessary risk.
Premature abstraction. Creating elaborate frameworks and utilities before you have enough concrete cases to know what the abstraction should look like produces the wrong abstraction.
Neglecting error handling at boundaries. Internal code can trust its inputs, but system boundaries (user input, APIs, file I/O) require defensive validation.
Skipping documentation for obvious code. What is obvious to you today will not be obvious to your colleague next month or to you next year.
Install this skill directly: skilldb add python-web-skills
Related Skills
Celery
Celery patterns for distributed task queues, scheduling, retries, and worker management
Django Admin
Django admin customization patterns for list views, forms, inlines, actions, and permissions
Django ORM
Django ORM patterns for models, querysets, migrations, and database optimization
Django REST Framework
Django REST Framework patterns for building, serializing, and securing RESTful APIs
Fastapi
FastAPI patterns for async APIs, dependency injection, Pydantic models, and OpenAPI integration
Flask
Flask application patterns for routing, blueprints, extensions, and application factories