In this post we will discuss some commonly used methods to make the design of our databases more readable and less error prone. For those that are new to Sqlalchemy is may be useful to read the previous post on the very basics of how things work.
Contents
- Mappings
These are essentially the type hints we create to indicate the Python type that the column is expecting
- Mapping Column Method
This is a method that actually does the mapping
- Annotations
A way to create reusable column definitions in our database design.
Defintion of Map/Mapping
We will use the term mapping quite often in this post, so it may be useful to clearly define what we mean by this. Essentially this is some transformation of an object , perhaps it may be useful to think of it as a 'function applied on an object' , although this definition isn't correct for the way we use mapping in this post, but it is a good place to start. Let's take a Python example to show what we mean using the map keyword. Below we give an example of 'mapping' a list of floats to integers.
input_ = [1.1, 1.05, 1.2, 1.3]
output = map(int, input_)
print(list(output))
'''
[1, 1, 1, 1]
'''
Why isn't this correct in context of databases? Well, if we have a list of floats, and we are mapping to some SQL datatype e.g. BIGINT , we have probably chosen the wrong type (on Python end), and FLOAT would have been a better choice. So quite generally, we want to map the Python type to an SQL type and we probably don't want to be transforming the data apart from its mapping from Python to SQL.
Below we give a better definition that will at first seem confusing, but once explained will make sense hopefully.
Say we want to save some dates in a database, so we want to apply a mapping from a datetime to a TIMESTAMP let's see what a mapping like that might look like.
from datetime import datetime
dates = [datetime(2024, 1, 1),
datetime(2023, 1, 1),
datetime(2022, 1, 1)]
map_to_timestamp = lambda date: date.timestamp()
mapped_timestamps = map(map_to_timestamp, dates)
print(list(mapped_timestamps))
'''
[1704067200.0, 1672531200.0, 1640995200.0]
'''
So this looks like more of a change than the float to integer example right? Well actually no, let's show why by doing the mapping and then reversing the mapping.
from datetime import datetime
dates = [datetime(2024, 1, 1),
datetime(2023, 1, 1),
datetime(2022, 1, 1)]
## DO THE MAPPING
map_to_timestamp = lambda date: date.timestamp()
mapped_timestamps = list(map(map_to_timestamp, dates))
## UNDO THE MAPPING
undo_mapping = lambda timestamp: datetime.fromtimestamp(timestamp)
original = list(map(undo_mapping, mapped_timestamps))
print(f'We started with {dates} ')
'''
We started with [datetime.datetime(2024, 1, 1, 0, 0),
datetime.datetime(2023, 1, 1, 0, 0),
datetime.datetime(2022, 1, 1, 0, 0)]
'''
print(f'We mapped to timestamp to get {mapped_timestamps}')
'''
We mapped to timestamp to get [1704067200.0,
1672531200.0,
1640995200.0]
'''
print(f'We unmapped back to {original}')
'''
We unmapped back to [datetime.datetime(2024, 1, 1, 0, 0),
datetime.datetime(2023, 1, 1, 0, 0),
datetime.datetime(2022, 1, 1, 0, 0)]
'''
So we see that the object hasn't actually changed, here despite the mapping, the objects are equivalent. This should make sense because all computer time is measured like this anyway. Whereas, if we recall the [1.1, 1.05, 1.2, 1.3] to [1,1,1] mapping, there is no way to get back the original.
For me, this encapsulates the very essence of what an ORM does! OK we getting dangerously close to rambling, so let's do some examples.
Let's rewrite our column definitions we created last post in the new style, this way is more expressive regarding how sqlalchemy creates these columns.
from sqlalchemy import Integer, String, TIMESTAMP,Boolean, create_engine, Column
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Mapped, mapped_column
from typing import Optional
from datetime import datetime
class Base(DeclarativeBase):
pass
class User(Base):
__tablename__ = "users"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
fullname: Mapped[str] = mapped_column(String, nullable=False)
email: Mapped[str] = mapped_column(String, nullable=False, unique=True)
age: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
# here we pass in TIMESTAMP, this is similar to the mapping example we gave above
joined_date: Mapped[datetime] = mapped_column(TIMESTAMP, default=datetime.utcnow)
def __repr__(self):
return (
f"User(id={self.id}, "
f"fullname={self.fullname}, "
f"email={self.email}, "
f"age={self.age}, "
f"joined_date={self.joined_date})"
)
if __name__ == "__main__":
engine = create_engine("sqlite:///mappings.db", echo=True, future=True)
Base.metadata.create_all(engine)
Using a Mixin
A mixin is essentially a class that our table definitions will inherit from. Let's use a mixin for the timestamp column of our table. Let's make a simple example sticking to the timestamps and datetime we added from the last section.
class TimestampInit:
joined_date: Mapped[datetime] = mapped_column(TIMESTAMP, default=datetime.utcnow)
class User(Base, TimestampInit):
__tablename__ = "users"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
fullname: Mapped[str] = mapped_column(String, nullable=False)
email: Mapped[str] = mapped_column(String, nullable=False, unique=True)
age: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
def __repr__(self):
return (
f"User(id={self.id}, "
f"fullname={self.fullname}, "
f"email={self.email}, "
f"age={self.age}, "
f"joined_date={self.joined_date})"
)
Now all our User objects when created have a timestamp associated with when they joined up. I guess it might be ok to just have had this in the main User class, but let's see another example. Say for instance, we only want users that have verified there email to be allowed to use some functionality.
from sqlalchemy import Boolean, Integer, String, TIMESTAMP, create_engine
from sqlalchemy.orm import DeclarativeBase, sessionmaker, Mapped, mapped_column
from typing import Optional
from datetime import datetime
class Base(DeclarativeBase):
pass
# class User(Base):
# __tablename__ = "users"
# id: Mapped[int] = mapped_column(Integer, primary_key=True)
# fullname: Mapped[str] = mapped_column(String, nullable=False)
# email: Mapped[str] = mapped_column(String, nullable=False, unique=True)
# age: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
# # here we pass in TIMESTAMP, this is similar to the mapping example we gave above
# joined_date: Mapped[datetime] = mapped_column(TIMESTAMP, default=datetime.utcnow)
# def __repr__(self):
# return (
# f"User(id={self.id}, "
# f"fullname={self.fullname}, "
# f"email={self.email}, "
# f"age={self.age}, "
# f"joined_date={self.joined_date})"
# )
class TimestampInit:
joined_date: Mapped[datetime] = mapped_column(TIMESTAMP, default=datetime.utcnow)
class ContactVerificationMixin:
email_verified: Mapped[bool] = mapped_column(Boolean, default=False, nullable=False)
def verify_email(self):
self.email_verified = True
class User(Base, TimestampInit, ContactVerificationMixin):
__tablename__ = "users"
id: Mapped[int] = mapped_column(Integer, primary_key=True)
fullname: Mapped[str] = mapped_column(String, nullable=False)
email: Mapped[str] = mapped_column(String, nullable=False, unique=True)
age: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
def __repr__(self):
return (
f"User(id={self.id}, "
f"fullname={self.fullname}, "
f"email={self.email}, "
f"age={self.age}, "
f"joined_date={self.joined_date})"
)
if __name__ == "__main__":
engine = create_engine("sqlite:///mappings.db", echo=True, future=True)
#Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
Session = sessionmaker(engine, future=True)
with Session() as session:
new_user = User(fullname="John Doe", email="johnd3oaaaae@example.com", age=30)
session.add(new_user)
session.commit()
I think this is a more practical example of the benefits of using a mixin, imagine the email verification mixin having many complex methods, that we want to separate from the User , this has many benefits, say for instance, we wanted to created a new store or reuse the verification class in some way, this makes it much easier to see what is going on.
with Session() as session:
user = session.all_users = session.query(User).all()[0]
print(user)
print(user.email_verified)
### verifiy the user email
user.verify_email()
session.commit()
print(user.email_verified)
'''
User(id=1, fullname=John Doe, email=johnd3oaaaae@example.com, age=30, joined_date=2024-01-16 15:40:03.360917)
False
True
'''
Hopefully this is a useful example of using mixin classes in your code!
Using Annoatations
Let's rewrite the code to use annotated columns, this is another way to make the code look a lot cleaner, and also to allow us to reuse the column definitions.
from typing_extensions import Annotated
int_primary_key = Annotated[int, mapped_column(Integer, primary_key=True)]
We could also do this for our fullname column
name_annotate = Annotated[str, mapped_column(String(length=100))]
Then we just slot these annotated columns in to our user table.
class User(Base, TimestampInit, ContactVerificationMixin):
__tablename__ = "users"
id: Mapped[int_primary_key]
fullname: Mapped[name_annotate]
email: Mapped[str] = mapped_column(String, nullable=False, unique=True)
age: Mapped[Optional[int]] = mapped_column(Integer, nullable=True)
def __repr__(self):
return (
f"User(id={self.id}, "
f"fullname={self.fullname}, "
f"email={self.email}, "
f"age={self.age}, "
f"joined_date={self.joined_date})"
)
Join us in the next post to discuss foreign keys and relationships.