'SQLAlchemy secondary relationship not using JOIN
I created a Many to Many relationship on SQLAlchemy, but for some reason when I access it it's not emitting a JOIN clause, but rather selecting from a cross product of the three tables and filtering with WHERE
clauses. I can't understand why it's doing this, and how to fix it.
Here's the python code for the association table and the primary model. The secondary model is overly complex and I think that the fact that it has sku
and account_id
columns which have indices (and together are part of the PK) is enough info to go on
product_info_to_product_profile_association_table = Table(
"tpl_products_to_product_profiles_r1",
metadata,
sa.Column(
"sku",
sa.String(128),
# This FK is failing at the MySQL level
# sa.ForeignKey(ProductInfo.sku),
nullable=False,
),
sa.Column(
"account_id", mysql.INTEGER(11), sa.ForeignKey(ProductInfo.account_id), nullable=False
),
sa.Column(
"product_profile_id",
mysql.INTEGER(11, unsigned=True),
sa.ForeignKey("tpl_product_profiles.id"),
nullable=False,
),
sa.UniqueConstraint("sku", "account_id", "product_profile_id", name="product_unique"),
)
class ProductProfile(BillingBaseModel):
__tablename__ = "tpl_product_profiles"
id = sa.Column(mysql.INTEGER(11, unsigned=True), primary_key=True)
name = sa.Column(sa.String(90), nullable=False)
account_id = sa.Column(mysql.INTEGER(11), sa.ForeignKey(Account.id), nullable=False)
product_infos = orm.relationship(
ProductInfo,
secondary=product_info_to_product_profile_association_table,
secondaryjoin=sa.and_(
ProductInfo.sku
== orm.foreign(product_info_to_product_profile_association_table.c.sku),
ProductInfo.account_id
== orm.foreign(product_info_to_product_profile_association_table.c.account_id),
),
secondaryjoin=(
id
== orm.foreign(product_info_to_product_profile_association_table.c.product_profile_id)
),
backref=orm.backref("product_profile", uselist=False),
)
Here's the generated SQL when I try to access the relationship through an instance
>>> pp.product_infos
[SELECT product_info.sku..........
FROM product_info, tpl_products_to_product_profiles_r1
WHERE tpl_products_to_product_profiles_r1.product_profile_id = %s AND product_info.sku = tpl_products_to_product_profiles_r1.sku AND product_info.account_id = tpl_products_to_product_profiles_r1.account_id]
[parameters: (6L,)]
Solution 1:[1]
This might not be the best answer, but I've dug into some of sqlalchemy's inner working and it is explicitly creating the cross join when doing lazy loading
if self.parent_property.secondary is not None:
q.add_criteria(
lambda q: q.select_from(
self.mapper, self.parent_property.secondary
)
)
The best solution I have found is to use eager loading of some kind.
Whether that's
lazy='selectin'
inside of the relationship definition- adding
selectinload(ProductProfile.product_infos)
when querying for a ProductProfile object and know you will need to access that attribute - or doing the same as above but explicitly defining the joins in the query and using
contains_eager(ProductProfile.product_infos)
For the first 2 cases sqlalchemy suggests you use selectin
or subquery
eager loading when handling x-to-many
tables (especially when the many table is large): https://docs.sqlalchemy.org/en/14/orm/loading_relationships.html#what-kind-of-loading-to-use
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | JP Hanna |