'SQLAlchemy secondary relationship not using JOIN

I created a Many to Many relationship on SQLAlchemy, but for some reason when I access it it's not emitting a JOIN clause, but rather selecting from a cross product of the three tables and filtering with WHERE clauses. I can't understand why it's doing this, and how to fix it.

Here's the python code for the association table and the primary model. The secondary model is overly complex and I think that the fact that it has sku and account_id columns which have indices (and together are part of the PK) is enough info to go on

product_info_to_product_profile_association_table = Table(
    "tpl_products_to_product_profiles_r1",
    metadata,
    sa.Column(
        "sku",
        sa.String(128),
# This FK is failing at the MySQL level
        # sa.ForeignKey(ProductInfo.sku),
        nullable=False,
    ),
    sa.Column(
        "account_id", mysql.INTEGER(11), sa.ForeignKey(ProductInfo.account_id), nullable=False
    ),
    sa.Column(
        "product_profile_id",
        mysql.INTEGER(11, unsigned=True),
        sa.ForeignKey("tpl_product_profiles.id"),
        nullable=False,
    ),
    sa.UniqueConstraint("sku", "account_id", "product_profile_id", name="product_unique"),
)
​
​
class ProductProfile(BillingBaseModel):
​
    __tablename__ = "tpl_product_profiles"
​
    id = sa.Column(mysql.INTEGER(11, unsigned=True), primary_key=True)
    name = sa.Column(sa.String(90), nullable=False)
    account_id = sa.Column(mysql.INTEGER(11), sa.ForeignKey(Account.id), nullable=False)
​
    product_infos = orm.relationship(
        ProductInfo,
        secondary=product_info_to_product_profile_association_table,
        secondaryjoin=sa.and_(
            ProductInfo.sku
            == orm.foreign(product_info_to_product_profile_association_table.c.sku),
            ProductInfo.account_id
            == orm.foreign(product_info_to_product_profile_association_table.c.account_id),
        ),
        secondaryjoin=(
            id
            == orm.foreign(product_info_to_product_profile_association_table.c.product_profile_id)
        ),
        backref=orm.backref("product_profile", uselist=False),
    )

Here's the generated SQL when I try to access the relationship through an instance

>>> pp.product_infos

[SELECT product_info.sku..........
FROM product_info, tpl_products_to_product_profiles_r1 
WHERE tpl_products_to_product_profiles_r1.product_profile_id = %s AND product_info.sku = tpl_products_to_product_profiles_r1.sku AND product_info.account_id = tpl_products_to_product_profiles_r1.account_id]
[parameters: (6L,)]


Solution 1:[1]

This might not be the best answer, but I've dug into some of sqlalchemy's inner working and it is explicitly creating the cross join when doing lazy loading

        if self.parent_property.secondary is not None:
            q.add_criteria(
                lambda q: q.select_from(
                    self.mapper, self.parent_property.secondary
                )
            )

The best solution I have found is to use eager loading of some kind.

Whether that's

  • lazy='selectin' inside of the relationship definition
  • adding selectinload(ProductProfile.product_infos) when querying for a ProductProfile object and know you will need to access that attribute
  • or doing the same as above but explicitly defining the joins in the query and using contains_eager(ProductProfile.product_infos)

For the first 2 cases sqlalchemy suggests you use selectin or subquery eager loading when handling x-to-many tables (especially when the many table is large): https://docs.sqlalchemy.org/en/14/orm/loading_relationships.html#what-kind-of-loading-to-use

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 JP Hanna