'Configuring PostgreSQL schema for default Django DB working with PgBouncer connection pool

I need to set the default DB schema for a Django project, so that all tables of all apps (including 3rd party apps) store their tables in the configured PostgreSQL schema.

One solution is to use a DB connection option, like this:

# in Django settings module add "OPTIONS" to default DB, specifying "search_path" for the connection

DB_DEFAULT_SCHEMA = os.environ.get('DB_DEFAULT_SCHEMA', 'public')  # use postgresql default "public" if not overwritten

DATABASES['default']['OPTIONS'] = {'options': f'-c search_path={DB_DEFAULT_SCHEMA}'}

This works for a direct connection to PostgreSQL, but not when connecting to PgBouncer (to use connection pools), failing with OperatonalError: unsupported startup parameter: options". It appears PgBouncer doesn't recognize options as a startup parameter (at this point of time).

Another solution to set the schema without using startup parameters, is to prefix all tables with the schema . To make sure this works for built-in and 3rd party apps too (not just my own app), a solution is to inject the schema name to db_table attribute of all models when they're being loaded by Django, using class_prepared signal, and an AppConfig. This approach is close to what projects like django-db-prefix use, only need to make sure the schema name is well quoted:

from django.conf import settings
from django.db.models.signals import class_prepared


def set_model_schema(sender, **kwargs):
    schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
    db_table = sender._meta.db_table
    if schema and not db_table[1:].startswith(schema):
        sender._meta.db_table = '"{}"."{}"'.format(schema, db_table)


class_prepared.connect(set_model_schema)

This works for connection pools too, however it doesn't play well with Django migrations. Using this solution, python manage.py migrate fails to work, because migrate command ensures django_migrations table exists, by introspecting existing tables, which the db_table prefix of models has no effect on.

I'm curious what a proper way could be to solve this problem.



Solution 1:[1]

This is the solution I came up with. Mixing both solutions above, using 2 separate DB connections.

  • Using the connection startup parameters (which works well for apps and migrations), but only to run migrations, not the app server. This means Django migrations has to connect to PostgreSQL directly, and not via PgBouncer, which for my case is fine.

  • Prefixing DB tables with the schema using a class_prepared signal handler, but excluding django_migrations table. The handler is registered with a Django app (say django_dbschema) using the AppConfig.__init__() method, which is the 1st stage of project initialization process, so all other apps are affected. An environment variable is used to flag bypassing this registration, which is set when running migrations. This way when the app runs to serve requests, it can connect to PgBouncer just as good, but Django migrations is unaware of schema prefixes.

Two environment variables (used by Django settings module) will be used to configure this behavior: DB_DEFAULT_SCHEMA is the name of the schema, and DB_SCHEMA_NO_PREFIX flags disabling registration for signal handler. It'll be like this:

The django_dbschema app structure (in the root of the project)

django_dbschema/
??? apps.py
??? __init__.py

where apps.py defines the signal handler and AppConfig to register it:

from django.apps import AppConfig
from django.conf import settings
from django.db.models.signals import class_prepared


def set_model_schema(sender, **kwargs):
    """Prefix the DB table name for the model with the configured DB schema.
    Excluding Django migrations table itself (django_migrations).
    Because django migartion command directly introspects tables in the DB, looking
    for eixsting "django_migrations" table, prefixing the table with schemas won't work
    so Django migrations thinks, the table doesn't exist, and tries to create it.

    So django migrations can/should not use this app to target a schema.
    """

    schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
    if schema == "":
        return
    db_table = sender._meta.db_table
    if db_table != "django_migrations" and not db_table[1:].startswith(schema):
        # double quotes are important to target a schema
        sender._meta.db_table = '"{}"."{}"'.format(schema, db_table)


class DjangoDbschemaConfig(AppConfig):
    """Django app to register a signal handler for model class preparation
    signal, to prefix all models' DB tables with the schema name from "DB_DEFAULT_SCHEMA"
    in settings.

    This is better than specifying "search_path" as "options" of the connection,
    because this approach works both for direct connections AND connection pools (where
    the "options" connection parameter is not accepted by PGBouncer)

    NOTE: This app defines __init__(), to register class_prepared signal.
    Make sure no models are imported in __init__. see
    https://docs.djangoproject.com/en/3.2/ref/signals/#class-prepared

    NOTE: The signal handler for this app excludes django migrations,
    So django migrations can/should not use this app to target a schema.
    This means with this enabled, when starting the app server, Django thinks
    migrations are missing and always warns with:
    You have ... unapplied migration(s). Your project may not work properly until you apply the migrations for ...
    To actually run migrations (python manage.py migrate) use another way to set the schema
    """

    name = "django_dbschema"
    verbose_name = "Configure DB schema for Django models"

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        schema = getattr(settings, "DB_DEFAULT_SCHEMA", "")
        if schema and not getattr(
            settings, "DB_SCHEMA_NO_PREFIX", False
        ):  # don't register signal handler if no schema or disabled
            class_prepared.connect(set_model_schema)

This app is registered to the list of INSTALLED_APPS (I had to use full class path to the app config, otherwise Django wouldn't load my AppConfig definition).

Also the Django settings module (say settings.py), would define 1 extra DB connection (a copy of default) but with connection options:

# ...

INSTALLED_APPS = [
    'django_dbschema.apps.DjangoDbschemaConfig',  # has to be full path to class otherwise django won't load local app
    'django.contrib.admin',
    # ...
]

# 2 new settings to control the schema and prefix
DB_DEFAULT_SCHEMA = os.environ.get('DB_DEFAULT_SCHEMA', '')
DB_SCHEMA_NO_PREFIX = os.environ.get('DB_SCHEMA_NO_PREFIX', False)  # if should force disable prefixing DB tables with schema

DATABASES = {
    'default': {  # default DB connection definition, used by app not migrations, to work with PgBouncer no connection options
        # ...
    }
}


# default_direct: the default DB connection, but a direct connection NOT A CONNECTION POOL, so can have connection options

DATABASES['default_direct'] = deepcopy(DATABASES['default'])

# explicit test db info, prevents django test from confusing multi DB aliases to the same actual DB with circular dependencies
DATABASES['default_direct']['TEST'] = {'DEPENDENCIES': [], 'NAME': 'test_default_direct'}

# allow overriding connection parameters if necessary
if os.environ.get('DIRECT_DB_HOST'):
    DATABASES['default_direct']['HOST'] = os.environ.get('DIRECT_DB_HOST')
if os.environ.get('DIRECT_DB_PORT'):
    DATABASES['default_direct']['PORT'] = os.environ.get('DIRECT_DB_PORT')
if os.environ.get('DIRECT_DB_NAME'):
    DATABASES['default_direct']['NAME'] = os.environ.get('DIRECT_DB_NAME')

if DB_DEFAULT_SCHEMA:
    DATABASES['default_direct']['OPTIONS'] = {'options': f'-c search_path={DB_DEFAULT_SCHEMA}'}

# ...

Now setting the environment variable DB_DEFAULT_SCHEMA=myschema configures the schema. To run migrations, we'll set the proper environment variable, and explicitly use the direct DB connection:

env DB_SCHEMA_NO_PREFIX=True python manage.py migrate --database default_direct

And when the app server runs, it'll use the default DB connection, which works with PgBouncer.

The down side is that since Django migrations is excluded from the signal handler, it'll think no migrations were run, so it always warns about this:

"You have ... unapplied migration(s). Your project may not work properly until you apply the migrations for ..."

Which is not true if we make sure we actually run migrations always before running the app server.

A side note about this solution is that, now the Django project has multiple DB connection settings (if it didn't have before). So for example DB migrations should have been written to work with an explicit connection, and not relying on default connection. For example if RunPython is used in a migration, it should pass the connection (schema_editor.connection.alias) to the object manager when querying. For example:

my_model.save(using=schema_editor.connection.alias)
# or
my_model.objects.using(schema_editor.connection.alias).all()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1