'Backend docker image does not wait until db becomes available

I am trying to docker-compose up my containers, one for backend and another one for the database (postgis). If I docker-compose up db, I see db_1 | 2021-11-23 10:36:02.123 UTC [1] LOG: database system is ready to accept connections, so, it works.

But if I docker-compose up the whole project, I get

django.db.utils.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.23.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?

As far as I know, it means that my backend image does not waiting for until db becomes available, and throws an error. If this idea is correct (is it?), the one of solutions could be:

  • to add some code to force backend image to wait for db, like it described here: 1Docker-compose up do not start backend after database . I tried to implement solutions using while loop (see commented lines in docker-compose.yaml), but in my case it doesn't work, and, to be honest, I do not quite understand the "anatomy" of these commands.

I have two subquestions now:

  1. Do I understand my problem correctly?
  2. How to solve it?

Many thanks in advance to anybody who try to help me!

My files are below:

docker-compose.yaml

version: "3.9"

services:
  db:
    image: postgis/postgis
    volumes:
      - ./data/db:/var/lib/postgresql/data
    environment:
      - POSTGRES_DB=postgis
      - POSTGRES_USER=postgis
      - POSTGRES_PASSWORD=postgis
    ports:
      - 5432:5432
      #postgres: 5432

  web:
    build: .

    #command: /wait-for-it.sh db:5432
    #something like command: ["./wait-for-it.sh", "db:5432", "--", "./start.sh"]
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - ./:/usr/src/[projectname-backend]/
    ports:
      - "8000:8000"
    env_file:
      - ./.env.dev
    depends_on:
      - db

volumes:
  db:

Dockerfile

FROM python:3.8.3-alpine

WORKDIR /usr/src/[projectname-backend]

RUN apk update && apk upgrade \
  && apk add postgresql-dev \
    gcc \
    python3-dev \
    musl-dev \
    libffi-dev \
  && apk add --repository http://dl-cdn.alpinelinux.org/alpine/edge/testing \
    gdal-dev \
    geos-dev \
    proj-dev \
  && pip install pipenv

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN pip install --upgrade pip
COPY ./requirements.txt .
RUN pip install -r requirements.txt

COPY . .

Logs

% docker-compose down && docker-compose build && docker-compose up
WARNING: Found orphan containers (lista-backend_nginx_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Removing lista-backend_web_1 ... done
Removing lista-backend_db_1  ... done
Removing network lista-backend_default
db uses an image, skipping
Building web
[+] Building 7.8s (18/18) FINISHED                                                                                         
 => [internal] load build definition from Dockerfile                                                                  0.0s
 => => transferring dockerfile: 37B                                                                                   0.0s
 => [internal] load .dockerignore                                                                                     0.0s
 => => transferring context: 2B                                                                                       0.0s
 => resolve image config for docker.io/docker/dockerfile:1                                                            2.5s
 => [auth] docker/dockerfile:pull token for registry-1.docker.io                                                      0.0s
 => CACHED docker-image://docker.io/docker/dockerfile:1@sha256:42399d4635eddd7a9b8a24be879d2f9a930d0ed040a61324cfdf5  0.0s
 => [internal] load .dockerignore                                                                                     0.0s
 => [internal] load build definition from Dockerfile                                                                  0.0s
 => [internal] load metadata for docker.io/library/python:3.8.3-alpine                                                1.3s
 => [auth] library/python:pull token for registry-1.docker.io                                                         0.0s
 => [1/7] FROM docker.io/library/python:3.8.3-alpine@sha256:c5623df482648cacece4f9652a0ae04b51576c93773ccd43ad459e2a  0.0s
 => [internal] load build context                                                                                     1.0s
 => => transferring context: 18.30MB                                                                                  1.0s
 => CACHED [2/7] WORKDIR /usr/src/LISTA_backend                                                                       0.0s
 => CACHED [3/7] RUN apk update && apk upgrade   && apk add postgresql-dev     gcc     python3-dev     musl-dev       0.0s
 => CACHED [4/7] RUN pip install --upgrade pip                                                                        0.0s
 => CACHED [5/7] COPY ./requirements.txt .                                                                            0.0s
 => CACHED [6/7] RUN pip install -r requirements.txt                                                                  0.0s
 => [7/7] COPY . .                                                                                                    1.6s
 => exporting to image                                                                                                0.9s
 => => exporting layers                                                                                               0.9s
 => => writing image sha256:8a5e13ac74a6184b2be21da4269554fc98c677c9a0ee4c11a8989e9027903fec                          0.0s
 => => naming to docker.io/library/lista-backend_web                                                                  0.0s
Creating network "lista-backend_default" with the default driver
WARNING: Found orphan containers (lista-backend_nginx_1) for this project. If you removed or renamed this service in your compose file, you can run this command with the --remove-orphans flag to clean it up.
Creating lista-backend_db_1 ... done
Creating lista-backend_web_1 ... done
Attaching to lista-backend_db_1, lista-backend_web_1
web_1  | Watching for file changes with StatReloader
web_1  | Performing system checks...
web_1  | 
web_1  | System check identified some issues:
web_1  | 
web_1  | WARNINGS:
web_1  | api.CustomUser: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
web_1  |        HINT: Configure the DEFAULT_AUTO_FIELD setting or the ApiConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
web_1  | listings.Realty: (models.W042) Auto-created primary key used when not defining a primary key type, by default 'django.db.models.AutoField'.
web_1  |        HINT: Configure the DEFAULT_AUTO_FIELD setting or the ListingsConfig.default_auto_field attribute to point to a subclass of AutoField, e.g. 'django.db.models.BigAutoField'.
web_1  | 
web_1  | System check identified 2 issues (0 silenced).
web_1  | Exception in thread django-main-thread:
web_1  | Traceback (most recent call last):
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
web_1  |     self.connection = self.get_new_connection(conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
web_1  |     connection = Database.connect(**conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
web_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
web_1  | psycopg2.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.27.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?
web_1  | 
web_1  | 
web_1  | The above exception was the direct cause of the following exception:
web_1  | 
web_1  | Traceback (most recent call last):
web_1  |   File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
web_1  |     self.run()
web_1  |   File "/usr/local/lib/python3.8/threading.py", line 870, in run
web_1  |     self._target(*self._args, **self._kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/autoreload.py", line 64, in wrapper
web_1  |     fn(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/runserver.py", line 121, in inner_run
web_1  |     self.check_migrations()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 486, in check_migrations
web_1  |     executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 18, in __init__
web_1  |     self.loader = MigrationLoader(self.connection)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 53, in __init__
web_1  |     self.build_graph()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 220, in build_graph
web_1  |     self.applied_migrations = recorder.applied_migrations()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 77, in applied_migrations
web_1  |     if self.has_table():
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 55, in has_table
web_1  |     with self.connection.cursor() as cursor:
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 259, in cursor
web_1  |     return self._cursor()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 235, in _cursor
web_1  |     self.ensure_connection()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 90, in __exit__
web_1  |     raise dj_exc_value.with_traceback(traceback) from exc_value
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 219, in ensure_connection
web_1  |     self.connect()
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 200, in connect
web_1  |     self.connection = self.get_new_connection(conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/utils/asyncio.py", line 33, in inner
web_1  |     return func(*args, **kwargs)
web_1  |   File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 187, in get_new_connection
web_1  |     connection = Database.connect(**conn_params)
web_1  |   File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
web_1  |     conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
web_1  | django.db.utils.OperationalError: could not connect to server: Connection refused
web_1  |        Is the server running on host "db" (172.27.0.2) and accepting
web_1  |        TCP/IP connections on port 5432?
web_1  | 
db_1   | 
db_1   | PostgreSQL Database directory appears to contain a database; Skipping initialization
db_1   | 
db_1   | 2021-11-24 23:19:28.324 UTC [1] LOG:  starting PostgreSQL 13.3 (Debian 13.3-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
db_1   | 2021-11-24 23:19:28.328 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
db_1   | 2021-11-24 23:19:28.329 UTC [1] LOG:  listening on IPv6 address "::", port 5432
db_1   | 2021-11-24 23:19:28.336 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
db_1   | 2021-11-24 23:19:28.364 UTC [66] LOG:  database system was shut down at 2021-11-24 13:55:35 UTC
db_1   | 2021-11-24 23:19:28.400 UTC [1] LOG:  database system is ready to accept connections



Solution 1:[1]

My problem was: if I healthcheck my db container, I also should not forget to add a condition to my depends_on like this:

depends_on:
      db:
        condition: service_healthy

As mentioned in discussion here. Now it works.

Solution 2:[2]

I faced the same issue and I have resolved it using custom management command

Add this management command on your app

<django_project>/<app_name>/management/commands/wait_for_db.py

"""
Django management command wait_for_database
"""
import sys
from time import sleep, time

from django.core.management.base import BaseCommand, CommandError
from django.db import DEFAULT_DB_ALIAS, connections
from django.db.utils import OperationalError


def wait_for_database(**opts):
    """
    The main loop waiting for the database connection to come up.
    """
    wait_for_db_seconds = opts['wait_when_down']
    alive_check_delay = opts['wait_when_alive']
    stable_for_seconds = opts['stable']
    timeout_seconds = opts['timeout']
    db_alias = opts['database']

    conn_alive_start = None
    connection = connections[db_alias]
    start = time()

    while True:
        # loop until we have a database connection or we run into a timeout
        while True:
            try:
                connection.cursor().execute('SELECT 1')
                if not conn_alive_start:
                    conn_alive_start = time()
                break
            except OperationalError as err:
                conn_alive_start = None

                elapsed_time = int(time() - start)
                if elapsed_time >= timeout_seconds:
                    raise TimeoutError(
                        'Could not establish database connection.'
                    ) from err

                err_message = str(err).strip()
                print(f'Waiting for database (cause: {err_message}) ... '
                      f'{elapsed_time}s',
                      file=sys.stderr, flush=True)
                sleep(wait_for_db_seconds)

        uptime = int(time() - conn_alive_start)
        print(f'Connection alive for > {uptime}s', flush=True)

        if uptime >= stable_for_seconds:
            break

        sleep(alive_check_delay)


class Command(BaseCommand):
    """
    A readiness probe you can use for Kubernetes.
    If the database is ready, i.e. willing to accept connections
    and handling requests, then this call will exit successfully. Otherwise
    the command exits with an error status after reaching a timeout.
    """
    help = 'Probes for database availability'
    requires_system_checks = False

    def add_arguments(self, parser):
        parser.add_argument('--timeout', '-t', type=int, default=180,
                            metavar='SECONDS', action='store',
                            help='how long to wait for the database before '
                                 'timing out (seconds), default: 180')
        parser.add_argument('--stable', '-s', type=int, default=5,
                            metavar='SECONDS', action='store',
                            help='how long to observe whether connection '
                                 'is stable (seconds), default: 5')
        parser.add_argument('--wait-when-down', '-d', type=int, default=2,
                            metavar='SECONDS', action='store',
                            help='delay between checks when database is '
                                 'down (seconds), default: 2')
        parser.add_argument('--wait-when-alive', '-a', type=int, default=1,
                            metavar='SECONDS', action='store',
                            help='delay between checks when database is '
                                 'up (seconds), default: 1')
        parser.add_argument('--database', default=DEFAULT_DB_ALIAS,
                            action='store', dest='database',
                            help='which database of `settings.DATABASES` '
                                 'to wait for. Defaults to the "default" '
                                 'database.')

    def handle(self, *args, **options):
        """
        Wait for a database connection to come up. Exit with error
        status when a timeout threshold is surpassed.
        """
        try:
            wait_for_database(**options)
        except TimeoutError as err:
            raise CommandError(err) from err

In your docker-compose.yml, run wait_for_db before migrate

command: bash -c "python manage.py wait_for_db; python manage.py migrate;

sample code docker-compose.yml

version: "3.8"
   
services:
  db:
    image: postgres

  django:
    build: django
    command: bash -c "python manage.py wait_for_db; python manage.py migrate; daphne -b 0.0.0.0 -p 8001 demo.asgi:application"
    volumes:
      - ./django:/workdir
    expose:
      - 29000 # UWSGI application
      - 8001  # ASGI application
    depends_on:
      - db
    stdin_open: true
    tty: true

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 halfer
Solution 2 Salman Khuwaja