'Clean AND practical way to handle node_modules in a Dockerized Node.js dev environment?

I have been trying to dockerize (both for development and later for production) a MERN stack application recently, and the interaction between Node.js (more especially the node_modules) and Docker kind of puts me off. In the whole question, I will designate the computer that is used for development as the "host machine".

TL;DR

Is there a way that is not too impractical to Dockerize a Node.js app without mounting the node_modules folder of your host machine to the container when developing ?

My attempted approaches (and why none of them satisfies me)

I can think of 3 big advantages of using Docker (and its docker-compose utility) for development (I will refer to these as points 1, 2 and 3) :

  1. It makes it easy to setup the dev environment on new machines, or integrate new members in the project, in that you do not have to manually install anything besides Docker itself.
  2. The app is easy to run for debugging while developing (just a quick docker-compose up and the db, the backend and the frontend are up and running).
  3. The runtime environment is consistent across all machines and the app's final production environment, since a Docker container is kind-of its own small Linux virtual machine.

The first 2 points pose no problem when dockerizing a Node.js application ; however I feel like the third one is harder to achieve in a dev environment because of how the dependencies work in Node, and its node_modules functioning. Let me explain :

This is my simplified project folder structure :

project
│   docker-compose.yml
│
└───node-backend
│   │   Dockerfile
│   │   package.json
│   │   server.js
│   │
│   └───src
│   │   │   ...
│   │
│   └───node_modules
│       │   ...
│
└───react-frontend
    │   ...

From what I have tried and what I have seen in articles and tutorials on the internet, there are basically 3 approaches to developing Node.js with Docker. In all 3 approaches, I assume I am using the following Dockerfile to describe the image of my app :

# node-backend/Dockerfile
FROM node:lts-alpine
WORKDIR /usr/src/app

COPY package*.json ./
RUN npm install

COPY . ./

EXPOSE 8000
CMD [ "npm", "start" ]
  • Approach 1 : Quick and dirty

    When developing, mount your host machine's whole code folder (including the node_modules) to your container. The docker-compose.yml file typically looks like this (not including the database or the react app config for clarity) :

    # ./docker-compose.yml
    version: "3"
    services:
        backend:
            build: ./node-backend/
            ports:
                - 8000:8000
            volumes:
                - ./node-backend:/usr/src/app
    

    This approach is the easiest to setup and use for development : code is synchronized between the host and the container, hot-reloading (with nodemon for example) works, dependencies are synchronized between host and container (no need to rebuild the container at each npm install some-module on the host machine).

    However, it doesn't respect point 3 : since the host machine's node_modules are also mounted to the container, they may contain some platform-specific parts (for example node-gyp addons) that were compiled for your host machine's OS (in my case, macOS or Windows) and not for the container's OS (Alpine Linux).

  • Approach 2 : Cleaner but annoying to install new dependencies

    Mount the host machine's source code folder, but this time create a volume (named or anonymous) that will hold the container's node_modules, preventing them from being hidden by the host machine's node_modules.

    # ./docker-compose.yml
    version: "3"
    services:
        backend:
            build: ./node-backend/
            ports:
                - 8000:8000
            volumes:
                - ./node-backend:/usr/src/app
                - /usr/src/app/node_modules
    

    With this approach, point 3 is now respected : we ensure that the node_modules folder used by the container in development was created specifically for the container, and thus contains the appropriate platform-specific code.

    However, installing new dependencies is a pain :

    • Either you run your npm installs directly in the container (using docker exec), but then the dependencies are not installed on your host machine unless you do it manually each time. It's important to also have them installed locally for the IDE (VSCode in my case) to provide auto-completion, linting, avoid "missing module" warnings...
    • Either you run your npm installs on your host machine but you have to rebuild your Docker image each time so that the node_modules of the container are up-to-date, which is time-consuming.
  • Approach 3 : Probably the cleanest, but harder to setup

    The final approach would be to develop directly within the container, which means that there is no host mounting to do, you just have to create a volume (let's make it named this time, but I think anonymous may work too ?) so that changes to your code and node_modules are persistent. I haven't tried this approach yet so I am not sure what the docker-compose.yml file would look like, but probably something among those lines :

    # ./docker-compose.yml
    version: "3"
    services:
        backend:
            build: ./node-backend/
            ports:
                - 8000:8000
            volumes:
                - backend-data:/usr/src/app
    volumes:
        backend-data:
    

    This approach also respects point 3, but remote development within a container is harder to setup than regular development on your host machine (although VSCode apparently simplifies the process). Also, source code version-control (i.e. using Git) seems a bit annoying to do, since you would have to pass your host machine's SSH identification onto your container for it to be allowed to access your remote repo.

Conclusion

As you can see, I am yet to find an approach that combines all of the advantages I am looking for. I need an approach that is easy to setup and to use in development ; that respects point 3 because it is an important aspect of the philosophy and purpose of containerization ; that doesn't make synchronizing node_modules between the container and the host machine a headache, while preserving all of the IDE's functionalities (Intellisense, linting, etc.).

Is there something I am completely missing on ? What are you guys' solutions to this problem ?



Solution 1:[1]

I would suggest to look at this project https://github.com/BretFisher/node-docker-good-defaults, it supports both local and container node_modules by using a trick, but is not compatible with some frameworks (eg strapi):

  • The container node_modules are placed one folder level up from the app (node's algo to resolve deps, recursively looks for up the folder of the app for the node_modules), and they are removed from the app folder
  • The host node_modules are placed inside the app folder (no changes there)
  • You extra bind-mount the package.json + lock files specifically between the container (remember up one folder from app) to host (app folder) so they are always kept in sync => So when you docker exec npm install dep, you only need to npm install dep on host and eventually rebuild your docker image.

Solution 2:[2]

Approach 2 + 3

Your docker-compose will look exactly like you said on Approach 2 but you attach the VSCode into the container as you mentioned on 3.

This is easy to do with a devcontainer configuration file.

My current setup below. I have one command aliased to run start.sh and it builds the containers (if necessary) and spins them up with VSCode attached with all the dev tools

??? backstage     # app code
??? db            # init.sql configuration for the database
??? dev.sh        # script to switch between services to attach to and enable debug mode
??? docker-compose.yml
??? Dockerfile           # backend
??? Dockerfile.frontend  # frontend
??? Dockerfile.other     # other service
??? postgres.conf
??? README.md
??? start.sh             # script to replace backend container initial command adding important developer aliases and enabling debug mode

docker-compose.yml

version: "3.9"
services:
  db:
    image: ...
    environment:
      - ...
    volumes:
      - ./db:/docker-entrypoint-initdb.d/
      - pgdata:/var/lib/postgresql/data:rw
  web:
    image: ...
    build: .
    environment:
      - ...
    command: bash -c "chmod +x /start.sh && /start.sh"
    volumes:
      - ./backstage/:/backstage      # app code, hot reloading
      - ./.git/:/.git                # git with submodules
      - ~/.config/:/root/.config     # my git creds
      - ./start.sh:/start.sh         # see below
      - ./.vscode/launch.json:/backstage/.vscode/launch.json
     # overwriting my teammates' who don't develop in a container debugging configs without checking anything into git
    ports:
      - ...
    depends_on:
      - db
  frontend:
    image: ...
    build:
      context: .
      dockerfile: Dockerfile.frontend
    command: bash -c "cd path/to/app && yarn start:dev"
    ports:
      - ...
    volumes:
      - ./backstage/:/backstage/     
      - ./.git/:/.git                
      - ~/.config/:/root/.config
      - /backstage/apps/prime/frontend/node_modules
      - /backstage/node_modules # prevent host from overwriting container packages
    depends_on:
      - web
  other:
    ...
  adminer:
    ... # cause why not
volumes:
  pgdata:

start.sh

echo "Container Start Script"

DEBUG_MODE=false

function runserver() {
    python path/to/manage.py runserver 0.0.0.0:...
}

# Add useful aliases to the container
echo "alias django='python -W i /backstage/.../manage.py'" >> /root/.bashrc
echo "alias pylint='\
black ... && \
flake8 ... && \
...etc'" >> /root/.bashrc
source /root/.bashrc

# keep the container alive even if Django crashes
while true; do
    if [ $(ps auxw | grep runserver | wc -l) -ge 2 ] || [ "$DEBUG_MODE" = true ]; then
        sleep 1   # yes, this is hacky
    else
        runserver # but hey, it works
    fi
done

dev.sh

# Shortcut to attach to multiple services
# Requires devcontainer CLI

SCRIPTPATH="$( cd -- "$(dirname "$0")" >/dev/null 2>&1 ; pwd -P )"
DEBUG=0
SERVICE=0

for arg in "$@"; do
    case $arg in
        -d|--debug)
            DEBUG=1
            ;;
        web|frontend)
            SERVICE="$arg"
            ;;
        *)
            echo "Unknown option $arg"
            exit 1
            ;;
    esac
done

if [ $SERVICE != 0 ]; then
    echo "This will rebuild the dev container and it might take a minute or two."
    echo "To prevent this just use this script without arguments and attach via VSCode."
    echo "Are you sure? y/n"
    read user_input
    if [ $user_input != "y" ]; then
        echo "Aborted"
        exit 0
    fi
    sed -i -E 's/"service":.*/"service": "'$SERVICE'",/' "SCRIPTPATH/.devcontainer/devcontainer.json"
fi

if [ $DEBUG = 1 ]; then
    echo "[ DEBUG MODE ]"
    echo "Please run web server with debugger attached."
    sed -i -E "s,DEBUG_MODE=.*,DEBUG_MODE=true," "$SCRIPTPATH/start.sh"
else
    sed -i -E "s,DEBUG_MODE=.*,DEBUG_MODE=false," "$SCRIPTPATH/start.sh"
   # sed <-- hacky hacky, yayayay... i don't care
fi


devcontainer open $SCRIPTPATH

devcontainer.json

{
    "name": "Dev Container",
    "features": {
        "git": "latest" // it's vscode-suggested solution
// but I might change it in the future,
// it adds a ton of time and wasted space
// when building the dev container image for the first time
    },
    // extensions --> customize to taste
    "extensions": [
        ...
        // they'll be installed inside the container
    ],
    "dockerComposeFile": "../docker-compose.yml",
    "service": "web",
    "workspaceFolder": "/backstage",
    "postStartCommand": "git config --global --add safe.directory /backstage"
    // I think this command became necessary after the latest security vuln found on git
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Marcos