'ECS Fargate task failing to start: "standard_init_linux.go:228: exec user process caused: exec format error" even when built for amd64

My CDK stack with a Fargate task won't start, tasks stop directly with the error:

"standard_init_linux.go:228: exec user process caused: exec format error"

I am aware that this error is often due to the Docker image not being for the correct CPU architecture or some malformed shell script so have taken steps against that.

My Docker image is a nodejs express server and I have built it on my MacOs M1 machine. I have built the image multi-arch (amd64 and arm) as well as just amd64.

I have a manually AWS console created Fargare service behind an ALB in another AWS account which mimics what I am trying to do with CDK. I have the same Docker image in that other AWS account ECS repository and there my tasks work.

I have therefore concluded that it is not the Docker image itself but something with the task definition.

What could be wrong?

Here is my CDK stack:

import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';

import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as ecr from 'aws-cdk-lib/aws-ecr';
import * as ecs from 'aws-cdk-lib/aws-ecs';
import * as ecsp from 'aws-cdk-lib/aws-ecs-patterns';

export class HelloEcsStack extends cdk.Stack {
  constructor(scope: Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const serviceName = 'my-backend';
    const servicePort = 5432;

    const vpc = new ec2.Vpc(this, "MyVpc", {
      maxAzs: 2
    });

    const cluster = new ecs.Cluster(this, "BackendCluster", {
      vpc: vpc
    });    

    const load_balanced_fargate_service = new ecsp.ApplicationLoadBalancedFargateService(this, 'Backend', {
      cluster: cluster,
      taskImageOptions: {
        image: ecs.ContainerImage.fromEcrRepository(
          ecr.Repository.fromRepositoryName(this, id, 'backend-container-repo')
        ),
        containerPort: servicePort,
        environment: {
          // some key values our backend needs
        }
      },
      listenerPort: servicePort,
      publicLoadBalancer: true
    });

    load_balanced_fargate_service.targetGroup.configureHealthCheck({
      port: `${servicePort}`,
      path: '/api/health'
    });
  }
}

The only diff between my task definition JSON in the manual and CDK variant is:

Manual: task memory and CPU is larger (2048/1024)

CDK:

"runtimePlatform": null,

Manual:

"runtimePlatform": {
    "operatingSystemFamily": "LINUX",
    "cpuArchitecture": null
},

There are also some values where the CDK has empty array while the value in the manual task def is null. Not sure if that matters at all. Example: Manual: "entryPoint": null, CDK: "entryPoint": [],

And just to be complete, here is my Dockerfile for the image:

# --------------> The build image
FROM node:latest AS build
WORKDIR /usr/src/app
COPY package*.json /usr/src/app/
RUN npm ci --only=production
 
# --------------> The production image
FROM node:lts-alpine
RUN apk add dumb-init
ENV NODE_ENV production
USER node
WORKDIR /usr/src/app
COPY --chown=node:node --from=build /usr/src/app/node_modules /usr/src/app/node_modules
COPY --chown=node:node . /usr/src/app

# Setup backend port (default can be overridden by --build-arg PORT <port value>)
ARG PORT=5432
ENV PORT=$PORT
EXPOSE $PORT

RUN echo 

CMD ["dumb-init", "node", "server.js"]


Solution 1:[1]

It actually was the image after all.

My problem was that I built an image locally first but when deploying I used a shell script. The script didn't tag the image properly so the actually pushed image was an old one (with the wrong architecture). Found this out by purging my docker image and containers locally and got errors when running the script even after building.

The corrected build and deploy script is now as follows:

#!/bin/bash
if [ -z "$1" ]
then
  echo "Please provide all parameters"
  echo "usage: ./docker_deploy.sh <AWS account profile> <AWS region>"
  exit 1
fi

if [ -z "$2" ]
then
  echo "Please provide all parameters"
  echo "usage: ./docker_deploy.sh <AWS account profile> <AWS region>"
  exit 1
fi

PROFILE="$1"
REGION="$2"

export AWS_PROFILE=$PROFILE
export AWS_REGION=$REGION

REPO_NAME=your-repo
if aws ecr describe-repositories --repository-names ${REPO_NAME} >/dev/null ; then
    echo "ECR repository exists"
else
    echo "ECR repository doest not exist, creating..."
    aws ecr create-repository --repository-name ${REPO_NAME} >/dev/null
fi

REPO_URI=`aws ecr describe-repositories --repository-names ${REPO_NAME} | jq -r '.repositories[].repositoryUri | match( "([^/]*)" ).string'`



set -e # fail script on any individual command failing

aws ecr get-login-password --profile ${PROFILE} --region ${REGION} | docker login --username AWS --password-stdin ${REPO_URI}
docker build --platform=linux/amd64 -t ${REPO_NAME} .
docker tag ${REPO_NAME}:latest ${REPO_URI}/${REPO_NAME}:latest
docker push ${REPO_URI}/${REPO_NAME}:latest

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1