'Providing access to EFS from ECS task

I am struggling to get an ECS task to be able to see an EFS volume. The terraform config is:

EFS DEFINITION

resource "aws_efs_file_system" "persistent" {
encrypted = true
}

resource "aws_efs_access_point" "access" {
  file_system_id = aws_efs_file_system.persistent.id
}

resource "aws_efs_mount_target" "mount" {
  for_each = {for net in aws_subnet.private : net.id => {id = net.id}}
  file_system_id = aws_efs_file_system.persistent.id
  subnet_id      = each.value.id
  security_groups = [aws_security_group.efs.id]
}

TASK DEFINITION

resource "aws_ecs_task_definition" "app" {
  family                   = "backend-app-task"
  execution_role_arn       = aws_iam_role.ecs_task_execution_role.arn
  task_role_arn = aws_iam_role.ecs_task_role.arn
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
  cpu                      = var.fargate_cpu
  memory                   = var.fargate_memory
  container_definitions    = data.template_file.backendapp.rendered
  volume {
    name = "persistent"

    efs_volume_configuration {
      file_system_id          = aws_efs_file_system.persistent.id
      root_directory          = "/opt/data"
      transit_encryption      = "ENABLED"
      transit_encryption_port = 2999
      authorization_config {
        access_point_id = aws_efs_access_point.access.id
        iam             = "ENABLED"
      }
    }
  }
}

SECURITY GROUP

resource "aws_security_group" "efs" {
  name        = "efs-security-group"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol        = "tcp"
    from_port       = 2999
    to_port         = 2999
    security_groups = [aws_security_group.ecs_tasks.id]
    cidr_blocks = [for net in aws_subnet.private : net.cidr_block]
  }
}

TASK ROLE

resource "aws_iam_role" "ecs_task_role" {
  name               = "ecsTaskRole"

  assume_role_policy  = data.aws_iam_policy_document.ecs_task_execution_role_base.json
  managed_policy_arns = ["arn:aws:iam::aws:policy/AmazonElasticFileSystemFullAccess","arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy", aws_iam_policy.ecs_exec_policy.arn]
}

As I understand the AWS docs, the IAM role should have access, and the security group should be passing traffic, but the error suggests that the task cannot resolve the EFS instance.

The error message is:

ResourceInitializationError: failed to invoke EFS utils commands to set up EFS volumes: stderr: Failed to resolve "fs-0000000000000.efs.eu-west-2.amazonaws.com" - check that your file system ID is correct. 

I've manually confirmed in the console that the EFS id is correct, so I can only conclude that it cannot resolve due to a network/permissions issue.

-- EDIT -- ECS SERVICE DEFINITION

resource "aws_ecs_service" "main" {
  name            = "backendservice"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.app_count
  launch_type     = "FARGATE"
  enable_execute_command = true

  network_configuration {
    security_groups  = [aws_security_group.ecs_tasks.id]
    subnets          = aws_subnet.private.*.id
    assign_public_ip = true
  }

  load_balancer {
    target_group_arn = aws_alb_target_group.app.id
    container_name   = "server"
    container_port   = var.app_port
  }

  depends_on = [aws_alb_listener.backend]
}

ECS TASK SECURITY GROUP

resource "aws_security_group" "ecs_tasks" {
  name        = "backend-ecs-tasks-security-group"
  description = "allow inbound access from the ALB only"
  vpc_id      = aws_vpc.main.id

  ingress {
    protocol        = "tcp"
    from_port       = var.app_port
    to_port         = var.app_port
    security_groups = [aws_security_group.lb.id]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

VPC DEFINITION (minus internet gateway)

data "aws_availability_zones" "available" {
}

resource "aws_vpc" "main" {
  cidr_block = "172.17.0.0/16"
}

# Create var.az_count private subnets, each in a different AZ
resource "aws_subnet" "private" {
  count             = var.az_count
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]
  vpc_id            = aws_vpc.main.id
}

resource "aws_subnet" "public" {
  count                   = var.az_count
  cidr_block              = cidrsubnet(aws_vpc.main.cidr_block, 8, var.az_count + count.index)
  availability_zone       = data.aws_availability_zones.available.names[count.index]
  vpc_id                  = aws_vpc.main.id
  map_public_ip_on_launch = true
}

EDIT

It turned out the mountPoints block was missing from the container template. I have now added it, but the outcome is the same.



Solution 1:[1]

Run into this problem, this is what I did to resolve:

  1. platform_version = "1.4.0" on the aws_ecs_service, its possible its no longer relevant but was in this blog post I used as a starting point https://medium.com/@ilia.lazebnik/attaching-an-efs-file-system-to-an-ecs-task-7bd15b76a6ef
  2. Made sure aws_efs_mount_target has subnet_id and security_groups set to same ones used by the service. Since I use multiple subnets for multiple availability zones I created mount target for each of them.
  3. Added standard EFS port 2049 to ingress as without that mount operation was failing with timeout error.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Arunas