'DAX TimeoutError: Connection timeout after 10000ms

Tech stack:

Lambda
Dynomodb
DAX
amazon-dax-client

DAX Query:

const parameters = {
      TableName: USER_TABLE,
      ....
    };
const endpoint = DAX_CLUSTER_ENDPOINT;
const daxService = new AmazonDaxClient({ endpoints: [endpoint], region });
const daxClient = new AWS.DynamoDB.DocumentClient({ service: daxService });
response = await daxClient.query(parameters).promise();

API works fine, but sometimes throwing this error

ERROR   Failed to pull from my-dax-cluster.dax-clusters.xxxx.amazonaws.com (ip address.): TimeoutError: Connection timeout after 10000ms
    at SocketTubePool.alloc (/var/task/node_modules/amazon-dax-client/src/Tube.js:244:64)
    at /var/task/node_modules/amazon-dax-client/generated-src/Operations.js:215:30

Following this error

{
    "errorType": "Error",
    "errorMessage": "Endpoint is unreachable: my-ip:9111. connect EMFILE my-ip:9111 - Local (undefined:undefined)",
    "time": 1635838117288,
    "retryable": true,
    "requestId": null,
    "statusCode": -1,
    "_tubeInvalid": false,
    "waitForRecoveryBeforeRetrying": false,
    "stack": [
        "Error: Endpoint is unreachable: my-ip:9111. connect EMFILE my-ip:9111 - Local (undefined:undefined)",
        "    at SocketTubePool.socketError (/var/task/node_modules/amazon-dax-client/src/Tube.js:290:11)",
        "    at TLSSocket.<anonymous> (/var/task/node_modules/amazon-dax-client/src/Tube.js:277:103)",
        "    at TLSSocket.emit (events.js:400:28)",
        "    at TLSSocket.emit (domain.js:470:12)",
        "    at emitErrorNT (internal/streams/destroy.js:106:8)",
        "    at emitErrorCloseNT (internal/streams/destroy.js:74:3)",
        "    at processTicksAndRejections (internal/process/task_queues.js:82:21)",
        "    at runNextTicks (internal/process/task_queues.js:64:3)",
        "    at listOnTimeout (internal/timers.js:526:9)",
        "    at processTimers (internal/timers.js:500:7)"
    ]

Other similar questions:

https://stackoverflow.com/questions/63587352/still-receiving-timeout-connects-using-js-dax-client-classified-as-an-error-but

Aws dax stability issues



Solution 1:[1]

I encounter exactly the same issue in lambda function. I was missing ingress security rules to allow incoming traffic - 8111 (unencrypted) and 9111 (encrypted) for DAX.

Make sure to have DAX and lambda in the same VPC and same subnets.

Creating VPC in aws-cdk:

new core.aws_ec2.Vpc(this, "VPC", {
      cidr: "10.0.0.0/16",
      natGateways: 1,
      maxAzs: 3,
      subnetConfiguration: [
        {
          name: "private-subnet-1",
          subnetType: core.aws_ec2.SubnetType.PRIVATE_WITH_NAT,
          cidrMask: 24,
        },
        {
          name: "public-subnet-1",
          subnetType: core.aws_ec2.SubnetType.PUBLIC,
          cidrMask: 24,
        },
      ],
    });

Creating DAX in aws-cdk:

    const daxRole = new core.aws_iam.Role(this, "DAXRole", {
      assumedBy: new core.aws_iam.ServicePrincipal("dax.amazonaws.com"),
    });

    this.grantFullAccess(daxRole);

    const parameters = new core.aws_dax.CfnParameterGroup(
      this,
      "DaxParameters",
      {
        parameterGroupName: "ParameterGroup",
        parameterNameValues: {
          "query-ttl-millis": core.Duration.minutes(60)
            .toMilliseconds()
            .toString(),
          "record-ttl-millis": core.Duration.minutes(60)
            .toMilliseconds()
            .toString(),
        },
      }
    );

    const securityGroup = new core.aws_ec2.SecurityGroup(
      this,
      "DaxSecurityGroupV2",
      {
        vpc: props.vpc,
        description: "DAX Security Group",
      }
    );

    securityGroup.addIngressRule(
      core.aws_ec2.Peer.anyIpv4(),
      core.aws_ec2.Port.tcp(8111),
      "Allow unencrypted connection"
    );

    securityGroup.addIngressRule(
      core.aws_ec2.Peer.anyIpv4(),
      core.aws_ec2.Port.tcp(9111),
      "Allow encrypted connection"
    );

    const subnetGroups = new core.aws_dax.CfnSubnetGroup(this, `SubnetGroups`, {
      subnetIds: props.vpc.privateSubnets.map((subnet) => subnet.subnetId),
    });

    const daxCluster = new core.aws_dax.CfnCluster(this, "DaxV2", {
      iamRoleArn: daxRole.roleArn,
      nodeType: "dax.t3.small",
      replicationFactor: 1,
      clusterEndpointEncryptionType: "NONE",
      parameterGroupName: parameters.parameterGroupName,
      securityGroupIds: [securityGroup.securityGroupId],
      sseSpecification: {
        sseEnabled: true,
      },
      subnetGroupName: subnetGroups.ref,
    });

Creating lambda in same VPC and same subnets:

    this.role = new core.aws_iam.Role(scope, "DemoApiLambdaRole", {
      assumedBy: new core.aws_iam.ServicePrincipal("lambda.amazonaws.com"),
    });

    const lambdaCodeAsset = createAsset(this, "../");

    this.role.addManagedPolicy(
      core.aws_iam.ManagedPolicy.fromAwsManagedPolicyName(
        "service-role/AWSLambdaBasicExecutionRole"
      )
    );

    this.role.addToPolicy(
      new core.aws_iam.PolicyStatement({
        effect: core.aws_iam.Effect.ALLOW,
        actions: [
          // VPC
          "ec2:DescribeNetworkInterfaces",
          "ec2:CreateNetworkInterface",
          "ec2:DeleteNetworkInterface",
          "ec2:DescribeInstances",
          "ec2:AttachNetworkInterface",
          // DAX
          "dax:*",
        ],
        resources: ["*"],
      })
    );

    const securityGroup = new core.aws_ec2.SecurityGroup(
      this,
      "LambdaSecurityGroup",
      {
        vpc: props.vpc,
        description: "Demo API Security Group",
        allowAllOutbound: true,
      }
    );

    const func = new core.aws_lambda.Function(this, "Function", {
      code: lambdaCodeAsset,
      handler: "index.handler",
      runtime: core.aws_lambda.Runtime.NODEJS_14_X,
      vpc: props.vpc,
      vpcSubnets: {
        subnetType: core.aws_ec2.SubnetType.PRIVATE_WITH_NAT,
      },
      securityGroups: [securityGroup],
      role: this.role,
    });

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Townsheriff