'How to execute a lambda function which copies objects from one S3 bucket to another via Step functions?

I was able to perform the task to copy data from the source bucket to a destination bucket using lambda function, however, I got an error while executing the lambda function in Step functions. Below are the steps I followed from the scratch.

  1. Region chosen is ap-south-1
  2. Created 2 buckets. Source bucket: start.bucket & Destination bucket: final.bucket
  3. Created a Lambda function with the following information:
    • Author from scratch
    • Function name: CopyCopy
    • Runtime: Python 3.8
    • Had created a lambda IAM role: LambdaCopy and gave the necessary policies(S3 full access and Step functions full access) and attached it to the function.
    • Added a trigger and chose:
      • S3
      • Bucket: start.bucket
      • Event type: All object create events
    • I found a python code in GeeksforGeeks and applied in the code section.
import json
import boto3
s3_client=boto3.client('s3')

# lambda function to copy file from 1 s3 to another s3
def lambda_handler(event, context):
    #specify source bucket
    source_bucket_name=event['Records'][0]['s3']['bucket']['name']
    #get object that has been uploaded
    file_name=event['Records'][0]['s3']['object']['key']
    #specify destination bucket
    destination_bucket_name='final.bucket'
    #specify from where file needs to be copied
    copy_object={'Bucket':source_bucket_name,'Key':file_name}
    #write copy statement 
    s3_client.copy_object(CopySource=copy_object,Bucket=destination_bucket_name,Key=file_name)

    return {
        'statusCode': 3000,
        'body': json.dumps('File has been Successfully Copied')
    }
- I deployed the code and it worked. Uploaded a csv file in start.bucket and it was copied to final.bucket.

Then, I created a State machine in Step functions with the following information:

  1. Design your workflow visually
  2. Type: Standard
  3. Dragged the AWS Lambda between the Start and End state.
    • Changed its name to LambdaCopy
    • Integration type: Optimized
    • Under API Parameters, Function name(I chose the Lambda function that I had created): CopyCopy:$LATEST
    • Next State: End
  4. Next and then again Next
  5. State machine name: StepLambdaCopy
    • IAM Role: Create a new role (Later gave it S3 full access, Lambdafullaccess and Step function fullaccess too).

It showed error when I tried to execute it. I know I am missing out on something. I would really appreciate the help.



Solution 1:[1]

Step functions now allows you to utilize the S3 Copy SDK directly completely bypassing the need for Lambda and boto3. Take a look here for more information.

So in your case you would need a simple task that looks like this:

{
  "Comment": "A description of my state machine",
  "StartAt": "CopyObject",
  "States": {
    "CopyObject": {
      "Type": "Task",
      "End": true,
      "Parameters": {
        "ServerSideEncryption": "AES256",
        "Bucket.$": "$.destination_bucket",
        "CopySource.$": "$.source_path",
        "Key.$": "$.key"
      },
      "Resource": "arn:aws:states:::aws-sdk:s3:copyObject"
    }
  }
}

Then your input state will need to feed in the parameters you would normally use to copy a file with the copy command. Source Path, Destination Bucket, and Object Key exactly the same as the boto3 command.

Note: Your state machine IAM role will need direct S3 permissions and will need to be in the same region as the buckets.

Solution 2:[2]

It's always confusing what exactly you have to pass as parameters. Here is a template I use to copy the output of an Athena query. You can adapt it to your needs:

"athena_score": {
  "Type": "Task",
  "Resource": "arn:aws:states:::athena:startQueryExecution.sync",
  "Parameters": {
    "QueryExecutionContext": {
      "Catalog": "${AthenaCatalog}",
      "Database": "${AthenaDatabase}"
    },
    "QueryString": "SELECT ...",
    "WorkGroup": "${AthenaWorkGroup}",
    "ResultConfiguration": {
      "OutputLocation": "s3://${BucketName}/${OutputPath}"
    }
  },
  "TimeoutSeconds": 300,
  "ResultPath": "$.responseBody",
  "Next": "copy_csv"
},
"copy_csv": {
  "Type": "Task",
  "Resource": "arn:aws:states:::aws-sdk:s3:copyObject",
  "Parameters": {
    "Bucket": "${BucketName}",
    "CopySource.$": "States.Format('/${BucketName}/${OutputPath}/{}.csv', $.responseBody.QueryExecution.QueryExecutionId)",
    "Key": "${OutputPath}/latest.csv"
  },
  "ResultPath": "$.responseBody.CopyObject",
  "Ent": "true"
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Coin Graham
Solution 2 Leopoldo Varela