'Not able to populate AWS Glue ETL Job metrics

I am trying to populate maximum possible Glue job metrics for some testing, below is the setup I have created:

  • A crawler reads data (dummy customer data of 500 rows) from a CSV file placed in an S3 bucket.
  • Used another crawler to crawl tables created in Redshift cluster.
  • An ETL job finally reads data from csv file in s3 and dumps it into a Redshift table.

The job is running without any issue and i am able to see final data getting dumped into Redshift table, however, in the end, only below 5 Cloudwatch metrics are being populated:

  • glue.jvm.heap.usage
  • glue.jvm.heap.used
  • glue.s3.filesystem.read_bytes
  • glue.s3.filesystem.write_bytes
  • glue.system.cpuSystemLoad

There are approximately 20 more metrics which are not getting populated.

Any suggestions on how to populate those remaining metrics as well?



Solution 1:[1]

Met the same issue. Does your glue.s3.filesystem.read_bytes and glue.s3.filesystem.write_bytes have any data?

One possible reason is that the AWS Glue job metrics not emitted if job completes in less then 30 sec

Solution 2:[2]

While running the job enable the metrics option under monitoring tab.

Solution 3:[3]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Shirui Xu
Solution 2 Shubham Jain
Solution 3