I am trying to connect oracle database from AWS glue using cx_oracle but i am getting this error message DatabaseError: DPI-1047: Cannot locate a 64-bit Oracle
I'm new to Glue jobs and I'm looking to try to use Glue 2.0 to run PySpark jobs (python 3) that require the following python libraries as defined in my requirem
I've created a crawler that pulls messages from SQS when new objects are added on S3 but when it runs the message "The number of unique events received is 0 for
I was using sqlalchemy to create connection and query mySQL DB, however, glue doesn't seem to support "sqlalchemy" or even "pymysql". Is there a way to do this
I have a folder containing files in parquet format. I used crawler to create table defined in Glue Data Catalog which counted to 2500+ columns. I want to create
I'm using AWS Glue 3.0 and am trying to connect to Redshift using Psycopg2. At first I was uploading a whl file version of it and it would give me the error abo
I'm testing some pyspark code in an EMR notebook before I deploy it and keep running into this strange error with Spark SQL. I have all my tables and metadata i
I am fairly new to AWS Glue. I have tried creating some jobs and it works fine, now i want to take it a step further. Say we have other developers working and n
Thanks for taking your time to read this! I have multiple tables within an AWS glue catalog database and want to create an ER diagram from that database. It sho
I am trying to create table in spark sql by providing the schema and giving the location. However when i run select on the table, i see only half the columns. (
How to capture a Glue job's arguments by position rather than using the getResolvedOptions function and passing the arguments as key value pairs?
I have a source bucket where small 5KB JSON files will be inserted every second. I want to use AWS Athena to query the files by using an AWS Glue Datasource and
I am following AWS documentation on how to transfer DDB table from one account to another. There are two steps: Export DDB table into Amazon S3 Use a Glue job t
We have an ETL job that uses the below code snippet to update the catalog table: sink = glueContext.getSink(connection_type='s3', path=config['glue_s3_path_bc']
So, I've used Glue before, but it's been with a single file <> single folder relationship. What I'm trying to do now is to have a structure like this crea
Ive created an EMR cluster with the Glue Data catalog. When I invoke the spark-shell, I am able to successfully list tables stored within a Glue database via s
I am trying to populate maximum possible Glue job metrics for some testing, below is the setup I have created: A crawler reads data (dummy customer data of 500
According to AWS Glue documentation, we can use exlusions to exclude files when the connection type is s3: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-
I'm running trino on EMR version 6.5 and I have added the iceberg connector for the trino and I want it to use a glue catalog. These are the configuration under
When I started job with IAM Role AWSGlueServiceNotebookRoleDefault I have this error: Failed to authenticate user due to missing information in request. No info