'How to connect and query MySQL DB from python shell job in AWS Glue

I was using sqlalchemy to create connection and query mySQL DB, however, glue doesn't seem to support "sqlalchemy" or even "pymysql". Is there a way to do this on Glue python shell jobs?



Solution 1:[1]

I think you'll need to install sqlalchemy and pymysql. Glue makes it fairly easy to install additional py libs if you're using the Spark runtime, but the py shell runtime seems to be a little different.

The only way I've gotten it to work is to download(or create) whl files. Luckily, you can download sqlalchemy and pymysql from pypi. Note: there are many options for the sqlachemy whl file if you need a specific version.

Get those two whl files in a s3 bucket that your glue job will have access to. Then add both paths (separated by a comma) to the Python library path in your Job

example.

s3://my-bucket/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl,s3://my-bucket/PyMySQL-1.0.2-py3-none-any.whl

Then you should be able to import them like so

import sqlalchemy
import pymysql


print('sqlalchemy', sqlalchemy.__version__)
print('pymysql', pymysql.__version__)

May 7, 2022, 9:38:03 AM Pending execution
Processing ./glue-python-libs-ox4yhv_1/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting greenlet!=0.4.17; python_version >= "3" and (platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32"))))))
Downloading greenlet-1.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (147 kB)
Collecting importlib-metadata; python_version < "3.8"
Downloading importlib_metadata-4.8.3-py3-none-any.whl (17 kB)
Collecting zipp>=0.5
Downloading zipp-3.6.0-py3-none-any.whl (5.3 kB)
Collecting typing-extensions>=3.6.4; python_version < "3.8"
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Installing collected packages: greenlet, zipp, typing-extensions, importlib-metadata, SQLAlchemy
Successfully installed SQLAlchemy-1.4.36 greenlet-1.1.2 importlib-metadata-4.8.3 typing-extensions-4.1.1 zipp-3.6.0
Processing ./glue-python-libs-ox4yhv_1/PyMySQL-1.0.2-py3-none-any.whl
Installing collected packages: PyMySQL
Successfully installed PyMySQL-1.0.2
sqlalchemy 1.4.36 pymysql 1.0.2

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Bob Haffner