'How to connect and query MySQL DB from python shell job in AWS Glue
I was using sqlalchemy to create connection and query mySQL DB, however, glue doesn't seem to support "sqlalchemy" or even "pymysql". Is there a way to do this on Glue python shell jobs?
Solution 1:[1]
I think you'll need to install sqlalchemy and pymysql. Glue makes it fairly easy to install additional py libs if you're using the Spark runtime, but the py shell runtime seems to be a little different.
The only way I've gotten it to work is to download(or create) whl files. Luckily, you can download sqlalchemy and pymysql from pypi. Note: there are many options for the sqlachemy whl file if you need a specific version.
Get those two whl files in a s3 bucket that your glue job will have access to. Then add both paths (separated by a comma) to the Python library path
in your Job
example.
s3://my-bucket/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl,s3://my-bucket/PyMySQL-1.0.2-py3-none-any.whl
Then you should be able to import them like so
import sqlalchemy
import pymysql
print('sqlalchemy', sqlalchemy.__version__)
print('pymysql', pymysql.__version__)
May 7, 2022, 9:38:03 AM Pending execution
Processing ./glue-python-libs-ox4yhv_1/SQLAlchemy-1.4.36-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Collecting greenlet!=0.4.17; python_version >= "3" and (platform_machine == "aarch64" or (platform_machine == "ppc64le" or (platform_machine == "x86_64" or (platform_machine == "amd64" or (platform_machine == "AMD64" or (platform_machine == "win32" or platform_machine == "WIN32"))))))
Downloading greenlet-1.1.2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (147 kB)
Collecting importlib-metadata; python_version < "3.8"
Downloading importlib_metadata-4.8.3-py3-none-any.whl (17 kB)
Collecting zipp>=0.5
Downloading zipp-3.6.0-py3-none-any.whl (5.3 kB)
Collecting typing-extensions>=3.6.4; python_version < "3.8"
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Installing collected packages: greenlet, zipp, typing-extensions, importlib-metadata, SQLAlchemy
Successfully installed SQLAlchemy-1.4.36 greenlet-1.1.2 importlib-metadata-4.8.3 typing-extensions-4.1.1 zipp-3.6.0
Processing ./glue-python-libs-ox4yhv_1/PyMySQL-1.0.2-py3-none-any.whl
Installing collected packages: PyMySQL
Successfully installed PyMySQL-1.0.2
sqlalchemy 1.4.36 pymysql 1.0.2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Bob Haffner |