'Table metadata transfer from sqlserver to postgresql using SQLAlchemy

I am trying to migrate my database from ms sql server to PostgreSQL using python script . Before migrating the data, script needs to create required tables.

I intend to use sqlalchemy to create required tables and then migrate the actual data. Below is the sample code. While creating table in pgsql , script is failing as there are no datatype like tinyint in pgsql. I though sqlalchemy abstracts these data types.

Any suggestions and best practices for this kind of usecase will be of great help

from sqlalchemy import create_engine, MetaData, select, func, Table
import pandas as pd
engine_pg = create_engine('postgresql://XXXX:[email protected]:5432/pgschema')
engine_ms = create_engine('mssql+pyodbc://XX:[email protected]/msqlschema?driver=SQL+Server')
ms_metadata = MetaData(bind=engine_ms)
pg_metadata = MetaData(bind=engine_pg)

#extract Node table object from mssql using ms_metadat and engine_ms
Node = Table('Node', ms_metadata, autoload_with=engine_ms)

#create Node table in pgsql using the Node table object
Node.create(bind=engine_pg)


Solution 1:[1]

While I have not done the ms sql to postgreSQL path I have done some other (small to tiny) migrations and have some minor experience with both databases you are looking at. The solution to your specific problem is probably best done through a mapping functionality. There is a library that I have looked at but never gotten around to using which contain such mappings: https://pgloader.readthedocs.io/en/latest/ref/mssql.html?highlight=tinyint%20#default-ms-sql-casting-rules

Since a data migration is usually done just once, I would recommend making use of an existing tool. SQLAlchemy is not really such a tool from my understanding but could potentially be turned into one with some effort.

Regarding your question about SQLAlchemy abstracting the data I would not hold this situation against SQLAlchemy. Tinyint is a 1 byte data type. There is no such data type available in postgreSQL which makes a direct mapping impossible. Hence the mapping ound in pgloader (linked above). https://docs.microsoft.com/en-us/sql/t-sql/data-types/int-bigint-smallint-and-tinyint-transact-sql?view=sql-server-ver15

https://www.postgresql.org/docs/9.1/datatype-numeric.html

Finally some thoughts on meta information available here. It seems like you are offering a bounty to this 6 months after you posted the original question which is interesting as it is either a huge project or one you don't allocate a lot of time to. Either way I urge you to use an existing tool rather than trying to make something work beyond its intended usage. Another thing is the inclusion of the pandas import. If you are thinking of using pandas for the data transfer I want to caution you on the fact that pandas is very forgiving with data formats. This might not be a problem for you, but a more controlled data pipeline would probably be less error prone.

Given the previous paragraph I'd like some more info on the overall situation before pointing you in the right direction. Database migration can have other unforeseen consequences as well so I don't want to give the impression that the solution to your overall problem is a quick fix as simple as a tinyint to smallint mapping.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 NacMacFeegle