'How do I convert numpy NaN objects to SQL nulls?
I have a Pandas dataframe that I'm inserting into an SQL database. I'm using Psycopg2 directly to talk to the database, not SQLAlchemy, so I can't use Pandas built in to_sql functions. Almost everything works as expected except for the fact that numpy np.NaN values get converted to text as NaN and inserted into the database. They really should be treated as SQL null values.
So, I'm trying to make a custom adapter to convert np.NaN to SQL null but everything I've tried results in the same NaN strings being inserted in the database.
The code I'm currently trying is:
def adapt_nans(null):
a = adapt(None).getquoted()
return AsIs(a)
register_adapter(np.NaN, adapt_nans)
I've tried a number of variations along this theme but haven't had any luck.
Solution 1:[1]
The code I was trying previously fails because it assumes that np.Nan is its own type when it is actually a float. The following code, courtesy of Daniele Varrazzo on the psycopg2 mailing list, does the job correctly.
def nan_to_null(f,
_NULL=psycopg2.extensions.AsIs('NULL'),
_Float=psycopg2.extensions.Float):
if not np.isnan(f):
return _Float(f)
return _NULL
psycopg2.extensions.register_adapter(float, nan_to_null)
Solution 2:[2]
This answer is an alternate version of Gregory Arenius's Answer. I have replaced the conditional statement to work on any Nan value by simply checking if the value is equal to itself.
def nan_to_null(f,
_NULL=psycopg2.extensions.AsIs('NULL')
_Float=psycopg2.extensions.Float)):
if f != f:
return _NULL
else:
return _Float(f)
psycopg2.extensions.register_adapter(float, nan_to_null)
If you check if a nan value is equal to itself you will get False. The rational behind why this works is explained in detail in Stephen Canon's answer.
Solution 3:[3]
I believe the easiest way is:
df.where(pd.notnull(df), None)
Then None
is "translated": to NULL
when imported to Postgres.
Solution 4:[4]
If you are trying to insert Pandas dataframe data into PostgreSQL and getting the error for NaN
, all you have to do is:
import psycopg2
output_df = output_df.fillna(psycopg2.extensions.AsIs('NULL'))
#Now insert output_df data in the table
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | A Merii |
Solution 2 | A Merii |
Solution 3 | Yannis |
Solution 4 | Jeremy Caney |