'Convert a float column with nan to int pandas
I am trying to convert a float pandas column with nans
to int format, using apply.
I would like to use something like this:
df.col = df.col.apply(to_integer)
where the function to_integer
is given by
def to_integer(x):
if np.isnan(x):
return np.NaN
else:
return int(x)
However, when I attempt to apply it, the column remains the same.
How could I achieve this without having to use the standard technique of dtypes?
Solution 1:[1]
You can't have NaN
in an int
column, NaN
are float
(unless you use an object
type, which is not a good idea since you'll lose many vectorial abilities).
You can however use the new nullable integer type (NA
).
Conversion can be done with convert_dtypes
:
df = pd.DataFrame({'col': [1, 2, None]})
df = df.convert_dtypes()
# type(df.at[0, 'col'])
# numpy.int64
# type(df.at[2, 'col'])
# pandas._libs.missing.NAType
output:
col
0 1
1 2
2 <NA>
Solution 2:[2]
Not sure how you would achieve this without using dtypes. Sometimes when loading in data, you may have a column that contains mixed dtypes. Loading in a column with one dtype and attemping to turn it into mixed dtypes is not possible though (at least, not that I know of).
So I will echo what @mozway said and suggest you use nullable integer data types
e.g
df['col'] = df['col'].astype('Int64')
(note the capital I
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | skeuomorph |