'How to avoid the string NA from being interpreted as nan when using load_dataset
I have looked at it a number of places in the process. With panda beforehand I create a csv file and there are cells that contain the string NA which is meant to be exactly that: a string with no mathematical notion. I think that the appropriate thing is to keep the csv file as it is.
So if I do have a csv file with NA, how can I use load_dataset
to recognize that as the string NA and not what I am seeing as a Python None
? My code downstream breaks because I have a few unexpected None
values.
datasets.load_dataset('csv', split=['train'], data_files='data/my_datafile.csv')
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|