'"Input contains NaN, infinity or a value too large for dtype('float64')"

I am trying to train a model, but I am getting this error

Input contains NaN, infinity or a value too large for dtype('float64').

Here's part of my code, how can I fix this?

from sklearn.model_selection import train_test_split

a = clean_df.drop('AQI_calculated', axis = 1).values
b = clean_df.loc[:, 'AQI_calculated'].values


a_train, a_test, b_train, b_test = train_test_split(a, b, test_size = 0.3, random_state = 42)

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(a_train, b_train)

Solution 1:^[1]

You have to check if in your data you have NaN values basically. A model can't be trained if there are some NaN, infinity or a value to large (as the error says).

To check I reccomend you using this code:

df.isnull().any().any()  #This code tells you if you have some NaN value in you dataframe

If you want to know in which column these NaN values are, you can do it this way:

df.isnull().any()

Once you know where NaN values are, you should have to deal with them. You can simple remove, fill or replace as @kelvt suggest in the comment.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	Alex Serra Marrugat

'"Input contains NaN, infinity or a value too large for dtype('float64')"

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]