'Polynomial regression isn't taking my data because it's considering date to be strings

i made this program for a school project it works fine but my data should be in the form of dates but every time i insert dates as variables it just promts me with an error saying ( can't float string "2022-05-16" )

thanks in advance

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import datetime 

dataset = pd.read_csv('/content/Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values

dataset

"""from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)"""

"""from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
X_test = sc_X.transform(X_test)"""

from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(X, y)



Solution 1:[1]

To get the datetime column as a datetime-dtype rather than a string, you could use the parse_dates argument in pandas.read_csv:

dataset = pd.read_csv('/content/Position_Salaries.csv', parse_dates=...)

Or you could convert the datetime column to a datetime data type later using pandas.to_datetime:

dataset[date_col] = pd.to_datetime(dataset[date_col])

Afterwards, you might want to extract date components using the .dt accessor methods.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Tim