'How to convert the values of an attribute having categorical values to integer type?
I have a dataset in which one of its columns is Ex-Showroom_Price
, and I'm trying to convert its values to integers but I'm getting an error.
import pandas as pd
#reading the dataset
cars = pd.read_csv('cars_engage_2022.csv')
cars["Ex-Showroom_Price"] = int(cars["Ex-Showroom_Price"] .split()[-1].replace(',',''))
Error:
TypeError Traceback (most recent call last)
<ipython-input-40-d65bfedf76a4> in <module>
----> 1 cars["Ex-Showroom_Price"] = int(cars["Ex-Showroom_Price"] .split()[-1].replace(',',''))
TypeError: 'int' object is not subscriptable
Values of Ex-Showroom_Price
:
Rs. 2,92,667
Rs. 2,36,447
Rs. 2,96,661
Rs. 3,34,768
Rs. 2,72,223
:
Solution 1:[1]
First split string
into list.
df["cars-list"] = df['Ex-Showroom_Price'].str.split()
Then remove commas (',').
df["cars-int"] = df["cars-list"].apply(lambda x: x[-1].replace(',','') )
Then convert into int
.
df["cars-int"] = df["cars-int"].astype(int)
Solution 2:[2]
You are trying to use str
methods over an array of data. Assuming your cars
is a DataFrame
, you could try with methods iterating over single cells. str
comes in handy for DataFrame
s
data = ["Rs. 2,92,667", "Rs. 2,36,447", "Rs. 2,96,661", "Rs. 3,34,768", "Rs. 2,72,223"]
cars = pd.DataFrame(data, columns=['Ex-Showroom_Price'])
cars["Ex-Showroom_Price"] = cars["Ex-Showroom_Price"].str.replace(r'.* ([\d,]+)+$', r'\1').str.replace(',', '').astype('int32')
I've used a regular expression here and I've kept your ','
substitution for simplicity, but you may merge them in one.
Note: the above code runs well, but as @martineau pointed out, the error you are getting seems to be related to the format of your data. Please, ensure data
conforms with the format I'm assuming here, or expand you question with further details.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | ALai |