'How to create dummy variable for specifc values in a column?
I want to create a dummy variable for a specific value in a column. Let's say my database looks like this :
I want a dummy variable just for the museums.
pd.get_dummies (df,['Buildings'])
gives me a dummy for "cinema", "school" and "university". In practice, I could drop the new columns for "cinema", "school" and "university" but only keep the one for "museum" but what if the variable "buildings" takes a lot of values (more than 100) ? What would be the correct syntax to only select a specific value and create with it a dummy variable ?
Solution 1:[1]
If need only one column simpliest is create it manually with casting boolean to integers:
df['museum'] = df['Buildings'].eq('museum').astype(int)
With your solution is possible replace non museum
values to missing values, then pd.get_dummies
omit missing values:
df=pd.DataFrame({'Buildings':['museum','cinema','school']})
print (df)
Buildings
0 museum
1 cinema
2 school
df1 = df.assign(Buildings = df['Buildings'].where(df['Buildings'].eq('museum')))
out = pd.get_dummies(df1,['Buildings'])
print (out)
Buildings_museum
0 1
1 0
2 0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |