'How to turn the item in the column to multiple columns?
I am trying to do this. So, currently my df look like this.
col_names = ['movie_id', 'movie_title', 'genres']
df = pd.read_csv('/content/drive/MyDrive/testing/ml-1m/movies.csv', sep='::', names=col_names, encoding='latin-1')
1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
8::Tom and Huck (1995)::Adventure|Children's
9::Sudden Death (1995)::Action
10::GoldenEye (1995)::Action|Adventure|Thriller
while the columns are such "movie_id, name, and genre".
I want it to look like this.
The columns would be movie_id, name,Action,Adventure,Animation,Children's,Comedy,Crime,Documentary,Drama,Fantasy,Film-Noir,Horror,Musical,Mystery,Romance,Sci-Fi,Thriller,War,Western
1::Toy Story (1995)::0::0::1::1::1::0::0::0::0::0::0::0::0::0::0::0::0::0::0
2::Jumanji (1995)::0::1::0::1::0::0::0::0::1::0::0::0::0::0::0::0::0::0::0
.
.
...
Basically, I want to turn genre column to multiple columns and 1 will be there if it match the columns.
Is there anyway to do this with pandas?
Solution 1:[1]
IIUC you can do it like that:
res = (df
.join(df['genre'].str.get_dummies())
.drop('genre',axis=1))
print(res)
print(res)
movie_id name Action Adventure Animation Children's Comedy Crime Drama Fantasy Romance Thriller
0 1 Toy Story (1995) 0 0 1 1 1 0 0 0 0 0
1 2 Jumanji (1995) 0 1 0 1 0 0 0 1 0 0
2 3 Grumpier Old Men (1995) 0 0 0 0 1 0 0 0 1 0
3 4 Waiting to Exhale (1995) 0 0 0 0 1 0 1 0 0 0
4 5 Father of the Bride Part II (1995) 0 0 0 0 1 0 0 0 0 0
5 6 Heat (1995) 1 0 0 0 0 1 0 0 0 1
6 7 Sabrina (1995) 0 0 0 0 1 0 0 0 1 0
7 8 Tom and Huck (1995) 0 1 0 1 0 0 0 0 0 0
8 9 Sudden Death (1995) 1 0 0 0 0 0 0 0 0 0
9 10 GoldenEye (1995) 1 1 0 0 0 0 0 0 0 1
If you don't want to delete the original genre
column, just remove the last part with drop
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |