'Python, Pandas and intersection - not PIVOT
This isn't a straightforward pivot question. I don't want to create new named columns (or numbered ones).
What I am looking for is to find a way to search for actors that satisfy the asked question, such as:
Any ACTOR that has UP and DOWN
in the table below.
Actors | Events |
---|---|
A | Up |
B | Up |
C | Left |
A | Down |
D | Left |
C | Down |
C | Up |
The expected answer should be
[A,C]
So, using just event column to search for more than one event and returning the actor(s) that fulfill this requirement.
I have a set like this:
Actors | Events |
---|---|
A | Up |
B | Up |
C | Left |
A | Down |
D | Left |
C | Down |
C | Up |
I want to find all actors that has events intersection. Like:
Any ACTOR that has UP and DOWN
So this should return A , C
and any other that has both.
One solution was to "explode" into new data frames by events, merge all by actors, creating a new data frame like:
Actors | Event01 | Event02 | Event03 |
---|---|---|---|
A | Up | Down | |
B | Up | ||
C | Up | Down | Left |
D | Left |
or even a dictionary with lists, like:
{'A':['Up', 'Down'], 'B': ... }
But these solutions don't look very smart, thus consuming time, memory and process to explode and rebuild it.
Plus, the big new DataFrame should have new columns with new names for events (not only one named events) which could be a little stressful to managing.
Does anyone have a better solution?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|