'pandas Groupby matrix of one condition based on the other condition bin by time
I have a Dataset like below that divided to two desired group by below condition
Employee No | Event date | Event Description | Quarter | Year |
---|---|---|---|---|
102 | 2021-10-12 | First Hire | Q4 | 21 |
103 | 2021-11-02 | First Hire | Q4 | 21 |
102 | 2022-01-01 | Terminated | Q1 | 22 |
102 | 2021-12-12 | Shift Change | Q4 | 21 |
101 | 2021-12-03 | First Hire | Q4 | 21 |
103 | 2021-11-05 | Terminated | Q4 | 21 |
101 | 2021-12-04 | Terminated | Q4 | 21 |
105 | 2022-02-26 | First Hire | Q1 | 22 |
106 | 2022-02-26 | First Hire | Q1 | 22 |
102 | 2022-03-29 | Second Hire | Q1 | 22 |
107 | 2021-05-04 | First Hire | Q2 | 21 |
108 | 2022-04-04 | First Hire | Q2 | 22 |
109 | 2022-03-03 | Terminated | Q1 | 22 |
109 | 2021-12-29 | First Hire | Q4 | 21 |
109 | 2022-04-01 | Second Hire | Q2 | 22 |
109 | 2022-01-10 | Shift Change | Q1 | 22 |
df = pd.DataFrame.from_dict(
{
'Employee No': [102,103,102,102,101,103,101,105,106,102,107,108,109,109,109,109],
'Event date': ['2021-10-12', '2021-11-02', '2022-01-01', '2021-12-12','2021-12-03','2021-11-05','2021-12-04','2022-02-26','2022-02-26','2022-03-29','2021-05-04','2022-04-04','2022-03-03','2021-12-29','2022-04-01','2022-01-10'],
'Event Description': ['First Hire', 'First Hire', 'Terminated', 'Shift Change','First Hire','Terminated ','Terminated','First Hire','First Hire',' Second Hire','First Hire','First Hire','Terminated','First Hire','Second Hire','Shift Change'],
'Quarter': ['Q4', 'Q4', 'Q1', 'Q4','Q4','Q4','Q4','Q1','Q1','Q1','Q2','Q2','Q1','Q4','Q2','Q1'],
'Year': ['21', '21', '22', '21','21','21','21','22','22','22','21','22','22','21','22','22']
}
)
# First hired
cond1 = df["EventDescription"].eq("FirstHire")
# Terminated later after first hired
cond2 = (
df["Event Description"].eq("Terminated")
& df["employee No"].isin(df.loc[cond1, "employee No"])
)
df[cond2]
I need to visualize as a table (Heatmap or Pivot Table ) distribution for Terminated by quarter and the year at vertical and first Hire at Horizontal by Quarter and the year. the value at each cell will be unique counts of Cond 2 based on cond 1 date Quarter and year . How we can use groupby or crosstab in pandas or any other solution ? Thank you in advance.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|