'How to add new column to pandas group? Pandas forgets the column
I have a pandas dataframe indexed by createdAt grouped by pid (participant identifiers). The created at column are unix timestamps. Now I would like to add a new column to each group with a string representation of the day and month so so that I can group again by day.
But pandas seems to forget about the new column I have added?
So I have this:
def to_daymonth(timestamp: int):
datime_obj = datetime.fromtimestamp(timestamp)
return datime_obj.strftime('%d %b')
for pid, group in bypid:
group['date'] = group.index.map(to_daymonth)
print(group.date) # Inside the for loop this prints the new column like 01 May etc.
# But outside of the for loop
print(bypid.get_group('12')['date']) # KeyError: 'date'
print(bypid.get_group('12').date) # AttributeError: 'DataFrame' object has no attribute 'date'
It seems to me that pandas is forgetting I added the date column? What am I missing here?
This is what I would like to do after it remembers the date column.
for pid, group in bypid:
plt.figure()
plt.title(pid)
plt.plot((0,1), (0.5, 0.5)) # Lines for x and y in the middle
plt.plot((0.5, 0.5), (0, 1))
for date, dategroup in group.groupby('date'):
plt.scatter(dategroup.euro, dategroup.dollar, label=date)
plt.legend(loc='best')
plt.show()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|