'How to name the column when using value_count function in pandas?
I was counting the no of occurrence of angle and dist by the code below:
g = new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False)
the output:
current_angle current_dist 0
-50 30 1
-50 40 2
-50 41 6
-50 45 4
try1:g.columns = ['angle','Distance','count','Percentage Missed']
- result was no change in the name of column
try2:
When I print the columns using print(g.columns)
ended with error AttributeError: 'Series' object has no attribute 'columns'
I want to rename the column 0 as count and add a new column to the dataframe g
as percent missed which is calculated by 100 - value in column 0
Expected output
current_angle current_dist count percent missed
-50 30 1 99
-50 40 2 98
-50 41 6 94
-50 45 4 96
1:How to modify the code? I mean instead of value_counts, is there any other function that can give the expected output? 2. How to get the expected output with the current method?
EDIT 1(exceptional case)
data:
angle | distance | velocity |
---|---|---|
0 | 124 | -3 |
50 | 24 | -25 |
50 | 34 | 25 |
expected output:
count is calculated based on distance
angle | distance | velocity | count | percent missed |
---|---|---|---|---|
0 | 124 | -3 | 1 | 99 |
50 | 24 | -25 | 1 | 99 |
50 | 34 | 25 | 1 | 99 |
Solution 1:[1]
First add Series.reset_index
, because DataFrame.value_counts
return Series
, so possible use parameter name
for change column 0
to count
column and then subtract 100
to new column by Series.rsub
for subtract from right side like 100 - df['count']
:
df = (new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False)
.reset_index(name='count')
.assign(**{'percent missed': lambda x: x['count'].rsub(100)}))
Or if need also set new columns names use DataFrame.set_axis
:
df = (new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False)
.reset_index(name='count')
.set_axis(['angle','Distance','count'], axis=1)
.assign(**{'percent missed': lambda x: x['count'].rsub(100)}))
If need assign new columns names here is alternative solution:
df = (new_df.value_counts(subset=['Current_Angle','Current_dist'] ,sort = False)
.reset_index())
df.columns = ['angle','Distance','count']
df['percent missed'] = df['count'].rsub(100)
Solution 2:[2]
Assuming a DataFrame as input (if not reset_index
first), simply use rename
and a subtraction:
df = df.rename(columns={'0': 'count'}) # assuming string '0' here, else use 0
df['percent missed'] = 100 - df['count']
output:
current_angle current_dist count percent missed
0 -50 30 1 99
1 -50 40 2 98
2 -50 41 6 94
3 -50 45 4 96
alternative: using groupby.size
:
(new_df
.groupby(['current_angle','current_dist']).size()
.reset_index(name='count')
.assign(**{'percent missed': lambda d: 100-d['count']})
)
output:
current_angle current_dist count percent missed
0 -50 30 1 99
1 -50 40 2 98
2 -50 41 6 94
3 -50 45 4 96
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |