'Transform a dataframe using pivot
I am trying to transform a dataframe using pivot
. Since the column contains duplicate entries, i tried to add a count
column following what's suggested here (Question 10 posted in this answer).
import pandas as pd
from pprint import pprint
if __name__ == '__main__':
d = {
't': [0, 1, 2, 0, 1, 2, 0, 2, 0, 1],
'input': [2, 2, 2, 2, 2, 2, 4, 4, 4, 4],
'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'B', 'B'],
'value': [0.1, 0.2, 0.3, 1, 2, 3, 1, 2, 1, 1],
}
df = pd.DataFrame(d)
df = df.drop('t', axis=1)
df.insert(0, 'count', df.groupby('input').cumcount())
pd.pivot(df, index='count', columns='type', values='value')
But I still get the same error raise ValueError("Index contains duplicate entries, cannot reshape") ValueError: Index contains duplicate entries, cannot reshape
.
Could someone please suggest how to resolve this error?
Solution 1:[1]
As far as you have more then one value associated with 'A' and 'B' you have to aggregate values somehow.
So if I've understood your issue right possible solution is the following:
#pip install pandas
import pandas as pd
d = {
't': [0, 1, 2, 0, 1, 2, 0, 2, 0, 1],
'input': [2, 2, 2, 2, 2, 2, 4, 4, 4, 4],
'type': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'B', 'B'],
'value': [0.1, 0.2, 0.3, 1, 2, 3, 1, 2, 1, 1],
}
df = pd.DataFrame(d)
df
# I've used aggfunc='sum' argument for example, the default value is 'mean'
pd.pivot_table(df, index='t', columns='type', values='value', aggfunc='sum')
Returns
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |