'Im getting a different output than expected when using df.loc to change some values of the df
I have a data frame, and I want to assign a quartile number based on the quartile variable, which gives me the ranges that I later use in the for. The problem is that instead of just changing the quartile number, it its creating n (len of the datframe) rows, and then using the row number for the loop.
quartile = numpy.quantile(pivot['AHT'], [0.25,0.5,0.75])
pivot['Quartile'] = 0
for i in range(0,len(pivot)-1):
if i <= quartile[0]:
pivot.loc[i,'Quartile'] = 1
elif i <= quartile[1]:
pivot.loc[i,'Quartile'] = 2
elif i <= quartile[2]:
pivot.loc[i,'Quartile'] = 3
else:
pivot.loc[i,'Quartile'] = 4
Solution 1:[1]
Use qcut
with labels=False
and add 1
or specify values of labels in list:
pivot['Quartile'] = pd.qcut(pivot['AHT'], 4, labels=False) + 1
pivot['Quartile'] = pd.qcut(pivot['AHT'], 4, labels=[1,2,3,4])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |