'Stacked bars are unexpectedly annotated with the sum of bar heights

My data:

names_col = ['Count','Freq']
dat = [['Matching', 56935],['Mismatching', 100587]]
plot_df = pd.DataFrame(data=dat,columns=names_col)

I trying plot stacked catplot with showing values , there is my code:

plt.figure(figsize=(16,9))
p=plot_df.set_index('Count').T.plot(kind='bar', stacked=True)
p.bar_label(p.containers[0])
p.bar_label(p.containers[1])
plt.show();

First of all, output of figure not in size (16,9), what wrong? And the second plot show value as: enter image description here

instead value for matching - 56935 (here its ok), and mismatching - 100587, plot show total(157522). How I can access and show also to Mismatching value?



Solution 1:[1]

  • Use matplotlib.pyplot.bar_label twice
    • The annotation value is being determined based on whether the label is at the center of the bar or the edge of the bar.
    • The other answer uses x[0]. because there is only one group of stacked bars, but that won't work if there's more than one group on the x-axis.
    • See this answer for more details and examples with .bar_label.
  • Reshaping the dataframe should be a separate step from plotting
  • pandas.DataFrame.plot uses matplotlib as the default plotting backend, and has a number of parameters like rot, xlabel, ylabel, and figsize, for customizing the plot.
  • Tested in python 3.10, pandas 1.3.4, matplotlib 3.5.0
df = pd.DataFrame(data=dat, columns=names_col)
dft = df.set_index('Count').T

axe = dft.plot(kind='bar', stacked=True, figsize=(16,9), rot=0)

for x in axe.containers:
    axe.bar_label(x, label_type='edge', weight='bold')
    axe.bar_label(x, label_type='center', weight='bold', color='white')

enter image description here

  • Here's a more thorough example with multiple groups
    • The other answer does not place the middle annotations for the second group of bars.
# test data 
data = {'Matching': [56935, 17610], 'Mismatching': [100587, 13794], 'Test': [33139, 23567]}
df = pd.DataFrame(data=data, index=['Freq', 'Freq2'])

axe = df.plot(kind='bar', stacked=True, figsize=(16,9), rot=0)

for x in axe.containers:
    axe.bar_label(x, label_type='edge', weight='bold')
    axe.bar_label(x, label_type='center', weight='bold', color='white')

enter image description here

Add only the total to the top of the bars

  • Add a new colon for the sum of the rows, to use for annotations
df['tot'] = df.sum(axis=1)

display(df)
       Matching  Mismatching   Test     tot
Freq      56935       100587  33139  190661
Freq2     17610        13794  23567   54971

# plot 
axe = df.iloc[:, :3].plot(kind='bar', stacked=True, figsize=(16,9), rot=0)

# annotate
for x in axe.containers:
    axe.bar_label(x, label_type='center', weight='bold', color='white')

# resuse x from the for loop, the last x is the top set of bar patches
axe.bar_label(x, labels=df['tot'], label_type='edge', weight='bold')

enter image description here

Solution 2:[2]

You can set figsize as parameter of plot. Then for each of your containers, add the bar label and your own text:

p=plot_df.set_index('Count').T.plot(kind='bar', stacked=True, figsize=(16,9)) 
for x in p.containers:
    p.bar_label(x)
    p.text(0, x[0].get_y() + x[0].get_height()*0.5, x.datavalues[0], ha='center', color='w', weight='bold')

plt.show()

Output: enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Tranbi