'How to show the y-axis of seaborn displot as percentage
I'm using seaborn.displot to display a distribution of scores for a group of participants.
Is it possible to have the y axis show an actual percentage (example below)?
This is required by the audience for the data. Currently it is done in excel but It would be more useful in python.
import seaborn as sns
data = sns.load_dataset('titanic')
p = sns.displot(data=data, x='age', hue='sex', height=4, kind='kde')
Desired Format
Solution 1:[1]
As mentioned by @JohanC, the y axis for a KDE is a density, not a proportion, so it does not make sense to convert it to a percentage.
You'd have two options. One would be to plot a KDE curve over a histogram with histogram counts expressed as percentages:
sns.displot(
data=tips, x="total_bill", hue="sex",
kind="hist", stat="percent", kde=True,
)
But your "desired plot" actually doesn't look like a density at all, it looks like a histogram plotted with a line instead of bars. You can get that with element="poly"
:
sns.displot(
data=tips, x="total_bill", hue="sex",
kind="hist", stat="percent", element="poly", fill=False,
)
Solution 2:[2]
seaborn.displot
is a figure-level plot providing access to several approaches for visualizing the univariate or bivariate distribution of data (histplot, kdeplot, ecdfplot)- See How to plot percentage with seaborn distplot / histplot / displot
- seaborn histplot and displot output doesn't match is relevant for the settings of the
common_bins
andcommon_norm
. - Tested in
python 3.8.12
,pandas 1.3.4
,matplotlib 3.4.3
,seaborn 0.11.2
data
is a pandas dataframe, andseaborn
is an API formatplotlib
.
kind='hist'
: seaborn.histplot
- Use
stat='percent'
? available fromseaborn 0.11.2
import seaborn as sns
from matplotlib.ticker import PercentFormatter
data = sns.load_dataset('titanic')
p = sns.displot(data=data, x='age', stat='percent', hue='sex', height=4, kde=True, kind='hist')
Don't do the following, as explained
kind='kde'
: seaborn.kdeplot
- As per mwaskom, the creator of
seaborn
: you can wrap a percent formatter around a density value (as shown in the following code), but it's incorrect because the density is not a proportion (you may end up with values > 100%). - As per JohanC, you can't view the y-axis as a percentage, it is a density. The density can go arbitrarily high or low, depending on the x-axis. Formatting it as a percentage is a mistake.
- I will leave this as part of the answer as an explanation, otherwise it will just be posted by someone else.
- Using
matplotlib.ticker.PercentFormatter
to convert the axis values.
import seaborn as sns
from matplotlib.ticker import PercentFormatter
data = sns.load_dataset('titanic')
p = sns.displot(data=data, x='age', hue='sex', height=3, kind='kde')
p.axes.flat[0].yaxis.set_major_formatter(PercentFormatter(1))
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 |