'convert hued displot of X to plot of hue vs mode(X given hue)?

I have a Seaborn displot with a hued variable:

enter image description here

For each hued variable, I want to extract the mode of the density estimate and then plot each hue variable versus its mode, like so:

enter image description here

How do I do this?



Solution 1:[1]

You can use scipy.stats.gaussian_kde to create the density estimation function. And then call that function on an array of x-values to calculate its maximum.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df = pd.DataFrame({'x': np.random.normal(0.001, 1, 1300).cumsum() + 30,
                   'hue': np.repeat(np.arange(0.08, 0.20001, 0.01), 100).round(2)})
g = sns.displot(df, x='x', hue='hue', palette='turbo', kind='kde', fill=True, height=6, aspect=1.5)
plt.show()

from scipy.stats import gaussian_kde
from matplotlib.cm import ScalarMappable

fig, ax = plt.subplots(figsize=(10, 6))
hues = df['hue'].unique()
num_hues = len(hues)
colors = sns.color_palette('turbo', num_hues)
xmin, xmax = df['x'].min(), df['x'].max()
xs = np.linspace(xmin, xmax, 500)
for hue, color in zip(hues, colors):
     data = df[df['hue'] == hue]['x'].values
     kde = gaussian_kde(data)
     mode_index = np.argmax(kde(xs))
     mode_x = xs[mode_index]
     sns.scatterplot(x=[hue], y=[mode_x], color=color, s=50, ax=ax)
cmap = sns.color_palette('turbo', as_cmap=True)
norm = plt.Normalize(hues.min(), hues.max())
plt.colorbar(ScalarMappable(cmap=cmap, norm=norm), ax=ax, ticks=hues)
plt.show()

sns.displot with hue

scatterplot with the modes

Here is another approach, extracting the kde curves. It uses the legend of the kde plot to get the correspondence between the curves and the hue values. sns.kdeplot is the axes-level function used by sns.displot(kind='kde'). fill=False creates lines instead of filled polygons for the curves, for which the values are easier to extract. (ax1.fill_between can fill the curves during a second pass). The x and y axes of the second plot are switched to align the x-axes of both plots.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np

df = pd.DataFrame({'x': np.random.normal(0.007, 0.1, 1300).cumsum() + 30,
                   'hue': np.repeat(np.arange(0.08, 0.20001, 0.01), 100).round(2)})
fig, (ax1, ax2) = plt.subplots(nrows=2, figsize=(12, 10), sharex=True)

sns.kdeplot(data=df, x='x', hue='hue', palette='turbo', fill=False, ax=ax1)

hues = [float(txt.get_text()) for txt in ax1.legend_.get_texts()]
ax2.set_yticks(hues)
ax2.set_ylabel('hue')
for hue, line in zip(hues, ax1.lines[::-1]):
     color = line.get_color()
     x = line.get_xdata()
     y = line.get_ydata()
     ax1.fill_between(x, y, color=color, alpha=0.3)
     mode_ind = np.argmax(y)
     mode_x = x[mode_ind]
     sns.scatterplot(x=[mode_x], y=hue, color=color, s=50, ax=ax2)

sns.despine()
plt.tight_layout()
plt.show()

calculating the mode from the kde curves

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1