'Obtaining the exact data coordinates of seaborn boxplot boxes
I have a seaborn boxplot (sns.boxplot
) on which I would like to add some points. For example, say I have this pandas DataFrame:
[In] import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'Property 1':['a']*100+['b']*100,
'Property 2': ['w', 'x', 'y', 'z']*50,
'Value': np.random.normal(size=200)})
df.head(3)
[Out] Property 1 Property 2 Value
0 a w 1.421380
1 a x -1.034465
2 a y 0.212911
[In] df.shape
[Out] (200, 3)
I can easily generate a boxplot with seaborn:
[In] sns.boxplot(x='Property 2', hue='Property 1', y='Value', data=df)
[Out]
Now say I want to add markers for a specific case in my sample. I can get close with this:
[In] specific_case = pd.DataFrame([['a', 'w', '0.5'],
['a', 'x', '0.2'],
['a', 'y', '0.1'],
['a', 'z', '0.3'],
['b', 'w', '-0.5'],
['b', 'x', '-0.2'],
['b', 'y', '0.3'],
['b', 'z', '0.5']
],
columns = df.columns
)
[In] sns.boxplot(x='Property 2', hue='Property 1', y='Value', data=df)
plt.plot(np.arange(-0.25, 3.75, 0.5),
specific_case['Value'].values, 'ro')
[Out]
That is unsatisfactory, of course.
I then used this answer that talks about getting the bBox
and this tutorial about converting diplay coordinates into data coordinates to write this function:
[In] def get_x_coordinates_of_seaborn_boxplot(ax, x_or_y):
display_coordinates = []
inv = ax.transData.inverted()
for c in ax.get_children():
if type(c) == mpl.patches.PathPatch:
if x_or_y == 'x':
display_coordinates.append(
(c.get_extents().xmin+c.get_extents().xmax)/2)
if x_or_y == 'y':
display_coordinates.append(
(c.get_extents().ymin+c.get_extents().ymax)/2)
return inv.transform(tuple(display_coordinates))
That works great for my first hue, but not at all for my second:
[In] ax = sns.boxplot(x='Property 2', hue='Property 1', y='Value', data=df)
coords = get_x_coordinates_of_seaborn_boxplot(ax, 'x')
plt.plot(coords, specific_case['Value'].values, 'ro')
[Out]
How can I get the data coordinates of all my boxes?
Solution 1:[1]
I'm unsure about the purpose of those transformations. But it seems the real problem is just to plot the points from the specific_case
at the correct positions. The xcoordinate of every boxplot is shifted by 0.2 from the whole number. (That is because bars are 0.8 wide by default, you have 2 boxes, which makes each 0.4 wide, half of that is 0.2.)
You then need to arrange the x values to fit to those of the specific_case
dataframe.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'Property 1':['a']*100+['b']*100,
'Property 2': ['w', 'x', 'y', 'z']*50,
'Value': np.random.normal(size=200)})
specific_case = pd.DataFrame([['a', 'w', '0.5'],
['a', 'x', '0.2'],
['a', 'y', '0.1'],
['a', 'z', '0.3'],
['b', 'w', '-0.5'],
['b', 'x', '-0.2'],
['b', 'y', '0.3'],
['b', 'z', '0.5']
], columns = df.columns )
ax = sns.boxplot(x='Property 2', hue='Property 1', y='Value', data=df)
X = np.repeat(np.atleast_2d(np.arange(4)),2, axis=0)+ np.array([[-.2],[.2]])
ax.plot(X.flatten(), specific_case['Value'].values, 'ro', zorder=4)
plt.show()
Solution 2:[2]
I got it figured out:
In your code do this to extract the x-coordinate based on hue. I did not do it for the y, but the logic should be the same:
Create two lists holding your x coordinate:
display_coordinates_1=[]
display_coordinates_2=[]
Inside your for loop that starts with:
for c in ax.get_children():
Use the following:
display_coordinates_1.append(c.get_extents().x0)
You need x0 for the x-coordinate of boxplots under first hue.
The following gives you the x-coordinates for the subplots in the second hue. Note the use of x1 here:
display_coordinates_2.append(c.get_extents().x1)
Lastly, after you inv.transform()
the two lists, make sure you select every other value, since for x-coordinates each list has 6 outputs and you want the ones at indices 0,2,4 or [::2].
Hope this helps.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | ImportanceOfBeingErnest |
Solution 2 |