'Plotly: How to change length of whiskers (min/max) in a boxplot?
I know that 1.5 * IQR
is a common rule, but I would like to plot other min/max if possible. I am using plotly (python). Basically, I would like to define a function to show the boxplot by the parameters data frame, column, and a self-defined multiplier.
df_test = pd.Series(np.array([26124.0, 8124.0, 27324.0, 13188.0, 21156.0]))
def get_boxplot(df,column, multiplier):
data = [go.Box(y=df[column],boxpoints="outliers")]
return pyo.plot(data)
get_boxplot(df_test,0,3)
My goal is to replace 1.5 * IQR
by the multiplier parameter. In this example by 3 or any other number.
Do you have an idea of how to change my function?
Thank you!
Solution 1:[1]
Getting the exact result you are looking for does not seem to be possible within the boundaries of python, meaning that the properties at best are only available in the javascript context.
You still have som options regarding the placement of the whiskers, though. And you are right by the way about the 1.5 * IQR
part. From help(fig)
you can find:
By default, the whiskers correspond to the box' edges +/- 1.5 times the interquartile range (IQR: Q3-Q1), see "boxpoints" for other options.
And under boxpoints
you'll find:
If "outliers", only the sample points lying outside the whiskers are shown If "suspectedoutliers", the outlier points are shown and points either less than 4*Q1-3*Q3 or greater than 4*Q3-3*Q1 are highlighted (see
outliercolor
) If "all", all sample points are shown If False, only the box(es) are shown with no sample points
So for the different values of
'boxpoints': False, 'all', outliers
you'll get:
And as you'll se below, whether or not boxpoints
are shown will also determine the placement of the whiskers. So you could use False, 'all', outliers
as arguments in a custom function to at least be able to change between those options. And judging by your question boxpoints=False
shouldn't be too far off target.
Here's a way to do it:
Code with boxpoints set to False:
# imports
from plotly.subplots import make_subplots
import plotly.graph_objs as go
import pandas as pd
import numpy as np
# data
np.random.seed(123)
y0 = np.random.randn(50)-1
x0 = y0
x0 = [0 for y in y0]
y0[-1] = 4 # include an outlier
# custom plotly function
def get_boxplot(boxpoints):
fig = go.Figure(go.Box(y=y0, boxpoints = boxpoints, pointpos = 0,
)
)
fig.show()
get_boxplot(boxpoints='outliers')
Plot 1 - Boxpoints = False:
Plot 1 - Boxpoints = 'outliers':
This will raise another issue though, since the markers by default are not shown in the first case. But you can handle that by including another trace like this:
Complete plot:
Complete code:
# imports
from plotly.subplots import make_subplots
import plotly.graph_objs as go
import pandas as pd
import numpy as np
# data
np.random.seed(123)
y0 = np.random.randn(50)-1
x0 = y0
x0 = [0 for y in y0]
y0[-1] = 4 # include an outlier
# custom plotly function
def get_boxplot(boxpoints):
fig = go.Figure(go.Box(y=y0, boxpoints = boxpoints, pointpos = 0,
)
)
if boxpoints==False:
fig.add_trace(go.Box(x=x0,
y=y0, boxpoints = 'all', pointpos = 0,
marker = dict(color = 'rgb(66, 66, 244)'),
line = dict(color = 'rgba(0,0,0,0)'),
fillcolor = 'rgba(0,0,0,0)'
))
get_boxplot.show()
foo(boxpoints=False)
Solution 2:[2]
The current plotly allows setting upper and lower fences as stated in a comment in the original question. It took a fair while to work out how so thought I would spare others the pain.
You need to specify q1, med, q2 as well as the upper and lower fences. I've given an example below where a is an array.
fig.add_trace(go.Box(y=[a]))
fig.update_traces(q1=[np.percentile(a,25)],
median=[np.percentile(a,50)],
q3=[np.percentile(a,75)],
lowerfence=[np.min(a)],
upperfence=[np.max(a)]
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Hoarie |