'Remove outlier from five-number summary statistics

How can I force fivenum function to not put outliers as my maximum/minimum values?

I want to be able to see uppper and lower whisker numbers on my boxplot.

My code:

boxplot(data$`Weight(g)`)
text(y=fivenum(data$`Weight(g)`),labels=fivenum(data$`Weight(g)`),x=1.25, title(main = "Weight(g)"))

enter image description here



Solution 1:[1]

boxplot returns a named-list that includes things you can use to remove outliers in your call to fivenum:

  • $out includes the literal outliers. It can be tempting to use setdiff(data$`Weight(g)`), but that may be prone to problems due to R FAQ 7.31 (and floating-point equality), so I recommend against this; instead,

  • $stats includes the numbers used for the boxplot itself without the outliers. I suggest we work with this.

(BTW, title(.) does its work via side-effect, and it is not used by text(.), I suggest you move that call.)

Reproducible data/code:

vec <- c(1, 10:20, 30)
bp <- boxplot(vec)
str(bp)
# List of 6
#  $ stats: num [1:5, 1] 10 12 15 18 20
#  $ n    : num 13
#  $ conf : num [1:2, 1] 12.4 17.6
#  $ out  : num [1:2] 1 30
#  $ group: num [1:2] 1 1
#  $ names: chr "1"

five <- fivenum(vec[ vec >= min(bp$stats) & vec <= max(bp$stats)])
text(x=1.25, y=five, labels=five)
title("Weight(g)")

basic boxplot with corrected fivenum labels

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1