Label outliers in boxplot

2,797 views
Skip to first unread message

Harish Krishnan

unread,
Sep 6, 2015, 4:12:11 AM9/6/15
to ggplot2
Hello

Is there a simple and elegant solution to label just the outliers in a boxplot

Thanks
Harish

Brian

unread,
Sep 6, 2015, 9:37:03 AM9/6/15
to Harish Krishnan, ggplot2
Hello Harish,

why not make a subset of your data and plot those with different
characteristics.

box = ggplot(data) + ...

box + geom_point(data=subset(data, variable < -10 & variable > 10))


You may also consider playing with stat_summary.

I mean:

|stat_summary(fun.y=function(x) x[which(x>10)], geom="point", shape=5,
size=4)|

See: http://www.cookbook-r.com/Graphs/Plotting_distributions_%28ggplot2%29/

It's just an idea, but your message is unclear and you provide no code
to specify exactly what you mean/want.

Cheers,
Brian

Harish Krishnan

unread,
Sep 6, 2015, 5:39:22 PM9/6/15
to Brian, ggplot2
Hi Brian

thanks for taking time out!

What I meant is, 

for the following code

ggplot(iris,aes(Species,Sepal.Length)) + geom_boxplot() + geom_text(aes(label=Sepal.Length,hjust=-0.5))

We get a series of box plots with labels for all the points. I need a solution where we only label the outliers.

Thanks,
Harish


Vivek Patil

unread,
Sep 6, 2015, 5:51:06 PM9/6/15
to Harish Krishnan, Brian, ggplot2

At a generic level, you could create a new variable where you provide a label=Sepal.Length if it is an outlier (use your criterion for one) and blank if it is not and use it in geom_text then? For the boxplot, you could identify the outliers using http://docs.ggplot2.org/0.9.3.1/geom_boxplot.html

 

Vivek

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

 

 




Avast logo

This email has been checked for viruses by Avast antivirus software.
www.avast.com


Harish Krishnan

unread,
Sep 6, 2015, 7:08:12 PM9/6/15
to Vivek Patil, Brian, ggplot2
Hi Vivke

If I understood it correctly, it would work if we are not grouping the points by Species. Since the box plots are drawn for each of the Species group, the outliers would vary from Species to Species.

Can you help me with the code if that is not what you are suggesting?

Thanks Vivek,
Harish 

Ben Bond-Lamberty

unread,
Sep 7, 2015, 7:20:08 AM9/7/15
to Harish Krishnan, Vivek Patil, Brian, ggplot2
Harish,

There are two steps: identify the outliers, and plot. From reading the `geom_boxplot` documentation, it sounds like outlier points are based on the interquartile range, so using your iris example:

# Use a `dplyr` pipeline to identify the outliers
outliers <- iris %>% 
   group_by(Species) %>% 
   mutate(outlier = abs(Sepal.Length - median(Sepal.Length)) > 1.5 * IQR(Sepal.Length)) %>% 
   filter(outlier)
# Now plot
ggplot(iris,aes(Species,Sepal.Length)) + geom_boxplot() + geom_text(data=outliers, aes(label=Sepal.Length))

Ben


Harish Krishnan

unread,
Sep 7, 2015, 9:07:15 PM9/7/15
to Ben Bond-Lamberty, Vivek Patil, Brian, ggplot2
Thanks for the code Ben,
Much appreciated! 

Harish Krishnan

unread,
Sep 11, 2015, 1:28:20 AM9/11/15
to Ben Bond-Lamberty, Vivek Patil, Brian, ggplot2
Hello Ben, Vivek and Brian

Using "data" in geom_text helped me in getting what I needed. Just FYI. Each of your inputs helped in getting this! :)

data <- iris %>% select(Species,Sepal.Width) %>% group_by(Species) %>% summarize(IQR = IQR(Sepal.Width),Q1 = quantile(Sepal.Width,0.25),Q3 = quantile(Sepal.Width,0.75)) %>% inner_join(select(iris,Species,Sepal.Width)) %>% mutate(Flag = (Sepal.Width < (Q1-1.5*IQR))|(Sepal.Width > (Q3+1.5*IQR))) %>% filter(Flag)

Thanks again,
Harish
Reply all
Reply to author
Forward
0 new messages