Suppress stat_bin Message About Default Width

1,247 views
Skip to first unread message

Dario Strbenac

unread,
Jul 3, 2014, 5:00:14 AM7/3/14
to ggp...@googlegroups.com
I am writing a R markdown document and I can't suppress the message from stat_bin "stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this." How can I make my report neater ? Also, is there any convenience syntax to specify something like range/50. I've only seen examples were people give a constant number to binwidth. It would be desirable if there was some nice syntax.

Richard Zijdeman

unread,
Jul 3, 2014, 5:29:09 AM7/3/14
to ggp...@googlegroups.com
take the hint very literally: for example add 'binwidth = .2' to your code. To illustrate, just run the code below

df <- data.frame(x = rnorm(n = 1000, mean = 5, sd=1.5)) # create data
ggplot(df, aes(x=x)) + geom_bar(stat = "bin") # replicate warning
ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth = .2) # example of no warning
range.value = (range(df$x)[2] - range(df$x)[1])/50 # calculate range value
# remember: the 'range' function provides the lower and upper limit, so we
# subtract the first, lower value [1] from the second higher value [2]
ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth = range.value) # or as one-liner:
ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth = (range(df$x)[2] - range(df$x)[1])/50)

Best wishes,

Richard

Brian Diggs

unread,
Jul 3, 2014, 3:50:59 PM7/3/14
to Dario Strbenac, ggplot2
On 7/3/2014 2:00 AM, Dario Strbenac wrote:
> I am writing a R markdown document and I can't suppress the message from
> stat_bin "stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to
> adjust this." How can I make my report neater ?

You can not suppress ggplot from emitting the message, but you can keep
it from appearing in the markdown document (I assume you are using
knitr). Add message=FALSE to the block options and the message will be
kept out of the markdown.

> Also, is there any convenience syntax to specify something like
> range/50. I've only seen examples were people give a constant number
> to binwidth. It would be desirable if there was some nice syntax.

You can specify the width of bins (with binwidth) or the specific breaks
(with breaks), but there is no syntax for specifying just the number of
breaks (in a manner analogous to the default). You would have to
pre-compute one of the other two values.

I can see this as a useful feature, though. It could be implemented as,
if the length of breaks is 1, then that is the number of breaks there
should be. I don't think that would break anything (since a single break
can't produce any bins anyway), but I'm also leery of that because some
functions that have similar behavior (like sample) can give headaches
when the behavior is triggered unintentionally. Alternatively, a
separate argument could be created (with defined precedence with regard
to binwidth and breaks).

--
Brian S. Diggs, PhD
Senior Research Associate, Department of Surgery
Oregon Health & Science University

Sean O'Riordain

unread,
Jul 4, 2014, 1:24:42 AM7/4/14
to Brian Diggs, Dario Strbenac, ggplot2
Hi Brian,

I agree with Dario, I normally want to do EDA plotting (and rarely for journal publication) and I am quite happy with 30 bins for this purpose, but I don't like having the "we know better than you so you'd better fix those binwidths" message... which is what it feels like... so I would appreciate it if it was possible to turn off the message without having to go to the trouble of tediously calculating what ggplot2 is already calculating!  Sometimes I use sink() to capture the results of analysis and obviously I don't have the "message=FALSE" option here.

Many thanks,
Sean





--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+unsubscribe@googlegroups.com
More options: http://groups.google.com/group/ggplot2

--- You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Brian Diggs

unread,
Jul 4, 2014, 2:34:11 AM7/4/14
to Sean O'Riordain, Dario Strbenac, ggplot2
On 7/3/14 10:24 PM, Sean O'Riordain wrote:
> Hi Brian,
>
> I agree with Dario, I normally want to do EDA plotting (and rarely for
> journal publication) and I am quite happy with 30 bins for this purpose,
> but I don't like having the "we know better than you so you'd better fix
> those binwidths" message... which is what it feels like... so I would
> appreciate it if it was possible to turn off the message without having
> to go to the trouble of tediously calculating what ggplot2 is already
> calculating! Sometimes I use sink() to capture the results of analysis
> and obviously I don't have the "message=FALSE" option here.

There is not an option within ggplot to suppress that, but you could
always just wrap the print call (you may have to make this explicit) in
a suppressMessages call:

suppressMessages(print(ggplot(mtcars, aes(mpg)) + geom_bar()))

Whether that is uglier (or more of a nuisance) than the message is your
call.

> Many thanks,
> Sean
>
>
>
> On 3 July 2014 20:50, Brian Diggs <brian....@gmail.com

Richard Zijdeman

unread,
Jul 4, 2014, 6:09:05 AM7/4/14
to ggp...@googlegroups.com, sea...@acm.org, dario.garvan-Re5J...@public.gmane.org
I don't want to be more Catholic than the pope (if that's an international expression, and please I am not making any statement about this or any other religion), but:

the amount of discussion that goes into how to 'surpass' the warning is so much more than the energy needed to getting things done properly, avoiding the warning all together. 

The degree of binning is crucial for how your distribution is being displayed: you're basically aggregating your data. If you get it wrong, you will misinterpret the shape of your distribution. Therefore: before you're going to plot your distribution, think about what's a reasonable degree of binning.

Below some code to illustrate what happens to the shape of the distribution if you change the binning. Note that these ought to be normally distributed data, and the distortion would be worse if you had skewed data.

set.seed(5)
df <- data.frame(x = rnorm(n = 1000, mean = 5, sd=1.5)) # create data

# create histograms with different bins: p1, p2, p3
# the second line in each plot adds title and fixes x-axis between plots
p1 <- ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth =  .1) +
  ggtitle("binwidth = .1") + scale_x_continuous(limits = c(0,10)) 

p2 <- ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth = 2) +
  ggtitle("binwidth = 2") + scale_x_continuous(limits = c(0,10))

p3 <- ggplot(df, aes(x=x)) + geom_bar(stat = "bin", binwidth = 3) +
  ggtitle("binwidth = 3") + scale_x_continuous(limits = c(0,10))

library(gridExtra) # needed to combine the plots
grid.arrange(p1,p2,p3)

Best,

Richard
Reply all
Reply to author
Forward
0 new messages