geom_jitter mystery- Please help

925 views
Skip to first unread message

J H

unread,
Mar 6, 2014, 6:15:00 PM3/6/14
to ggplot2
All, Now that I know that it is better to supply an example to my problem please help me resolve a mystery I am running into.  My code is below and have a sample data set as well to pics.  I am new to ggplot2 and R so perhaps I am doing something wrong.  Here are my 2 issues:

- Pic 1 depicts State vs. Sales.  I am using jitter and you can see that the outliers still appear.  What is the best way to now show these.  Should I use a different code?
- Pic 2 this is the mystery.  I now plot State vs. Sales2 for which the sales numbers are true "0", however the jitter still appears.  How can I resolve this?

Please assist as I have not been successful.  Thank you.

Regards,
JOH



ggplot(table, aes(x=State,y=Sales2, fill=State)) + geom_boxplot() + geom_jitter(position=position_jitter(width=0.1),size=4, alpha=0.3)

State Sales Sales2
CA 122 0
CA 13 0
CA 12 0
CA 12 0
CA 31 0
CA 2 0
CA 12 0
CA 31 0
CA 3 0
CA 12 0
NJ 3 0
NJ 123 0
NJ 1 0
NJ 2 0
NJ 12 0
NY 3 0
NY 12 0
NY 31 0
NY 2 0
NY 13 0
AZ 23 0
AZ 1 0
AZ 23 0
AZ 31 0


Inline image 1
Inline image 2

Dennis Murphy

unread,
Mar 6, 2014, 6:32:23 PM3/6/14
to J H, ggplot2
Hi:

Thank you for providing data and code. I named your data DF to avoid potential conflicts with the table() function.

Problem 1:

require(ggplot2)
ggplot(DF, aes(x=State, y=Sales, fill=State)) + 
   geom_boxplot(outlier.colour = NA) + 
   geom_jitter(position=position_jitter(width=0.1), size=4, alpha=0.3)


Problem 2:

ggplot(DF, aes(x=State, y=Sales2, fill=State)) + 
   geom_boxplot()  +
   geom_jitter(position=position_jitter(width=0.1, height = 0), size=4, alpha=0.3)

In the first problem, setting outlier.colour = NA (and you do need the Commonwealth spelling) effectively removes the boxplot points from being displayed. In the second problem, you controlled the width of the jitter but not the height. Setting height = 0 jitters points horizontally with width 0.1.

Dennis



--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

J H

unread,
Mar 6, 2014, 6:38:32 PM3/6/14
to Dennis Murphy, ggplot2
Dennis,
Thank you so much.  One question on prob 2.  I guess I don't understand why R would show zero values at all to begin with.  If I just use the code geom_jitter() I still get the same problem with R even though values are true zero's.  Am I  missing something? would using height=0 artificially suppress any true >0 values?
Regards,
JOH

Dennis Murphy

unread,
Mar 7, 2014, 7:04:15 AM3/7/14
to J H, ggplot2
Hi:


On Thu, Mar 6, 2014 at 3:38 PM, J H <renalcl...@gmail.com> wrote:
Dennis,
Thank you so much.  One question on prob 2.  I guess I don't understand why R would show zero values at all to begin with. 

All the Sales2 values were zero. When you specify geom_point(), ggplot2 will plot points. It just does what you ask it to do.
 
If I just use the code geom_jitter() I still get the same problem with R even though values are true zero's. 

Which problem? Code example??
 
Am I  missing something? would using height=0 artificially suppress any true >0 values?

Jittering can occur both vertically and horizontally. My code suppressed the vertical jitter but left the horizontal jitter, which is why you got a pseudo-rug plot at the median of each box plot. If you had nonzero values, specifying height = 0 in geom_jitter() or position_jitter() will suppress any vertical jittering of the points, but nonzero values would still be plotted with some horizontal jitter unless you remove the jittering altogether, in which case you'll get one value plotted for Sales2 in each group at zero.

The reason your original code plotted points away from zero when Sales2 was the response is because you allowed both horizontal and vertical jitter; see ?geom_jitter for the default values. 

Dennis

JOH JOH

unread,
Mar 7, 2014, 1:44:33 PM3/7/14
to ggplot2
Thank you Dennis.  Makes sense.  Learning ggplot2 on the go.  I thought that if I only specify width that it would only jitter by width.
Reply all
Reply to author
Forward
0 new messages