I’m having some problems with something I think should be really simple.
I’d like to get rid of the horizontal grid lines in my plot. Here’s an
example:
library(lattice)
library(ggplot2)
qplot(yield,site,data=barley)
Now, I *can* remove the horizontal grid lines by adding
scale_y_discrete(breaks=NA), but this also removes the labels in the
left margin. How can I just remove the horizontal grid lines, and keep
the vertical grid lines and everything else the same?
(My real plot uses geom="tile", where removing the horizontal lines
makes much more sense than for this dotplot example.)
--
Karl Ove Hufthammer
Actually, for this example it would be even better to change the
vertical lines to be in the middle of two categories on the vertical
axis, so that they would border the tiles. Here’s a simple example:
n=30
k=10
let=sample(letters[1:k],n,replace=TRUE)
years=sample(1998:2002,n,replace=TRUE,prob=c(1,2,5,10,3))
d=data.frame(let,years)
qplot(years,let,data=d,geom="tile")
I want the grid lines to border the tiles. For the x axis I can achieve
this by removing the major grid lines and drawing the minor ones in the
style of the major ones. But for the y axis, the minor ones are not
drawn at all. Is it possible to change this?
--
Karl Ove Hufthammer
Rather than moving the grid lines, I'd move the boxes:
ggplot(d) + geom_rect(aes(xmin = years, xmax = years + 1, ymin = let,
ymax = as.numeric(let) + 1))
Hadley
--
http://had.co.nz/
Thanks, but now the labels on y axis aren’t centred on the boxes
anymore, which makes the graphic a bit hard to read.
For now, I use a solution which work fine if I only need vertical lines:
Just get rid of all the grid lines, and then draw vertical lines
manually, using geom_vline.
I have one more question regarding this type of plot, though. I have
several data sets on this form, but with varying number of levels of the
‘let’ factor, giving me different numbers of boxes vertically. What is
the best approach to ensure that the *physical* heights of the boxes are
the same when saving the graphs?
If I just use the default widths and heights, datasets with smaller
number of levels of the ‘let’ factor will give higher boxes, and
datasets with many levels give ‘squashed’ boxes, and the graphs don’t
look very good when displayed together.
I can get an *approximate* solution by letting the height of the figure
be a multiple of the number of levels, but this doesn’t seem like a
proper solution (and doesn’t take margins and headings into account).
--
Karl Ove Hufthammer
Oh yeah, that's another good solution.
> I have one more question regarding this type of plot, though. I have
> several data sets on this form, but with varying number of levels of the
> ‘let’ factor, giving me different numbers of boxes vertically. What is
> the best approach to ensure that the *physical* heights of the boxes are
> the same when saving the graphs?
>
> If I just use the default widths and heights, datasets with smaller
> number of levels of the ‘let’ factor will give higher boxes, and
> datasets with many levels give ‘squashed’ boxes, and the graphs don’t
> look very good when displayed together.
>
> I can get an *approximate* solution by letting the height of the figure
> be a multiple of the number of levels, but this doesn’t seem like a
> proper solution (and doesn’t take margins and headings into account).
There is no good solution to this currently. I've thought up a
solution that I think will work - when I rewrite the layout system,
I'll make it possible to set the size of legends, titles etc to a
fixed amount. Then you'll be able to work out the size exactly (or if
you're really luckily ggplot2 will come with functions to do that
built in)
Hadley
How much of the lines are what Tufte would call 'chartjunk'? I am sure Tufte would
prefer that graphs contain a lot fewer gridlines than most ggplot graphs contain. I am
also sure that I think Tufte goes too far .... his aesthetic purity of "bits of info per square inch"
sacrifices some clarity. It's not how much info you can cram onto a page, it's how much the reader
can get out, and how easily.
But how did Hadley decide on the number of grid lines?
How do we adjust it?
Peter
Peter L. Flom, PhD
Statistical Consultant
www DOT peterflomconsulting DOT com
I agree. See the following paper for an example of why you shouldn't
accept Tufte's advice uncritically:
W. A. Stock and J. T. Behrens. Box, line, and midgap plots: Effects of
display characteristics on the accuracy and bias of estimates of
whisker length. Journal of Educational Statistics, 16(1): 1–20, 1991.
And section 3 of the following paper gives some concrete reasons why
you actually want grid lines:
W. Cleveland. A model for studying display methods of statistical
graphics. Journal of Computational and Graphical Statistics, 2:
323–364, 1993. URL http://stat.bell-labs.com/doc/93.4.ps.
> But how did Hadley decide on the number of grid lines?
>
> How do we adjust it?
They just match up to the tick on the x axis.
Hadley