Fixing graph size

Robin Wilson

unread,

Aug 22, 2010, 6:46:14 AM8/22/10

to ggplot2

Hi,

I'm using the arrange function which was previously posted on this
list to produce one image with 9 plots in it, in a 3 x 3 grid. The
file is available at http://www.rtwilson.com/ExampleGraphs.png. As you
can see, the plots are not all the same size, which makes the grid
look untidy. I'm not entirely sure why this is, and after a lot of
searching I can't find how to fix the size of the plots.

I am creating the plot by creating a number of plot objects (using
qplot() then adding a lot of themes etc) and then passing them to the
arrange function.

Regards,

Robin Wilson
University of Southampton

baptiste auguie

unread,

Aug 22, 2010, 7:17:57 AM8/22/10

to Robin Wilson, ggplot2

Hi,

Each ggplot expands to the full space available, therefore the size of
the plotting region depends on the size of the text labels (and, if
any, legends). This is why you observe a misalignment between the plot
panels.

My suggestion would be to join all your datasets into one and create a
dummy facetting variable. You could then use facet_grid or facet_wrap
to get the alignment you want.

Another possibility would be to use align.plots() instead of
grid.arrange(), but note that this is really a hack.

library(ggplot2)

# create plots with unequal y axis titles
ps <- replicate(5, qplot(1:10, rnorm(10),
ylab= paste(sample(letters,
sample(2:10,1)),collapse="",sep="")) +
opts(axis.title.y=theme_text(angle=0)),simplify=FALSE)

library(gridExtra)
do.call(grid.arrange, ps)

library(ggExtra)
grid.newpage()
do.call(align.plots, ps)

In your case, since you have several columns, you'd want to use a more
complicated version of align.plots that was posted on the list some
time ago (i forget by whom).

HTH,

baptiste

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> Please provide a reproducible example: http://gist.github.com/270442
>
> To post: email ggp...@googlegroups.com
> To unsubscribe: email ggplot2+u...@googlegroups.com
> More options: http://groups.google.com/group/ggplot2
>

Robin Wilson

unread,

Aug 22, 2010, 7:31:29 AM8/22/10

to ggplot2

Hi,

Thanks for the help again, Baptiste.

I was hoping I could do this through faceting, but I couldn't work out
how to do it. I understand from your post that I'll need to create a
dummy faceting variable. Could you point me to some instructions on
how to do it? At the moment I have one dataframe with a number of
columns (time, mean_len, max_len, etc). Would I need to add another
column?

Regards,

Robin

> > file is available athttp://www.rtwilson.com/ExampleGraphs.png. As you

Brandon Hurr

unread,

Aug 22, 2010, 7:44:56 AM8/22/10

to Robin Wilson, ggplot2

My guess would be that you would stack all of your data in a single column and then create another column (as you suggest) with a facet variable for each graph that you want. Probably best to simply put the name of the old column as the variable name so you can keep them straight. Then tweak until you're happy.

I'm sure the reshape package would do this in a few keystrokes, but I don't know quite how to do it.

Brandon

Brandon Hurr

unread,

Aug 22, 2010, 7:52:25 AM8/22/10

to Robin Wilson, ggplot2

In fact, it appears that what you want to do is the first example of Hadley's Introduction to the Reshape package...

http://had.co.nz/reshape/introduction.pdf

-B

Dennis Murphy

unread,

Aug 22, 2010, 7:56:16 AM8/22/10

to Robin Wilson, ggplot2

Hi:

Perhaps something like this, assuming that Time is the common x variable
in all your plots and that mydata contains Time and the nine responses:

m <- melt(mydata, id = 'Time')

g <- ggplot(m, aes(x = Time, y = value))
g + geom_point() + facet_wrap( ~ variable, ncol = 3, scales = 'free_y')

One of the things you'll have to deal with is getting the titles you want;
in facet_wrap, the strip will contain the names in variable; if you're a
little more clever, you could create a parallel factor to variable that
contains the titles and use it in place of variable in the facet_wrap call.

HTH,
Dennis

Robin Wilson

unread,

Aug 22, 2010, 8:09:33 AM8/22/10

to ggplot2

Hi Dennis,

That's very helpful. However, when I run the melt command I end up
with a dataframe with lots of NA's in it, and lots of warnings saying:

In `[<-.factor`(`*tmp*`, ri, value = c(0, 1e-06, 0, 0.000602 ... :
invalid factor level, NAs generated

Any idea what's going on here? I've tried various different ways of
specifying the arguments to the melt command, but they all seemed to
end up like this. Just to help - below is a print of my dataframe:

name t n mean_len total_len max_len min_len stdev_len
1 S5_Pe06_sand02 1 64 37.15435 2377.878 105.5182 15.04988 20.50629
2 S5_Pe06_sand03 2 49 42.03569 2059.749 105.1337 15.81665 23.61410
3 S5_Pe06_sand04 3 42 44.77685 1880.628 105.3765 16.92837 23.16211
4 S5_Pe06_sand05 4 40 43.16876 1726.750 110.7670 15.32033 22.59987

(sorry about the indentation going wrong here in Google Groups)

Thanks again,

Robin

On Aug 22, 12:56 pm, Dennis Murphy <djmu...@gmail.com> wrote:
> Hi:
>
> Perhaps something like this, assuming that Time is the common x variable
> in all your plots and that mydata contains Time and the nine responses:
>
> m <- melt(mydata, id = 'Time')
>
> g <- ggplot(m, aes(x = Time, y = value))
> g + geom_point() + facet_wrap( ~ variable, ncol = 3, scales = 'free_y')
>
> One of the things you'll have to deal with is getting the titles you want;
> in facet_wrap, the strip will contain the names in variable; if you're a
> little more clever, you could create a parallel factor to variable that
> contains the titles and use it in place of variable in the facet_wrap call.
>
> HTH,
> Dennis
>

> > > > To unsubscribe: email ggplot2+u...@googlegroups.com<ggplot2%2Bunsu...@googlegroups.com >

> > > > More options:http://groups.google.com/group/ggplot2
>
> > --
> > You received this message because you are subscribed to the ggplot2 mailing
> > list.
> > Please provide a reproducible example:http://gist.github.com/270442
>
> > To post: email ggp...@googlegroups.com

> > To unsubscribe: email ggplot2+u...@googlegroups.com<ggplot2%2Bunsu...@googlegroups.com >
> > More options:http://groups.google.com/group/ggplot2

baptiste auguie

unread,

Aug 22, 2010, 8:18:07 AM8/22/10

to Robin Wilson, ggplot2

I had something like this in mind,

library(ggplot2)

# create a list with all the data.frames
ld <- llply(title.names, function(d) data.frame(x=1:10, y=rnorm(10)))
# give them a name
names(ld) <- title.names <- paste("plot", 1:4)

# combine them
m <- melt(ld, id = "x")
str(m)

ggplot(m) + facet_wrap(~L1,scales="free") +
geom_path(aes(x,value)) +
theme_bw() + ylab("")+
opts(strip.background=theme_blank())

HTH,

baptiste

baptiste auguie

unread,

Aug 22, 2010, 8:20:04 AM8/22/10

to Robin Wilson, ggplot2

oops, it should be:

library(ggplot2)

ld <-replicate(4, data.frame(x=1:10, y=rnorm(10)), simplify=FALSE)
names(ld) <- paste("plot", 1:4)

m <- melt(ld, id = "x")
str(m)

ggplot(m) + facet_wrap(~L1,scales="free") +
geom_path(aes(x,value)) +
theme_bw() + ylab("")+
opts(strip.background=theme_blank())

Dennis Murphy

unread,

Aug 22, 2010, 8:21:26 AM8/22/10

to Robin Wilson, ggplot2

Hi:

Hopefully this is a little more helpful - you didn't tell me you had a factor variable in there :)
Using tst as the input data frame from your data snippet below,

m <- melt(tst, id = c('name', 't'))
m$titles <- factor(rep(c('N', 'Mean Length', 'Total Length',
'Max Length', 'Min Length', 'StDev Length'), each = 4))
g <- ggplot(m, aes(x = t, y = value))
g + geom_point() + geom_line() +
facet_wrap( ~ titles, ncol = 3, scales = 'free_y')

Except for the x scale at the bottom, which you already know how to remove, this seems to be OK?

HTH,
Dennis

More options: http://groups.google.com/group/ggplot2

Robin Wilson

unread,

Aug 22, 2010, 9:26:03 AM8/22/10

to ggplot2

Thank you everyone for your help. I have now got my plot almost
exactly how I want it - and it looks so much nicer than it did before!

Just one more question: Is there any way to control the order of my
plots in the facet grid? For example, if I want the Number of Dunes
plot to be first, and Mean Length to be next to Max Length etc. Would
that require some manual editing of the dataframe?

Regards,

Robin

On Aug 22, 1:21 pm, Dennis Murphy <djmu...@gmail.com> wrote:
> Hi:
>

> Hopefully this is a little more helpful - you didn't tell me you had a
> factor variable in there :)
> Using tst as the input data frame from your data snippet below,
>
> m <- melt(tst, id = c('name', 't'))
> m$titles <- factor(rep(c('N', 'Mean Length', 'Total Length',
> 'Max Length', 'Min Length', 'StDev Length'), each = 4))
> g <- ggplot(m, aes(x = t, y = value))
> g + geom_point() + geom_line() +
> facet_wrap( ~ titles, ncol = 3, scales = 'free_y')
>
> Except for the x scale at the bottom, which you already know how to remove,
> this seems to be OK?
>
> HTH,
> Dennis
>

> > <ggplot2%2Bunsu...@googlegroups.com<ggplot2%252Bunsubscribe@googlegroup s.com>>

> > > > > > More options:http://groups.google.com/group/ggplot2
>
> > > > --
> > > > You received this message because you are subscribed to the ggplot2
> > mailing
> > > > list.
> > > > Please provide a reproducible example:http://gist.github.com/270442
>
> > > > To post: email ggp...@googlegroups.com
> > > > To unsubscribe: email ggplot2+u...@googlegroups.com<ggplot2%2Bunsu...@googlegroups.com >

> > <ggplot2%2Bunsu...@googlegroups.com<ggplot2%252Bunsubscribe@googlegroup s.com>>

Dennis Murphy

unread,

Aug 22, 2010, 9:34:47 AM8/22/10

to ggplot2

Hi:

Probably the easiest thing to do would be to reorder the column locations in the original data frame before you melt it, something like

mydata <- mydata[, c(2, 3, 1, 4, 8, 5, 7, 6)]

One line - no muss, no fuss :)

HTH,
Dennis

More options: http://groups.google.com/group/ggplot2

Robin Wilson

unread,

Aug 22, 2010, 1:15:38 PM8/22/10

to ggplot2

Thanks for the idea Dennis. Sadly, I can't seem to get that to work.
Whatever I do - whether it's using the syntax you gave me, or other
ideas I've found (like the subset command), it doesn't seem to change
the order of the plots in the grid.

I've posted my entire r file below, in case that helps.

library(ggplot2)
library(reshape)

df = read.csv("D:\\results.csv", header=T)

df <- df[,-match("z_score",names(df))]
df <- df[,-match("p_value",names(df))]
df <- df[,-match("min_len",names(df))]

names(df)

m <- melt(df, id = c('name', 't'))

m$titles <- factor(rep(c('No of dunes', 'Mean Length', 'Total Length',
'Max Length', 'StDev Length', 'Mean Closeness', 'StDev
Closeness',
'Defect Density', 'NN R-score'), each = 4))

g <- ggplot(m, aes(x = t, y = value))
g + geom_point() + geom_line() +

facet_wrap( ~ titles, ncol = 3, scales = 'free_y') +
theme_bw() + ylab("") + xlab("Time") +
opts(strip.background=theme_blank()) +
scale_x_continuous(breaks=0:4, labels="") +
opts(axis.title.x = theme_text(size = 10, vjust = 2.5, hjust = 0.5))
+
opts(title = "Standard DECAL results")

Regards,

Robin

On Aug 22, 2:34 pm, Dennis Murphy <djmu...@gmail.com> wrote:
> Hi:
>

> Probably the easiest thing to do would be to reorder the column locations in
> the original data frame before you melt it, something like
>
> mydata <- mydata[, c(2, 3, 1, 4, 8, 5, 7, 6)]
>
> One line - no muss, no fuss :)
>
> HTH,
> Dennis
>

> >> > > <ggplot2%2Bunsu...@googlegroups.com<ggplot2%252Bunsubscribe@googlegroup s.com>

> >> <ggplot2%252Bunsubscribe@googlegroup s.com>>
> >> > > > > > > More options:http://groups.google.com/group/ggplot2
>
> >> > > > > --
> >> > > > > You received this message because you are subscribed to the
> >> ggplot2
> >> > > mailing
> >> > > > > list.
> >> > > > > Please provide a reproducible example:
> >>http://gist.github.com/270442
>
> >> > > > > To post: email ggp...@googlegroups.com
> >> > > > > To unsubscribe: email ggplot2+u...@googlegroups.com<ggplot2%2Bunsu...@googlegroups.com >
> >> <ggplot2%2Bunsu...@googlegroups.com<ggplot2%252Bunsubscribe@googlegroup s.com>>

> >> > > <ggplot2%2Bunsu...@googlegroups.com<ggplot2%252Bunsubscribe@googlegroup s.com>

Dennis Murphy

unread,

Aug 22, 2010, 1:48:58 PM8/22/10

to Robin Wilson, ggplot2

Hi:

OK, this will teach me not to post before I test. Oy.

Given the test data tst from an earlier e-mail of yours,

m <- melt(tst, id = c('name', 't'))

This time, we create an ordered factor whose ordering we specify
in levels = and whose labels are given in labels = :

m$titles <- ordered(m$variable, levels = c('min_len', 'mean_len',
'max_len', 'total_len', 'stdev_len', 'n'), labels = c('Min Length',
'Mean Length', 'Max Length', 'Total Length', 'StDev Length', 'N'))

in ordered(), levels sets the ordering by name and labels are what
should be seen when the factor is used.

The expectation is that the plots will start from upper left to lower riight
row-wise according to the ordered levels set in the above call...

g <- ggplot(m, aes(x = t, y = value))
g + geom_point() + geom_line() +

facet_wrap( ~ titles, scales = 'free_y', ncol = 3)

It works for me...hope it's the same on your end!

Dennis

More options: http://groups.google.com/group/ggplot2

Robin Wilson

unread,

Aug 22, 2010, 5:02:49 PM8/22/10

to ggplot2

Hi Dennis,

Thanks for that - it works perfectly.

I've now got exactly the graph I want - so thank you to everyone who
helped. I'm planning to do a bit more research into this, and some
playing around, and then write it up on my blog so that others can
benefit from it.

Thanks again,

Robin

On Aug 22, 6:48 pm, Dennis Murphy <djmu...@gmail.com> wrote:
> Hi:
>

> OK, this will teach me not to post before I test. Oy.
>
> Given the test data tst from an earlier e-mail of yours,
>
> m <- melt(tst, id = c('name', 't'))
>
> This time, we create an ordered factor whose ordering we specify
> in levels = and whose labels are given in labels = :
>
> m$titles <- ordered(m$variable, levels = c('min_len', 'mean_len',
> 'max_len', 'total_len', 'stdev_len', 'n'), labels = c('Min Length',
> 'Mean Length', 'Max Length', 'Total Length', 'StDev Length', 'N'))
>
> in ordered(), levels sets the ordering by name and labels are what
> should be seen when the factor is used.
>
> The expectation is that the plots will start from upper left to lower riight
> row-wise according to the ordered levels set in the above call...
>
> g <- ggplot(m, aes(x = t, y = value))
> g + geom_point() + geom_line() +
> facet_wrap( ~ titles, scales = 'free_y', ncol = 3)
>
> It works for me...hope it's the same on your end!
>
> Dennis
>

> ...
>
> read more »

Dennis Murphy

unread,

Aug 22, 2010, 5:32:38 PM8/22/10

to Robin Wilson, ggplot2

Hi:

Since you intend to post a summary of this exchange on your blog, let me make one extended remark. In response to a post here a couple of weeks ago, Hadley chimed in to observe that one can save oneself a lot of unnecessary work in ggplot2 by doing the upfront work outside of ggplot2 and then using the basic tools to render the plot. It may seem like an innocuous statement, but it is very true. R has many remarkably efficient ways to handle and process data, and it pays to take advantage. Not only is Hadley an author or co-author of several graphics packages in R (e.g., ggplot2, tourr, GGally), but he is also a (co-)author of several packages re data manipulation (plyr, reshape, stringr and the new lubridate, for example). They are meant to work hand in hand. IMHO, it pays to think about what you want your graphic to look like, which inputs will produce it and how to manipulate the data into a form that meshes easily with the graphics package of choice - i.e., design the graphic beforehand. When I take the time to do this, it sometimes amazes me what can be accomplished with judiciously chosen code. Not only does the graph look nice, but the code is readable and readily understandable, too. That's an important aspect of the end product.

Regards,
Dennis

Hadley Wickham

unread,

Aug 23, 2010, 2:18:08 PM8/23/10

to Dennis Murphy, Robin Wilson, ggplot2

> IMHO, it pays to think about what you want your graphic to
> look like, which inputs will produce it and how to manipulate the data into
> a form that meshes easily with the graphics package of choice - i.e., design
> the graphic beforehand. When I take the time to do this, it sometimes amazes
> me what can be accomplished with judiciously chosen code. Not only does the
> graph look nice, but the code is readable and readily understandable, too.

Totally agreed. A big part of becoming fluent in R is being able to
choose a way of storing your data that makes it easy to do what you
want with it. I think I'm slowly converging on a consistent philosophy
of data that helps you figure this out, and also provides the tools
you need to work with the data.

Hadley

--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

Robin Wilson

unread,

Aug 23, 2010, 2:25:11 PM8/23/10

to ggplot2

Hi all,

I'm gradually starting to realise that what Hadley and Dennis is very
true. However, as a relative beginner to R, ggplot2, reshape etc, I'm
finding it difficult to know how best to store and manipulate data.
Has anyone posted any guidelines anywhere about how to do this? Or how
to learn to do this?

I'm happy to document what I learn as I learn it, but I was wondering
if there was anything that could kick-start me?

Best regards,

Robin

Brandon Hurr

unread,

Aug 23, 2010, 2:31:42 PM8/23/10

to Robin Wilson, ggplot2

I think I'm in your shoes as well Robin. I've been learning R and ggplot, plyr as I go and it's been a struggle at times. Hadley has written up a plyr manual that is available online, but most of the time I just try and use it and figure out how and why I've broken it. Takes some time, but I think I learn more that way.

Beyond that, I'd be interested in any other materials/examples as well.

Brandon

Hadley Wickham

unread,

Aug 23, 2010, 7:26:24 PM8/23/10

to Brandon Hurr, Robin Wilson, ggplot2

> I think I'm in your shoes as well Robin. I've been learning R and ggplot,
> plyr as I go and it's been a struggle at times. Hadley has written up a
> plyr manual that is available online, but most of the time I just try and
> use it and figure out how and why I've broken it. Takes some time, but I
> think I learn more that way.
> Beyond that, I'd be interested in any other materials/examples as well.

I'd also recommend the reshape paper
(www.jstatsoft.org/v21/i12/paper), particularly the earlier part
before it gets to casting. I'm thinking my next book will be more
about storing and working with data in R.

Mark Connolly

unread,

Aug 24, 2010, 7:36:30 AM8/24/10

to Hadley Wickham, Brandon Hurr, Robin Wilson, ggplot2

I hesitate somewhat to bring this up since most people aren't
necessarily keen on yet another set of things to learn, but a relational
database can be a very powerful tool for managing and manipulating
data. Not all problems need (or want) the formalization of a database
design, but a little standard normalization of data and a solid schema
definition can pay dividends, especially if the database can grow and be
re-used. Getting poorly structured data out of spreadsheets and into a
database can be very liberating. Plenty of freely available tools and
plenty of references exist to help the process along.

Other than rendering the data with a fine ggplot2 plot, there are few
things as gratifying as adding a view to an existing set of tables and
bringing the data into R with a very simple query against that view --
the query encapsulated into a nicely re-usable function, of course.

That being said, melt can be far more convenient to writing a union
query, so the RDBMS would be complementary to the packages of manipulators.

Robin Wilson

unread,

Aug 26, 2010, 5:17:12 PM8/26/10

to ggplot2

Hi,

Thanks to all those who helped with this problem. I've written up a
summary of my 'journey of discovery' on my blog at
http://blog.rtwilson.com/2010/08/26/producing-grids-of-plots-in-r-with-ggplot2-a-journey-of-discovery/

I hope others will find it useful, and comments are most welcome!

Regards,

Robin

Reply all

Reply to author

Forward