I'd like to have a go at creating a custom geom (say, ellipses with
varying aspect ratio, angle, fill, size, and colour). I can try to
copy the code of an existing geom (say, points) and change the name,
but it doesn't seem to be enough. Is there a document that describes
the different pieces of code involved in a ggplot and what are the
required, minimal steps to follow? I don't mind trial and error but
(a) ggplot2 is quite a large piece of code ; (b) I have no knowledge
whatsoever of the proto framework, which makes it much harder to track
the links between different pieces of code.
Thanks,
baptiste
> It's almost there for the geom part, but I'm struggling with mapping
> and recycling rules. I generalised my initial idea of ellipses to
> regular polygons with arbitrary number of sides (3 to 12, and 50 for
> smooth paths).
What in particular are you having problems with? I can see a couple
of things that you could change for better compatibility with the rest
of ggplot2:
* use angle instead of theta
* make sure size is interpreted in mm
* when a geom has both colour and fill, alpha should be applied to the fill
I can also see that you'll need some new scales:
* scale_angle should scale to 0, 2 pi
* scale_sides should be a discrete scale
* I'm not sure about the aspect ratio
You'll also need to implement draw_legend eventually, but that should
be pretty simple.
Hadley
On Mon, May 11, 2009 at 2:42 PM, hadley wickham <h.wi...@gmail.com> wrote:
> What in particular are you having problems with?
say, I want to map the color of the diamonds to the number of sides of
a polygon,
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
d <- ggplot(dsmall, aes(carat, price))
str(dsmall)
d + geom_ngon(aes(fill = carat, sides=as.numeric(color)),
colour="orange",ar=1, size=5, angle=pi/3)
This works, but I shouldn't need as.numeric, and it should also work
for numeric values that are not directly equal to the number of sides
(say, non-integer values as opposed to factor codes). I'm missing a
mapping function somewhere, with different methods for factors,
integers, numeric values. I'm not sure where this transformation
should be written?
> I can see a couple
> of things that you could change for better compatibility with the rest
> of ggplot2:
>
> * use angle instead of theta
OK
> * make sure size is interpreted in mm
I don't really understand the sizes yet, but i'll have a look.
> * when a geom has both colour and fill, alpha should be applied to the fill
>
OK
> I can also see that you'll need some new scales:
>
> * scale_angle should scale to 0, 2 pi
> * scale_sides should be a discrete scale
I guess something like this would be a start,
ScaleSides <- proto(ScaleDiscrete, expr={
doc <- TRUE
common <- NULL
.input <- .output <- "sides"
output_set <- function(.) c("3", "4", "5", "6", "7", "8",
"0")[seq_along(.$input_set())]
max_levels <- function(.) 9
detail <- "<p></p>"
# Documentation -----------------------------------------------
objname <- "sides"
desc <- "Scale for polygon sides"
icon <- function(.) {
#
}
examples <- function() {
#
# See scale_manual for more flexibility
}
})
> * I'm not sure about the aspect ratio
It should be continuous, although I'm not convinced a distortion makes
sense for all shapes (it does for ellipses and squares (rectangles)).
>
> You'll also need to implement draw_legend eventually, but that should
> be pretty simple.
>
Yep, should be OK.
> Hadley
>
>
> --
> http://had.co.nz/
>
Thanks again,
baptiste
Yes, the basics are pretty easy - it's all the edge cases that are hard.
> say, I want to map the color of the diamonds to the number of sides of
> a polygon,
>
>
> dsmall <- diamonds[sample(nrow(diamonds), 100), ]
> d <- ggplot(dsmall, aes(carat, price))
> str(dsmall)
> d + geom_ngon(aes(fill = carat, sides=as.numeric(color)),
> colour="orange",ar=1, size=5, angle=pi/3)
>
> This works, but I shouldn't need as.numeric, and it should also work
> for numeric values that are not directly equal to the number of sides
> (say, non-integer values as opposed to factor codes). I'm missing a
> mapping function somewhere, with different methods for factors,
> integers, numeric values. I'm not sure where this transformation
> should be written?
So this brings you to the next type of ggplot2 object you might want
to create - scales. You basically want to create
scale_sides_discrete and scale_sides_continuous to deal with two of
the possible types of data.
> ScaleSides <- proto(ScaleDiscrete, expr={
> doc <- TRUE
> common <- NULL
> .input <- .output <- "sides"
>
> output_set <- function(.) c("3", "4", "5", "6", "7", "8",
> "0")[seq_along(.$input_set())]
> max_levels <- function(.) 9
>
> detail <- "<p></p>"
>
> # Documentation -----------------------------------------------
>
> objname <- "sides"
> desc <- "Scale for polygon sides"
>
> icon <- function(.) {
> #
> }
>
> examples <- function() {
> #
> # See scale_manual for more flexibility
> }
>
> })
Again, you'll need
scale_sides <- ScaleSide$build_accessor()
When adding a default scale (e.g. when you use sides as an aesthetic),
ggplot2 categorises the type of data into date, datetime, continuous
and discrete. It then looks for (e.g.) a scale named
scale_sides_discrete or scale_sides_continuous. If it doesn't find a
particular type of scale, it uses the generic, scale_sides.
So if you want different behaviour for continuous and discrete values,
then you need to create scale_sides_continuous and
scale_sides_discrete.
Hadley
there are many bits of code I borrowed from other geoms that i don't
understand, but the basic features are somewhat miraculously working.
I've put the code on r-forge,
http://r-forge.r-project.org/projects/ggplot-add-ons/
and i'll try to polish it soon (legends, doc, all scales, etc.)
The basic example looks like,
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
d <- ggplot(dsmall, aes(carat, price))
str(dsmall)
library(ggplotpp)
d + geom_ngon(aes(fill = carat, sides=color), colour="orange",ar=1,
size=5, angle=pi/3)
d + geom_ngon(aes(fill = carat, angle = x, ar=y, size=depth),
sides=50) # ellipses
best wishes,
baptiste
--
Great :) Thanks for being the brave first person to write their own
geom. Hopefully it will get easier with practice (and as the
documentation improves)
> I've put the code on r-forge,
>
> http://r-forge.r-project.org/projects/ggplot-add-ons/
>
> and i'll try to polish it soon (legends, doc, all scales, etc.)
Let me know if you need any help.
Hadley
Is there something else to take care of?
Thanks,
baptiste
--
On a side note I was looking at the grImport package and the
my.symbols function from TeachingDemos --- there might be some
inspiration for a new geom (admittedly for fairly specific purposes).
I wonder whether it would be possible to have a geom take an arbitrary
grob as an argument passed to the draw function (better yet, a
function creating a new grob).
Something along those lines,
GeomCustom <- proto(Geom, {
objname <- "custom"
desc <- "custom"
draw <- function(., data, scales, coordinates, customGrob, ...) {
with(coordinates$transform(data, scales),
ggname(.$my_name(),
customGrob(x, y, size, col=alpha(colour, alpha), ...))
)
}
required_aes <- c("x", "y")
default_aes <- function(.)
aes(size=1, colour=NA, alpha = 1, customGrob= pointsGrob)
default_stat <- function(.) StatIdentity
guide_geom <- function(.) "polygon"
examples <- function() {
#
}
})
Thanks,
baptiste
On Tue, May 12, 2009 at 2:11 PM, Hadley Wickham <had...@rice.edu> wrote:
> You need to make sure you have
> LazyLoad: false
> in your DESCRIPTION.
>
> For your simple case, you shouldn't need to worry about having zzz.r
> and a collate field, just make sure each accessor function appears
> after the geom definition.
>
> You may also need an explicit require("ggplot2") call.
>
> Hadley
Yes, but it couldn't be an aesthetic (unless you want to create a
scale that maps from a data value to a function!). Just do:
draw <- function(., data, scales, coordinates, customGrob = pointsGrob, ...) {
with(coordinates$transform(data, scales),
ggname(.$my_name(),
customGrob(x, y, size, col=alpha(colour, alpha), ...))
)
}
and leave it out of the aesthetics. That makes customGrob a parameter
of the geom.
I have two other similar geoms on my to do list:
* geom_ellipse and matching stat_normal_ellipse similar to
car::data.ellipse. But how should the ellipse be parameterised? x,
y, a, b, angle? xmin, ymin, xmax, ymax, angle? centre, cov?
* sunflowers, e.g. http://www.stata.com/meeting/3nasug/sunflower.pdf
Well, both are quite exciting ideas for two complementary scenarios,
1) for kids who want little snoopy characters instead of boring
symbols in a plot (everyone knows kids like to attend scientific
conferences). For a grown-ups presentation, you may have to represent
data where the peanuts characters correspond to factors (snoopy,
charlie brown, etc.). Here you'll need a function that selects the
appropriate grob for each level, but these can all be hard-coded in
advance.
2) the my.symbols example (TeachingDemos) with a function passed as a
symbol: here each symbol is a little sparkline graph that needs to be
created from the data.
Another case is the sunflower plot that seems to require a different
grob for each number of observations: it could also fit in this second
scheme to prevent the user from having to create a new geom for each
new idea.
> I have two other similar geoms on my to do list:
>
> * geom_ellipse and matching stat_normal_ellipse similar to
> car::data.ellipse. But how should the ellipse be parameterised? x,
> y, a, b, angle? xmin, ymin, xmax, ymax, angle? centre, cov?
>
> * sunflowers, e.g. http://www.stata.com/meeting/3nasug/sunflower.pdf
>
Both of these seem quite similar in form to the geom I started
yesterday: the ellipses I implemented as a 50-edges polygon (example
attached), while the sunflower plot could correspond to a starred
version of the convex polygons (I saw a similar function in the
?my.symbols page).
As for the ellipse, I'm biased towards a parametrisation of the form,
x,y, aspect ratio a/b , angle
but that's probably because I had to implement it in this form in a
base plot once. Note that in the end I chose to have the aspect ratio
preserve the area of the ellipse so that it remains an orthogonal
setting to the size parameter.
Some people (ellipse package I think) seem to prefer the statistician
perspective of a covariance matrix. I guess it doesn't matter if one
provides an example of the conversion from one to another. I can see
difficulties with my parametrisation when one wants to draw this,
http://weitaiyun.blogspot.com/2009/03/visulization-of-correlation-matrix.html
but then again the size could be adjusted to compensate for the aspect
ratio so that the enclosing square remains constant (admittedly i'd
have to think a little bit to get this one right!)
Here's what I've got for ellipses so far,
# ellipse example (no legend yet)
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
d <- ggplot(dsmall, aes(carat, price))
str(dsmall)
library(ggplotpp)
d + geom_ngon(aes(colour = carat, angle = x, ar=y), fill=NA, sides=50)
You mentioned that I need to create a draw_legend() function in the
definition of the geom. Presumably, this is a grob that gets coloured
etc. with the right parameters and placed on a vertical grid by the
build_legend() function.
I tried this, but nothing changes (I haven't tested the new icon, I'm
guessing it's only for an online doc, is it?).
GeomNgon <- proto(Geom, {
objname <- "ngon"
desc <- "Regular polygons"
draw <- function(., data, scales, coordinates, ...) {
with(coordinates$transform(data, scales),
ggname(.$my_name(),
ngonGrob(x, y, sides, size, angle, ar, col=alpha(colour, alpha), fill
= alpha(fill, alpha)))
)
}
required_aes <- c("x", "y")
default_aes <- function(.)
aes(sides=5, size=1, angle=0, ar=1, colour=NA, fill = "grey50", alpha = 1)
default_stat <- function(.) StatIdentity
guide_geom <- function(.) "polygon"
#
draw_legend <- function(., data, ...) {
data <- aesdefaults(data, .$default_aes(), list(...))
with(data,
ggname(.$my_name(),
ngonGrob(0, 0,
ar=ar,
size=size,
sides=sides,
angle = angle ,
fill=fill)))
}
icon <- function(.) {
ngonGrob(c(1/4, 1/2, 3/4), c(1/4, 1/2, 3/4),
ar=c(1, 1.5, 2),
size=1.2*c(0.1, 0.3, 0.1),
sides=c(5, 6, 50),
angle = c(0, pi/4, pi/3) ,
fill=c("#E41A1C", "#377EB8", "#4DAF4A"))
}
examples <- function() {
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
d <- ggplot(dsmall, aes(carat, price))
d + geom_ngon(aes(fill=carat), sides=5, size=2)
}
})
Cheers,
baptiste
You could draw a geom_point on top.
> One thing I'm not sure about is the legend: to indicate the rotation
> of the grobs, I've added a small segment mapped onto angle, but this
> is only useful for the rotation. Can the grob be different for the
> other variables (fill, aspect ratio, etc)?
Hmmm. Not in the way that legends are currently constructed. You
could check for the case where angle != 0 && sides == 50 && ar == 1
and only draw the line then.
It's looking really good! Personally, I've also found the hardest
part of the process to be coming up with a good legend.
Hadley
Good idea -- it's almost right (plot attached), i'm guessing the small
difference comes from the function creating the polygon vertices. I'll
try to fix it.
>> One thing I'm not sure about is the legend: to indicate the rotation
>> of the grobs, I've added a small segment mapped onto angle, but this
>> is only useful for the rotation. Can the grob be different for the
>> other variables (fill, aspect ratio, etc)?
>
> Hmmm. Not in the way that legends are currently constructed. You
> could check for the case where angle != 0 && sides == 50 && ar == 1
> and only draw the line then.
>
Good idea too, it does the trick (albeit a bit of an ugly construct
but oh well).
> It's looking really good! Personally, I've also found the hardest
> part of the process to be coming up with a good legend.
>
From a foreign perspective, the hardest part was to believe that it
could all work out by simply tweaking bits and pieces borrowed from
other geoms!
I tried to implement a new parameter today, starred or convex version
of the polygons: everything went fine except that it was getting
really messy with the legend and the vast amount of degrees of freedom
assigned to one geom. I gave up, also I wasn't sure how to implement a
binary scale (discrete with TRUE/FALSE values).
I'll try to reuse the starred polygons in a new geom for sunflower
plots though.
Meanwhile, I'm happy to say the ngon geom is now roughly complete and
available on r-forge for those interested in playing with it (several
scales are still missing). Note that many combinations of parameters
should not be advised for faithful display of information (e.g, the
angle of rotation is mapped onto [0, pi] which makes sense for
ellipses but not for triangles, or even worse for circles!)
Thanks,
baptiste
> Hadley
Well you do need to think about what the "size" of an arbitrary
polygon is. For a circle, you have the radius so it's easy, but what
about for a triangle or square? It seems like you should want to keep
the area constant so that a triangle, square, circle etc of size 1 all
have the same area.
> From a foreign perspective, the hardest part was to believe that it
> could all work out by simply tweaking bits and pieces borrowed from
> other geoms!
Generally, individual geoms are fairly easy to write - it's getting
right all the interactions with other geoms, statistics, coordinate
systems etc.
> I tried to implement a new parameter today, starred or convex version
> of the polygons: everything went fine except that it was getting
> really messy with the legend and the vast amount of degrees of freedom
> assigned to one geom. I gave up, also I wasn't sure how to implement a
> binary scale (discrete with TRUE/FALSE values).
A scale like that doesn't exist currently. You could probably model
it as a discrete geom with only two possible values.
> Meanwhile, I'm happy to say the ngon geom is now roughly complete and
> available on r-forge for those interested in playing with it (several
> scales are still missing).
Great, thanks!
> Note that many combinations of parameters
> should not be advised for faithful display of information (e.g, the
> angle of rotation is mapped onto [0, pi] which makes sense for
> ellipses but not for triangles, or even worse for circles!)
That's a good point - and it clearly reflects the difficulty with
putting angle on the legend.
Hadley
Good idea -- I was thinking of a circumscumbing circle but the area
makes more sense quantitatively speaking.
I'm glad mathworld is here to the rescue,
http://mathworld.wolfram.com/RegularPolygon.html
Area = n/2 * R^2 * sin(2*pi / n)
>
>
>> From a foreign perspective, the hardest part was to believe that it
>> could all work out by simply tweaking bits and pieces borrowed from
>> other geoms!
>
> Generally, individual geoms are fairly easy to write - it's getting
> right all the interactions with other geoms, statistics, coordinate
> systems etc.
>
>> I tried to implement a new parameter today, starred or convex version
>> of the polygons: everything went fine except that it was getting
>> really messy with the legend and the vast amount of degrees of freedom
>> assigned to one geom. I gave up, also I wasn't sure how to implement a
>> binary scale (discrete with TRUE/FALSE values).
>
> A scale like that doesn't exist currently. You could probably model
> it as a discrete geom with only two possible values.
>
>
I've quickly put together a new star geom with the hope to mimic the
sunflower plots (I'll need a new stat I suppose). I'm still wondering
whether:
1) the ellipse should be a geom on its own (perhaps additionally)
2) the starred / convex scale wouldn't be a nice feature (I had a need
for a binary scale of symbols with and without filling once, and it
wasn't straight-forward as the default point symbols aren't ordered
for this.
Best,
baptiste
--
:)
polygon.regular() and star() are two functions that return a matrix of
XY coordinates for polygons (resp. stars) included in a circle of
radius unity. For instance,
polygon.regular(5)
[,1] [,2]
[1,] 0.0000e+00 1.00000
[2,] -9.5106e-01 0.30902
[3,] -5.8779e-01 -0.80902
[4,] 5.8779e-01 -0.80902
[5,] 9.5106e-01 0.30902
[6,] 2.2204e-16 1.00000
polygon.regular(50)->test
range(test) # -1 1
The geom is then defined as,
ngonGrob <- function(x, y, sides=5, size = 1,
angle=rep(pi/2, length(x)), ar = rep(1, length(x)),
colour = "grey50", fill = "grey90", units.def="native") {
stopifnot(length(y) == length(x))
n <- length(x)
size <- size / 2 # polygon.regular has radius unity
if(length(size) < n ) size <- rep(size, length.out=n)
if(length(sides) < n ) sides <- rep(sides, length.out=n)
ngonC <- llply(sides, polygon.regular)
# stretch the polygons, then rotate them
# aspect ratio factor for constant area
ngonC.list <-
llply(seq_along(ngonC), function(ii)
size[ii] * ngonC[[ii]] %*% matrix(c(sqrt(ar[ii]), 0, 0,
1/sqrt(ar[ii])), ncol=2) %*%
matrix(c(cos(angle[ii]), -sin(angle[ii]), sin(angle[ii]),
cos(angle[ii])), nrow = 2)
)
vertices <- laply(ngonC.list, nrow)
reps.x <- do.call(c, llply(seq_along(x), function(ii) rep(x[ii], vertices[ii])))
reps.y <- do.call(c, llply(seq_along(y), function(ii) rep(y[ii], vertices[ii])))
ngonXY <- do.call(rbind, ngonC.list)
polygonGrob(
x = unit(ngonXY[, 1], "mm") + unit(reps.x, units.def),
y = unit(ngonXY[, 2], "mm") + unit(reps.y, units.def),
default.units = units.def,
id.lengths = unlist(vertices), gp = gpar(col = colour, fill = fill)
)
}
For size=ar=1, the output of polygon.regular is directly interpreted
in mm and shifted to the location of the point in native coordinates.
Where have I gone wrong? (example below and result attached).
#install.packages("ggplotpp", repos="http://R-Forge.R-project.org")
library(ggplotpp)
qplot(0, 0)+
geom_point(size=100, col="blue", pch=21, fill="red")+
geom_ngon(size=100, fill="yellow", alpha=0.5, sides=5, col="blue")
Thanks,
baptiste
What's the problem? That looks exactly like your description of of
"XY coordinates for polygons (resp. stars) included in a circle of
radius unity"
Hadley
Oops, no it doesn't - it looks like a circle of radius 1 included
inside the polygon.
Hadley
vp <- viewport(width=0.5, height=0.5, name="main")
pushViewport(vp)
p <- pointsGrob(0.5, 0.5, size=unit(10, "mm"),pch=21,
gp=gpar(fill=alpha("blue", 0.5), col=alpha("red", 0.5)))
p1 <- rectGrob(0.5, 0.5, width=unit(10, "mm"), height=unit(10, "mm"),
gp=gpar(fill=alpha("yellow", 0.5), col=alpha("red", 0.5)))
p2 <- pointsGrob(0.5, 0.5, size=unit(10 / 2 * 2.54, "mm"), pch=21,
gp=gpar(fill=alpha("blue", 0.5), col=alpha("red", 0.5)))
grid.draw(p)
grid.draw(p1)
grid.draw(p2)
vp <- viewport(width=0.5, height=0.5, name="main")
pushViewport(vp)
checkOneSymbol <- function(pch=0){
gTree(children=gList(
rectGrob(0.5, 0.5, width=unit(10, "mm"), height=unit(10, "mm"),
gp=gpar(lty=2, fill=NA, col=alpha("black", 0.5))),
pointsGrob(0.5, 0.5, size=unit(10, "mm"),pch=pch,
gp=gpar(fill=alpha("blue", 0.5), col=alpha("red", 0.5)))
))
}
all.symbols <- llply(0:23, checkOneSymbol)
vp <- viewport(width=0.5, height=0.5, name="main")
pushViewport(vp)
pushViewport(viewport(layout=grid.layout(1, 24,
widths=unit(10, "mm"),
heights=unit(10, "mm"),
just="center")))
for(ii in 0:23){
pushViewport(viewport(layout.pos.col=ii+1, layout.pos.row=1))
grid.draw(all.symbols[[ii+1]])
upViewport(1)
}
It seems that the size of the basic R symbols is not quite what one
might have expected, and I can't find any doc saying that the size for
pch=21 should actually be the radius in bloody inches ...
I've had a look at the sources (graphics.c, etc.) but this is way too
deep for me.
Best wishes,
baptiste
checkOneSymbol <- function(pch=0){
gTree(children=gList(
rectGrob(0.5, 0.5, width=unit(10, "mm"), height=unit(10, "mm"),
gp=gpar(lty=2, fill=NA, col=alpha("black", 0.5))),
pointsGrob(0.5, 0.5, size=unit(10, "mm"),pch=pch,
gp=gpar(col=alpha("red", 0.5)))
))
}
all.symbols <- llply(0:23, checkOneSymbol)
pdf("symbols.pdf", height=1.2/2.54, width=24.2/2.54)
vp <- viewport(width=0.5, height=0.5, name="main")
pushViewport(vp)
pushViewport(viewport(layout=grid.layout(1, 24,
widths=unit(10, "mm"),
heights=unit(10, "mm"),
just="center")))
for(ii in 0:23){
pushViewport(viewport(layout.pos.col=ii+1, layout.pos.row=1))
grid.draw(all.symbols[[ii+1]])
upViewport(1)
}
dev.off()