negative log scale

1,091 views
Skip to first unread message

dpark

unread,
Jan 20, 2010, 4:22:46 PM1/20/10
to ggplot2
Hi folks,

I'm trying to see if a set of p-values matches the expectation of a
uniform distribution. The usual visualization I've seen is a q-q plot
where both axes are on a negative log scale. I realized that I could
make log scales and negative linear scales, but I'm not sure how to
put the two together (which ever scale_() function I add in last takes
precedence I think).

Here's what I have so far:

dat <- read.delim( --a file with a column called pcor-- )

ggplot(subset(dat, !is.na(pcor)), aes(sample=pcor)) +
stat_qq(distribution=qunif) + geom_abline() +
scale_x_log10("theoretical") + scale_y_log10("actual")

If I try something like "scale_x_log10(trans='reverse')", it's not
log10 anymore, just reversed. I can precompute the negative log10
ahead of time, but then I'm not familiar with a function that I can
pass to the distribution parameter of stat_qq.

I'm sure there's a simple answer.. any thoughts?

Juliet Hannah

unread,
Jan 23, 2010, 10:15:20 AM1/23/10
to dpark, ggplot2
You can build the plot manually.

pvals <- data.frame(pv=runif(1000))
pvals <- pvals[order(pvals$pv),,drop=FALSE]

n <- nrow(pvals)
r <- (1:n)/(n + 1)
y <- -log(pvals$pv, base=10)
x <- -log(r, base=10)
mydata <- data.frame(y,x)

library(ggplot2)
p <- ggplot(mydata,aes(x=x,y=y))
p <- p + geom_point()
p <- p + geom_abline(intercept = 0, slope = 1)

> --
> You received this message because you are subscribed to the ggplot2 mailing list.
> To post to this group, send email to ggp...@googlegroups.com
> To unsubscribe from this group, send email to
> ggplot2+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/ggplot2
>

dpark

unread,
Jan 26, 2010, 10:01:18 AM1/26/10
to ggplot2
Yeah, manually always works. Thanks!

dpark

unread,
Jan 27, 2010, 8:46:52 AM1/27/10
to ggplot2
Okay I found something slightly better than completely manually. This
is good because making ggplot do more of the work allows me to slip in
stuff like facet_wrap that wouldn't be so straightforward in the do-it-
yourself approach.

qnlog10 <- function(p) {
-log10(qunif(p[length(p):1]))
}
p <- ggplot(mydata, aes(sample=-log10(P))) + stat_qq
(distribution=qnlog10)
p <- p + geom_abline()
p <- p + scale_x_continuous('theoretical -log10 pval') +
scale_y_continuous('actual -log10 pval')
p

In my case, I'm also tossing in a geom_hline at a bonferroni
significance threshold, and a facet_wrap because mydata actually
contains several conditions I want to plot independently.

Took me a while to realize I could just make up my own distribution
function.. and that the function needed to reverse the order of its
input if it's negating stuff.

-danny

Aaron Mackey

unread,
Jan 27, 2010, 9:29:07 AM1/27/10
to dpark, ggplot2
p[length(p):1] is perhaps better written as: rev(p)

Thanks for the example for rolling a custom distributions!

-Aaron

Reply all
Reply to author
Forward
0 new messages