http://www.saoconnell.com/ggplot2/job_timeline_one_month.png
When I plot two or more months, the colors are all mixed up, and there
are warnings with the process runs:
http://www.saoconnell.com/ggplot2/job_timeline_two_month.png
DATA:
http://www.saoconnell.com/ggplot2/job_data.csv
require(ggplot2)
segLines <- read.csv("job_data.csv")
segLines$std <- as.POSIXct(substr(segLines$std, 1, 19))
segLines$etd <- as.POSIXct(substr(segLines$etd, 1, 19))
segLines$Month <- as.factor(segLines$Month)
segLines$job <- as.factor(segLines$job)
## ONE MONTH
seg <- subset(segLines, std > as.POSIXlt("2009-12-01 00:00:00") & etd
< as.POSIXlt("2009-12-30 00:00:00") )
p <- ggplot(seg, aes(x = std, y = job), colour = job)
(p <- p + geom_segment(aes(xend = seg$etd, yend = seg$job), data=seg,
colour = seg$job, size=3))
last_plot() + scale_x_datetime(major = "3 days", format = "%m/%d")
last_plot() + facet_wrap(~ Month, ncol = 1, scales="free_x")
ggsave(file="job_timeline_one_month.png", height = 1.5, width = 10,
dpi=72)
## TWO OR MORE MONTHS
seg <- subset(segLines, std > as.POSIXlt("2009-11-01 00:00:00") & etd
< as.POSIXlt("2009-12-30 00:00:00") )
p <- ggplot(seg, aes(x = std, y = job), colour = job)
(p <- p + geom_segment(aes(xend = seg$etd, yend = seg$job), data=seg,
colour = seg$job, size=3))
last_plot() + scale_x_datetime(major = "3 days", format = "%m/%d")
last_plot() + facet_wrap(~ Month, ncol = 1, scales="free_x")
ggsave(file="job_timeline_two_month.png", height = 1.5, width = 10,
dpi=72)
Any suggestions would be greatly appreciated.
Thanks,
Stephen...
This seemed to work for me:
ggplot(seg, aes(x=std, y=job,color=job)) +
geom_segment(aes(xend=etd,yend=job), size=3) +
scale_x_datetime(major = "3 days", format = "%m/%d") +
facet_wrap(~ Month, ncol = 1, scales="free_x")
Try chunks of your code to make sure it really is working. Your single
month did not work for me. Maybe run update.packages? Don't
overspecify your parameters ($'s in your code is considered bad form) --
try to specify only those things that are not inherited from previous
arguments.
I could get the December plots to work (as a single month), but not
the November one (even as a single month). Below, I get it down to a
minimal examples, one that works and one that doesn't. I followed
Mark's simplification of the ggplot calls.
require(ggplot2)
segLines <- read.csv("http://www.saoconnell.com/ggplot2/job_data.csv")
segLines$std <- as.POSIXct(substr(segLines$std, 1, 19))
segLines$etd <- as.POSIXct(substr(segLines$etd, 1, 19))
segLines$Month <- as.factor(segLines$Month)
segLines$job <- as.factor(segLines$job)
# Eliminate unused columns
segLines$X <- NULL
segLines$date <- NULL
segLines$job_ref <- NULL
Dec <- subset(segLines, std > as.POSIXlt("2009-12-01 00:00:00") &
etd < as.POSIXlt("2009-12-30 00:00:00") )
Nov <- subset(segLines, std > as.POSIXlt("2009-11-01 00:00:00") &
etd < as.POSIXlt("2009-11-30 00:00:00") )
ggplot(Dec, aes(x = std, y = job, colour = job)) +
geom_segment(aes(xend = etd, yend = job), size=3) +
scale_x_datetime(major = "3 days", format = "%m/%d") +
facet_wrap(~ Month, ncol = 1, scales="free_x")
# works
ggplot(Nov, aes(x = std, y = job, colour = job)) +
geom_segment(aes(xend = etd, yend = job), size=3) +
scale_x_datetime(major = "3 days", format = "%m/%d") +
facet_wrap(~ Month, ncol = 1, scales="free_x")
# gives error: Error in seq.int(0, to - from, by) : 'to' must be
finite
# Minimal reproducible subsets to show difference
DecMin <- subset(Dec, job=="sancopy")[1:2,c("job","std")]
DecMin$job <- factor(DecMin$job)
NovMin <- subset(Nov, job=="sancopy")[1:2,c("job","std")]
NovMin$job <- factor(NovMin$job)
#> DecMin
# job std
#34 sancopy 2009-12-01 18:13:57
#41 sancopy 2009-12-03 17:19:47
#> NovMin
# job std
#2 sancopy 2009-11-01 14:53:04
#4 sancopy 2009-11-03 16:06:04
ggplot(DecMin, aes(x=std, y=job)) + geom_point()
# works
ggplot(NovMin, aes(x=std, y=job)) + geom_point()
# gives error: Error in seq.int(0, to - from, by) : 'to' must be
finite
# dput version of DecMin and NovMin, for reference
DecMin <- structure(list(job = structure(c(1L, 1L), .Label =
"sancopy", class = "factor"),
std = structure(c(1259720037, 1259889587), class = c("POSIXt",
"POSIXct"), tzone = "")), .Names = c("job", "std"), row.names = c
(34L,
41L), class = "data.frame")
NovMin <- structure(list(job = structure(c(1L, 1L), .Label =
"sancopy", class = "factor"),
std = structure(c(1257115984, 1257293164), class = c("POSIXt",
"POSIXct"), tzone = "")), .Names = c("job", "std"), row.names = c
(2L,
4L), class = "data.frame")
I don't see any substantial differences between the NovMin and DecMin
dates such that one should work and the other not. I do not know what
causes the differences.
http://www.saoconnell.com/ggplot2/job_timeline.png
Thank you!
Stephen...
On Jan 20, 2:58 am, Mark Connolly <mark_conno...@acm.org> wrote:
http://www.saoconnell.com/ggplot2/job_timeline.png
ggplot(seg, aes(x=std, y=job,color=job)) +
geom_segment(aes(xend=etd,yend=job), size=3) +
scale_x_datetime(major = "3 days", format = "%m/%d") +
facet_wrap(~ Month, ncol = 1, scales="free_x")
Sorry for the confusion.
Stephen...
The most likely cause is the automatic generation of the minor breaks
- I suspect if you supply them the problem will go away. Getting tick
positions right automatically is far far more complicated than I ever
imagined, but I have some code that a student wrote which I think will
fix the problem.
Hadley
That doesn't seem to be the problem, or at least specifying major and
minor breaks doesn't help. Using the DecMin and NovMin sets from
before (reproduced below for completeness):
DecMin <-
structure(list(job = structure(c(1L, 1L),
.Label = "sancopy", class = "factor"),
std = structure(c(1259720037, 1259889587),
class = c("POSIXt", "POSIXct"), tzone = "")),
.Names = c("job", "std"), row.names = c(34L, 41L),
class = "data.frame")
NovMin <-
structure(list(job = structure(c(1L, 1L),
.Label = "sancopy", class = "factor"),
std = structure(c(1257115984, 1257293164),
class = c("POSIXt", "POSIXct"), tzone = "")),
.Names = c("job", "std"), row.names = c(2L, 4L),
class = "data.frame")
ggplot(DecMin, aes(x=std, y=job)) + geom_point() +
scale_x_datetime(major="1 days", minor="3 hours")
# works, setting major and minor ticks as requested
ggplot(NovMin, aes(x=std, y=job)) + geom_point() +
scale_x_datetime(major="1 days", minor="3 hours")
# Error in seq.int(0, to - from, by) : 'to' must be finite
> traceback()
29: seq.POSIXt(start, maxx + incr, breaks)
28: seq.int(start, maxx + incr, breaks)
27: cut.POSIXt(date, time, right = TRUE, include.lowest = TRUE)
26: cut(date, time, right = TRUE, include.lowest = TRUE)
25: as.POSIXct(cut(date, time, right = TRUE, include.lowest = TRUE),
tz = attr(date, "tz") %||% "")
24: floor_time(range[1], time)
23: inherits(from, "POSIXt")
22: seq.POSIXt(floor_time(range[1], time), ceiling_time(range[2],
time), by = time)
21: fullseq_time(d, .$break_points()[1])
20: get("input_breaks", env = ., inherits = TRUE)(., ...)
19: .$input_breaks()
18: get("input_breaks_n", env = scales$x, inherits = TRUE)(scales$x,
...)
17: scales$x$input_breaks_n()
16: inherits(x, "factor")
15: is.factor(x)
14: rescale(data, 0:1, range, clip = clip)
13: get("rescale_var", env = ., inherits = TRUE)(., ...)
12: .$rescale_var(scales$x$input_breaks_n(), x.range, TRUE)
11: get("compute_ranges", env = coord, inherits = TRUE)(coord, ...)
10: coord$compute_ranges(scales)
9: FUN(1L[[1L]], ...)
8: lapply(seq_along(data), function(i) {
layer <- layers[[i]]
layerd <- data[[i]]
grobs <- matrix(list(), nrow = nrow(layerd), ncol = ncol
(layerd))
for (i in seq_len(nrow(layerd))) {
for (j in seq_len(ncol(layerd))) {
scales <- list(x = .$scales$x[[j]]$clone(), y = .$scales
$y[[i]]$clone())
details <- coord$compute_ranges(scales)
grobs[[i, j]] <- layer$make_grob(layerd[[i, j]],
details, coord)
}
}
grobs
})
7: get("make_grobs", env = facet, inherits = TRUE)(facet, ...)
6: facet$make_grobs(data, layers, cs)
5: ggplot_build(plot)
4: ggplotGrob(x, ...)
3: grid.draw(ggplotGrob(x, ...))
2: print.ggplot(list(data = list(job = c(1L, 1L), std = c(1257115984,
1257293164)), layers = list(<environment>), scales =
<environment>,
mapping = list(x = std, y = job), options = list(labels = list(
x = "std", y = "job")), coordinates = <environment>,
facet = <environment>, plot_env = <environment>))
1: print(list(data = list(job = c(1L, 1L), std = c(1257115984,
1257293164
)), layers = list(<environment>), scales = <environment>, mapping =
list(
x = std, y = job), options = list(labels = list(x = "std",
y = "job")), coordinates = <environment>, facet =
<environment>,
plot_env = <environment>))
> sessionInfo()
R version 2.10.1 (2009-12-14)
i386-pc-mingw32
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] grid stats graphics grDevices utils datasets
methods
[8] base
other attached packages:
[1] ggplot2_0.8.5 digest_0.4.2 reshape_0.8.3 plyr_0.1.9
proto_0.3-8
Odd, that code runs fine for me.
Hadley
I went back and checked it again, verifying everything was up-to-date
and even running it with R --vanilla, and I still got the error. I
investigated some more and figured out it is not a ggplot2 error, but
a daylight savings time/timezone issue in some base functions.
2009-11-01, the earliest date in the November data, was when the US
switched from daylight savings time to standard time (and I am in the
US/Pacific ("America/Los_Angeles") timezone). I can get the same
error with the far simpler code:
> cut(as.POSIXct("2009-11-01 04:00:00"), "1 day")
Error in seq.int(0, to - from, by) : 'to' must be finite
> cut(as.POSIXct("2009-11-01 01:00:00"), "1 day")
[1] 2009-11-01
Levels: 2009-11-01
I'm going to put a post on R-devel about this. Sorry for thinking
this was ggplot2 related.
--Brian Diggs
I think I've run into that issue in the past too. I think I
complained about it, but nothing happened.
Hadley