Strange scale_x_datetime offset?

2,322 views
Skip to first unread message

Bill Harris

unread,
Jun 2, 2013, 7:26:51 PM6/2/13
to ggp...@googlegroups.com
I often plot data from a data frame that has both a POSIXct and a
numeric column. The following code creates such a data frame and plots
the data:

require(ggplot2)
require(scales)
require(lubridate)
timelist <- ymd_hm(c("2013-05-01 12:00","2013-05-02 18:00","2013-05-03 06:00"),tz="America/Los_Angeles")
sampleDF <- data.frame(time = timelist, values = c(3,4,2))
qplot(time, values, data = sampleDF)+scale_x_datetime(minor_breaks=date_breaks("6 hour"))

When I run the code above, I get a plot with the three points at the
right times, but the minor break lines are offset apparently by 3 hours
from the axis. I expected the minor breaks to coincide with the major
breaks. That is, I expected minor breaks at 0:00, 6:00, 12:00, 18:00,
..., with major breaks as scales chose, but I get minor breaks at 3:00,
9:00, ....

My time zone is "America/Los_Angeles", but I get the same result when I
use "UTC" for the data instead.

What can I do to get what I expected? What controls where the minor
breaks start?

Thanks,

Bill
--
Bill Harris
Facilitated Systems
http://makingsense.facilitatedsystems.com/

Hadri Commenges

unread,
Jun 3, 2013, 2:23:17 PM6/3/13
to Bill Harris, ggp...@googlegroups.com
Hi,

you can set the limits:

qplot(time, values, data = sampleDF) + scale_x_datetime(limits = c(min(timelist), max(timelist)), minor_breaks=date_breaks("6 hour"))

Hope it helps,

--
Hadrien


2013/6/3 Bill Harris <bill_...@facilitatedsystems.com>

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility

To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2

---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.






Bill Harris

unread,
Jun 3, 2013, 7:59:29 PM6/3/13
to Hadri Commenges, ggp...@googlegroups.com
Hadri Commenges <hadri.c...@gmail.com> writes:

> scale_x_datetime(limits =
> c(min(timelist), max(timelist)), minor_breaks=date_breaks("6 hour"))

Hadri,

Did the suggested code work for you? When I try it, it's worse.

Here's what I got with your suggestion:

sampleDF.png

Hadri Commenges

unread,
Jun 4, 2013, 1:04:27 AM6/4/13
to Bill Harris, ggp...@googlegroups.com
With the code I send I get that result (see attached), using minor_breaks() or breaks() the result is the same. I have no idea why you don't get that result, do you have updated versions of ggplot2 and scales? If you don't get a good technical answer on that list, I would suggest the stupid way: uninstall and reinstall ggplot2 and scales.

--
Hadrien


2013/6/4 Bill Harris <bill_...@facilitatedsystems.com>
Here's the data.

> sampleDF
                 time values
1 2013-05-01 12:00:00      3
2 2013-05-02 18:00:00      4
3 2013-05-03 06:00:00      2

Note that the values don't appear at the indicated times.  For example,
row 2 with a value of 4 appears to show up in this new version at about
11 a.m. on May 2 -- 7 hours early.

Oh, and the minor breaks still don't mesh with the major breaks.

Bill
--
Bill Harris





Rplot.png

Bill Harris

unread,
Jun 4, 2013, 1:41:57 AM6/4/13
to Hadri Commenges, ggp...@googlegroups.com
Bill Harris <bill_...@facilitatedsystems.com> writes:

> Note that the values don't appear at the indicated times. For example,
> row 2 with a value of 4 appears to show up in this new version at about
> 11 a.m. on May 2 -- 7 hours early.
>
> Oh, and the minor breaks still don't mesh with the major breaks.

In thinking about it, it's almost as if scale_x_datetime uses a time
zone but neither says what time zone it will use nor provides a way to
adjust it, and minor_breaks does the same thing but with a different
time zone. Is that -- or something close -- possible?

Bill Harris

unread,
Jun 4, 2013, 1:56:33 AM6/4/13
to Hadri Commenges, ggp...@googlegroups.com
Hadri Commenges <hadri.c...@gmail.com> writes:

> With the code I send I get that result (see attached), using minor_breaks()
> or breaks() the result is the same. I have no idea why you don't get that
> result, do you have updated versions of ggplot2 and scales? If you don't
> get a good technical answer on that list, I would suggest the stupid way:
> uninstall and reinstall ggplot2 and scales.

What time zone is your computer set to, if you don't mind my asking?

In your graph, the point at 3 seems to be at 2013-05-01T21:00, the point
at 4 seems to be at 2013-05-03T03:00, and the point at 2 apppears to be
at 2013-05-03T15:00.

That's /not/ what I specified in my data as shown below.

>> > sampleDF
>> time values
>> 1 2013-05-01 12:00:00 3
>> 2 2013-05-02 18:00:00 4
>> 3 2013-05-03 06:00:00 2
>>
>> Note that the values don't appear at the indicated times. For example,
>> row 2 with a value of 4 appears to show up in this new version at about
>> 11 a.m. on May 2 -- 7 hours early.

So the plot on my computer showed data 7 hours earlier than I wrote
down. On yours, the plot seems to show up 7 hours later on the first, 9
hours later on the second, and 3 hours earlier on the last. That makes
no sense to me, but it's late, and I may be looking at things wrongly.

Still, what you got and what I got don't make any sense. FWIW, I have
ggplot2 version 0.9.3.1 and scales version 0.2.3. Those appear to be
the current versions. I'm running R version 2.15.3.

Any more ideas?

Hadri Commenges

unread,
Jun 4, 2013, 10:47:05 AM6/4/13
to Bill Harris, ggplot2
I tried your script changing the timezone and only minor breaks move. I think it's because you create a plot with qplot and then you scale only minor breaks. You should scale both major and minor breaks:
 
qplot... + scale_x_datetime( ... breaks = date_breaks("6 hour"), minor_breaks = date_breaks("3 hour"))

Does it work for you?

--
Hadrien


2013/6/4 Bill Harris <bill_...@facilitatedsystems.com>
Bill Harris <bill_...@facilitatedsystems.com> writes:

Bill Harris

unread,
Jun 4, 2013, 8:13:26 PM6/4/13
to Brian "Meyer" Waismeyer, Hadri Commenges, ggplot2
"Brian \"Meyer\" Waismeyer" <meye...@gmail.com> writes:

> To consolidate the discussion and make comparison easy, I'm reporting the
> three code variations below and attaching PNGs of each graph they code
> produced. I comment on how each piece of code behaved for me and then I add
> some additional thoughts and attempts at producing the behavior Bill seems
> to be going for. (Note: I double checked that my R and packages were all
> updated. My timezone is PST.)

Thanks, Brian. I think what you did may help move us forward.

>> sampleDF
> time values
> 1 2013-05-01 12:00:00 3
> 2 2013-05-02 18:00:00 4
> 3 2013-05-03 06:00:00 2
>
> *Here's the first plot code (as provided by Bill):*
> qplot(time, values, data =
> sampleDF)+scale_x_datetime(minor_breaks=date_breaks("6 hour"))
>
> - on my computer: major/minor breaks are oddly aligned so that minor breaks
> don't nicely parse the space between the major breaks - this makes the
> spacing look "uneven" (similar to Bill to I think)

I was going to attach my results, but this graph is the same as the one
you provided. I perceive the data is at the correct times, as indicated
by the abscissa legend, but the minor breaks are in unexpected places.

> *Here's the first solution proposed (as provided by Hadri):*
> qplot(time, values, data = sampleDF) + scale_x_datetime(limits =
> c(min(timelist), max(timelist)), minor_breaks=date_breaks("6 hour"))
>
> - on my computer: major/minor breaks are still oddly aligned and space
> still looks uneven (similar to Bill I think). in fact, I see no changes
> from the graph produced by the original code.

I presume you intended what Hadri suggested in his first email of
2013-06-04 and showed in a graph in his second email of 2013-06-04.
That's not what I saw in his graph. For example, it has an ordinate of
4 at what I read as 3:00 a.m. on May 3, according to the graph's
abscissa, while my graph and the data frame shows it at 6 p.m. on May 2.

Your graph looked the same as mine, though. That leads me to speculate
that it has something to do with local time zones. The difference
between Hadri's time zone and yours and mine is 9 hours--the same as the
difference in the abscissa of that example point. I haven't looked at
the code, but I didn't see anything in the help text that describes how
to set time zones for scales (and breaks and vlines and ...).

> *Here's the second solution proposed (as provided by Hadri):*
> qplot(time, values, data = sampleDF)+scale_x_datetime(breaks=date_breaks("6
> hour"), minor_breaks = date_breaks("3 hour"))
>
> - on my computer: the space is now evenly parsed but the major breaks seem
> oddly placed (on 21, 3, 9, ... instead of 0, 6, 12, ...).

Agreed.

> Even though this solution works, I'm still confused as to why the breaks
> lined up so oddly in the first place. I suspect this solution simply masks
> the mis-alignment going on between the major/minor breaks by having the
> minor breaks "overfill."
>
> For instance, I think Bill wanted this:
> major minor
> 0 0
>
> 6
>
> 12 12

Right.

> But got this:
> major minor
> 0
> 3
>
> 9
> 12

Yes.

> The current solution gives this:
> major minor
> 0
> 3 3
> 6
> 9 9
> 12
>
> The mis-alignment is still there - we've just masked it by having the minor
> breaks fill in the gaps.
>
> In fact, reading through ??scale_x_datetime, I am confused why the
> following doesn't work to give the behavior Bill seems to want.
> *
> *
> *third solution (using the default minor breaks)*
> qplot(time, values, data =
> sampleDF)+scale_x_datetime(breaks=date_breaks("12 hour"))
>
> The major breaks occur every 12 hours with a minor break marked in between
> each (the default behavior when minor breaks aren't specified) at the 6
> hour marks. This is the correct spacing and has removed the major/minor
> misalignment. However, now the grid lines up poorly with the data and (at
> least to me) the breaks seem illogical since they are at odd times (21, 3,
> 9, ...).
>
> I think this is what Hadri's first solution was attempting to address - by
> adjusting the start/end points of the scale to the min/max values, we
> should get better aligned breaks. Yet this doesn't work as anticipated - it
> doesn't change my graph at all. I think the graph already assigned these
> values as default.
>
> Perhaps it's the "expand" feature of scale_x_datetime? This feature seems
> to put space between the min/max values on the scale and the edges of the
> drawn grid. I wasn't able to figure out how to make "expand" work directly,
> but I was able to "force" the data into the desired alignment by adjusting
> the values in "limits".
>
> *fourth solution (shifting the scale minimum to line up the first time with
> a major break)*
> qplot(time, values, data =
> sampleDF)+scale_x_datetime(breaks=date_breaks("12 hour"), limits =
> c(as.POSIXct("2013-05-01 3:00:00 PDT"), as.POSIXct("2013-05-03 08:00:00
> PDT")))
>
> The main problem with this "solution" is that it draws too much of the
> range before the first data-point... =/

... and it also makes ggplot hard to use, but most of ggplot is easy and
intuitive, at least after a bit of learning.

> I still suspect that "expand" might be the place to look for an alignment
> solution...
>
> And that's where I stopped for the day. =)

Me, too. :-)

Does this spark any ideas?

Bill Harris

unread,
Jun 4, 2013, 8:17:24 PM6/4/13
to Hadri Commenges, ggplot2
Hadri Commenges <hadri.c...@gmail.com> writes:

> I tried your script changing the timezone and only minor breaks move. I
> think it's because you create a plot with qplot and then you scale only
> minor breaks. You should scale both major and minor breaks:

What time zone did you change? My current speculation is that you'd
need to change the locale of your computer to see what we're seeing --
or that the locale of your computer when combined with the time zone of
your date makes the results look different.

It might be nice -- and it might solve the problem -- if
scale_x_datetime() had an option such as force_tz and with_tz.

> qplot... + scale_x_datetime( ... breaks = date_breaks("6 hour"),
> minor_breaks = date_breaks("3 hour"))
>
> Does it work for you?

I got the same result Brian just showed.

Hadri Commenges

unread,
Jun 5, 2013, 3:42:24 AM6/5/13
to Bill Harris, ggplot2
I didn't change the locale, only the option tz = when creating the timelist object.

Another option to add to Brian's survey would be to draw only one level of grid...

--
Hadri

2013/6/5 Bill Harris <bill_...@facilitatedsystems.com>

Brian "Meyer" Waismeyer

unread,
Jun 4, 2013, 5:27:25 PM6/4/13
to Hadri Commenges, Bill Harris, ggplot2
I've been following the conversation and was able to reproduce the issue Bill described. The first solution proposed also failed to solve the problem for me. I'm not sure the second solution quite works either.

To consolidate the discussion and make comparison easy, I'm reporting the three code variations below and attaching PNGs of each graph they code produced. I comment on how each piece of code behaved for me and then I add some additional thoughts and attempts at producing the behavior Bill seems to be going for. (Note: I double checked that my R and packages were all updated. My timezone is PST.)

Here's the data setup code (as provided by Bill):
require(ggplot2)
require(scales)
require(lubridate)
timelist <- ymd_hm(c("2013-05-01 12:00","2013-05-02 18:00","2013-05-03 06:00"),tz="America/Los_Angeles")
sampleDF <- data.frame(time = timelist, values = c(3,4,2))

> sampleDF
                 time values
1 2013-05-01 12:00:00      3
2 2013-05-02 18:00:00      4
3 2013-05-03 06:00:00      2

Here's the first plot code (as provided by Bill):
qplot(time, values, data = sampleDF)+scale_x_datetime(minor_breaks=date_breaks("6 hour"))

- on my computer: major/minor breaks are oddly aligned so that minor breaks don't nicely parse the space between the major breaks - this makes the spacing look "uneven" (similar to Bill to I think)

Here's the first solution proposed (as provided by Hadri):
qplot(time, values, data = sampleDF) + scale_x_datetime(limits = c(min(timelist), max(timelist)), minor_breaks=date_breaks("6 hour"))

- on my computer: major/minor breaks are still oddly aligned and space still looks uneven (similar to Bill I think). in fact, I see no changes from the graph produced by the original code.

Here's the second solution proposed (as provided by Hadri):
qplot(time, values, data = sampleDF)+scale_x_datetime(breaks=date_breaks("6 hour"), minor_breaks = date_breaks("3 hour"))

- on my computer: the space is now evenly parsed but the major breaks seem oddly placed (on 21, 3, 9, ... instead of 0, 6, 12, ...).

Even though this solution works, I'm still confused as to why the breaks lined up so oddly in the first place. I suspect this solution simply masks the mis-alignment going on between the major/minor breaks by having the minor breaks "overfill." 

For instance, I think Bill wanted this:
major minor
0            0

              6

12        12

But got this:
major minor
0
             3

             9
12

The current solution gives this:
major minor
              0
3            3
              6
9            9
             12

The mis-alignment is still there - we've just masked it by having the minor breaks fill in the gaps.

In fact, reading through ??scale_x_datetime, I am confused why the following doesn't work to give the behavior Bill seems to want.

third solution (using the default minor breaks)
qplot(time, values, data = sampleDF)+scale_x_datetime(breaks=date_breaks("12 hour"))

The major breaks occur every 12 hours with a minor break marked in between each (the default behavior when minor breaks aren't specified) at the 6 hour marks. This is the correct spacing and has removed the major/minor misalignment. However, now the grid lines up poorly with the data and (at least to me) the breaks seem illogical since they are at odd times (21, 3, 9, ...).

I think this is what Hadri's first solution was attempting to address - by adjusting the start/end points of the scale to the min/max values, we should get better aligned breaks. Yet this doesn't work as anticipated - it doesn't change my graph at all. I think the graph already assigned these values as default. 

Perhaps it's the "expand" feature of scale_x_datetime? This feature seems to put space between the min/max values on the scale and the edges of the drawn grid. I wasn't able to figure out how to make "expand" work directly, but I was able to "force" the data into the desired alignment by adjusting the values in "limits".

fourth solution (shifting the scale minimum to line up the first time with a major break)
qplot(time, values, data = sampleDF)+scale_x_datetime(breaks=date_breaks("12 hour"), limits = c(as.POSIXct("2013-05-01 3:00:00 PDT"), as.POSIXct("2013-05-03 08:00:00 PDT")))

The main problem with this "solution" is that it draws too much of the range before the first data-point... =/

I still suspect that "expand" might be the place to look for an alignment solution...

And that's where I stopped for the day. =)

Warm regards,
Brian

-- 
_____________________________________

Brian Waismeyer, MA
Researcher and Data Geek

_____________________________________
original code.png
first solution.png
second solution.png
third solution.png
fourth solution.png

Brian "Meyer" Waismeyer

unread,
Jun 5, 2013, 7:06:40 PM6/5/13
to Hadri Commenges, Bill Harris, ggplot2
Alright. I /think/ I've arrived at a pair of working solutions and maybe a better sense of why Bill's data graph oddly.

Here's my logic:
1. The grid breaks seem to be dependent on (or influenced by) the scale minimum. So, for instance, if we happen to have a minimum that is "odd", then we seem to get odd breaks.
2. The scale minimum seems to be influenced both by the "limits" argument and the "expand" argument. "limits" says what the minimum should be and "expand" shapes how far the edge of the graph should be from the minimum.
3. The default setting of "expand" may make time scales behave in unexpected ways. scale_x_datetime seems to derive a lot of its behavior from scale_x_continuous. Continuous scales, however, tend to either be anchored at 0 (which would make for predictable break behavior) or are non-cyclical (perhaps making odd breaks less obvious). 

If this idea is correct, then Bill's graph looks odd because the "expand" default values are shifting the starting point of the drawn grid. This shift, in turn, is shifting the x-axis breaks off of the minimum data value. This produces illogical breaks since we're working with cyclical, time-incremented data. 

We should be able to force the breaks the line up with our time data in two ways. 

First, we could set the "expand" argument so that it does nothing - this should force the grid to start at our "limits" minimum. We could either let this be the minimum value in our data (though will partially occlude this point) or pick a minimum that we find reasonable.

creating the example data (from Bill's original post)
require(ggplot2)
require(scales)
require(lubridate)
timelist <- ymd_hm(c("2013-05-01 12:00","2013-05-02 18:00","2013-05-03 06:00"),tz="America/Los_Angeles")
sampleDF <- data.frame(time = timelist, values = c(3,4,2))

working solution 1: removing the "expand" effect and forcing a desirable start value (see attached PNG)
p <- ggplot(sampleDF, aes(x = time, y = values))
p + geom_point() + scale_x_datetime(breaks = date_breaks("12 hour"), limits = c(as.POSIXct("2013-05-01 0:00:00 PDT"), as.POSIXct("2013-05-03 24:00:00 PDT")), expand = c(0, 0))

* NOTE: I switched to ggplot rather than the qplot syntax to help me think this through.

Second, we could give an exact list of our major breaks rather than simply specifying the interval. This should circumvent the default behaviors that are setting odd breaks.

working solution 2: specifying all major breaks explicitly (see attached PNG)
# create desired timelist (sorry this is a REALLY inefficient way of doing this - I think the chron package would handle this better)
startTime <- as.POSIXct("2013-05-01 0:00:00 PDT")
majorBreak <- as.duration(interval(start = as.POSIXct("2013-05-01 0:00:00 PDT"), end = as.POSIXct("2013-05-01 12:00:00 PDT")))
increments <- (3 * 24) / 12 # days in range * hours, divide the result by the break interval

majorBreakSet <- c(startTime) 
for(i in (1:increments)) {
  newTime <- majorBreakSet[1] + (majorBreak * i)
  majorBreakSet <- c(majorBreakSet, newTime)
}

# now create the graph with the explicit major breaks - go ahead and allow default minor break and expand behavior, should work fine
p <- ggplot(sampleDF, aes(x = time, y = values))
p + geom_point() + scale_x_datetime(breaks = majorBreakSet)


This solution let's us make use of the default ggplot behaviors. This is especially nice because it ignores some of the sillier explicit breaks I set (the start and end breaks) which fall kinda far from our data. We could, however, force it to respect these breaks by adjusting "limits" if we wanted and still have things nicely lined up the way we intended.

I hope this helps and thanks for the fun discussion!!

Warm regards,
Brian

--
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: https://github.com/hadley/devtools/wiki/Reproducibility
 
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2
 
---
You received this message because you are subscribed to the Google Groups "ggplot2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ggplot2+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
force expand.png
explicit breaks.png

Bill Harris

unread,
Jun 8, 2013, 6:45:33 PM6/8/13
to Brian "Meyer" Waismeyer, ggplot2
"Brian \"Meyer\" Waismeyer" <meye...@gmail.com> writes:

> We should be able to force the breaks the line up with our time data in two
> ways.

Thanks, Brian. You've done a lot on this. Hadri has, too, but the
solutions don't work as well for me. That said, these don't seem ideal,
either.

> First, we could set the "expand" argument so that it does nothing - this
> should force the grid to start at our "limits" minimum. We could either let
> this be the minimum value in our data (though will partially occlude this
> point) or pick a minimum that we find reasonable.

This seems to work. I can simulate the expand option (which I had never
used) artificially by subtracting and adding time to the limits.

Still, the code begins to become baroque, with setting expand and limits
and then rounding and offsetting limits to simulate the effect I wanted
originally.

> *working solution 2: specifying all major breaks explicitly (see attached
> PNG)*

If the first solution is baroque, this seems rococco: to work with
varied data sets, I have to calculate the breaks explicitly from
whatever data I use.

Neither seem consistent with the general way other scales that I've used
work. It seems quite arcane and undocumented compared to most of
ggplot2. I'll probably use the first, submit a documentation request to
explain it, and hope it turns into a change in ggplot2.

Thanks to both of you for engaging with this. I've learned a few
things.

sherif

unread,
Oct 6, 2015, 7:32:40 AM10/6/15
to ggplot2, meye...@gmail.com
It's a bit over two years later, but I had been running into this same problem and it had been driving me crazy for hours. This was one of two threads I found discussing it.

I found a solution to my case, so I thought I'd share what I did in case it is helpful for anyone else in this thread, past or future.

What fixed the offset for me was setting the "tz" value in "date_format",

So instead of:

...
    scale_y_datetime(breaks=date_breaks("1 hour"),
                     labels = date_format("%H:%M"),
                     limits = range(keys7$ytime),
                     expand = c(0,60)) +
...
Try

...
    scale_y_datetime(breaks=date_breaks("1 hour"),
                     labels = date_format("%H:%M", tz="America/Toronto"), # <======
                     limits = range(keys7$ytime),
                     expand = c(0,60)) +  
...
And here's a link to a github issue that I suspect might be the same issue I ran into, where I posted a more elaborate comment:
Reply all
Reply to author
Forward
0 new messages