proposal: allow for service periods to be defined by multiple calendar ranges in calendar.txt

164 views
Skip to first unread message

Joe Hughes

unread,
Mar 11, 2008, 1:11:47 PM3/11/08
to Google Transit Feed Spec Changes
Summary:
This proposal discusses allowing the same service_id to appear in
multiple rows of calendar.txt, in order to allow a service period to
be defined by multiple disjoint date ranges.

Motivation:
Some systems (particularly in Europe) have both "normal" and
"vacation" service, with the latter occurring for several periods
during the year. Currently, GTFS allows for service periods to be
defined using a single date range (in calendar.txt) and/or adding/
removing individual days to a service period (in calendar_dates.txt).
For a system with multiple extended "vacation" periods during the
year, the only option currently is to list all the individual vacation
dates, which is unnecessarily verbose.

Proposal:
The semantics of calendar.txt would be adjusted to allow for the same
service_id to appear in multiple rows in order to define multiple date
ranges for a single service_id. These date ranges should not overlap;
in the case of overlap between two date ranges, a parser should
discard the one which occurs later in the calendar.txt file. If the
same service_id is used in both calendar.txt and calendar_dates.txt,
the individual date additions and removals in calendar_dates.txt would
be applied to the date ranges as they are currently.

Implications for current users:
Existing feeds would remain valid with this change. Parsers would need
to be updated to expect multiple date ranges.

Thoughts?

Joe Hughes
Google

Tom Brown

unread,
Mar 11, 2008, 2:52:13 PM3/11/08
to gtfs-c...@googlegroups.com
On Tue, Mar 11, 2008 at 10:11 AM, Joe Hughes <joe.hug...@gmail.com> wrote:
ranges for a single service_id. These date ranges should not overlap;
in the case of overlap between two date ranges, a parser should
discard the one which occurs later in the calendar.txt file. If the
same service_id is used in both calendar.txt and calendar_dates.txt,
the individual date additions and removals in calendar_dates.txt would
be applied to the date ranges as they are currently.

On the whole, this looks good. But why burden parsers with checking for overlapping ranges when reading calendar.txt? Validation should fail for overlapping ranges. General parsers should either be strict like the validator and fail to load overlapping ranges or be loose and treat overlapping ranges as both active (probably what the feed creator intended).

Joe Hughes

unread,
Mar 11, 2008, 3:02:19 PM3/11/08
to gtfs-c...@googlegroups.com

My goal here is that the expected behavior of parsers is as
well-defined as possible, especially in edge cases. (In practice, we
should expect that people want to write parsers that give useful
output for as much input as possible.)

I'll count your message as a vote for an alternate interpretation of
this edge case: that overlapping ranges should be allowed, and that
the resulting service period has the union of the active days defined
by all the ranges with that service_id. (Any additions and removals
of dates in calendar_dates.txt are then performed on that union.)

Joe

Marc Ferguson

unread,
Mar 11, 2008, 3:11:52 PM3/11/08
to gtfs-c...@googlegroups.com

If you allow overlaps then you run into the possibility that there will be multiple nearly identical trips "active" that are intended to be mutually exclusive.  The times might be slightly different or the stops used may be different because of the "vacation" service.  When a trip is planned on the day where overlap occurs you run the risk of getting a trip that doesn't really run.

I vote for no overlaps

Marc Ferguson
marc.f...@trapezegroup.com
757-961-9224 x304




"Joe Hughes" <joe.hug...@gmail.com>
Sent by: gtfs-c...@googlegroups.com

03/11/2008 03:02 PM

Please respond to
gtfs-c...@googlegroups.com

To
gtfs-c...@googlegroups.com
cc
Subject
[gtfs-changes] Re: proposal: allow for service periods to be defined by multiple calendar ranges in calendar.txt


Joe Hughes

unread,
Mar 11, 2008, 3:17:49 PM3/11/08
to gtfs-c...@googlegroups.com
Keep in mind that what we're discussion is overlap between ranges
specifying a single service period, not ranges specifying different
service periods. Correct me if I'm wrong, but I don't believe that
the case you mention would occur within a single service period.

Joe

Aaron Antrim

unread,
Mar 11, 2008, 3:29:53 PM3/11/08
to gtfs-c...@googlegroups.com
Joe: I support the proposed change.  As an anecdote, at one point I assumed this is how the spec for calendar.txt worked, and then one of my feeds failed validation.  Several clients are in university towns, and they have differ school calendar schedules and vacation schedules.

Here's what my web-app does: Weekly service schedules (Mon-Sun) are assigned to annual service schedules which are defined by multiple data ranges.  When it exports a GTFS, a unique service_id is defined for each combination of annual and weekly service schedules.  This results in a lot of what are basically duplicated trips and stop times, each assigned to one of these service_ids.

I'll also offer my vote for Tom's suggestion, that in cases of service period overlap, the service period include the union of the overlapping service periods.

mikeness

unread,
Mar 12, 2008, 8:11:21 AM3/12/08
to Google Transit Feed Spec Changes
Joe

Another way of achieving the same effect of compressing file size (and
complexity of generating the file) would be to add a new field (end
date of range) to calendar_dates.txt which would allow it to hold a
range of exceptional dates, rather than the single exceptional date as
now. The range could still be additional dates or no-service dates as
now and would default to a single date if only one date was specified.
At the moment our calendar.txt file is 5Mb and calendar_dates.txt is
23Mb which would substantially reduce if ranges were allowed.

As a point of interest should dates in calendar_dates.txt be within
the range defined in calendar.txt or can calendar_dates.txt be used to
extend the range of service validity. Equally does an additional date
in calendar_dates.txt which is not one one the defined days (Monday to
Sunday) overide the defined days. Are dates and days considered as
'and' conditions or as 'or' conditions. The specification does not
make this clear at the moment.

Joe Hughes

unread,
Mar 12, 2008, 11:25:48 AM3/12/08
to gtfs-c...@googlegroups.com
Thanks for your comments, Mike.

On Wed, Mar 12, 2008 at 5:11 AM, mikeness <mike...@dsl.pipex.com> wrote:
> Another way of achieving the same effect of compressing file size (and
> complexity of generating the file) would be to add a new field (end
> date of range) to calendar_dates.txt which would allow it to hold a
> range of exceptional dates, rather than the single exceptional date as
> now. The range could still be additional dates or no-service dates as
> now and would default to a single date if only one date was specified.

That's an interesting proposal. The main disadvantage of doing the
multiple ranges in calendar_dates.txt instead of calendar.txt is that
you lose the ability to specify days of the week, so if you're trying
to specify multiple ranges during the year that have a particular type
of weekend service, you'd need to do a saturday-sunday range for each
individual weekend, instead of having ranges a couple months long that
have the saturday & sunday flags set.

> As a point of interest should dates in calendar_dates.txt be within
> the range defined in calendar.txt or can calendar_dates.txt be used to
> extend the range of service validity. Equally does an additional date
> in calendar_dates.txt which is not one one the defined days (Monday to
> Sunday) overide the defined days. Are dates and days considered as
> 'and' conditions or as 'or' conditions. The specification does not
> make this clear at the moment.

You're right--we've iterated that wording in past revisions in an
attempt to improve this, but it's still not clear enough.
calendar_dates.txt modifies whatever range has been set in
calendar.txt, both in terms of adding service dates outside that range
and in removing dates from the range. It can also be used to
enumerate dates for a service_id on its own, without any corresponding
range defined in calendar.txt.

Joe

Jacques chez stibus

unread,
Mar 16, 2008, 6:35:14 AM3/16/08
to Google Transit Feed Spec Changes
Bonjour,

Indeed, European transit systems, and particularly French, are often
organized into two periods ("school activity" and "school
vacations")... In fact, many agencies even have three periods ("school
activity", "school short vacations" and "school long summer
vacations")... An extra period for "sundays and non worked holidays"
quite often completes the organisation of services.

This kind of organisation is mainly caused by the fact that school
transport is fully integrated (and is frequently an important part) in
our activity.

As for SEMITIB-Stibus (Maubeuge, France), our services are alternating
all along the year between normal "school activity" and recurrent
"school vacations" periods. (That is why our annual timetable begins
in september...)

To get an adequate GTFS feed, I have been forced to create several
fake periods which contain strictly the same informations and are
defined by couples of start/end dates that we know for the whole year
(september to august). "Sundays and holidays" period is defined for
the entire year and is normally applied for sundays (sunday=1 in
calendar.txt); non worked holidays are dealt as exceptions in
calendar_dates.txt.

That works certainly well... But we get a heavy, dowdy and complicated
file! Any modification done in timetables must be reported in every
concerned fake period... Opportunities to make mistakes are
multiplied!

Allowing multiple date ranges is a much more elegant and efficient way
to deal with this problem.

As for overlapping of dates, it seems to be a good idea that the
resulting service period includes the union of the overlapping service
periods. If it is unnecessary, I guess that people in charge of
planning in agencies would be careful about it!

Cordialement.
Jacques Lys

Aaron Antrim

unread,
Oct 13, 2009, 3:23:24 PM10/13/09
to gtfs-c...@googlegroups.com
I see that allowing multiple date ranges with the same service_id is still an open proposal (http://groups.google.com/group/gtfs-changes/browse_frm/thread/d2ab2f43462c1714/08af018a45184d3a?lnk=gst&q=service_id#08af018a45184d3a).

From looking at the previous discussion, it looks as if there is agreement that this proposal should be incorporated into the GTFS, taking the approach that parsers will consider that the service period has the union of the active days defined by all the ranges with that service_id.

I'm interested in producing feeds that correspond to this GTFS change proposal.  Is this still planned to be included in a future update to the spec?

-Aaron


Reply all
Reply to author
Forward
0 new messages