Active versus static schedule data in feeds - service id usage

63 views
Skip to first unread message

T Sobota

unread,
Sep 28, 2009, 10:31:17 AM9/28/09
to Google Transit Feed Spec Changes
Due to an error message I reported as Issue #145, and resultant
discussion, Tom Brown has recommended that I attempt to offer up a
feed extension to resolve my issue. The full discussion surrounding
issue #145 can be accessed on the thread :<http://groups.google.com/
group/googletransitdatafeed/t/e504652d4dfba90f>.

In sum, what I am seeking the GTFS feed spec to support is trip ids in
the trips.txt table that are keyed to a service id that does not
happen to fall within the date range of the calendar_dates.txt file.

Operationally, these are holiday service trips that are "scheduled" as
part of the overall database for a three-month driver pick period
(i.e. date range of a GTFS feed), but will not actually operate in
that date range since none of the dates in the range are a day on
which we substitute holiday service for weekday service, etc.

For a GTFS feed, seen specifically from the "active" trip planning
engine perspective - I understand presently tagging trip ids with an
unused holiday service id as an [error]. But from the perspective of
a "static" service profile - it is imperative that these trip ids be
present in the feed. One specific implementation is the Timetable
Publisher application - which generates static (printable/web)
schedules for a transit property's website or printed schedule book -
based off the information contained in a GTFS feed. If trip ids with
holiday service ids were surpressed - a transit property would be
unable to generate the static schedules (for holidays) they might
publish on their website, or send to a printer for publication in a
booklet format.
I further brought forth the concept of using a GTFS feed as a "static"
compository of system data - in the sense that third party
applications could tabulate frequency of service to a bus stop under
various service scenarios and draw comparisions, etc. or do other such
analysis.

As far an extension proposal, in light of Tom's comment, I would first
propose a new text file type "services.txt" which would have for
fields - at a minimum:
"service_id" (same as used in trips.txt);
"service_name" (logical name for service, otherwise clarifying
numerical or abbreviated service_id term used)
"service_desc" (optional description of service, such as Holiday
service operates on Christmas Day, New Years Day, etc.)
as well as any other fields needing to be carried over from other GTFS
extensions (i.e. "agency_id")

From a feed validation perspective - I would request that trip ids
with a service id not contained in calendar_dates.txt be flagged as
just a [warning], not an [error].

I welcome other's review of this extenstion proposal - both to form
and content.

Much Thanks

Tim Sobota, Metro Transit (City of Madison)

KevinChicago

unread,
Sep 28, 2009, 1:28:46 PM9/28/09
to Google Transit Feed Spec Changes
It took me some time to navigate through this thread to understand the
problem, and I'm still not 100% understanding every nuance. We in
Chicago have complex scheduling, with holidays and quarterly picks
(that are different for our bus versus our rail), and special school
runs that only run on Tuesdays, and a scheduling tradition that starts
and ends the day at 3am, not midnight, and 24 hour services that have
to work within those confines. The whole calendaring issue was
actually one of the most difficult part of GTFS for us, so I take an
interest in any proposed changes, as well as I appreciate hearing
about other city's viewpoint on the matter, because we're all going to
be very different, and we can learn from each other (and Google can
learn from us). With all that said, we got our data to work with
GTFS, so I'm surprised to hear a situation where it doesn't work.

I do have a clarification to ask of you, first, Tim. You said "...to
support trip ids in the trips.txt table that are keyed to a service id
that does not happen to fall within the date range of the
calendar_dates.txt file." and then later: "...request that trip ids
with a service id not contained in calendar_dates.txt be flagged as
just a [warning], not an [error]." What confuses me is that a
service_id doesn't need to be in the calendar_dates.txt file right
now, only the calendar.txt file. You also referred to "date range" in
the calendar_dates.txt file. There isn't a range in
calendar_dates.txt file, just individual dates. Did you mean
calendar.txt file in these sentences or calendar_dates.txt?

So to clarify why I'm not understanding the problem 100%, yet. Why is
the following not acceptable?
- in calendar.txt: service_id=regular, start_date=20090301 and
end_date=20090524.
- also in calendar.txt: service_id=holiday, start_date=20090301 and
end_date=20091231, and the Monday thru Sunday fields are all 0.
- in calendar_dates.txt: service_id=holiday has exception_type=1 for
20090525, 20090704,20090907, etc.

Kevin

ps: We quite often will post an end_date value that is past the last
date of our current scheduling pick because we made the decision it's
worse to not have data past a certain date then to present data that
may change slightly. The reality is that not only will the trips
possibly change, but so may the routes and the stops. There's a lot
of unknowns weeks and month out that will never make it into the GTFS
and, yes, could potentially misinform a customer. But I believe there
is some expectation on the part of the customer that if they ask for
directions many weeks out, they probably should check back closer to
their travel date to see if anything changed. But I appreciate that
part of the point you're making isn't about Trip Planning
functionality, but other "static" uses, so you may very well be on to
something, so let's keep this conversation going.

T Sobota

unread,
Sep 28, 2009, 2:29:39 PM9/28/09
to Google Transit Feed Spec Changes
Kevin-

Reviewing your message and feedback - I note I should clarify the
following.

I am building a feed using the "alternate" method noted under the spec
for calendar_dates.txt:

"Alternate: Omit calendar.txt, and include ALL dates of service in
calendar_dates.txt. If your schedule varies most days of the month, or
you want to programmatically output service dates without specifying a
normal weekly schedule, this approach may be preferable."

Semantically then... while there is no defined "date range" in the
calendar_dates.txt file - there is a start date (with service ids
added) and end date (with service ids added)... and in our situation -
none of the dates listed had the holiday service id added.

I have not attempted what you partially reference as a work-around (at
least for the error message part) - as far as maintaining the
alternate method above, but then (violating text) having a
calendar.txt file with (just?) the holiday service id in it and zeros
for each day.

Where you not the vagarities of "weekday" service (i.e. extras), we
have similar complexitites that prompted me to use this alternate
method of calendaring services/trips.

Going beyond the actual driver pick period was also an option
discussed between Tom Brown and myself... to which I rose the same
complication of routes potentially changing if a user were to plan an
"active" trip itinerary off such a feed.

Tom Hixson

unread,
Sep 28, 2009, 3:46:54 PM9/28/09
to Google Transit Feed Spec Changes
I believe the spec already handles this (although its implementation
may not). To create a service calendar (which service on which date)
you start with the calendar.txt template then apply the
calendar_dates.txt exceptions. The spec should assume 1) BOTH are
optional, and 2) Any trip whose service is not used (has no dates)
should be ignored (other than the warning it gets now). If you don't
know when to use something, don't use it.

So, if you use calendar.txt you might set all days of the holiday
service to 0 and have no entry in calendar_dates.txt, like Kevin
says. Here's the trick: This 'all days=0' entry is the same as a
service with no entry at all. That is, a 'zero days' service
(explicit) should be equivalent to a 'null' service (omitted)--in
either case the resulting service calendar is empty. It there are no
exceptions to this 'nothing' (no entries in calendar_dates.txt), the
service ends up with no dates at all, and any trips with that service
should just be ignored.

Tom
Sacramento
> > something, so let's keep this conversation going.- Hide quoted text -
>
> - Show quoted text -

KevinChicago

unread,
Sep 29, 2009, 10:33:52 AM9/29/09
to Google Transit Feed Spec Changes
So I guess Tim's basic proposal should still be taken at face value:

If the Trips.txt file has a service_id that is not in either
calendar.txt or calendar_dates.txt file, the validator should give it
a warning, but not an error. If the spec currently accepts the
"all_days=0" solution, then why not just accept it when the service_id
is omitted? I think I agree that there's no difference. Allowing
"non-calendared" service_ids to remain in a data feed could have other
uses, so why not? Trip Planners require calendared data; so if
there's no calendar for a trips, don't schedule those trips. Seems
simple.

To try and take an opposing viewpoint, the only thing I can think of
is that the owner of Trip Planner app (like Google Maps) may be wary
of allowing more warning-only data through in the sense that if an
agency forgets to validate and unintentionally leaves out some
service_ids, there is a partial headache on the part of the app-owner
to deal with the effects. At this point, that's the only downside I
see; but I don't think it's a good enough reason.

I have not thought through what other "static" uses are out there that
might counter what Tim is proposing. For example, what if there's a
route-pattern that only runs on holidays (maybe some street is closed
or something--just hypothetical here). Let's say there's a bus stop
that gets service only on holidays. Should Google Maps still map
those stop locations in its map tiles? What is the logic behind which
stops get mapped?*

* as a sidebar example: I would prefer that any stop_id that always
has a pickup_type=1 (no pickups allowed) in the stop_times.txt file
never get mapped in a static map. I want it used in a trip-planner,
but not in a static map app. The GTFS spec never really discusses
much the mapping part of Google Transit/Maps.

Tom Hixson

unread,
Sep 29, 2009, 1:09:04 PM9/29/09
to Google Transit Feed Spec Changes
Here's another reason for including unused trips. Our schedulers
create a sign-up by copying the previous one. They want the holiday
service to just come along, 'unhiding' it when appropriate.
Extracting the holiday service from the old sign-up where it was last
used is a hassle for them.

The reason for the warnings is to remind the agency not to forget it!

All stops should be mapped, but the display needs a time dimension
(like 'today') that constrains the stop bubbles schedules and the
transit layer patterns. This is an issue even without holidays--if
today is a weekend does Maps show weekday patterns and schedules?

We use 'no pickup' for end-of-route stops where the driver can run hot
("head for the barn"). This keeps them out of the trip planner, but
the driver will stop if flagged. Also, drop-offs might like to see
where they're being dropped. So I'd say only no display only if no
pickup and no dropoff (as in "control point").

The icons should only mark the actual signs, which is fairly static.
The service at those stops, which is dynamic, should be displayed in
other ways.

Devin Braun

unread,
Oct 1, 2009, 6:33:17 PM10/1/09
to Google Transit Feed Spec Changes
I also vote for service_ids and resultant trip_ids not being used in a
feed being flagged as a warning.

The workaround listed with calendar.txt having all zeroes and then
applying the service_ids via calendar_dates.txt is tricky and wouldn't
be intuitive to non-programmer types unless they asked.

My own selfish reason for changing this to a warning is due to special
event service. It is easier just to keep the special service in the
feed all the time for scheduling system exports - and then just
activate the service_ids when needed. Another proposed change to the
merge.exe tool could help this issue - merging two feeds with
overlapping effective dates (which is currently illegal with the
tool). One can keep the special event or holiday service in a
separate feed and then merge the current and special feed together for
posting.

But in the case of static scheduling requirements, I agree that all
service and service_ids should be allowed in the files with just a
warning. Timetablepublisher would definitely require that.

Devin Braun
San Diego MTS

Tom Brown

unread,
Oct 2, 2009, 9:54:37 PM10/2/09
to gtfs-c...@googlegroups.com
Kevin's solution of a calendar.txt row with 0 for every day is a good way for to add trips that don't run in the current period without violating the current version of GTFS. You should get a warning that the service_id never runs which I think is reasonable.

Two changes to GTFS have been suggested in this thread.

1) permit trips.txt to contain rows with service_id not in calendar.txt or calendar_dates.txt

It sounds like adopting this will make producing data easier, an important goal of GTFS. Devin and Tim, you both mentioned TimeTablePublisher as a consumer that makes use of trips with a service_id not found in calendar.txt or calendar_dates.txt. I have a few questions.

Does TTP treat trips with a service_id found in the calendar files differently from those without a calendar? How should the two catagories of GTFS consumers, "active" trip planners and "static" service profilers, interpret a trip without a defined calendar?

A small advantage of requiring trips.txt service_id values to appear elsewhere is it provides a check against errors. If trip planners drop these trips then adding a space character at the end of a service_id could drop one or many trips with just a warning. I think the only other places that such a small error could change the meaning of a file without making it invalid is zone_id and block_id. Extranous spaces appear enough that they deserve their own warning, http://code.google.com/p/googletransitdatafeed/issues/detail?id=169
Maybe a warning is enough.


2) add services.txt that contains a service_id and values associated with the service

Before this can be adopted it should be demonstrated in use by a producer and consumer. TTP might be a good candidate consumer.

Must every service_id in trips.txt appear in at least one of services.txt, calendar.txt or calendar_dates.txt? Must every service_id in services.txt be used by at least one trip?

T Sobota

unread,
Oct 5, 2009, 2:15:54 PM10/5/09
to Google Transit Feed Spec Changes
Thanks for the comments.

1. I would note that the underlying reason I filed the initial error
message complaint was the factor cited by Tom Hixon on 9/29 - that
schedulers copy databases forward (and then make any modifications).
Our bus driver contract requires that they have four opportunities
each year to select new work (routes to drive) - which means quarterly
"re-affirmation" of the schedule database (whether anything actually
changes or not). Our scheduling software package makes it very easy
to copy databases forward and slap a new label on them, so to the
extent that scheduling software serves as the source for generating
the GTFS feed - it would require filtering out unused services before
publishing a feed.

2. As noted, I am open to suggested work-arounds playing with zero
values in the calendar.txt file. However, I remind that the feed spec
does state that usage of just a calendar_dates.txt file is
permissible... so continuing down such a path would require reworking
the feed spec pages (including language as to how to formulate the
workaround necessary).

3. Per Tom Brown's message of 10/2... TimeTablePublisher (in my
observation) only pays attention to what the user dictates it pay
attention to. From the perspective of the app producing static
schedule output (i.e. HTML pages or PDF files), the user essentially
must pre-define query variables with which to parse the whole of GTFS
data. (i.e. for all trips on Route x with service id "wkd", list
those in a schedule output labeled "weekday service schedule")

4. Tom Brown's message also sought clarification of the proposed
services.txt file. Not recalling right now if it was Tom who
suggested I post to this forum (and to include such an extenstion) -
but regardless... given Tom's narrative - I could see services.txt
being the error-trigger for trips.txt (if a trip has a service not
enumerated in services.txt - that is an error... rather than comparing
back to calendar or calendar_dates). I would venture to clarify that
a service in trips must also appear in services, but that a service
need not be used in trips. Finally, I see services.txt as a value-
added table - to the extent the feed generator can populate some
beneficial text descriptors (generically "Holiday service operates on
these dates")

On Oct 2, 8:54 pm, Tom Brown <tom.brown.c...@gmail.com> wrote:
> Kevin's solution of a calendar.txt row with 0 for every day is a good way
> for to add trips that don't run in the current period without violating the
> current version of GTFS. You should get a warning that the service_id never
> runs which I think is reasonable.
>
> Two changes to GTFS have been suggested in this thread.
>
> 1) permit trips.txt to contain rows with service_id not in calendar.txt or
> calendar_dates.txt
>
> It sounds like adopting this will make producing data easier, an important
> goal of GTFS. Devin and Tim, you both mentioned TimeTablePublisher as a
> consumer that makes use of trips with a service_id not found in calendar.txt
> or calendar_dates.txt. I have a few questions.
>
> Does TTP treat trips with a service_id found in the calendar files
> differently from those without a calendar? How should the two catagories of
> GTFS consumers, "active" trip planners and "static" service profilers,
> interpret a trip without a defined calendar?
>
> A small advantage of requiring trips.txt service_id values to appear
> elsewhere is it provides a check against errors. If trip planners drop these
> trips then adding a space character at the end of a service_id could drop
> one or many trips with just a warning. I think the only other places that
> such a small error could change the meaning of a file without making it
> invalid is zone_id and block_id. Extranous spaces appear enough that they
> deserve their own warning,http://code.google.com/p/googletransitdatafeed/issues/detail?id=169
Reply all
Reply to author
Forward
0 new messages