LIRR Train Numbers in GTFS data?

151 views
Skip to first unread message

techs...@braynesoft.com

unread,
Dec 31, 2010, 11:12:19 AM12/31/10
to mtadeveloperresources
How does one identify the TRAIN #, as posted on the paper timetable,
within the GTFS data?

trip_id, not block_id, contain the Train # in the LIRR data set like
it is in the MNR data set.

Is there either (a) a decoder ring, or (b) a way to have the data
adjusted to include the Train #?

EXAMPLE:

Babylon Timetable, PENN STATION->to->BABYLON
Train # 6000 on paper timetable - is coded as - trip_id = GO203_102 in
stop_times.txt

Thank you.

Brett & Wayne

Joe Hughes

unread,
Jan 4, 2011, 5:35:32 AM1/4/11
to mtadevelop...@googlegroups.com
According to the GTFS specification [1], trip_short_name would be the
appropriate field for LIRR to use for the train number:

"The trip_short_name field contains the text that appears in schedules
and sign boards to identify the trip to passengers, for example, to
identify train numbers for commuter rail trips."

Joe

[1] http://code.google.com/transit/spec/transit_feed_specification.html#trips_txt___Field_Definitions

Michael B. Justice

unread,
Jan 4, 2011, 5:57:28 AM1/4/11
to mtadevelop...@googlegroups.com
The LIRR doesn't announce or post trains by train number. The only time customers see them is in the large timetables, or their HTML representation on the website.

LIRR passengers typically identify their train by the departure time and location of the terminal that is the end of their travel direction. Examples would be the "4:20 to Hempstead" or the "5:23 to Babylon". The LIRR posts trains in a similar manner and relies on signage and announcements to identify individual stops and transfers.

There are obvious issues with this when traveling west to New York (City), but it's a culture that's been in-place for decades.

MJ

Wayne

unread,
Jan 4, 2011, 7:52:03 AM1/4/11
to mtadevelop...@googlegroups.com
Understood, and we've considered this as an acceptable alternative to the way we implement the data store.

However, the GTFS files don't contain a trip_short_name; it's not included. Is it expected to be populated in a future release of the data?

Thank you.

Wayne

Wayne

unread,
Jan 4, 2011, 8:46:40 AM1/4/11
to mtadevelop...@googlegroups.com
We're asking from the perspective of development. That is, if the agency (Metro-North, LIRR, etc.) already carries a unique identifier like trip_id (e.g. train number) then we'd prefer to leverage the agency's id rather than craft our own. Crafting our own is very easy to do, but why deviate from the authoritative source?

Our request is specific to the way we handle the data store, rather than how the customer may or may not use it.

Apologies for not being clear in my previous post ;-)

Thank you.

Wayne

Michael B. Justice

unread,
Jan 4, 2011, 8:54:35 AM1/4/11
to mtadevelop...@googlegroups.com
Many moons ago I worked on a project that tracked (LIRR) train movements and found that even among their printed matter that all of the train numbers weren't printed. We wound up creating our own train IDs and storing/displaying their numbers if and when we could find them in the printed timetables. While it would be nice to see this information in the GTFS I am not holding my breath. :)

The only comprehensive source for LIRR train numbers I've ever seen is an employee timetable.

MJ

Doug Kelly

unread,
Jan 5, 2011, 8:50:32 PM1/5/11
to mtadevelop...@googlegroups.com
The train number is in the trip_id, in most cases.  Train 2 (weekday Babylon local) on the paper timetable is coded as trip_id = GO203_102.  Train 6000 (weekend Babylon local) is trip_id GO203_106000.  I just chop off the "GO203_10" (or whatever GO number with each new timetable) and use what's left--it's "good enough" for my purposes.  (the previous timetable also had variations for Shea^H^H Citi Field service, so if you wanted the whole timetable you couldn't just chop it off like I did)
Some numbers don't match up exactly.  There are no revenue trains in the 3000s, yet there are trips for GO203_103xxx.
See also:

Wayne

unread,
Jan 6, 2011, 9:49:48 AM1/6/11
to mtadevelop...@googlegroups.com
Doug,

We owe you a beverage! 

It's very obvious, now that you've spelled it out ;-) This is very helpful.

Thank you.

Wayne & Brett

Joe Hughes

unread,
Jan 6, 2011, 10:14:10 AM1/6/11
to mtadevelop...@googlegroups.com
Out of curiosity, if you just want to use the ID for internal database
representation, why does it matter that it matches the train number on
the schedules?

While it may work in this case at the moment, it's generally a bad
idea to pull any particular meaning out of the GTFS "*_id" fields,
since the spec treats them as opaque identifiers that the agency could
change at any time. If you're looking for correspondence with the
published train numbers, I'd encourage the agency to populate the
trip_short_name field in the longer term.

Cheers,
Joe

Wayne

unread,
Jan 6, 2011, 11:07:12 PM1/6/11
to mtadevelop...@googlegroups.com
You're missing the point, partly because I'm not going to disclose how we developed out data structures, testing procedure, and application engine.

As I stated in a previous post, it's easy to craft a unique ID, but prefer not if one exists from the authoritative source (agency.) Another benefit of using the TRAIN # is in cross checking the application against the paper schedule; the TRAIN # is the only unique ID for a given trip (column.) The TRAIN # is also handy if one wants to assign attributes, like peak/off-peak or club car, to a TRAIN # (a.k.a. trip.)

As for a bad idea, lacking any other unique ID for the trip, Doug's suggestion will work fine in a rules-based environment; easily changed if the agency decides to alter the content.

I agree w/you that the agency should populate the data. Maybe I incorrectly assumed that someone from LIRR's Dev group monitored this list ;-)

Thanks for the feedback Joe.

Wayne

Reply all
Reply to author
Forward
0 new messages