MTA trip_id realtime vs static GTFS

383 views
Skip to first unread message

Ian Dixon

unread,
Jan 27, 2015, 9:14:45 AM1/27/15
to onebusaway...@googlegroups.com
I'm trying to hook up GTFS realtime to an MTA installation, but I always get "Unknown trips/Total Trips x/x", all being unmatched. Comparing the the trip_ids from the decoded realtime response against the GTFS static trip_ids reveals that the static trip_id is preceded by the service_id. And even stripping that off doesn't give me reliable matches of static trip_id to realtime trip_ids.

Anyone have any idea what the issue is? 

Sean Barbeau

unread,
Jan 27, 2015, 2:02:55 PM1/27/15
to onebusaway...@googlegroups.com
Ian,
Can you post links to the GTFS/GTFS-rt data you're using?

Sean

Ian Dixon

unread,
Jan 27, 2015, 2:10:44 PM1/27/15
to onebusaway...@googlegroups.com

Sean,

And this for the real time. This is for Subway data. http://datamine.mta.info/mta_esi.php?key=<api_key>&feed_id=1

An example of the problem I'm seeing is as follows:

The realtime feed gives me a trip_id of 284150_6..S01X004. No such trip_id exists in the static data. I can find plenty of *_6_S01* , but no way to marry up with a particular trip.

Thanks for taking a look.

Ian

Barbeau, Sean

unread,
Jan 27, 2015, 2:19:30 PM1/27/15
to onebusaway...@googlegroups.com

Ian,

Have you tried posting about this on the MTA Developers Google Group?

 

https://groups.google.com/forum/#!forum/mtadeveloperresources

 

You might get more mileage there.  If you do(did), please post link to that discussion here so others can follow it.

 

But, from a quick look at the MTA docs:

http://datamine.mta.info/sites/all/files/pdfs/GTFS-Realtime-NYC-Subway%20version%201%20dated%207%20Sep.pdf

 

…it seems like this is an implementation limitation – starting on page 4 – TripDescriptor section:

 

“The New York City subway is a 24 x 7 operations and as a result is a highly dynamic operation. The

majority of repairs and maintenance are performed during live operations so the daily service plan is

subject to both planned and unplanned changes. The result of this is that some trips defined in the

GTFS trips.txt may change (originating times, trip running times and trip path), cancelled or new trips

may be added.

 

Unfortunately, there is no reliable way for us to determine the relationship between the actual and the

static GTFS trip, so we can’t tell if a particular trip is the original one or has been changed or added

later so the ScheduleRelationship is not used.

 

While trip_id in the GTFS-realtime feed will not directly match the trip_id in trips.txt, a partial match

should be possible if the trip has been defined in trips.txt. If there is a partial match, the trip is a

scheduled trip.”

 

Examples follow that text.

 

OneBusAway is designed to be used with GTFS-rt feeds that have exactly matching trip_ids, so it likely won’t work out-of-the-box with this feed.  However, you could definitely modify the code to adapt and partially match trip_ids.

 

Sean

--
You received this message because you are subscribed to a topic in the Google Groups "onebusaway-developers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/onebusaway-developers/KD4Qijjbx8w/unsubscribe.
To unsubscribe from this group and all its topics, send an email to onebusaway-devel...@googlegroups.com.
To post to this group, send email to onebusaway...@googlegroups.com.
Visit this group at http://groups.google.com/group/onebusaway-developers.
For more options, visit https://groups.google.com/d/optout.

Sheldon A. Brown

unread,
Jan 27, 2015, 3:17:18 PM1/27/15
to onebusaway...@googlegroups.com
I'd love to know a bit more as to why you want to do this.

As you may know, the MTA doesn't have a traditional AVL system, so
GTFS-rt isn't an intended output. There are both very short and very
long answers as to why GTFS-rt isn't intended, but I can summarize by
saying it doesn't map cleanly to the source data.

Basically if you let us know what you are trying to do, we may be able
to give you a better way of going about it.

Sheldon
> --
> You received this message because you are subscribed to the Google Groups
> "onebusaway-developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an

Ian Dixon

unread,
Jan 29, 2015, 1:55:05 PM1/29/15
to onebusaway...@googlegroups.com
Sheldon,

Given a stop ID, I want to predict next arrivals, with realtime data if it's available, and scheduled data as a fallback. You hit the nail on the head. It doesn't seem possible to marry the two sources in a reliable way given the dynamic trip_ids from realtime.

Thanks,
Ian

Kurt Raschke

unread,
Jan 29, 2015, 9:46:16 PM1/29/15
to onebusaway...@googlegroups.com

On Thu, Jan 29, 2015 at 1:55 PM, Ian Dixon <idi...@gmail.com> wrote:
Given a stop ID, I want to predict next arrivals, with realtime data if it's available, and scheduled data as a fallback. You hit the nail on the head. It doesn't seem possible to marry the two sources in a reliable way given the dynamic trip_ids from realtime.

OneBusAway doesn't currently support the extensions used in the NYCT subway real-time feed.  That said, you don't necessarily need OneBusAway for the task described above (although some of the OBA project libraries will make the task easier).  There are two characteristics of the subway real-time feed (as described in the feed spec, http://datamine.mta.info/sites/all/files/pdfs/GTFS-Realtime-NYC-Subway%20version%201%20dated%207%20Sep.pdf) that lead to a somewhat different consumer implementation than for other GTFS-realtime feeds:

1. The feed completely describes service operating in a specified time period: In most GTFS-realtime feeds, the realtime feed is a layer on top of the static schedule—but for the subway real-time feed, the trips described in the feed completely replace static trips within a given time horizon, described in the trip_replacement_period field in the NyctFeedHeader extension to the FeedHeader message.

2.  TripUpdates completely describe trips: In a conventional GTFS-realtime TripUpdate message, a producer can provide relative or absolute times for one or more stoptimes along the trip, and it is up to the consumer to fill in the gaps (either by straight interpolation or a more complex prediction algorithm), which often necessitates referencing the static GTFS data.  By contrast, in the subway real-time feed, each trip contains absolute arrival times for every stop, so no reference to the static trip definition is required (which is good, as unscheduled trips created on-the-fly in ATS will by definition not exist in the published GTFS).

So, to accomplish what you've described above, you can apply a reasonably simple algorithm: if the time for which you want arrivals is within the trip_replacement_period for that route, return arrivals from the real-time data; if not (that is, if it's greater than the 30 minute horizon), then return arrivals from the static GTFS data.

Now, having said all that, there is a _completely wrong_ way to consume the feed in OneBusAway, which I implemented as a simple demonstration: https://github.com/kurtraschke/onebusaway-application-modules/commit/113a9b8da9f8b9a91d8b1810bdf776bd38dfd110

In the subway real-time feed, trip IDs are generally of the form "123550_1..S02R", which you may have noticed is just the end of the GTFS trip ID, which looks like "A20130803WKD_000800_1..S03R".  You can thus match _some_ of the trips in the realtime feed to the GTFS, but this is wrong for two reasons: first, it does nothing for added trips (which thus wouldn't be shown to data users), and second (perhaps worse from a usability perspective), it'll continue to show scheduled trips for which no real-time data is available, when in reality that scheduled trip isn't running if it has no real-time data.  Thus the proper implementation is to simply ignore the scheduled data within the trip_replacement_period horizon and use solely the real-time data.

(If you've made it this far, sorry this turned out so long...)

-Kurt

Michael Frumin

unread,
Jan 29, 2015, 10:24:42 PM1/29/15
to OneBusAway Developers

Worth stating for the record here -- the reason the subway feed is as kurt describes is (in part) because in the real world trips often get created (or modified) that aren't in the schedule (gasp!).  OneBusAway and GTFS are awesome things (just ask my early 30's) but they do not fully model the domain of real transit operations.

--

Tony Laidig

unread,
Jan 30, 2015, 3:23:44 PM1/30/15
to onebusaway...@googlegroups.com
Decoding Mike's reply:

The schedule is what is scheduled to run. Because of many external factors, what is scheduled to run is not always run (c.f. Snowpocalypse-mageddon 2015).
Reply all
Reply to author
Forward
0 new messages