GTFS-Realtime vs. REST/XML

732 views
Skip to first unread message

Bat-Erdene Gombosuren

unread,
Nov 23, 2013, 10:58:38 PM11/23/13
to gtfs-r...@googlegroups.com
Seems like both GTFS-Realtime and REST/XML are used for real time transit data. What are the advantages and disadvantages of these two? Which one is preferred? CTA in Chicago seems to have one of the best implemented systems, that allows you to get information via DIY Info Display, Map, and SMS. But following link indicates that CTA uses REST/XML. https://code.google.com/p/googletransitdatafeed/wiki/PublicFeedsNonGTFS

But can you do things like that with GTFS-realtime? Thanks!

Batuka

Brian Ferris

unread,
Nov 24, 2013, 11:25:55 AM11/24/13
to gtfs-r...@googlegroups.com
There actually a couple of interesting dimensions to consider when comparing GTFS-realtime to the various RESTful web-interfaces that agencies have cooked up vs other real-time data standards (eg. SIRI):

Bulk vs piece-wise access: GTFS-realtime is designed for bulk access to real-time data (eg. what's the status of every vehicle in the system right now?).  By comparison, many RESTful APIs provided by agencies are geared towards piece-wise access (eg. what's the status of the next couple of buses arriving at a particular stop?).  The RESTful API design reflects the typical applications consuming the API: station arrival boards, mobile apps giving next-bus-at-my-stop status, etc.  Those applications are great and provide immediate, obvious benefits to riders, so it's no surprise why agencies provide these kinds of APIs.  However, piece-wise APIs make some kinds of applications tricky to write.  For example, building a real-time trip planner using a piece-wise API is really tricky, since you really need a full picture of the system to do real-time routing.  Bulk access specifications like GTFS-realtime or SIRI (ET, VM, SX) more easily and efficiently support these types of applications.  Generally speaking, anything you can do with a high-level piece-wise API can also be done with a bulk-access API, but it might require a bit more effort on the part of the programmer.

Markup language: Many real-time data formats use an existing markup language to encode data - XML, JSON, Protocol Buffers, or perhaps one of their own invention.  Most RESTful APIs tend to be XML or JSON.  Protocol Buffers are used by GTFS-realtime.  Each has different trade-offs in terms of ease-of-use, verbosity, encode-decode speed, size, etc.  These trade-offs tend to become more evident when framed by the "bulk vs piece-wise" access tradeoff.

Semantics: Though you can spend life-times arguing the merits of XML vs JSON vs ProtoBufs vs other encoding schemes, when you get down to it, most of the actual semantic data coming from real-time systems looks largely the same.  At end of the day, they're all talking about the same underlying systems, so maybe that's not surprising.  As result, GTFS-realtime Trip Updates look a lot like SIRI Vehicle Monitoring message.  Ditto GTFS-realtime Alerts and SIRI Situation Exchange messages.  Many RESTful APIs look a lot like SIRI Stop Monitoring messages.  And all the static discover APIs look a lot like the data you get from a static GTFS feed (or maybe a NeTEx message at some point?).

tl;dr - There are different trade-off in real-time API design.  Almost every agency / real-time vendor in North America who publishes an API has approached those trade-offs differently and they've all got slightly different APIs as result (hurray!).  GTFS-realtime and some of the RESTful SIRI work that Mike Frumin has championed at NYC MTA are efforts to bring order to the madness, but they are new kids on the block so time will tell.

Me personally?  I want GTFS-realtime because it gives me the most flexibility to write cool transit apps and run them everywhere.  RESTful next-bus-at-stop APIs have their place, but the journey doesn't stop when your bus arrives to pick you up.  Having access to information about the entire transit system is critical for giving riders up-to-date information about their entire trip, not just the start.


--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/7a667c4f-1743-4d8d-ab3f-f4b6e2b3695e%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Frumin, Michael

unread,
Nov 24, 2013, 12:08:03 PM11/24/13
to gtfs-r...@googlegroups.com
Thank you sir. This is the best and simplest exposition I have seen of this topic which so many have a hard time understanding. Can you post this perhaps on the GTFS-realtime site, for example on a new page titled something along the lines of "GTFS-Realtime vs Other API's" ?

Ethan Arutunian

unread,
Nov 24, 2013, 1:58:48 PM11/24/13
to gtfs-r...@googlegroups.com

Bat-Erdene Gombosuren

unread,
Nov 24, 2013, 9:24:31 PM11/24/13
to gtfs-r...@googlegroups.com
Very useful information, thanks a lot. Will approach again if I need more. Thanks again.

B

Brian Ferris

unread,
Nov 24, 2013, 9:35:12 PM11/24/13
to gtfs-r...@googlegroups.com
I guess I'm not opposed, but I might need to expand a bit beyond NA-specifics and remove a bit of the snark ;)  Anyone else who would like to see this added to the official spec page?  Anyone who wouldn't?


Landon Reed

unread,
Nov 25, 2013, 11:54:43 AM11/25/13
to gtfs-r...@googlegroups.com
+1 on adding to GTFS-realtime site

I think GTFS-realtime could deinitely benefit from having a "layman's" explanation that gets to the purpose/principles of GTFS-realtime a bit more.

Stefan de Konink

unread,
Nov 25, 2013, 11:58:44 AM11/25/13
to gtfs-r...@googlegroups.com
+1 for translation this is *multiple* languages.
> To view this discussion on the web visithttps://groups.google.com/d/msgid/gtfs-realtime/05D59F448D427D46AB7582AD218
> 72875CDE60D2C%40NYCT2BWYEXMB15.transit.nyct.com.
> For more options, visit
> https://groups.google.com/groups/opt_out.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "GTFS-realtime" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to gtfs-realtim...@googlegroups.com.
> To view this discussion on the web visithttps://groups.google.com/d/msgid/gtfs-realtime/5b6d1022-5590-4916-948d-f3b
> c797fd663%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
> !DSPAM:1,52938101253076053219243!
>

Michael Smith

unread,
Nov 25, 2013, 12:11:21 PM11/25/13
to gtfs-r...@googlegroups.com
I agree that this info is extremely helpful and should be added to the GTFS-rt documentation.

I suggest that a couple of items be added to make it even more complete. For the Markup Language section I recommend including a bit on how the programming language used to parse the feed also effects the decision on which format is best. It is not just whether it is "bulk vs piece-wise" data (though that is likely the most important issue). For example, if Javascript is used then a protobuffer feed is probably not the best choice.

And another aspect to keep in mind is that with respect to XML and JSON one should most likely use a compressed feed for efficiency, especially when the feed might go directly to smartphones over a cellular connection. The availability of compression should be taken into account when comparing formats.

Mike


Stefan de Konink

unread,
Nov 25, 2013, 12:15:59 PM11/25/13
to gtfs-r...@googlegroups.com
On Mon, 25 Nov 2013, Michael Smith wrote:

> And another aspect to keep in mind is that with respect to XML and JSON one
> should most likely use a compressed feed for efficiency, especially when the
> feed might go directly to smartphones over a cellular connection. The
> availability of compression should be taken into account when comparing
> formats.

Please please it is never about compression it is about the cost of
serialisation and deserialisation of information. While the entropy of
this information remains equal.

Sean Barbeau

unread,
Nov 25, 2013, 12:41:42 PM11/25/13
to gtfs-r...@googlegroups.com
+1 for adding somewhere on the GTFS-realtime site.

Brian pretty much nailed the basics - if you're interested in pretty pictures to go along with this, I did a presentation on "Open Transit Data - A Developer's Perspective" at APTA 2013, which is mostly aimed at a layperson (e.g., transit agency) trying to understand these different data formats.

Here's the presentation from APTA 2013 on SlideShare:
http://www.slideshare.net/sjbarbeau/apta-transitech-2013-open-transit-data-a-developers-perspective

I also did a follow-up webinar on the same topic, the video for which is available online at CUTR's website (free to view and no key required, but you do have to supply contact info):
http://bit.ly/CUTRWebcastOpenTransitData

One thing I would add is that if an agency can supply both a bulk and piece-wise API ("firehose" and "faucet", as referred in my presentation), this is really the best of both worlds for developers.  Many casual mobile app developers don't want to have to run their own server to process bulk data formats.  If you have to pick one, then, as Brian says, the bulk is preferred, since you can derive piece-wise from bulk.

Finally, if you're providing a RESTful API for mobile apps, please use JSON instead of XML (see http://bit.ly/Perf_Eval_Data_Mobile_Devices for performance differences on Android).

Sean


On Monday, November 25, 2013 11:58:44 AM UTC-5, Stefan de Konink wrote:
+1 for translation this is *multiple* languages.

On Mon, 25 Nov 2013, Landon Reed wrote:

> +1 on adding to GTFS-realtime site
> I think GTFS-realtime could deinitely benefit from having a "layman's"
> explanation that gets to the purpose/principles of GTFS-realtime a bit more.
>
> On Sunday, November 24, 2013 9:35:12 PM UTC-5, Brian Ferris wrote:
>       I guess I'm not opposed, but I might need to expand a bit beyond
>       NA-specifics and remove a bit of the snark ;) �Anyone else who
>       would like to see this added to the official spec page? �Anyone
>       who wouldn't?
>
>
> On Sun, Nov 24, 2013 at 6:08 PM, Frumin, Michael <Michael...@nyct.com>
> wrote:
>       Thank you sir. This is the best and simplest exposition I
>       have seen of this topic which so many have a hard time
>       understanding. Can you post this perhaps on the
>       GTFS-realtime site, for example on a new page titled
>       something along the lines of "GTFS-Realtime vs Other
>       API's" ?
>
>       �
>       From: Brian Ferris [mailto:bdfe...@google.com]
>       Sent: Sunday, November 24, 2013 11:25 AM
>       To: gtfs-r...@googlegroups.com
>       <gtfs-r...@googlegroups.com>
>       Subject: Re: [GTFS-realtime] GTFS-Realtime vs. REST/XML
>       �
> There actually a couple of interesting dimensions to consider
> when comparing GTFS-realtime to the various RESTful
> web-interfaces that agencies have cooked up vs other real-time
> data standards (eg. SIRI):
> Bulk vs piece-wise access: GTFS-realtime is designed for bulk
> access to real-time data (eg. what's the status of every vehicle
> in the system right now?). �By comparison, many RESTful APIs
> provided by agencies are geared towards piece-wise access (eg.
> what's the status of the next couple of buses arriving at a
> particular stop?). �The RESTful API design reflects the typical
> applications consuming the API: station arrival boards, mobile
> apps giving next-bus-at-my-stop status, etc. �Those applications
> are great and provide immediate, obvious benefits to riders, so
> it's no surprise why agencies provide these kinds of APIs.
> �However, piece-wise APIs make some kinds of applications tricky
> to write. �For example, building a real-time trip planner using
> a piece-wise API is really tricky, since you really need a full
> picture of the system to do real-time routing. �Bulk access
> specifications like GTFS-realtime or SIRI (ET, VM, SX) more
> easily and efficiently support these types of applications.
> �Generally speaking, anything you can do with a high-level
> piece-wise API can also be done with a bulk-access API, but it
> might require a bit more effort on the part of the programmer.
>
> Markup language:�Many real-time data formats use an existing
> markup language to encode data - XML, JSON, Protocol Buffers, or
> perhaps one of their own invention. �Most RESTful APIs tend to
> be XML or JSON. �Protocol Buffers are used by GTFS-realtime.
> �Each has different trade-offs in terms of ease-of-use,
> verbosity, encode-decode speed, size, etc. �These trade-offs
> tend to become more evident when framed by the "bulk vs
> piece-wise" access tradeoff.
>
> Semantics: Though you can spend life-times arguing the merits of
> XML vs JSON vs ProtoBufs vs other encoding schemes, when you get
> down to it, most of the actual semantic data coming from
> real-time systems looks largely the same. �At end of the day,
> they're all talking about the same underlying systems, so maybe
> that's not surprising. �As result, GTFS-realtime Trip Updates
> look a lot like SIRI Vehicle Monitoring message. �Ditto
> GTFS-realtime Alerts and SIRI Situation Exchange messages. �Many
> RESTful APIs look a lot like SIRI Stop Monitoring messages. �And
> all the static discover APIs look a lot like the data you get
> from a static GTFS feed (or maybe a NeTEx message at some
> point?).
>
> tl;dr - There are different trade-off in real-time API design.
> �Almost every agency / real-time vendor in North America who
> publishes an API has approached those trade-offs differently and
> they've all got slightly different APIs as result (hurray!).
> �GTFS-realtime and some of the RESTful SIRI work that Mike
> Frumin has championed at NYC MTA are efforts to bring order to
> the madness, but they are new kids on the block so time will
> tell.
>
> Me personally? �I want GTFS-realtime because it gives me the
> most flexibility to write cool transit apps and run them
> everywhere. �RESTful next-bus-at-stop APIs have their place, but
> the journey doesn't stop when your bus arrives to pick you up.
> �Having access to information about the entire transit system is
> critical for giving riders up-to-date information about their
> entire trip, not just the start.
>
>
> On Sun, Nov 24, 2013 at 4:58 AM, Bat-Erdene Gombosuren
> <batu...@gmail.com> wrote:
>       Seems like both GTFS-Realtime and REST/XML are used
>       for real time transit data. What are the advantages
>       and disadvantages of these two? Which one is
>       preferred? CTA in Chicago seems to have one of the
>       best implemented systems, that allows you to get
>       information via DIY Info Display, Map, and SMS. But
>       following link indicates that CTA usesREST/XML.�https://code.google.com/p/googletransitdatafeed/wiki/PublicFeedsN

Kurt Raschke

unread,
Nov 25, 2013, 8:30:55 PM11/25/13
to gtfs-r...@googlegroups.com
Not to pile on, but, +1 for posting this somewhere widely available, and +1 for leaving the snark in.

(maybe also mention that rate-limiting your API to the point that developers can't keep data fresh is not helpful?)

-Kurt
Reply all
Reply to author
Forward
0 new messages