Route Variants and Route Relationships

398 views
Skip to first unread message

Dave Barker, MBTA

unread,
Aug 14, 2018, 5:22:41 PM8/14/18
to General Transit Feed Spec Changes

Dear GTFS stakeholders,


At the MBTA we’re giving ourselves 6 weeks to add something to our GTFS feed to help clients better define the different variations of routes, and the relationships between routes. We’d much prefer to leverage existing GTFS extensions where possible; where not possible we expect to create something new. I wanted to ask what other challenges users have encountered in this area, and what existing approaches are out there.


It’s probably a strength of GTFS that it launched without variants (or variations or patterns), since that means there’s no need to think about them if you don’t want to. There are routes, and within routes there are trips, with no in-between grouping to worry about. But as we build more systems around GTFS we’re encountering a pattern of needs in this area that we’re not meeting. It doesn’t help that the MBTA has an unusually complicated network in which routes can have many variations.


Some user stories in this area we can’t currently address but might be addressable with with information about route variants or with information about the relationship between routes:


As a developer of an application that presents schedule information to users, I can…

  1. Show a shape and list of stops that is typical or representative of the route and direction.

  2. Produce a list of stops served by a route that excludes those only served by highly unusual trips that may only run once a day or once a week.

  3. Let a user select from a list of different route variants, presented as meaningful text descriptions, to show one on a map.

  4. Let a user switch from a view showing a variant in one direction to a view showing the same variant in the opposite direction.

  5. Tell a user, with a string, what is unusual about a particular set of trips (“bypasses Longwood.”) May also include a single-letter identifier of the "note" (i.e. “C” “C: Serves Carey Square.”)

  6. Include a public-facing short identifier for a trip without giving it a unique route_id that separates it from other trips on its route (showing a trip as “111C” but still listing it on the “route 111” page.)

  7. Identify branches, and the relationship between branches (i.e. best represent Green Line’s B, C, D, E service.)

  8. Show schedule information on closely related routes together (i.e. show routes 116 and 117 together since they mostly overlap.)

  9. Identify that a route exists solely in relationship to another route or routes (i.e. the Orange Line shuttle bus during a scheduled diversion of part of the heavy-rail Orange Line.)


With that to give a general idea of the space we’re interested in:


A. What needs do others have in this area? Does anything above resonate? What's missing?

B. What approaches (successful or unsuccessful) have others tried or seen to address challenges in this area?


Thanks for your input!

-Dave Barker, MBTA

PS: This came out of a needs assessment that led us to create a 6-month GTFS roadmap for ourselves, which you can see here: https://groups.google.com/forum/#!topic/massdotdevelopers/zdL0OnuC4Y8

Stefan de Konink

unread,
Aug 14, 2018, 5:33:51 PM8/14/18
to gtfs-c...@googlegroups.com
Hi Dave,

I first want to make a point of clarity which I would like you to resolve
in your text. Route in GTFS is commonly defined as "Line" in
Transmodel/NeTEx. Unlike the Route in Transmodel which is the Shape, and
the JourneyPattern which contain the stops in sequence and variants.

Since Route in GTFS is analogue to a line, it should contain variants.
Variants in GTFS fall through into the next grouping: the trip, which may
share the same or different shape_id.

--
Stefan

Dave Barker, MBTA

unread,
Aug 14, 2018, 6:28:01 PM8/14/18
to General Transit Feed Spec Changes
Thanks Stefan. You're correct, in this case when I say "route" I mean "a route as identified in GTFS by a route_id and defined in routes.txt." When I say "variant" I'm describing a transit concept, not something in GTFS currently. It's a concept that not all agencies define in the same way or use the same words for, but roughly speaking "an identifier for a path of travel and set of stops that a subset of a route's trips have in common." 

-Dave Barker, MBTA

Leo Frachet (MobilityData)

unread,
Aug 20, 2018, 9:19:55 AM8/20/18
to General Transit Feed Spec Changes
So, if we try to do a (partial) overview of what is already being used or being asked about those subjects:

## Regarding #1: Canonical shape or list of stops & identification of unusual patterns

• Transit (transit.app) internally uses the field `trip_exclude_in_route_view` on the `trip` to flag the « unusual » trips which should not be displayed when the full shape of the route is drawn.

• TriMet (trimet.org) internally uses `stop_route` to define the canonical order of the stops of a route, to sort them in a timetable for example.

• Bileto (bileto.com), in Czechia, told me that all rail schedules in the country should be flagged if they weren’t following the typical pattern. They flag them by adding « Výlukový » in the planned schedules.


## Regarding #3: Meaning full pattern description

• Johannes Rudolph posted 3 years ago about the need to have a trip_desc field for each trip, to add « specification informations for a specific trip to give information to the travelers », and Steven Judd also wanted it. Do we know with which datasets they were working? (Source: https://groups.google.com/forum/#!topic/gtfs-changes/BUBhQrpMqgQ)


## Regarding #5 and #6: public facing identifier and it’s definition

• TTC (ttc.ca) uses letters to distinguish between patterns, with sometime very specific ones (like the one at end-of-school-time which makes a detour to pick up student). E.g. for route 32, the branch will be « A », and this information will be in the `trip_headsign` field, as « 32A Eglinton West Towards… » (Detours of the 32: http://www.ttc.ca/Routes/32/Eastbound.jsp)

• NJ Transit (njtransit.com) also uses letters to distinguish between patterns, but not always put them in their GTFS. E.g. for route 114, the schedule defines two special patterns: X and m. But the GTFS only contains the X. Like for TTC, it’s encoded in the `trip_headsign` fields, which contains « 114X Bridgewater Express ».


## Regarding #8: Showing merged schedule information when the two routes are close one to another

• Metro LA (metro.net) runs short version of their routes during the week-end, which are often the common portion of different routes. Therefore the vehicle is at the same time running on the 35 and the 38. They represent that in the GTFS as another route, with the route short name « 35/38 ».


## Regarding #9: Identifying a route which solely exists in relationship to another route, like a shuttle route

• Kisio Digital has extended/modified the GTFS format, and they created another level on top of the routes, that they call the lines (lines.txt with line_id). This is allowing them to group different routes (a bit like a « parent route ») which are branded as one line. This allows to have distinct route_type, but only one branding at the end.


This was just a food-for-though email. I’ll try to think about more consistent solution to the needs you’ve listed and I’ll come back with (hopefully) something more proposal-oriented…

Dave Barker, MBTA

unread,
Aug 21, 2018, 4:54:45 PM8/21/18
to General Transit Feed Spec Changes
Thank you Leo! This is all really interesting stuff, very useful. Adding some of our own findings (from researching old posts and feeds on transitfeeds.com)  and additional commentary below.


On Monday, August 20, 2018 at 9:19:55 AM UTC-4, Leo Frachet (MobilityData) wrote:
So, if we try to do a (partial) overview of what is already being used or being asked about those subjects:

## Regarding #1: Canonical shape or list of stops & identification of unusual patterns

• Transit (transit.app) internally uses the field `trip_exclude_in_route_view` on the `trip` to flag the « unusual » trips which should not be displayed when the full shape of the route is drawn.

• TriMet (trimet.org) internally uses `stop_route` to define the canonical order of the stops of a route, to sort them in a timetable for example.

• Bileto (bileto.com), in Czechia, told me that all rail schedules in the country should be flagged if they weren’t following the typical pattern. They flag them by adding « Výlukový » in the planned schedules.

TriMet also has a trip_type field in which they label certain trips "Express" in part to mark them as not representative. I saw other agencies include a trip_type field but its use wasn't clear to me; an enumerated trip_type was discussed briefly some time ago. 

MPK Wroclaw (Poland) associates trips with variants and includes a variants.txt file that indicates whether the variant is a "main" variant or not, and if not, what the corresponding main variant is. 
 


## Regarding #3: Meaning full pattern description

• Johannes Rudolph posted 3 years ago about the need to have a trip_desc field for each trip, to add « specification informations for a specific trip to give information to the travelers », and Steven Judd also wanted it. Do we know with which datasets they were working? (Source: https://groups.google.com/forum/#!topic/gtfs-changes/BUBhQrpMqgQ)

Maanteeamet (Estonia) has a trip_long_name field which lists areas served by the trip in order, and a direction_code instead of a direction_id. 
 


## Regarding #5 and #6: public facing identifier and it’s definition

• TTC (ttc.ca) uses letters to distinguish between patterns, with sometime very specific ones (like the one at end-of-school-time which makes a detour to pick up student). E.g. for route 32, the branch will be « A », and this information will be in the `trip_headsign` field, as « 32A Eglinton West Towards… » (Detours of the 32: http://www.ttc.ca/Routes/32/Eastbound.jsp)

• NJ Transit (njtransit.com) also uses letters to distinguish between patterns, but not always put them in their GTFS. E.g. for route 114, the schedule defines two special patterns: X and m. But the GTFS only contains the X. Like for TTC, it’s encoded in the `trip_headsign` fields, which contains « 114X Bridgewater Express ».



## Regarding #8: Showing merged schedule information when the two routes are close one to another

• Metro LA (metro.net) runs short version of their routes during the week-end, which are often the common portion of different routes. Therefore the vehicle is at the same time running on the 35 and the 38. They represent that in the GTFS as another route, with the route short name « 35/38 ».

Interesting that this is almost exactly what TTC and NJ Transit do (above), except to different ends -- as I understand it TTC and NJ Transit are doing this to identify patterns/branches/variants, and LA Metro is doing it to group closely related routes together.  One drawback is that it makes it harder to seek information abound a particular route if it's represented in GTFS as part of a group of routes (if you're looking up the schedule for route 76 you need to know it's filed under "62" for "62/76.") The closest thing we've come across to indicating "routes A and B are related" while keeping routes at the route_id level is probably the ability to indicate in gtfs-to-html that they should share a timetable.  And the lines.txt example you posted below. 



## Regarding #9: Identifying a route which solely exists in relationship to another route, like a shuttle route

• Kisio Digital has extended/modified the GTFS format, and they created another level on top of the routes, that they call the lines (lines.txt with line_id). This is allowing them to group different routes (a bit like a « parent route ») which are branded as one line. This allows to have distinct route_type, but only one branding at the end.

That's very interesting. Are there any examples or documentation available? 

l...@frachet.ca

unread,
Aug 22, 2018, 9:25:33 AM8/22/18
to General Transit Feed Spec Changes
About the lines.txt extension used by Kisio Digital in their NTFS format, you can find the document (in French alas) there:

https://github.com/CanalTP/navitia/blob/dev/documentation/ntfs/ntfs_fr.md#linestxt-requis

Since the field names are in English, you'll get the gist even without understanding the description.

Be careful though if you start reading the whole NTFS spec: this is not only an extension of GTFS. It is a distinct format heavily inspired of GTFS, but if you put a NTFS in a GTFS parser, it will break.

Alan Gorman

unread,
Aug 22, 2018, 7:54:25 PM8/22/18
to gtfs-c...@googlegroups.com
Shapes (shape_id) equate to route patterns in our feed, not yours?

--
You received this message because you are subscribed to the Google Groups "General Transit Feed Spec Changes" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-changes+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-changes/c37a07af-55e0-4799-a3eb-60814e2cf1d3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Alan

sja...@camsys.com

unread,
Aug 24, 2018, 11:33:17 AM8/24/18
to General Transit Feed Spec Changes
I think a GTFS extension for route variants would definitely help GTFS consumers understand how agencies want their services to be displayed. In particular:

> Show a shape and list of stops that is typical or representative of the route and direction.

"Typical" and "representative" can often be difficult or impossible to infer from individual stopping patterns of trips; can vary based on how agencies want to market services; and individual trips can vary in their relationship to a marketed route based on temporary stop closures, express service, detours, etc...

So I would propose explicitly specifying route variants separately from trips in GTFS. Perhaps something like the following:

1) A new "route_variants.txt" file with columns variant_id, route_id, variant_name, variant_desc, shape_id, and perhaps others. This is sort of "parallel" to "trips.txt"

2) A new "route_variant_stop_times.txt" with columns variant_id, stop_id, arrival_time, departure_time, stop_sequence. The times would be optional, and if present, only the durations would be relevant, not specific times.

3) A new column variant_id in trips.txt, to optionally associate a trip with a variant.

One outstanding question is whether variants should have directions, or if they should be assumed to be bidirectional. I think they should be unidirectional, since loop routes wouldn't be able to be traversed in the opposite direction.

Thoughts?

Simon

Dave Barker, MBTA

unread,
Sep 12, 2018, 5:49:31 PM9/12/18
to General Transit Feed Spec Changes
We've been working on our ideas on how to add data to our GTFS feed to better describe the relationships of different kinds of trips on the same route, the relationship of different routes to each other, and other route information. (We're looking at adding calendar_attributes as part of the same effort, even though it's not strictly related.) We have some proposals that we’re evaluating and we’d love to hear people's feedback:
The document specifies that feedback can be sent to deve...@mbta.com, but you can reply to this message as well so we could share observations as a group. Thanks for taking a look! 

-Dave Barker, MBTA

Josh Fabian, MBTA

unread,
Oct 16, 2018, 1:27:21 PM10/16/18
to General Transit Feed Spec Changes
After receiving various feedback from our stakeholders, we have made a number of modifications from the original proposal for changes relating to routes and route variants (which we've since renamed into route patterns) posted above in September. For example, we are looking to add calendar_attributes.txt and directions.txt, but have proposed to add new fields to increase the machine readability of the data.

The full list of changes and additions can be found at https://docs.google.com/document/d/1I1WHU0uWFS79acblwLSAlfZGcbVEAN_7q99PlWr1HaY/edit?usp=sharing. A small sample of the affected files, containing data for a handful of routes, can be found at https://drive.google.com/file/d/1ZA2g2ATKGIpnQwG3qVpKZCo4mFfGG5pu/view?usp=sharing.

We will likely proceed to implement and deploy the proposals over the next month, but we'd still like to hear any thoughts and suggestions from this group as we do so and continue to evolve our GTFS feed. Thank you in advance!

- Josh Fabian, MBTA

Nikhil VJ

unread,
May 9, 2019, 4:30:40 AM5/9/19
to General Transit Feed Spec Changes
Hi All, 

Apologies for barging in here after many months. I read through the use cases and the MBTA proposal doc, and this might really help in with the situation of data for some large bus systems of Indian cities that we're working on or analysing. 

We can take New Delhi for example. A static GTFS is released here and you can see it visualized on ScheduleViewer here. Their GTFS closely matches the data internally maintained by the transit agencies. One route is fractured into several separate routes, each being a variant along a common pattern. In fact, even up and down directions are being stored separately as that is how their internal data maintains things. So if we had a variants.txt as defined in the proposal, we'd put this whole list into variants.txt instead of routes.txt and then map bunches of variants to a route.

In many cases the first trip or last trip of the day on a route would cover some extra stops because it's going to the depot, or would stop or start some stops short of the regular pattern. Organizing these is difficult and being able to explain the variations as given in the proposal doc would be helpful.

We've also observed there being "families" of routes, so being able to group them together ("parent_route_group_id") while still maintaining them as separate routes can help.

You may be thinking about this in terms of communicating data in GTFS to commuters. But there is another angle : converting the data as is internally managed by the transit agency, to GTFS, as seamlessly as possible. The concept of variants can greatly assist in this regard as well. In the Delhi case, I would store timings info with the variants in additional columns, use "representative_trip_id" to define the sequence of stops for each variant, and then compute these to generate the final stop_times.txt file.


I'm looking forward to this. Breaking the proposal up into parts might help : for example, "parent_route_group_id", "trip_group_id", "route_fare_class" seem external to the original route variants proposal (please correct me if I'm wrong) and might be better treated as separate proposals / extensions. Especially as the parent route entries would need to be recognized by the validator and not flagged for not having any trips.

---------------

Question regarding "route_fare_class" field : It was mentioned in the MBTA proposal doc. In the bus systems I'm seeing, we have express or AC bus services being introduced on some routes and the fares for them are different then the regular buses. The route number stays the same though. So having a way to distinguish would be very helpful. But I did not see in the proposal doc how fare_rules.txt and fare_attributes.txt would use this new field. I'm guessing fare_rules.txt would add this column as well?


Regards
Nikhil VJ
Pune, India

Stefan de Konink

unread,
May 9, 2019, 4:36:59 AM5/9/19
to Nikhil VJ, General Transit Feed Spec Changes
Hi Nikhil,

I think your proposal is very valid, but also solved in standards build for
operational data exchange (NeTEx/SIRI). I do not feel that GTFS is moving
in that direction, instead the focus is on the traveler and travel
information services. Where GTFS is explictly denormalised, while many
agencies use normalised data.

So what is you aim with respect to exchanging an explict variant, opposed
to let the receiver deduce the information?

Stefan
Reply all
Reply to author
Forward
0 new messages