Making it official: platform_code in stops.txt

249 views
Skip to first unread message

Brian Ferris

unread,
Mar 18, 2013, 1:54:16 PM3/18/13
to gtfs-c...@googlegroups.com
A while back, I proposed the "platform_code" field in stops.txt to allow agencies to specify platform identifiers for stops in a large transit station:


Since that initial discussion, Google Transit has added support for the extension and a few agencies have added it to their feeds.  For example, 9292 in the Netherlands includes platform_code in their publicly available GTFS and you can see platform information on Google Maps as result:


Note the "Platform 1b" next to Nijmegen in the trip details for the first itinerary.

Given that we've got GTFS producers and consumers using this proposed extension, I'd like to nominate it for official inclusion in the spec.  To recap, here's the current proposal:

File: stops.txt
Field Name: platform_code
Required: No
Description: Indicates the platform identifier for a stop in a station complex. This should be just the platform identifier (eg. "G" or "3"). Words like “platform” or "track" (or the feed’s language-specific equivalent) should not be included.  This allows feed consumers to more easily internationalize and localize the platform identifier into other languages.

Thoughts?  Concerns?  If I don't hear any negative feedback this week, I will propose that we start the clock on making this part of the official spec.

Thanks,
Brian


Thomas

unread,
Mar 18, 2013, 2:29:07 PM3/18/13
to gtfs-c...@googlegroups.com
In relation to parent_station, it is not directly obviously what is
the hierarchy implication of platform_id will be. Is it only something
visibile as a marker? Or a true identifier which results in an unique
key constraint with {agency,stop_id,platform_code}? What does a trip
refer to in this case? It should be clear what a identifies a unique
stop in the first place (like quay in NeTEx), and if part of the
stop_name relates to its local identifier (here: platform_code),
otherwise I would propose the proper use of parent_station and an
enumeration that clearly states what kind of stop is being described.

How does this relate to GTFS-Realtime? Especially for rail there are a
lot of stations with dynamic platform allocation, and quite some
bigger busstations that do the same. There is a case to argue to
switch stop_id's as the actual boarding location changes: in some
cases a position shifts a lot, but sometimes it is just a matter of
decision of on which side of the platform (left vs right). It could
also be more simple to support just a change of platform_code.


PS.
Good to see Google finally has some time for The Netherlands again.
If you need some help to debug the mess in The Netherlands, just mail?
> --
> You received this message because you are subscribed to the Google Groups "General Transit Feed Spec Changes" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-changes...@googlegroups.com.
> To post to this group, send email to gtfs-c...@googlegroups.com.
> Visit this group at http://groups.google.com/group/gtfs-changes?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

StuartJReynolds

unread,
Mar 18, 2013, 3:27:28 PM3/18/13
to gtfs-c...@googlegroups.com
I agree with Koch's proposed enumeration.

In the UK a bus station may have "bays" or "stands" or (in Scotland) "stances". These are all UK names, and will vary from bus station to bus station depending on the owner. Not to mention platform if it happened to be a rail station!
So having it determined by localisation would, for us, be wrong.

Stuart

Bradley Tollison

unread,
Mar 18, 2013, 3:28:59 PM3/18/13
to gtfs-c...@googlegroups.com
Yes even here in Los Angeles we have bays and docks meaning the same thing depending on who runs the transit center 

Brian Ferris

unread,
Mar 18, 2013, 4:23:26 PM3/18/13
to gtfs-c...@googlegroups.com
Regarding localization, I'd point out that agencies are still encouraged to use stop_name to identify the stop in a locale-appropriate way ("Bay 5", "Platform 6", "Quay E", etc).  In fact, we imagine most clients will continue to use stop_name to identify stops to clients.

We're proposing platform_code primarily so that we can easily localize the platform code into lots of different languages.  We plan tweaking localizations per language / region / agency as needed to get the context right.

Brian Ferris

unread,
Mar 18, 2013, 4:34:12 PM3/18/13
to gtfs-c...@googlegroups.com
Stops already have a unique identifier: stop_id.  I don't propose to change that.  Instead, platform_code is a human-readable label for a platform / bay / dock / quay / whatever identifier.  We imagine it will be used in part with stop-station hierarchy, but it's not required that it be.  For example, there are plenty of agencies who don't use stop-station hierarchy for complexes with multiple stops.  These agencies could still specify platform_code for these stops.

Regarding GTFS-realtime, it's already possible to switch platform assignments dynamically.  Assuming you model platforms as individual stops in stops.txt, you could schedule a trip per normal in your GTFS.  If the assigned platform changes, you can cancel the scheduled stop-time and replace it with a new stop-time for the actual platform stop.

Brian

Thomas

unread,
Mar 18, 2013, 4:47:33 PM3/18/13
to gtfs-c...@googlegroups.com
Other issue, which i encountered during the process of converting the
rail in the Netherlands is a different arrival and departure platform.
A train arrives at platform 4a (first half of platform 4), merges with
a other train and departs from platform 4. It's a minor difference but
can increase transfer times significantly. I thought about duplicating
the stop and hack around with pickup_type and dropoff_type but that
seems a bit too hackish too me.

Not sure how the current NL rail feed used in Google Transit handles
the "vleugeltreinen"[1] issue regarding departure/arrival en
block_id's.

Kind regards,

Thomas Koch

[1] http://en.wikipedia.org/wiki/Portion_working

Nicholas Albion

unread,
Mar 18, 2013, 8:29:48 PM3/18/13
to gtfs-c...@googlegroups.com
The destination in that link is simply "51.283217, 6.079225" - it doesn't tell you the name of the station where you should get off.

Brian Ferris

unread,
Mar 19, 2013, 5:15:53 AM3/19/13
to gtfs-c...@googlegroups.com
Nicholas: The first itinerary asks you to get off at the Reuver station. We just launched the 9292 feed, so that station icon has been added to the map tiles yet, which might be part of the confusion.


Brian Ferris

unread,
Mar 19, 2013, 5:18:51 AM3/19/13
to gtfs-c...@googlegroups.com
It's true that GTFS currently doesn't do a good job of handling vleugeltreinen / portion working trains.  I don't think I've ever seen a concrete proposal for modeling it in GTFS, but I'd definitely be up for suggestions.

Thomas

unread,
Mar 19, 2013, 6:33:55 AM3/19/13
to gtfs-c...@googlegroups.com
Hi,

The way potion working is now handled in the rail-feed i converted to
GTFS[1] is like this.There is a list block_id's of which each block
contains one or more trips. Each block represents a transfer-free
journey from a to c via b under two trip_id's.
When converting this feed i just took all the trips individually and
included the block_id in the trip_id. This generates a lot of
duplicate trip warnings in the validator but does do the trick pretty
well on the exception of the arrival/depature platform problem.
The duplicate trip warnings can be ignored if it's decided that a
different block_id is not a duplicate trip or a possibility to include
multiple block_id's to one trip.

Kind regards,

Thomas Koch

[1] http://gtfs.ovapi.nl/ns/gtfs-iffns-latest.zip

John L

unread,
Mar 19, 2013, 7:49:20 AM3/19/13
to gtfs-c...@googlegroups.com

Thumbs up for the concept,  thumbs down for the implementation.

Coming late to the conversation so this may have been statwd already.

Without a platform or track meta data element, there is no way to discern whether the data represents a platform or a track as both are not interchangeable.

Metro-North Railroad in New York uses only tracks as there exists many instances of "island" platforms where one platform services two tracks.

A bus, train, tram,  funicular,  ferry,  etc.  all posses the concept of track/berth/bay where they arrive or depart from. 

Platforms, ramps,  dock, ect.  are attached to track/berth/bay as amenities or egress to these areas.

If track/berth/bay were implemented instead of platform in the stops file and platforms/ramp/dock were in an optional amenities file, this would all be a thumbs up for me.

Brian Ferris

unread,
Mar 19, 2013, 9:46:38 AM3/19/13
to gtfs-c...@googlegroups.com
I'd argue that this proposal refers only to platforms, the generic name I'm using to refer to a location or area where a rider waits to board a transit vehicle.  I'm not making any claims about track assignments here.

Though it's not explicitly stated, I think the general convention with GTFS is to use stops.txt entries to identify the points where riders wait to board a transit vehicle.  For example, most agencies place a bus stop on the side of the road where the rider waits, not in the middle of the road where the vehicle actually drives.  Arguably the only concept of a track assignment in GTFS at the moment comes from shapes.txt.

Now, it's true that plenty of agencies tell their riders "Train arriving on track 5."  But in most cases, the agency isn't advising the rider to go stand out in the middle of track 5 but instead advising them to wait at the platform with a "5" hanging above it.  By extension in GTFS, you probably wouldn't place your stops.txt in the center of the track either, but instead you'd place it at the platform location with a platform_code of "5".  You could still make the stop_name "Track 5" if you prefer.

What about island platforms?  Island platforms are certainly pretty common in most rail networks.  However, most agencies still have signage identifying opposite sides of the platform with a track identifier to distinguish the two sides of the island.  If I were modeling this with GTFS, I would introduce two entries in stops.txt, on for each side of the island platform, and each with a unique platform_code value.

I don't argue that there may not be explicit value in modeling track assignments, but I think that's beyond the scope of my current proposal.

Do you have a specific station in mind were the conventions I outlined above would break down?

Andrew Byrd

unread,
Mar 19, 2013, 10:38:07 AM3/19/13
to gtfs-c...@googlegroups.com
On 03/19/2013 02:46 PM, Brian Ferris wrote:
> I'd argue that this proposal refers only to platforms, the generic name
> I'm using to refer to a location or area where a rider waits to board a
> transit vehicle. I'm not making any claims about track assignments here.
>
> Though it's not explicitly stated, I think the general convention with
> GTFS is to use stops.txt entries to identify the points where riders
> wait to board a transit vehicle. For example, most agencies place a bus
> stop on the side of the road where the rider waits, not in the middle of
> the road where the vehicle actually drives. Arguably the only concept
> of a track assignment in GTFS at the moment comes from shapes.txt.

On this point I think everyone agrees: 'Stops' represent places where
passengers board vehicles, to whatever degree of precision the
operator/feed producer desires or can provide. A track/platform/quai/bay
identifier can be included in the "stop" name accordingly.

The platform_code field is apparently intended for isolating the most
precise location information (the track or quai or whatever within a
station area) so it can be presented to the user as such, independent of
the interface language. This all sounds reasonable.

But if I understand correctly, the problem being raised here is that
when translated, the term used in the UI (the equivalent of 'platform')
may not be the correct one for the entity truly represented by this 'Stop'.

Stepping back a moment, I first have to wonder whether this degree of
internationalization is necessary or even possible with the suggested
extension. Is there a concern that someone will see 'Gare Centrale voie
8B' or 'Centraal Station perron 6A' and not be able to identify which
part of that name is the track number? Even with very descriptive stop
names and a translated UI, I suppose the user would see something like
'Gare Centrale voie 8B, perron 8B'. Is that better? Would 'Gare Centrale
voie 9 bay 9' be improved by rendering it 'Gare Centrale voie 9 track 9'?

There are of course many multilingual environments where it is not
appropriate to deliver a GTFS feed with localized platform names in stop
IDs because it implies the primacy of one linguistic community, or
because none of the supplied languages is a lingua franca. Feed
producers in these places might find it convenient to keep the platform
information completely separate from the name, i.e. when a platform_id
is supplied that information is not included in the stop name. On the
other hand, in these places it would often be necessary to translate
place names yielding two separate feeds.

The temptation to micro-model the world can be strong, but when creating
a model, what you abstract away is just as important as the detail you
include. GTFS doesn't seem destined to be an exhaustive representation
of a transportation system like Transmodel/NeTEx, and I don't see a need
for another such system. GTFS provides the right information to give
useful itineraries to passengers, while remaining simple enough to
encourage data sharing by agencies who may not have a great deal of
internal technical capacity. Might it be sufficient to just include
platform details in the "stop" name?

-Andrew

Brian Ferris

unread,
Mar 19, 2013, 11:26:12 AM3/19/13
to gtfs-c...@googlegroups.com
I'll admit that as an experienced rider of transit who speaks absolutely no French, I could figure out that  'Gare Centrale voie
8B' mentions a platform identifier.  I worry more about inexperienced transit riders who speak absolutely no French.  Mostly, I think of my parents on vacation ; )

As such, we provide transit directions in Google Maps in over 50 different languages.  We think one compelling feature of our service is that a tourist or traveler who doesn't speak a word of the local language can pull out their phone and get directions localized in their native tongue in hundreds of cities around the world.  Towards that goal, we see value in being able to present a localized platform identifier.  Having access to the platform_code stripped of any language-specific word for platform allows us to do that in a more consistent way.

I agree that GTFS does not intend to go into the exhaustive level of detail of other specifications, but I do think this is a feature that adds value for riders without adding a ton of complexity to the spec.  I'd also point out that for better or worse, the ambiguity in stop_name makes it difficult to know how to present it properly in a stop-station hierarchy context.  For example, some agencies include the full name of the parent station in the stop_name for a platform stop, but some don't.  As such, just showing the platform stop_name to identify the stop may not be enough but combing the stop_name's of the parent station and platform stop may lead to redundant display of information.  The platform_code field allows us to be a bit more precise in how we display a stop name.

If the community doesn't agree that it adds value to the spec, then we will probably just continue to support it as a Google-specific extension and the world will move on.  However, if there are better ways to support this feature, then I'm happy to hear suggestions as well.


Andrew Byrd

unread,
Mar 19, 2013, 11:57:50 AM3/19/13
to gtfs-c...@googlegroups.com
On 03/19/2013 04:26 PM, Brian Ferris wrote:
> As such, we provide transit directions in Google Maps in over 50
> different languages. We think one compelling feature of our service is
> that a tourist or traveler who doesn't speak a word of the local
> language can pull out their phone and get directions localized in their
> native tongue in hundreds of cities around the world. Towards that
> goal, we see value in being able to present a localized platform
> identifier. Having access to the platform_code stripped of any
> language-specific word for platform allows us to do that in a more
> consistent way.

It sounds like other people on the list are also interested in providing
localization. I wasn't trying to demonstrate that the extension was
unnecessary, but outline the things that should be clarified if it is to
be adopted.

In short: the platform_id is more useful as a localization enabler if it
is not duplicated in the stop name. In order for it to be properly
localized, we need to know what kind of entity it refers to so the
appropriate word can be selected in the target language. I suppose you
could just choose the most generic word available in each target
language, but that will make for a poorer interface, and assumes such a
word exists in all languages.

The need to allow feed producers to provide a 'better' name for the
platform in a preferred language (within the stop name itself) is
evidence that platforms are not described in sufficient detail.

I believe this is why people are suggesting a platform type enumeration.

> I'd also point out that for better or worse, the ambiguity in stop_name
> makes it difficult to know how to present it properly in a stop-station
> hierarchy context. For example, some agencies include the full name of
> the parent station in the stop_name for a platform stop, but some don't.

I've noticed this. Is the proper way to deal with this to remove the
ambiguity in the spec, or for feed consumers to pre-process the data?
Pre-processing might be able to avoid parent station name duplication,
but cannot reliably remove platform names that have been included in
stop names according to unknown rules in the original language.

> The
> platform_code field allows us to be a bit more precise in how we display
> a stop name.

I hear concern in previous posts that you also need a platform type
enumeration to do that correctly.

> If the community doesn't agree that it adds value to the spec, then we
> will probably just continue to support it as a Google-specific extension
> and the world will move on.

It seems like it could add value to the spec, but can we at least
clarify whether redundant information is allowed? If it is, does this
hint that platform descriptions are inadequate?

If there is concern about having multiple words in a target language for
different platform types, the location_type field could play a role.
However, at least in languages I am familiar with, the choice of
'platform-word' is largely determined by the mode. Does anyone know of
exceptions to this rule, i.e. a language where it is not possible to
choose a decent platform-word based on the mode used at the stop?

-Andrew


Brian Ferris

unread,
Mar 19, 2013, 1:49:55 PM3/19/13
to gtfs-c...@googlegroups.com
I agree that vehicle type would be one of the primary signals that we would use when performing localization.  When it comes to a platform type enumeration, however, it's hard for me to imagine an enumeration that would provide more detailed information than the vehicle type that wouldn't be intrinsically tied to the local language of the agency as opposed to some general attribute of service.  For example, as Bradley mentioned, two agencies using bays and docks to refer to the same concept seems largely a distinction of English and agency preference, as opposed to a more general property of the platform.  If anyone has counter-examples, I'm definitely interested.

For these reasons, this is why I still think it's ok for there to be some duplication between stop_name and platform_code.  It's a bit redundant, but stop_name could provide the localized platform name while platform_code would be for localization if a GTFS consumer wished to do so.

As for the ambiguity around stop naming conventions in stop-station hierarchy, I'm not opposed to trying to nail this down.  That said, I'm not totally optimistic either ; )  (Witness previous discussions around route and trip naming).  I guess I would need to do a more detailed analysis of how many agencies are using one convention vs the other in order to determine how much existing feeds would need adjustment.

Brian



-Andrew


Thomas

unread,
Mar 19, 2013, 1:55:22 PM3/19/13
to gtfs-c...@googlegroups.com
Might not really have been clear from my earlier posts but i am for
including this in the spec. Besides localization this also enables for
better UX where not every different agency represents the platform
differently. That's why i am also for not duplicating the platform in
the stop_name when platform_code is present, but not sure if this has
to be enforced or just used as a rule of thumb.

Brian Ferris

unread,
Mar 20, 2013, 9:27:26 AM3/20/13
to gtfs-c...@googlegroups.com
So I think there is general consensus that a platform_code field would be used by feed producers and consumers.  I don't think there is consensus on how platform_code will co-exist with stop_name.  To summarize the discussion so far, it seems there are two schools of thought:

1) stop_name can hold the fully localized name for a platform stop (eg. "Bay C") while platform_code would hold just the platform identifier "C".  There is some duplication of information here, but it supports existing GTFS clients that use only stop_name or clients that want to display the local stop name as determined by the agency.  Clients who wish to provide a localized platform name or to show a stylized representation of the platform identifier would use platform_code, ignoring stop_name.

2) Let's avoid duplication between stop_name and platform_code.  I think the idea here is that when platform_code is specified, stop_name would be blank (?) and instead the local word for platform would be provided in some field (eg. "Bay", "Platform", ...) or perhaps as an enum (1=bay, 2=platform, ...).

I'll admit that I'm firmly in position #1, and I'd love to hear if I'm misrepresenting #2 or if there is perhaps a third way.  I'm in favor of position #1 because I think it supports existing clients in a graceful way with only a bit of data duplication.  Also, I'm skeptical that position #2 can be implemented in a locale-neutral way.

Thoughts?

Brian

StuartJReynolds

unread,
Mar 20, 2013, 9:44:06 AM3/20/13
to gtfs-c...@googlegroups.com
I'm +1 for option #2

Yes, #1 is a graceful move - but it really doesn't take you forward.

#2 on the other hand more closely matches the data structures I already have - in the UK we have the concept of "Common Name" and "Indicator" e.g. "Southend Travel Centre" is the common name for all of the bays in Southend Travel Centre, and the detail is in the indicator - Stop A, Stop B, etc.

#2 is also more versatile. If you are used to consuming data written as e.g. "Southend Travel Centre Stop A" then you can build this out of two fields to get what you want to display. If, on the other hand, you needed to "show all the departures from Southend Travel Centre" (a not uncommon request) then the reverse is not true - there is no linkage between stops without parsing the text strings.

But even with #2 there is nothing to prevent users writing the names the way that they do now, if they really want to. But I would suggest for "niceness" that the validator should then throw a warning if it finds "platform", "dock", "bay" etc as part of the stop name.

Stuart

Andrew Byrd

unread,
Mar 20, 2013, 9:56:39 AM3/20/13
to gtfs-c...@googlegroups.com
On 03/20/2013 02:27 PM, Brian Ferris wrote:
> So I think there is general consensus that a platform_code field would
> be used by feed producers and consumers. I don't think there is
> consensus on how platform_code will co-exist with stop_name.

Thanks for summarizing. Yes, at this point I think the discussion is
about how how platform_code will co-exist with stop_name.

> 1) stop_name can hold the fully localized name for a platform stop (eg.
> "Bay C") while platform_code would hold just the platform identifier
> "C". There is some duplication of information here, but it supports
> existing GTFS clients that use only stop_name or clients that want to
> display the local stop name as determined by the agency. Clients who
> wish to provide a localized platform name or to show a stylized
> representation of the platform identifier would use platform_code,
> ignoring stop_name.

The problem here is that there is simply no way to reliably strip a
pre-localized platform-word out of the stop_name unless 1. the stop_name
is known to be only the platform name, e.g. because it is included in a
parent station that provides the rest of the place name, 2. the
stop_name is known to not include any platform information, 3. there is
some convention by which platform names are set apart from the rest of
the stop_name.

It seems like this would be as much a problem for Google as anyone else.
Do we want to guess which part of the stop name is the platform-word and
remove it, possibly yielding a nonsensical result?

> 2) Let's avoid duplication between stop_name and platform_code. I think
> the idea here is that when platform_code is specified, stop_name would
> be blank (?) and instead the local word for platform would be provided
> in some field (eg. "Bay", "Platform", ...) or perhaps as an enum (1=bay,
> 2=platform, ...).

The enum option was proposed, but if mode is sufficient to choose a good
platform-word in most languages it seems redundant.

The idea is that if platform_code is specified, we should at least have
a clear definition of what is in stop_name. It might depend on what
other information is supplied (parent station or platform). It could be
identical to the parent station name, or a pre-localized platform name,
or the dash-separated concatenation of the parent station name and the
pre-localized platform name. Whatever it is, it should be predictable
and allow properly rendering place names without repetition and awkward
translation mistakes.

As Thomas said, the conventions do not have to be strictly enforced.
Feeds that do not follow the recommendations in every detail should
still be usable for routing, but with no guarantee that platform IDs
will display correctly on various clients, be free of repetition, etc.

> I'll admit that I'm firmly in position #1, and I'd love to hear if I'm
> misrepresenting #2 or if there is perhaps a third way. I'm in favor of
> position #1 because I think it supports existing clients in a graceful
> way with only a bit of data duplication. Also, I'm skeptical that
> position #2 can be implemented in a locale-neutral way.

Option #1 is not locale-neutral. The stop_names will contain a
pre-localized platform string that cannot be reliably separated from the
rest of the place name.

-Andrew
Reply all
Reply to author
Forward
0 new messages