Re: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

636 views
Skip to first unread message

Roger Morton

unread,
Jul 29, 2014, 9:58:53 PM7/29/14
to gtfs-r...@googlegroups.com
In our system, we always refer to loads as a percentage of seats.  Locally, we set our planning standard for capacity as a percentage of seats. For example, we consider most routes to be "over capacity" if they exceed 50% of the seated load.  In an articulated bus, this might be 82 people. Often we have more people than that in a crush load condition.  Perhaps 180% of seated load or even more.  Still customers understand 150% of vehicle seats.  Perhaps if this is implemented, the reported occupancy might be expressed as a percentage of vehicle seats.


On Tue, Jul 29, 2014 at 10:36 AM, Aaron Steinfeld <aaron.s...@gmail.com> wrote:
On Tuesday, July 29, 2014 1:46:18 PM UTC-4, Sean Barbeau wrote:
For example, Tiramisu Transit (http://www.tiramisutransit.com/) by Carnegie Mellon crowd-sources occupancy from riders using their mobile app (i.e., not from a traditional AVL/APC system).  It looks like they present occupancy to users in the app as "Bus Load", with the values "Many Seats", "Few Seats", "No seats", and "Full".  I don't think they are presenting this data via an API, though.


Hi everyone, I'm one of the Tiramisu Transit people. Sean sent me an email asking for our rationale and approach for our fullness metric. A quick bit of history: originally, we used Empty, Seats Avail, No Seats, and Full. This seemed to be confusing for our users, especially for the bottom two levels, so we moved to the terms Sean mentions above. We're close to releasing a major refresh and will be shortening the above terms to: Many, Few, Stand, and Full. This will save screen real estate.

Our system is crowdsourced and therefore relies on rider perception of vehicle fullness. We didn't want to ask people to count passengers, were concerned about offering too many levels due to rater repeatability (the odds two people would give the same rating for the same stimuli), and the actual functional impact of each rating level. 

The functional impact part is really important. What does "this bus is 75% full" really mean to a rider? This doesn't disambiguate whether you're going to get a seat - which is the key question for a rider. An APC or farebox can't really tell if a seat is filled by luggage, a backpack, or a slouched passenger. Therefore, I think the spec needs to have the option to document seat fullness as an alternate to raw APC/farebox fullness calculations. An ordinal scale like Nisar's or our approach is also nice since it supports crowdsourced data techniques which can't capture numerical or percentage fullness. As an aside, we've also heard from colleagues that APC counts can be very noisy and often contain erroneous values. Therefore, placing too much faith in the accuracy of their raw numerical values could also be problem and it might make sense to map their data to 4-6 levels in order to mask the noise.

Another key motivator for us was the functional availability of wheelchair seating spots (our research sponsor is oriented on disability issues). In the US, a "Full" rating is pretty much the same as "no room for a wheelchair." We didn't want to explicitly ask about wheelchair room for several reasons: 

1) Improper interpretation of wheelchair space. We all know a driver will tell people to move out of the wheelchair seats and make room, but many end users might just mark such spaces as full.

2) Malicious mis-reporting that spaces are full. While unlikely, it is quite possible an end user might start to mark wheelchair spots as full as a method for discouraging wheelchair users from seeking out their bus (long boarding times).

3) Serving riders who need seats due to their physical capability. Asking about fullness instead of wheelchair seats is more universal and covers a wider range of disabilities. This extends the value of the data to frail older adults, people with disabilities that impact balance, and those who tire easily.

In reality, there is a good mapping of our labels to wheelchair spot availability is pretty good. This mapping breaks down when both wheelchair spots (common configuration in US) are consumed on a bus that isn’t full. The odds of this are pretty low and that would probably be during off-peak times.

We've been deployed for a while and have thousands of contributed trips (over 145k). We're in discussions with the local transit agency to compare their APC and farebox counts to our fullness ratings. We're still early in this discussion and hope to comment publicly on it sometime over the next year.


--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/7ad94bf9-2eb7-48f1-8be6-eed3d2eda76d%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
J. Roger Morton
President and General Manager
Oahu Transit Services, Inc.
Honolulu, HI 96819
Phone  (808) 848-4508
Fax      (808) 848-4419
email    rmo...@thebus.org

Aaron Steinfeld

unread,
Jul 30, 2014, 11:56:13 AM7/30/14
to gtfs-r...@googlegroups.com
To be clear, I'm not advocating percentage vs. some kind of ordinal level. Instead, I'm suggesting support for both is the right way to go. In fact, there may be cases where you want both reported for the same transit vehicle. For example, the local transit agency might want to parameterize their numerical data into one kind of ordinal scale for use by riders yet still support raw values for regional planning and operations.

Aaron

Sean Barbeau

unread,
Jul 31, 2014, 3:47:24 PM7/31/14
to gtfs-r...@googlegroups.com
I've been able to collect a few more data points (TransLoc, TriMet) for existing real-time occupancy data producers, and I've organized this and previous examples in a Google Spreadsheet:
https://docs.google.com/spreadsheets/d/1mRsFuTcaPy0YHe23lSCk2dqwqScfv8EDrC4mZy3uv8I/edit?usp=sharing

If you find more examples, please add them to the spreadsheet.

Based on this info and previous discussion, here's a Google Doc with a new proposal for occupancy that includes both ordinal and numeric values in a new optional OccupancyDescriptor message:
https://docs.google.com/document/d/1kQQClFwWo_peh-fAUyEwigkclQu1gIYs6AqOIzKgvJs/edit?usp=sharing

As long as producers aren't required to be able to report all of the ordinal values, the values that Eric initially proposed (which is what I've included in this proposal) should work for existing producers (TriMet is in the process of implementing a way to report NOT_ACCEPTING_PASSENGERS via their CAD/AVL).

A CapacityType enum is also included to describe how the producer defines capacity (which is required only if the producer populates the occupancy_percentage or capacity fields).

Please feel free to comment here or in the Google Doc.

Thanks!

Sean

Sean Barbeau

unread,
Aug 18, 2014, 10:36:27 AM8/18/14
to gtfs-r...@googlegroups.com
Just wanted to follow up on this vehicle occupancy proposal.  Are there any objections to the new representation I outlined in the Google Doc?

If not, what would be the next steps towards adoption?

We'd like to start testing vehicle occupancy in our system, and we could do it as a custom extension, or as a proposed part of the spec, depending on how the community wants to proceed.

Thanks,
Sean

Eric Andresen

unread,
Aug 19, 2014, 7:56:18 AM8/19/14
to gtfs-r...@googlegroups.com
I think it makes sense to start as an extension with a limited lifespan, and plan to move it into the spec once it's had some lifetime to make sure it fits real-life scenarios.

Before incorporating into the spec, there should be a more details analysis of how competing standards model this information (e.g. SIRI), and what systems are able to produce what form of data, as well as what consumers will be able to do with it generally.

As such a generic consumer, for example, I suspect only the OccupancyStatus enum will be of practical use, because something like the meaning of '130% SITTING capacity' means nothing to me without additional context, whereas any of the OccupancyStatus values represent a state that we can report on a UI to an end-user with a clear meaning and intent.



For more options, visit https://groups.google.com/d/optout.



--
Eric Andresen
ean...@google.com

Barbeau, Sean

unread,
Aug 19, 2014, 10:04:12 AM8/19/14
to gtfs-r...@googlegroups.com

Eric,

Good idea, I’ve added how SIRI represents occupancy (ordinal - seatsAvailable, standingAvailable, full) as well as TCIP (numeric - passenger count [0-255]) to the spreadsheet of known real-world examples of real-time occupancy. 

 

In the list of existing real-world examples I’ve found, 3 of the 4 are presenting occupancy to end-users as a percentage.  From speaking to regular riders here at USF, they say percentage is useful (even without an indicator as to whether its SITTING only or SITTING_AND_STANDING), as they are able to generalize what this means based on past experience (e.g., they’ll take a different route, or may have to wait for the next vehicle).  I agree that for new riders percentage likely isn’t as meaningful.

 

Beyond the contents of the Google Doc spreadsheet (and obviously more data there), what would you like to see in terms of what systems are able to produce and how it’s shown to users?

 

Sean

--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/CAFLeBcHUs5bKMCerKcCe2%2Bg1x%3D%3DT_qrDKeNDdS7mz5DLeXpf3g%40mail.gmail.com.

Aaron Steinfeld

unread,
Aug 19, 2014, 3:45:22 PM8/19/14
to gtfs-r...@googlegroups.com
In addition to novice users, percentage is also unhelpful for people trying to determine if there is room for a wheelchair. We've found that many buses technically have room but standing riders won't move back, thereby preventing access to the wheelchair seat locations. This is one of the reasons Tiramisu emphasizes a standing rating as one of the ordinal ratings (currently "No Seats" but will soon be renamed "Stand"). We didn't look at SIRI's spec when picking our fullness categories, but it sounds like the SIRI spec uses very similar categories. The only difference is our extra level in the "seats available" range. 

By the way, we did try "Empty" as our lowest rating but found that it really confused users. They were unclear on whether empty meant truly empty or just figuratively empty. Some users would only assign this rating if they were the sole rider. We renamed it "Many Seats" to remove the confusion. One idea is to use Empty when buses are running dead-head between routes or back to the barn. It is a special case of Not_Accepting_Passengers since I think the current semantic meaning for that is the bus is too full to board. To disambiguate, it may be appropriate to change Empty to Out_Of_Service. While OOS trips are not in the GTFS-Realtime spec (correct?), this may be desired in the future when developing AVL tools for systems managers.

Aaron

Eric Andresen

unread,
Aug 22, 2014, 7:32:42 AM8/22/14
to gtfs-r...@googlegroups.com
Thanks, Sean and Aaron.

At this point, I think it would be best if we limit the addition to the official spec to just the ordinal enumeration value, and not the other fields. It is less specific, but has significantly less ambiguity for presenters of this information, and leaves the choice of semantic mapping to the data providers which have a better understanding of their riders and users.

As such, here's my revised proposal based off of Sean's: ​
Cheers,
-- Eric



Filippos Karapetis

unread,
Sep 3, 2014, 9:19:38 AM9/3/14
to gtfs-r...@googlegroups.com
This idea could be extended to include the seat type, together with its availability.

- Total seats per type for each trip (e.g. 100 A class seats, 300 B class seats)
- Available seats per type for each trip

The above requirements could be added as additional optional fields in the trip updates feed, modeled as
follows:
Message: TripUpdate
Fields:
- name: availability_update, type: AvailabilityUpdate, cardinality: repeated
Message: AvailabilityUpdate
Fields:
- name: stop_id_from, type: string, cardinality: optional, description: The starting stop ID. If empty, the
availability is given for the whole trip
- name: stop_id_to, type: string, cardinality: optional, description: The ending stop ID. If empty, the availability
is given for the whole trip
- name: seat_availability, type: AvailabilityUpdate, cardinality: repeated, description: The available seats per
type
Message: AvailabilityUpdate
Fields:
- name: seat_type, type: string, cardinality: optional, description: The type of the seats
- name: total_seats, type: uint32, cardinality: optional, description: The total number of seats
- name: available_seats, type: uint32, cardinality: optional, description: The available number of seats
header {
gtfs_realtime_version: "1.0"
incrementality: FULL_DATASET
timestamp: 1284457468
}
entity {
id: "simple-trip"
trip_update {
trip {
trip_id: "trip-1"
}
stop_time_update {
stop_sequence: 3
arrival {
delay: 5
}
}
stop_time_update {
stop_sequence: 8
arrival {
delay: 1
}
}
stop_time_update {
stop_sequence: 10
}
availability_update {
stop_id_from: "stop 1"
stop_id_to: "stop 2"
seat_availability {
seat_type: "Class A"
total_seats: 100
availabie_seats: 30
}
seat_availability {
seat_type: "Class B"
total_seats: 300
availabie_seats: 90

Barbeau, Sean

unread,
Sep 3, 2014, 9:33:47 AM9/3/14
to gtfs-r...@googlegroups.com

Eric,

If the easiest step forward is to simply start with the enumeration, that’s fine with me.  I added a single comment to the Google Doc, if you have a chance to review.

 

If there are no objections to starting with the enumeration, what’s the next step?

 

We’d like to start include this data in our GTFS-rt feed.

 

Sean

 

From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Filippos Karapetis
Sent: Wednesday, September 03, 2014 9:20 AM
To: gtfs-r...@googlegroups.com
Subject: Re: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

 

This idea could be extended to include the seat type, together with its availability.

 

Eric Andresen

unread,
Sep 5, 2014, 5:10:14 AM9/5/14
to gtfs-r...@googlegroups.com
Hi Sean,

Thank you for your reminder. We will be proceeding with adding this enum to the published version of the spec as per my latest proposal; I expect it will be able to happen in the next week or so.

Cheers,
-- Eric


Sean Barbeau

unread,
Sep 5, 2014, 1:34:15 PM9/5/14
to gtfs-r...@googlegroups.com
Great, thanks!

Sean

Eric Andresen

unread,
Sep 12, 2014, 12:42:39 PM9/12/14
to gtfs-r...@googlegroups.com
Hi Sean, et al,

I just wanted to let you know: we haven't forgotten about this! However, we're probably going to have to classify the new field as one of "experimental" or "proposed", while still reserving the proto tag number. We're still working out the precise wording and documentation.

In the meantime, this is basically the relevant section if you want to get started using it:

message VehicleDescriptor {
  // ...existing stuff here

  // The degree of occupancy of the vehicle
  enum OccupancyStatus {
    // The vehicle is considered empty by most measures, and has few or no
    // passengers onboard, but is still accepting passengers.
    EMPTY = 0;

    // The vehicle has a relatively large percentage of seats available.
    // What percentage of free seats out of the total seats available is to be
    // considered large enough to fall into this category is determined at the
    // discretion of the producer.
    MANY_SEATS_AVAILABLE = 1;

    // The vehicle has a relatively small percentage of seats available.
    // What percentage of free seats out of the total seats available is to be
    // considered small enough to fall into this category is determined at the
    // discretion of the feed producer.
    FEW_SEATS_AVAILABLE = 2;

    // The vehicle can currently accommodate only standing passengers.
    STANDING_ROOM_ONLY = 3;

    // The vehicle can currently accommodate only standing passengers
    // and has limited space for them.
    CRUSHED_STANDING_ROOM_ONLY = 4;

    // The vehicle is considered full by most measures, but may still be
    // allowing passengers to board.
    FULL = 5;

    // The vehicle is not accepting additional passengers.
    NOT_ACCEPTING_PASSENGERS = 6;

  }
  optional OccupancyStatus occupancy_status = 4;

  // ...existing stuff here...
}

Specifics of names, values and tag are still subject to change, but unlikely to.

Apologies for the delay.

-- Eric


--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.

Chiara Micca

unread,
Jan 9, 2015, 10:49:09 AM1/9/15
to gtfs-r...@googlegroups.com
Hi all,

The new field is now published as experimental, see:
https://developers.google.com/transit/gtfs-realtime/reference#VehiclePosition
https://developers.google.com/transit/gtfs-realtime/reference#OccupancyStatus_VehiclePosition

Since VehiclePosition includes the mutable statuses for a vehicle which can change over the course of its run, while VehicleDescription is used solely for identification, we chose to add OccupancyStatus as a field of VehiclePosition for consistency. Any feedback or thoughts on this?

Thanks!
Chiara


Brian Ferris

unread,
Jan 9, 2015, 10:51:09 AM1/9/15
to gtfs-r...@googlegroups.com
Sean, I know you are already probably using a flavor of this proposal in some of your production systems based on the updates we did to onebusaway-gtfs-realtime-api.  Thoughts on the proposed location-change for the occupancy_status field?  If you think they change is reasonable, I can help on the OBA side with migrating the field.

Barbeau, Sean

unread,
Jan 9, 2015, 11:42:16 AM1/9/15
to gtfs-r...@googlegroups.com

Chiara/Brian,

We’re actually in the process of adding this field downstream in our GTFS-rt feed, so it hasn’t been deployed in production yet.  So, timing is still good for changes on our end, if we can reach a conclusion in the next week.  We’d of course still need to change onebusaway-gtfs-realtime-api with any updates.

 

I don’t think I agree with this change, though. 

 

1 - Is the mutability of VehicleDescriptor more of an implementation issue on Google’s end?  I don’t see where the mutability of VehicleDescriptor or VehiclePosition is defined in the spec – or am I missing something?  If it’s a matter of the VehicleDescriptor definition (“Identification information for the vehicle performing the trip”) it seems more logical to me to change the definition.  Logically, IMHO occupancy fits better under VehicleDescriptor.

 

2 - More importantly, I believe we’d lose the ability to represent vehicle occupancy in GTFS-rt TripUpdate feeds if occupancy is moved to VehiclePosition, since VehiclePositions aren’t included in TripUpdate feeds, while VehicleDescriptors are.

 

Also, a semi-related observation – I just noticed that in the hierarchy of objects in the GTFS-rt Element Index at the top of the page, VehicleDescriptor is missing from the VehiclePosition node:

https://developers.google.com/transit/gtfs-realtime/reference

 

VehicleStopStatus and CongestionLevel objects are also missing as children of the same VehiclePosition node.

--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/CAG9YwWAo49JL6TpiqJ_HpZB2EFRxN55vyOw1B2C%2BdSDE39CBDA%40mail.gmail.com.

Brian Ferris

unread,
Jan 9, 2015, 12:12:30 PM1/9/15
to gtfs-r...@googlegroups.com
It's not an implementation detail.  I suggested the change because most of the properties of VehicleDescriptor (and similarly TripDescriptor) are things that don't change for the course of the vehicle journey.  Where-as fields in the parent entity do.  I think that seems like a good convention to keep when deciding whether to add things to the descriptors vs the parent entity.

As for wanting occupancy status in a TripUpdate, that's a fair point, but really, you can make that same argument for every field in VehiclePosition.  If you want to cross-reference the two, I think that's an argument for providing both vehicle positions and trip updates as feeds (perhaps in the same stream even).

Sean Barbeau

unread,
Jan 9, 2015, 5:32:03 PM1/9/15
to gtfs-r...@googlegroups.com
It's not an implementation detail.  I suggested the change because most of the properties of VehicleDescriptor (and similarly TripDescriptor) are things that don't change for the course of the vehicle journey.  Where-as fields in the parent entity do.  I think that seems like a good convention to keep when deciding whether to add things to the descriptors vs the parent entity.

 Everything else being equal, I'd agree.  But...
 
As for wanting occupancy status in a TripUpdate, that's a fair point, but really, you can make that same argument for every field in VehiclePosition.  
If you want to cross-reference the two, I think that's an argument for providing both vehicle positions and trip updates as feeds (perhaps in the same stream even).

I see your point, but IMHO this adds unnecessary complexity for the consumer, to the point where it doesn't justify sticking to the above current design conventions. 

Most real-world GTFS-rt feeds I've seen (including the one we've built that will soon include occupancy) are split with TripUpdates and VehiclePositions in separate streams on separate ports.  In this design, consumers showing trip delays to end users only need to parse the TripUpdate feed.  If the producer adds occupancy to only the VehiclePosition feed and the consumer wants to show occupancy next to trip delay info, now the consumer is forced to also process the VehiclePosition feed, cross reference vehicle-ids between the feeds (assuming the producer has the foresight to include vehicle_ids in both feeds - its optional) and merge the data.

Until recently, this cross-referencing actually wouldn't have been possible in our GTFS-rt feeds, since we didn't have access to a vehicle_id from the underlying system that's supplying the predictions.  We're fortunate that now we do have access to vehicle_ids tied to predictions as well.  Even if you can do this matching, though, there is no guarantee about the atomic nature of combined delay/occupancy info across feeds.  In other words, each feed represents its own snapshot in time, which likely won't be identical.  As a producer/consumer, IMHO we should be able to export occupancy to the TripUpdates feed if its at our disposal.  I'd prefer to keep things simpler client-side.

As far as the other fields in VehiclePosition - as a producer/consumer, I don't want to show any of these other fields to end users along with the trip delay info, but I do want to show occupancy with trip delay info.  So, in this sense, you could argue that occupancy is more strongly related to the VehicleDescriptor fields like "label", which is also user-facing, than the other VehiclePosition fields.

If this is a real sticking point I'll agree to move it to VehiclePosition, but that wouldn't be my first choice.

Either way, have a great weekend! :)

Brian Ferris

unread,
Jan 9, 2015, 5:46:48 PM1/9/15
to gtfs-r...@googlegroups.com
I have wanted to display vehicle position + display information together.

I agree that cross referencing trip updates and vehicle positions is tricky if they are in separate feeds (and impossible if vehicle_ids aren't included).  That said, even if you really want occupancy status in TripUpdate, I'd prefer any of the following solutions:

1) Just put OccupancyStatus as a field in TripUpdate.
2) Heck, make VehiclePosition a field of TripUpdate ;)
3) Encourage agencies to provide combined vehicle-position + trip-update feeds, and make explicit that VehicleDescriptor should match between the two.

--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.

Kurt Raschke

unread,
Jan 9, 2015, 9:06:21 PM1/9/15
to gtfs-r...@googlegroups.com

On Fri, Jan 9, 2015 at 5:46 PM, 'Brian Ferris' via GTFS-realtime <gtfs-r...@googlegroups.com> wrote:
3) Encourage agencies to provide combined vehicle-position + trip-update feeds, and make explicit that VehicleDescriptor should match between the two.

Are such combined feeds spec-valid?  Perhaps more importantly, are they supported by major feed consumers?  My recollection from the initial public release of the GTFS-realtime spec was that combined feeds were prohibited, although some feed producers published combined feeds anyway.

https://developers.google.com/transit/gtfs-realtime/ (admittedly a non-normative document) says that "[u]pdates of each type are provided in a separate feed".  The spec itself says that "[a] feed should contain only items of the appropriate applications; all the other entities will be ignored".

-Kurt

Brian Ferris

unread,
Jan 9, 2015, 9:23:30 PM1/9/15
to gtfs-r...@googlegroups.com
I'm proposing to relax that restriction (as one option).

I think it can make sense in some systems to model alerts and trips/vehicles separately, since they are often updated on different time-lines.  By comparison, I think combining trips + vehicles in the same feed makes a lot of sense, given that they are often updated from the same input data.

--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.

Eric Andresen

unread,
Jan 12, 2015, 5:39:03 AM1/12/15
to gtfs-r...@googlegroups.com
Google does not currently support these mixed feeds, but there are very few technical limitations beyond doing so, so we are considering it for the future -- likely it would happen when we have providers requiring such mixed feeds.

That said, vehicle_id is not the only way to match these two feeds: the TripDescriptor in the VehiclePositions feed is also capable of performing the appropriate matching.

The bigger concern I have is how infrequently the 'timestamp' fields in both of the TripUpdate and VehiclePosition entities are set by producers. These are quite critical to knowing the age of the data about a particular trip/vehicle, and may have very little connection to the overall feed's timestamp (which in itself is optional, but for the love of all, please always set it). That'd then lead me to complain about time being only seconds-resolution, and not requiring any proper synchronization (we receive one feed that has a clock 3 minutes ahead!), but that's for another day... :-)

Anyway, I think that VehiclePosition is currently the best place for this new field, as it best aligns with the rest of the data. I too feel the pain of having to pull it from the separate feed though.

Sean Barbeau

unread,
Jan 12, 2015, 2:56:52 PM1/12/15
to gtfs-r...@googlegroups.com
1) Just put OccupancyStatus as a field in TripUpdate.

Of the above options, this is the simplest solution that requires the least disruption to the current GTFS-rt ecosystem and is the easiest for current producers to implement given existing feeds.  I'm not wild about the organization of fields, but it gets the job done.  The pragmatic side of me wants to pick this option :).
 
2) Heck, make VehiclePosition a field of TripUpdate ;)

I like this option since TripUpdates and VehiclePositions are tightly-coupled, and it would likely have fewer issues with mismatched/missing ids and/or timestamps than when trying to reconcile separate objects in separate (or same) feeds.  This makes consumers simpler and removes unnecessary complexity.  However, it is a larger change to the GTFS-rt ecosystem than #1.

For existing producers adding this new field in their feed, I guess we would need a requirement that additions of VehiclePosition to TripUpdate would need to be identical to any existing VehiclePosition exports in separate feeds.  You'd have legacy consumers that were still consuming the original separate VehiclePositions feed, and updated consumers that want to pull Vehicle Positions from the TripUpdates feed.  Consumers couldn't effectively transition to pulling VehiclePositions from TripUpdates unless the two were identical.  And, producers should still export the TripUpdates.VehicleDescriptor field to support legacy consumers, even though that's duplicated in TripUpdate.VehiclePosition.VehicleDescriptor.

This design does create a roadmap for AVL system evolution and export to GTFS-rt feeds.  If an AVL system can only provide simple position information but no predictions, then they can export a VehiclePosition feed.  When prediction technology is added, they can export TripUpdates that include the VehiclePositions (and deprecate the original VehiclePositon feed?).

However, the reverse option would be to add a TripUpdate field to VehiclePosition.  In the above evolution for new producers IMO it makes more sense (when you can export predictions you're just adding a field to an existing feed, instead of creating a new feed with the old feed as a subset).  This also potentially reduces some field repetition, since you could potentially have a single vehicle generating predictions for multiple trip instances (e.g., loop routes).  So, if you have vehicle A generating predictions for trips X and Y, you've have:

vehicle {
    id: "A"
    ...
    trip_update {
        trip_id: "X"
    }
    trip_update {
        trip_id: "Y"
    }
}

...instead of:

trip_update {
        trip_id: "X"
    vehicle {
        id: "A"
        ....
    }
}
trip_update {
        trip_id: "Y"
    vehicle {
        id: "A"
        ....
    }
}

A downside of this design is that it's possible that some systems that generate predictions don't share the vehicle_id that generated the prediction.  As mentioned earlier, that was our scenario until recently.

A third variant is to allow both TripUpdate.VehiclePosition and VehiclePosition.TripUpdate.  Seems like this opens up more fragmentation that clients would need to handle, though.

3) Encourage agencies to provide combined vehicle-position + trip-update feeds, and make explicit that VehicleDescriptor should match between the two

I think this is the largest disruption to the current GTFS-rt ecosystem, and requires the most work for producers and consumers to be able to adopt.  It still requires consumers to match fields since they are loosely coupled, and opens up potential for holes in producer implementations (though, granted, making explicit requirements would help).  Endpoint names would need to change to properly reflect the contents (e.g., /trip-updates would no longer be accurate in name, since it would also contain root VehiclePositions elements).

Of the above 3 options/variants, what are others preferences?

Brian - what's your top choice?

Sean
 

Brian Ferris

unread,
Jan 14, 2015, 2:21:43 AM1/14/15
to gtfs-r...@googlegroups.com
As stated before, I'm not a fan of #1, so where does that leave us?

I think you'd probably agree that TripUpdate and VehiclePosition really should have been unified from day one.  But they weren't so here we are.

In terms of legacy feeds, I think that no matter what we decide here, it makes sense to stress in the spec that if both vehicle positions and trip updates are provided, then providers are strongly encouraged to supply VehicleDescriptor or TripDescriptor values that match between the two feeds.

But either way, separate vehicle position and trip update feeds are already here and probably won't go away.  Is it a pain to unify these two feeds for feed consumers?  Yes but it's certainly not THAT hard. Will some feed providers fail to add the necessary references?  Yes but we will have to do what we can to prevent that going forward (better docs + validation tools).

So in terms of minimal disruption to the spec, we add OccupancyStatus to VehiclePosition and encourage feed providers to provide common id references and we're done.

If we want to go a step further, then I still think my preferred solution is to allow feeds that mix vehicle position and trip update entities.  I like it because:
1) It doesn't change the data model.
2) You can leverage the same logic for handling both split and merged feeds.

Specifically, if you've written a method that can process a feed containing both vehicle positions and trip updates, then implementing the separate feed case is mostly trivial: download the two feeds separately, concat the feed entities, and pass them to your original method.

I think the proposals for adding TripUpdate to VehiclePosition or vice-versa are tempting, but ultimately, they would require a separate code path from the code you'd write for handling separate vehicle positions and trip updates.  At this point, I don't think the fork in the data model and the additional code paths required to handle it separately from the existing data model are worth the effort.

So to summarize, my preferences:
1) Put OccupancyStatus in VehiclePosition.
2) Add stronger language about id references between VehiclePosition and TripUpdate.
3) Allow merged VehiclePosition + TripUpdate feeds if an agency desires.

Brian


To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtime+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/d42c99ff-7d36-451d-a2c4-40f33881a74c%40googlegroups.com.

Sean Barbeau

unread,
Jan 14, 2015, 12:07:34 PM1/14/15
to gtfs-r...@googlegroups.com
In terms of legacy feeds, I think that no matter what we decide here, it makes sense to stress in the spec that if both vehicle positions and trip updates are provided, then providers are strongly encouraged to supply VehicleDescriptor or TripDescriptor values that match between the two feeds.

I definitely agree with this, the stronger language the better.
 
So to summarize, my preferences:
1) Put OccupancyStatus in VehiclePosition.
2) Add stronger language about id references between VehiclePosition and TripUpdate.
3) Allow merged VehiclePosition + TripUpdate feeds if an agency desires.

I'm fine with the above, provided that the stronger language is added for matching id references.

What would be the next steps from here?

Sean

Brian Ferris

unread,
Jan 14, 2015, 1:30:18 PM1/14/15
to gtfs-r...@googlegroups.com
Next steps:

1) See if anyone strongly disagrees with this proposal.
2) Propose some text for the spec page.
3) Profit.

Sean Barbeau

unread,
Jul 28, 2014, 10:25:45 AM7/28/14
to gtfs-r...@googlegroups.com
Background

Some AVL systems provide “occupancy” data, which is the current number/percentage of riders on-board a given vehicle.  We would like to expose this data via GTFS-realtime.  See https://groups.google.com/forum/#!topic/gtfs-realtime/Nwn2P9L-oQs for a discussion.

Proposal

Add the following optional fields to GTFS-rt:
  • occupancy = (int32) The number of passengers currently on-board a vehicle.  Valid values are non-negative integers.
  • occupancy_percentage =  (float) The number of passengers currently on-board a vehicle, as a percentage of the total number of passengers the vehicle can carry.  For example, if a vehicle is rated to hold 10 passengers, and there are currently 5 passengers on board, occupancy_percentage would be 0.5.  Valid values are [0,1].   A value of 1 indicates that the vehicle is full and no more passengers can board.

These fields would be added to both VehiclePosition.VehicleDescriptor and TripUpdate.VehicleDescriptor.

Thoughts?

Thanks,
Sean

Sean Barbeau
Center for Urban Transportation Research
University of South Florida

Sirinya Matute

unread,
Jul 28, 2014, 2:25:42 PM7/28/14
to gtfs-r...@googlegroups.com

Hi everyone –

 

I’m writing as someone who works in marketing and is not a computer programmer.

 

I would have to ask riders for their opinion on what they think of having this kind of information available. For sure, I know some people personally who would think this is extremely cool. There are riders who will want to have this information because, quite frankly, we have a lot of bus bunching along our lines (it happens when there is a lot of congestion) and it is useful to know if, say, the first bus in a bunch is full so that they can board the next one that will be arriving monentarily. That said, we don’t consider a bus as ‘crush load’ (aka full) until the passenger load is 130% of the bus’s seated capacity. I understand this might be an industry standard.  I guess I’m trying to think through what value would be meaningful to people. You suggested that the valid value would be 0, 1, but I have the feeling that some riders would find it very useful to see that, say, there’s standing room but it’s not standing room—you cannot get on the bus and the bus driver will be skipping your stop consequently.

 

If this isn’t feasible from a programming standpoint, please disregard my suggestion!

 

Best regards,


Sirinya Matute

Santa Monica Big Blue Bus

--

You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.

Michael Smith

unread,
Jul 28, 2014, 2:25:42 PM7/28/14
to gtfs-r...@googlegroups.com
Passenger count info is incredibly useful so would is great to add to any feed, including GTFS-realtime.

Couple of notes. Occupancy in big cities frequently exceeds the vehicle capacity. They call it "crush loading" and if you have every ridden such a bus you would know why. So the percentage should not be limited to 1.

It would be great to also know for each stop how many passengers embark/disembark but that would be difficult to add to the feed and would probably be best left out for now.

The sensors are not that accurate so the CAD/AVL system needs to do appropriate filtering of the data. Since that isn't always done it is important to take current ridership values with a grain of salt.

Mike


--

Barbeau, Sean

unread,
Jul 28, 2014, 3:58:59 PM7/28/14
to gtfs-r...@googlegroups.com

Mike/Sirinya,

Good point on exceeding 100% capacity.  I guess it comes down to what our definition of occupancy is – a) the number of passengers that the vehicle is rated for, or b) the number of passengers that the vehicle can physically carry (sitting and standing room).  And, how the data is actually represented by the APC system.

 

In our case, we’ve been told by the AVL vendor that occupancy_percentage value provided by the AVL system will never exceed 100%.  Our transit operations people say that a driver should technically refuse any more passengers when this hits 100% (although, it sounds like reality might be a bit different).

 

If others represent this differently, sounds like our choices are:

1.       Allow occupancy_percentage > 1

2.       Require that producers normalize their data, so 100% occupancy is when they cut off picking up more passengers

 

#1 is definitely the simplest, although it places the burden on the rider to guess whether or not the vehicle is accepting more passengers.  But, given that it sounds like there a different interpretations of “full” and the cutoff point may be more of a driver judgment call, it seems like it’s the best approach.

 

So, a new definition of occupancy_percentage:

  • occupancy_percentage =  (float) The number of passengers currently on-board a vehicle, as a percentage of the total number of passengers the vehicle can carry.  For example, if a vehicle is rated to hold 10 passengers, and there are currently 5 passengers on board, occupancy_percentage would be 0.5 (i.e., 50%).  If there are currently 15 passengers on board a vehicle that is rated with an capacity of 10, then occupancy_percentage would be 1.5 (i.e., 150%).

Sean

--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/CA%2BvEGE0AQbLm26sJ-TJ9WyXAQxod-bV5djg%3DOZmAz5yKjphAzA%40mail.gmail.com.

Frumin, Michael

unread,
Jul 28, 2014, 5:17:05 PM7/28/14
to <gtfs-realtime@googlegroups.com>
"Total number of passengers the vehicle can carry" is ambiguous.  What is meant by "can" ?

Barbeau, Sean

unread,
Jul 28, 2014, 5:27:34 PM7/28/14
to gtfs-r...@googlegroups.com

What is meant by “is”? ;)

 

The way I phrased it below was intended to capture the number of passengers at which no additional passengers would be allowed to board the vehicle.  So, “true” capacity vs. “rated” capacity.  True capacity should not exceed 100%, while rated capacity can.  It sounds like our AVL vendor uses more of a true capacity measure, if occupancy_percentage never exceeds 100%, while others are using rated capacity.

 

Sean

Frumin, Michael

unread,
Jul 28, 2014, 8:48:04 PM7/28/14
to <gtfs-realtime@googlegroups.com>
You imply that through the example but it's not actually stated in your proposal.  In practice, if this is expected to be broadly used, different places will end up using it differently and the interpretation of 'can' will be implementation-specific and a function of vendor system specs, operational procedure, local policy, etc.  I recommend saying as much up front.  

Thanks
Mike

Ivan Egorov

unread,
Jul 29, 2014, 4:10:25 AM7/29/14
to GTFS RT group
2014-07-29 2:48 GMT+02:00 Frumin, Michael <Michael...@nyct.com>:
You imply that through the example but it's not actually stated in your proposal.  In practice, if this is expected to be broadly used, different places will end up using it differently and the interpretation of 'can' will be implementation-specific and a function of vendor system specs, operational procedure, local policy, etc.  I recommend saying as much up front.

Don't most of the vehicles developed for urban transportation have the nominal capacity declared by manufacturer of such vehicle? I'd think of this number as 100% in which case it is possible to have overloaded vehicle with more than 100% passengers inside.
By the way, I've met distinction between sitting and standing capacity, as well as I can imagine that amount of places for disabled persons might be valuable information in realtime context.

Eric Andresen

unread,
Jul 29, 2014, 10:08:21 AM7/29/14
to gtfs-r...@googlegroups.com
Nit: if you're representing the value as nominally in the range [0,1], do not call the field "percentage". It's a ratio.

It's unclear however that we currently have a consensus on what data we need to represent, and what it means. The meaning of "1.0" needs to be made entirely clear for the data to be useful, and not have to be interpreted with locale-specific rules. Most likely the ratio is in itself not a useful value though, and we'd need further breakdowns -- or take an entirely different approach, and just aim for a less-specific value with semantic meaning instead. For example, something along the lines of: "0: empty", "1: many seats available", "2: few seats available", "3: standing room only", "4: crushed standing only", "5: completely full", "6: not accepting boarding passengers".

Cheers,
-- Eric


For more options, visit https://groups.google.com/d/optout.



--
Eric Andresen
ean...@google.com

Stefan de Konink

unread,
Jul 29, 2014, 10:21:51 AM7/29/14
to 'Eric Andresen' via GTFS-realtime
On Tue, 29 Jul 2014, 'Eric Andresen' via GTFS-realtime wrote:

> Nit: if you're representing the value as nominally in the range [0,1], do
> not call the field "percentage". It's a ratio.

I like the concept ratio, but it is by definition as ambigous as the
concept of 'fullness' when people are boarding the wrong part of the
vehicle. For example with trains, the frontside can be overloaded, and the
backside is still empty.

The ratio could be a nice aggrated value. Better nuance is obviously an
explicit number of seats free (not taken!). But to make it more expressive
we might also want to consider a way to create this information on
sub-vehicle level. For information as: board in front, middle, back etc.

Stefan

Andrew Byrd

unread,
Jul 29, 2014, 11:34:59 AM7/29/14
to gtfs-r...@googlegroups.com
Hello,

Is there a technical or practical reason to use a ratio rather than
separate capacity and rider count fields, both of which are in units of
"people"? Which figures are most commonly provided by AVL systems?

It would of course be possible to include an optional seated capacity
field as complementary information to the total capacity.

Rider count and capacity is very useful information to include in
realtime messages, as it would allow riders to make an informed decision
to wait a few minutes for the following vehicle in order to obtain a
seat or at least some breathing room. This has the potential to improve
efficiency (especially when bunching occurs) by spreading riders more
evenly over vehicles, and significantly improve the rider's experience.

-Andrew Byrd

On 07/29/2014 10:10 AM, 'Ivan Egorov' via GTFS-realtime wrote:
>
>
>
> 2014-07-29 2:48 GMT+02:00 Frumin, Michael <Michael...@nyct.com
> <mailto:Michael...@nyct.com>>:

Nisar Ahmed

unread,
Jul 29, 2014, 1:03:43 PM7/29/14
to gtfs-r...@googlegroups.com
Let me bring in a perspective for operations where expensive AVL system in non-existent but real-time information may be provided with crowd-sourced data. I am thinking of transit systems in the developing world. For those systems, Eric's idea of "0: empty", "1: many seats available", "2: few seats available", "3: standing room only", "4: crushed standing only", "5: completely full", "6: not accepting boarding passengers" might be a more user friednly option for mobile apps to capture and disseminate occupancy approximation.

--Nisar

-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Andrew Byrd
Sent: Tuesday, July 29, 2014 8:35 AM
To: gtfs-r...@googlegroups.com
Subject: Re: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/53D7BF20.7030105%40fastmail.net.

Sirinya Matute

unread,
Jul 29, 2014, 1:06:35 PM7/29/14
to gtfs-r...@googlegroups.com
I like Nisar's perspective. I would be curious as to how we arrive at these intervals but I know that if it is feasible, riders would find this really useful. We here at Big Blue Bus in Los Angeles give our motor coach operators the option of indicating that they cannot stop through their headsign. It's had mixed results (literally the sign says "Sorry, Bus Full" and the sign doesn't always get adjusted when enough passengers disembark.)

-Sirinya Matute
Big Blue Bus
Santa Monica, Calif.

Barbeau, Sean

unread,
Jul 29, 2014, 1:11:38 PM7/29/14
to gtfs-r...@googlegroups.com
In the AVL system we're working with (Syncromatics), the ratio (i.e., percentage) is the only real-time data that we have for occupancy. We don't have access to the actual count or total capacity values.

Sean

-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Andrew Byrd
Sent: Tuesday, July 29, 2014 11:35 AM
To: gtfs-r...@googlegroups.com
Subject: Re: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/53D7BF20.7030105%40fastmail.net.

Stefan de Konink

unread,
Jul 29, 2014, 1:13:54 PM7/29/14
to gtfs-r...@googlegroups.com
On Tue, 29 Jul 2014, Barbeau, Sean wrote:

> In the AVL system we're working with (Syncromatics), the ratio (i.e., percentage) is the only real-time data that we have for occupancy. We don't have access to the actual count or total capacity values.

I'm curious. How does the system calculates it?

Barbeau, Sean

unread,
Jul 29, 2014, 1:22:46 PM7/29/14
to gtfs-r...@googlegroups.com
We've been told that internally a capacity value ("the maximum number of people the bus can hold" were the exact words) is defined for each vehicle. The APC system counts the number of people on/off at each stop, and the percentage is (current_count / capacity). The percentage is capped at 100% (presumably, if there is an erroneous over-count, the current_count is capped/reset).

This information is presented to the user in the vendor's web interface as "Percent Full" - see:
http://www.usfbullrunner.com/map

Sean
--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/alpine.LNX.2.00.1407291913320.10615%40ks3354962.kimsufi.com.

Juan Matute

unread,
Jul 29, 2014, 1:34:57 PM7/29/14
to gtfs-r...@googlegroups.com
The "maximum number of people that a bus can hold" varies based on passengers and other variables.  The same bus will have a different crush load occupancy value for a load of students after middle school in the spring is quite different than a load of adults with heavy jackets in the mid-winter.

Juan Matute
Associate Director, UCLA Institute of Transportation Studies


You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/2ADFB73EE93D4541A085363403E99C0B015689AF%40USFMAIL2.forest.usf.edu.

Frumin, Michael

unread,
Jul 29, 2014, 1:42:31 PM7/29/14
to gtfs-r...@googlegroups.com
I wouldn't hang this proposal's hat on the conventions and assumptions of a single vendor.



-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Barbeau, Sean
Sent: Tuesday, July 29, 2014 1:23 PM
To: gtfs-r...@googlegroups.com
Subject: RE: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

We've been told that internally a capacity value ("the maximum number of people the bus can hold" were the exact words) is defined for each vehicle. The APC system counts the number of people on/off at each stop, and the percentage is (current_count / capacity). The percentage is capped at 100% (presumably, if there is an erroneous over-count, the current_count is capped/reset).

This information is presented to the user in the vendor's web interface as "Percent Full" - see:
http://www.usfbullrunner.com/map

Sean

-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Stefan de Konink
Sent: Tuesday, July 29, 2014 1:14 PM
To: gtfs-r...@googlegroups.com
Subject: RE: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

On Tue, 29 Jul 2014, Barbeau, Sean wrote:

> In the AVL system we're working with (Syncromatics), the ratio (i.e., percentage) is the only real-time data that we have for occupancy. We don't have access to the actual count or total capacity values.

I'm curious. How does the system calculates it?

--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/alpine.LNX.2.00.1407291913320.10615%40ks3354962.kimsufi.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/2ADFB73EE93D4541A085363403E99C0B015689AF%40USFMAIL2.forest.usf.edu.

Barbeau, Sean

unread,
Jul 29, 2014, 1:46:18 PM7/29/14
to gtfs-r...@googlegroups.com
Does anyone else have other examples of APC data accessible in real-time from a real system? It would be useful to try and generalize the spec around real-world examples for what data is available and how this data is presented to riders.

For example, Tiramisu Transit (http://www.tiramisutransit.com/) by Carnegie Mellon crowd-sources occupancy from riders using their mobile app (i.e., not from a traditional AVL/APC system). It looks like they present occupancy to users in the app as "Bus Load", with the values "Many Seats", "Few Seats", "No seats", and "Full". I don't think they are presenting this data via an API, though.

Sean

-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Sirinya Matute
Sent: Tuesday, July 29, 2014 1:07 PM
To: 'gtfs-r...@googlegroups.com'
Subject: RE: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

I like Nisar's perspective. I would be curious as to how we arrive at these intervals but I know that if it is feasible, riders would find this really useful. We here at Big Blue Bus in Los Angeles give our motor coach operators the option of indicating that they cannot stop through their headsign. It's had mixed results (literally the sign says "Sorry, Bus Full" and the sign doesn't always get adjusted when enough passengers disembark.)

-Sirinya Matute
Big Blue Bus
Santa Monica, Calif.

-----Original Message-----
From: gtfs-r...@googlegroups.com [mailto:gtfs-r...@googlegroups.com] On Behalf Of Nisar Ahmed
Sent: Tuesday, July 29, 2014 9:45 AM
To: gtfs-r...@googlegroups.com
Subject: RE: [GTFS-realtime] Proposal: Add vehicle occupancy to GTFS-realtime

Let me bring in a perspective for operations where expensive AVL system in non-existent but real-time information may be provided with crowd-sourced data. I am thinking of transit systems in the developing world. For those systems, Eric's idea of "0: empty", "1: many seats available", "2: few seats available", "3: standing room only", "4: crushed standing only", "5: completely full", "6: not accepting boarding passengers" might be a more user friednly option for mobile apps to capture and disseminate occupancy approximation.

--Nisar


--
You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/_HtNTGp5LxM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/7D48DA543660094591FE065F3AA890F70AE8C9B9%40csmex10.smgov.net.

Aaron Steinfeld

unread,
Jul 29, 2014, 4:36:15 PM7/29/14
to gtfs-r...@googlegroups.com
On Tuesday, July 29, 2014 1:46:18 PM UTC-4, Sean Barbeau wrote:
For example, Tiramisu Transit (http://www.tiramisutransit.com/) by Carnegie Mellon crowd-sources occupancy from riders using their mobile app (i.e., not from a traditional AVL/APC system).  It looks like they present occupancy to users in the app as "Bus Load", with the values "Many Seats", "Few Seats", "No seats", and "Full".  I don't think they are presenting this data via an API, though.


Hi everyone, I'm one of the Tiramisu Transit people. Sean sent me an email asking for our rationale and approach for our fullness metric. A quick bit of history: originally, we used Empty, Seats Avail, No Seats, and Full. This seemed to be confusing for our users, especially for the bottom two levels, so we moved to the terms Sean mentions above. We're close to releasing a major refresh and will be shortening the above terms to: Many, Few, Stand, and Full. This will save screen real estate.

Our system is crowdsourced and therefore relies on rider perception of vehicle fullness. We didn't want to ask people to count passengers, were concerned about offering too many levels due to rater repeatability (the odds two people would give the same rating for the same stimuli), and the actual functional impact of each rating level. 

The functional impact part is really important. What does "this bus is 75% full" really mean to a rider? This doesn't disambiguate whether you're going to get a seat - which is the key question for a rider. An APC or farebox can't really tell if a seat is filled by luggage, a backpack, or a slouched passenger. Therefore, I think the spec needs to have the option to document seat fullness as an alternate to raw APC/farebox fullness calculations. An ordinal scale like Nisar's or our approach is also nice since it supports crowdsourced data techniques which can't capture numerical or percentage fullness. As an aside, we've also heard from colleagues that APC counts can be very noisy and often contain erroneous values. Therefore, placing too much faith in the accuracy of their raw numerical values could also be problem and it might make sense to map their data to 4-6 levels in order to mask the noise.

Another key motivator for us was the functional availability of wheelchair seating spots (our research sponsor is oriented on disability issues). In the US, a "Full" rating is pretty much the same as "no room for a wheelchair." We didn't want to explicitly ask about wheelchair room for several reasons: 

1) Improper interpretation of wheelchair space. We all know a driver will tell people to move out of the wheelchair seats and make room, but many end users might just mark such spaces as full.

2) Malicious mis-reporting that spaces are full. While unlikely, it is quite possible an end user might start to mark wheelchair spots as full as a method for discouraging wheelchair users from seeking out their bus (long boarding times).

3) Serving riders who need seats due to their physical capability. Asking about fullness instead of wheelchair seats is more universal and covers a wider range of disabilities. This extends the value of the data to frail older adults, people with disabilities that impact balance, and those who tire easily.

In reality, there is a good mapping of our labels to wheelchair spot availability is pretty good. This mapping breaks down when both wheelchair spots (common configuration in US) are consumed on a bus that isn’t full. The odds of this are pretty low and that would probably be during off-peak times.

We've been deployed for a while and have thousands of contributed trips (over 145k). We're in discussions with the local transit agency to compare their APC and farebox counts to our fullness ratings. We're still early in this discussion and hope to comment publicly on it sometime over the next year.


Reply all
Reply to author
Forward
0 new messages