Trip schedule relationship UNSCHEDULED

252 views
Skip to first unread message

Ivan Volosyuk

unread,
Jan 4, 2022, 11:59:47 PM1/4/22
to gtfs-r...@googlegroups.com
I'm working for a large gtfs-rt consumer and we currently don't
support the UNSCHEDULED trip relationship. Looking at the right way
how we can add support for that I'm kinda puzzled. Some agencies don't
specify the start_time for the trips in the UNSCHEDULED trip
descriptors. the start_time field is described as "The initially
scheduled start time of this trip instance". Not scheduled - no start
time, right? But this way we don't have a stable identifier for the
trip instance over time, which is {trip_id, start_date, start_time}.

I guess we may get away with that for trip_update feed, but it
definitely doesn't quite work for vehicle position feeds, where
{lat,lng} doesn't give sufficient information to handle loops and self
intersections. We do stateful processing for vehicle position feeds
and a stable identifier for a trip is a requirement for that. Any way
forward?
--
Thanks,
Ivan

Joachim Pfeiffer

unread,
Jan 5, 2022, 1:35:38 AM1/5/22
to gtfs-r...@googlegroups.com
Scheduled, Added and Canceled follow a schedule/headway,
"Unscheduled" is service without headway

--
You received this message because you are subscribed to the Google Groups "GTFS-realtime" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/CAEL1zHNi0m8C8BqdJrbxcrE2enYfDb61SV-%2Br9%2B05BFSr2XspA%40mail.gmail.com.

Sean Barbeau

unread,
Jan 5, 2022, 1:21:04 PM1/5/22
to GTFS-realtime
Ivan,
Yes, true frequency-based trips (exact_times=0) are tricky in GTFS Realtime. Unlike normal or exact_times=1 trips, each real-time trip instance is materialized in real-time because they aren't previously defined in the GTFS schedule data (hence UNSCHEDULED).

>Some agencies don't specify the start_time for the trips in the UNSCHEDULED trip descriptors

Hmmm, that's not good - it's an error.

The canonical spec on GitHub outlines how UNSCHEDULED should be used:

Some relevant sections:

> When the trip_id correponds to a frequency-based trip defined in GTFS frequencies.txt, start_time is required and must be specified for trip updates and vehicle positions. If the trip corresponds to exact_times=1 GTFS record, then start_time must be some multiple (including zero) of headway_secs later than frequencies.txt start_time for the corresponding time period. If the trip corresponds to exact_times=0, then its start_time may be arbitrary, and is initially expected to be the first departure of the trip. Once established, the start_time of this frequency-based exact_times=0 trip should be considered immutable, even if the first departure time changes -- that time change may instead be reflected in a StopTimeUpdate. If trip_id is omitted, start_time must be provided. Format and semantics of the field is same as that of GTFS/frequencies.txt/start_time, e.g., 11:15:35 or 25:15:35.

>The trip_id field cannot, by itself or in combination with other TripDescriptor fields, be used to identify multiple trip instances. For example, a TripDescriptor should never specify trip_id by itself for GTFS frequencies.txt exact_times=0 trips because start_time is also required to resolve to a single trip instance starting at a specific time of the day. If the TripDescriptor does not resolve to a single trip instance (i.e., it resolves to zero or multiple trip instances), it is considered an error and the entity containing the erroneous TripDescriptor may be discarded by consumers.

>UNSCHEDULED - A trip that is running with no schedule associated to it - this value is used to identify trips defined in GTFS frequencies.txt with exact_times = 0. It should not be used to describe trips not defined in GTFS frequencies.txt, or trips in GTFS frequencies.txt with exact_times = 1. Trips with schedule_relationship: UNSCHEDULED must also set all StopTimeUpdates schedule_relationship: UNSCHEDULED

UNSCHEDULED should only be used with exact_times=0 trips. And, when providing TripUpdates, you should provide the trip_id, start_time (based on the above highlighted definition - it's materialized on-the-fly and is somewhat arbitrary), and start_date - the combination of these field is your stable identifier and shouldn't change while that same vehicle is serving that same trip. 

If you have a loop route, your start_time should change based on the loop iteration that the vehicle is serving when it reaches the end of the GTFS trip. Note that as the vehicle nears the end of one loop iteration, you can (and should) have two simultaneous TripUpdates for loop routes that both refer to the same vehicle but different loop iterations (loop that is finishing and upcoming loop) with two different start_times. This avoids "pop-in" for predictions for travelers waiting at the first few stops in the trip.

Does that make sense?

Sean

Ivan Volosyuk

unread,
Jan 5, 2022, 9:35:45 PM1/5/22
to gtfs-r...@googlegroups.com
Right. It makes sense. I think the message "The initially scheduled
start time of this trip instance." is a bit confusing for an
UNSCHEDULED relationship.
https://github.com/google/transit/blob/master/gtfs-realtime/proto/gtfs-realtime.proto#L749
Otherwise, there are actually enough hints that start_time is required
for frequency-based trips. Thanks for the answer.
--
Thanks,
Ivan
> To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/f08d3041-9cb3-4b7e-b0d2-a4104c2092f9n%40googlegroups.com.

Sean Barbeau

unread,
Jan 6, 2022, 1:01:58 PM1/6/22
to GTFS-realtime
Ivan,
That's a good point, the word "scheduled" should probably be removed there or clarified.

Sean

Sean Barbeau

unread,
Jan 6, 2022, 1:16:43 PM1/6/22
to GTFS-realtime
I opened a pull request to clarify this wording here:

Feedback is welcome!

Sean

t...@google.com

unread,
Jan 6, 2022, 8:40:29 PM1/6/22
to GTFS-realtime
Is it correct to say that these 2 enum values (SCHEDULED and UNSCHEDULED) don't add any useful information to the RT message, if they're fully determined by what the static feed declared already? In other words, they're just duplicated information from the static, and they are not allowed to diverge from what the static feed already declared for the corresponding trip.

Note that this isn't true for the other values of the enum, CANCELED and ADDED/DUPLICATED, which definitely add meaning to that RT entity.

Sean Barbeau

unread,
Jan 10, 2022, 11:15:51 AM1/10/22
to gtfs-r...@googlegroups.com
>Is it correct to say that these 2 enum values (SCHEDULED and UNSCHEDULED) don't add any useful information to the RT message, if they're fully determined by what the static feed declared already?

Semantically, yes, that's correct - those enums don't add any additional transit information to the RT message.

However, given how protobufs work, we need a default value, and that's SCHEDULED. And, it doesn't make sense to indicate SCHEDULED for exact_times=0 trips, so therefore we need UNSCHEDULED for those. In hindsight, it probably would have made more sense to have a DEFAULT/UNCHANGED/STATIC or another similar neutral default that could have encompassed both schedule and frequency-based trips to avoid having two different "default" values.

Sean


You received this message because you are subscribed to a topic in the Google Groups "GTFS-realtime" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gtfs-realtime/ujeIp2UTnCs/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gtfs-realtim...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gtfs-realtime/0629b7a1-cbd1-44fb-a9be-d0abb5dc9282n%40googlegroups.com.

t...@google.com

unread,
Jan 12, 2022, 1:08:58 AM1/12/22
to GTFS-realtime
Thanks Sean, this is a great explanation.
Wouldn't it be worth to update the description of the field accordingly, by something like:

// Indicates whether this TripDescriptor modifies a trip declared in static (either SCHEDULED or UNSCHEDULED depending on the type of trip), adds a new one (ADDED or DUPLICATED) or removes one (CANCELED).

Instead of what's there currently, which IMO is incomplete and misleading: 
// The relation between this trip and the static schedule. If a trip is done
// in accordance with temporary schedule, not reflected in GTFS, then it

// shouldn't be marked as SCHEDULED, but likely as ADDED.

Sean Barbeau

unread,
Jan 12, 2022, 12:07:11 PM1/12/22
to gtfs-r...@googlegroups.com
Yes, I agree - that description in the .proto is very outdated, especially given that ADDED is "unspecified". 

So I agree it should be updated, but I'd remove the direct reference to ADDED in new text. And to avoid this text becoming outdated again, I think I'd just prefer a short definition at the top of the enum, and leave the specified behavior to be described for each of the enum values.

So something like:

    // The relation between this trip and the trip declared in static data.

Looking at the .proto further, I also realized that TripDescriptor.schedule_relationship doesn't have an explicit default:
optional ScheduleRelationship schedule_relationship = 4;
By protocol buffer rules, in this case the first enum, SCHEDULED, is then the default. But that's not obvious to someone who's not familiar with protocol buffers. And the reference document doesn't specify that SCHEDULED is the default for TripDescriptor.schedule_relationship either:

Conversely, StopTimeUpdate schedule_relationship does explicitly declare SCHEDULED as default in the .proto:
optional ScheduleRelationship schedule_relationship = 5
[default = SCHEDULED];
And explicitly mentioned it being the default in the reference too:

I think we should modify TripDescriptor.schedule_relationship to be the same - declare the default explicitly in the .proto and mention it in the reference. 

To be clear this doesn't change any existing behavior, just makes it clearer to new readers of the spec and .proto what the behavior is.

I added the changes you mentioned (with my edits) and further clarification on the above to the same pull request:

Please feel free to take a look and let me know what you think.

Sean

t...@google.com

unread,
Jan 19, 2022, 2:41:22 AM1/19/22
to GTFS-realtime
Thanks, I'll have a look there.

On Thursday, January 13, 2022 at 4:07:11 AM UTC+11 sjba...@gmail.com wrote:
Yes, I agree - that description in the .proto is very outdated, especially given that ADDED is "unspecified". 

So I agree it should be updated, but I'd remove the direct reference to ADDED in new text. And to avoid this text becoming outdated again, I think I'd just prefer a short definition at the top of the enum, and leave the specified behavior to be described for each of the enum values.

So something like:

    // The relation between this trip and the trip declared in static data.

Looking at the .proto further, I also realized that TripDescriptor.schedule_relationship doesn't have an explicit default:
optional ScheduleRelationship schedule_relationship = 4;
By protocol buffer rules, in this case the first enum, SCHEDULED, is then the default. But that's not obvious to someone who's not familiar with protocol buffers. And the reference document doesn't specify that SCHEDULED is the default for TripDescriptor.schedule_relationship either:

Conversely, StopTimeUpdate schedule_relationship does explicitly declare SCHEDULED as default in the .proto:
optional ScheduleRelationship schedule_relationship = 5
[default = SCHEDULED];
And explicitly mentioned it being the default in the reference too:

I think we should modify TripDescriptor.schedule_relationship to be the same - declare the default explicitly in the .proto and mention it in the reference. 

To be clear this doesn't change any existing behavior, just makes it clearer to new readers of the spec and .proto what the behavior is.
Yes exactly. This field was undoubtedly confusing with its current documentation.

Sean Barbeau

unread,
Mar 4, 2022, 11:34:34 AM3/4/22
to GTFS-realtime
I'd like to call for a vote on the clarifications I've proposed in the spec to address these issues. Voting will be open until Friday 2022-03-11 23:59:59 UTC.

Please vote by commenting with a +1, -1, or abstain in comments on the pull request proposal:

Thanks,
Sean

Reply all
Reply to author
Forward
0 new messages