Duplicate/almost identical trips in GTFS Bus Data from Feb, 2025

6 views
Skip to first unread message

elif ensari

unread,
10:19 AM (8 hours ago) 10:19 AM
to mtadeveloperresources
Hi, 
We are working with the GTFS bus schedules from February 2025 and are finding several Weekday trip ids representing identical or very similar trips. 

Why do these trips repeat, and is there an easy way to clean these up based on trip id or other data? Currently, we drop duplicates based on first departure, last arrival, number of stops, direction and route.

Some examples to identical trips:
B74:
UP_A5-Weekday-SDon-086500_B74_605, UP_A5-Weekday-SDon-086500_B6_204, UP_A5-Weekday-SDon-086500_B6_281, UP_A5-Weekday-SDon-086500_B6_206

and 

Q22:
41914478-FRPA5-FR_A5-Weekday-10-SDon, 41914490-FRPA5-FR_A5-Weekday-10-SDon, 41914500-FRPA5-FR_A5-Weekday-10-SDon, 41914714-FRPA5-FR_A5-Weekday-10-SDon


There are trips where a majority of stop times are identical but there are a few seconds of difference for one or more stop times. Example trip ids:

B8 - 50th stop time shifts by 11 seconds:
JG_A5-Weekday-147400_B8_150 and JG_A5-Weekday-SDon-147400_B8_150

Q54 - all stop times shift by 1:50 after the 7th stop :
FP_A5-Weekday-121700_Q54_725 and FP_A5-Weekday-SDon-121700_Q58_880


Thank you



Stephen Bauman

unread,
1:03 PM (5 hours ago) 1:03 PM
to mtadeveloperresources
What's the date on the datasets you are working with. I've archived one set that was posted on the website on 2024-12-31 and another on 2025-02-15? Both dates are the complete 6 sets for the boroughs and MTA Bus.

This may not apply to your study but you have to be careful when combining the datasets into a single one. I've noticed bus stop location discrepancies between for the same bus stop between different datasets. I've had to modify the stop id to include the dataset. This is probably a bigger problem in Queens, which is served by both NYCT and MTA Bus. As per the GTFS spec, ID's have to be unique (have unique properties) within the same GTFS. No guarantee among different GTFS schedules. I've been burned in the past.

Steve
Reply all
Reply to author
Forward
0 new messages