Duplicate/almost identical trips in GTFS Bus Data from Feb, 2025

37 views
Skip to first unread message

elif ensari

unread,
May 13, 2026, 10:19:33 AMMay 13
to mtadeveloperresources
Hi, 
We are working with the GTFS bus schedules from February 2025 and are finding several Weekday trip ids representing identical or very similar trips. 

Why do these trips repeat, and is there an easy way to clean these up based on trip id or other data? Currently, we drop duplicates based on first departure, last arrival, number of stops, direction and route.

Some examples to identical trips:
B74:
UP_A5-Weekday-SDon-086500_B74_605, UP_A5-Weekday-SDon-086500_B6_204, UP_A5-Weekday-SDon-086500_B6_281, UP_A5-Weekday-SDon-086500_B6_206

and 

Q22:
41914478-FRPA5-FR_A5-Weekday-10-SDon, 41914490-FRPA5-FR_A5-Weekday-10-SDon, 41914500-FRPA5-FR_A5-Weekday-10-SDon, 41914714-FRPA5-FR_A5-Weekday-10-SDon


There are trips where a majority of stop times are identical but there are a few seconds of difference for one or more stop times. Example trip ids:

B8 - 50th stop time shifts by 11 seconds:
JG_A5-Weekday-147400_B8_150 and JG_A5-Weekday-SDon-147400_B8_150

Q54 - all stop times shift by 1:50 after the 7th stop :
FP_A5-Weekday-121700_Q54_725 and FP_A5-Weekday-SDon-121700_Q58_880


Thank you



Stephen Bauman

unread,
May 13, 2026, 1:03:00 PMMay 13
to mtadeveloperresources
What's the date on the datasets you are working with. I've archived one set that was posted on the website on 2024-12-31 and another on 2025-02-15? Both dates are the complete 6 sets for the boroughs and MTA Bus.

This may not apply to your study but you have to be careful when combining the datasets into a single one. I've noticed bus stop location discrepancies between for the same bus stop between different datasets. I've had to modify the stop id to include the dataset. This is probably a bigger problem in Queens, which is served by both NYCT and MTA Bus. As per the GTFS spec, ID's have to be unique (have unique properties) within the same GTFS. No guarantee among different GTFS schedules. I've been burned in the past.

Steve
Message has been deleted

Jayden Lin

unread,
May 13, 2026, 6:58:58 PMMay 13
to mtadeveloperresources
(if the name seems familiar, i relooked at the trips and found the actual reason why)

Those B74 and Q22 identical trips are actual separate bus trips, all ran at the same time. Those are what the MTA refers to as a school tripper, and will only run on school days. The reason why there are 4 for each is due to the fact that there are 4 buses that start from the first stop, that being a school, and therefore are not duplicate trips but separate trips, all ran at the same time.

I see in both the B8 and Q54 example, both trips seems the same but they are technically different in it's own way. The ones with "A5-Weekday" refers to the trips that are operated, when school is closed. They may seem the same, but internally, it's a different schedule compared to the ones with "A5-Weekday-SDon". which refers to trips that are operated, when school is open. Traffic conditions vary whether it's a school day or not and the schedule somewhat reflects that in a way, or there are additional trips on school days that will not run when school is closed.
On Wednesday, May 13, 2026 at 10:19:33 AM UTC-4 elif....@gmail.com wrote:

elif ensari

unread,
Jun 1, 2026, 3:12:10 PM (yesterday) Jun 1
to mtadeveloperresources
Thank you Stephen,
Yes we did find those identical stops with different locations too.

elif ensari

unread,
2:43 PM (7 hours ago) 2:43 PM
to mtadeveloperresources
I have one follow up question to clarify: 

If I want to take into account all the trips on a school day, should I only be looking at trips with an -SDon- tag, or keep the unique trips (that don't have identical routes and stop-times) BUT get rid of those that are identical to a trip with an "-SDon-" tag but doesn't have "-SDon-" in its trip id?

Thank you 
Reply all
Reply to author
Forward
0 new messages