Data Quality in GTFS-ride

Skip to first unread message

Phillip Ryan Carleton

Sep 19, 2018, 8:08:15 PM9/19/18
to GTFS-ride

Greetings consortium members and transit stakeholders,

This thread focuses on the issues surrounding data quality and its implications for the GTFS-ride standard. More specifically, please comment if you have information or opinions related to the following questions:

  •         Do you (or your organization) have concerns about the quality of your collected (raw) ridership data?
  •         If concerns do exist, what would be needed to help your organization feel comfortable with publishing ridership data in the GTFS-ride standard? For example, a categorical field could be added to the standard to describe the level of data quality (e.g., “uncorrected raw data”, “complete data”, “incomplete data”, “complete data w/inferences”, etc.)
  •         Will having ridership in GTFS-ride help your organization facilitate quality assurance and/or quality control procedures?
  •         What types of additional metadata regarding ridership data quality would you feel is needed for GTFS-ride? (The current file ride_feed_info.txt and the fields record_use and  schedule_relationship in board_alight.txt could be considered metadata currently in the GTFS-ride standard).


The GTFS-ride project team highly values your feedback and feels these discussion threads will help engage the stakeholder community and further the development of the GTFS-ride data standard. Therefore, please feel free to provide any feedback or comments that you think are relevant to this topic. If you have a specific concern or recommendation for the standard, we invite you to start an issue or pull request in the GitHub repository.


Sep 20, 2018, 6:48:20 PM9/20/18
to GTFS-ride
  • Our GTFS ride feed had a very small percentage of boardings data that was dropped in the conversion process. I have small concerns about releasing data that likely won't match our NTD report (if you released a whole year at a time). These concerns are very minor.
  • I don't think it's really a large deal, but something to be aware of. Data consistency is important when releasing information to the public.
  • I'm not sure how they would at this point.
  • I'm not sure what else would be needed at this point.

Reply all
Reply to author
0 new messages