Data Quality in GTFS-ride

39 views

Skip to first unread message

Phillip Ryan Carleton

unread,

Sep 19, 2018, 8:08:15 PM9/19/18

to GTFS-ride

Greetings consortium members and transit stakeholders,

This thread focuses on the issues surrounding data quality and its implications for the GTFS-ride standard. More specifically, please comment if you have information or opinions related to the following questions:

Do you (or your organization) have concerns about the quality of your collected (raw) ridership data?
If concerns do exist, what would be needed to help your organization feel comfortable with publishing ridership data in the GTFS-ride standard? For example, a categorical field could be added to the standard to describe the level of data quality (e.g., “uncorrected raw data”, “complete data”, “incomplete data”, “complete data w/inferences”, etc.)
Will having ridership in GTFS-ride help your organization facilitate quality assurance and/or quality control procedures?
What types of additional metadata regarding ridership data quality would you feel is needed for GTFS-ride? (The current file ride_feed_info.txt and the fields record_use and schedule_relationship in board_alight.txt could be considered metadata currently in the GTFS-ride standard).

The GTFS-ride project team highly values your feedback and feels these discussion threads will help engage the stakeholder community and further the development of the GTFS-ride data standard. Therefore, please feel free to provide any feedback or comments that you think are relevant to this topic. If you have a specific concern or recommendation for the standard, we invite you to start an issue or pull request in the GitHub repository.

andrew.martin

unread,

Sep 20, 2018, 6:48:20 PM9/20/18

to GTFS-ride

Our GTFS ride feed had a very small percentage of boardings data that was dropped in the conversion process. I have small concerns about releasing data that likely won't match our NTD report (if you released a whole year at a time). These concerns are very minor.
I don't think it's really a large deal, but something to be aware of. Data consistency is important when releasing information to the public.
I'm not sure how they would at this point.
I'm not sure what else would be needed at this point.

Reply all

Reply to author

Forward

0 new messages