I've been making some changes to the feed validator to warn about
unusual syntax that is not interpreted in the same way be all parsers.
Each of these changes was motivated by one or more problematic GTFS
files that I've seen.
adds a warning if a file contains a header such as
stop_id, stop_name, stop_lat , stop_lon
The spaces before the names are okay but the space after stop_lat is
considered by many parsers, including Excel, to be part of the value.
checks that each line ends in CRLF or LF
Files created in very old Apple computers will need to be converted
and also warns about some hard-to-find corruption issues
Before making these changes I checked all the feeds we have at Google
to see if it will cause a widespread problem and they didn't seem to.
Most or all of these problematic files were created when people
modified files by hand, not systematic different interpretation of the
format in tools.
Directly refer to http://tools.ietf.org/html/rfc4180
as the expected
format for GTFS csv files. There are a couple deviations from RFC4180
common in GTFS files:
1) there is often one or more space characters after the , that is
between fields. Tools that parse GTFS should skip these spaces.
2) some GTFS files start with a utf-8 byte order marker which parsers