Semicolon as comment indicator

1,017 views
Skip to first unread message

Yuriy Yakimenko

unread,
Sep 18, 2008, 3:18:29 PM9/18/08
to Google Transit Feed Spec Changes
I don't see that GTFS allows comments in the data files. Sometimes
it's useful to have a line in a .txt (essentially, a text-based table)
commented out but not removed. How about making a semicolon as first
character in the line indicating a comment, as it is done in some
scripting languages?

If the semicolon is the actual first character of the first field, the
value can be put inside quotes.

Joe Hughes

unread,
Sep 18, 2008, 3:31:35 PM9/18/08
to Google Transit Feed Spec Changes
Thanks for the proposal, Yuriy. Comments in the files could certainly
be useful in some cases.

The main question for me is whether common CSV parser libraries (and
apps like Excel, OpenOffice, etc.) would do the right thing when
encountering a semicolon at the beginning of the line. Are you
willing to do some research and share the results with the group?

Thanks,
Joe

Yuriy Yakimenko

unread,
Sep 18, 2008, 5:33:16 PM9/18/08
to gtfs-c...@googlegroups.com
I will check of some software (like R, a statistical package) that
works with .csv undestands it.

Yuriy

Yuriy Yakimenko

unread,
Sep 20, 2008, 3:38:14 PM9/20/08
to gtfs-c...@googlegroups.com
It looks like comments are not supported in csv files. Excel does not
understand them. I found proposals to include a comment symbol (#) as
I searched on the subject. At the same time, the .txt files in GTFS
are not specified as CSV.

Yuriy

Tom Brown

unread,
Sep 20, 2008, 4:11:17 PM9/20/08
to gtfs-c...@googlegroups.com
On Sat, Sep 20, 2008 at 12:38, Yuriy Yakimenko <inth...@gmail.com> wrote:

It looks like comments are not supported in csv files. Excel does not
understand them. I found proposals to include a comment symbol (#) as
I searched on the subject. At the same time, the .txt files in GTFS
are not specified as CSV.

The file name doesn't end in csv but the spec explicitly says they are like Excel CSV files:

http://code.google.com/transit/spec/transit_feed_specification.html
"This is consistent with the manner in which Microsoft Excel outputs comma-delimited (CSV) files. For more information on the CSV file format,..."


As an alternate solution to your problem, its is not uncommon to find random files such as "stop-old.txt" laying around in GTFS zips.

Yuriy Yakimenko

unread,
Sep 20, 2008, 5:08:32 PM9/20/08
to Google Transit Feed Spec Changes
I guess I'll implement this comment logic for my conversion utility
only for now. I need it for debugging purposes. If anyone finds some
"official" commenting for CSV, please post it here.

On Sep 20, 4:11 pm, "Tom Brown" <tom.brown.c...@gmail.com> wrote:

Joe Hughes

unread,
Sep 20, 2008, 5:18:39 PM9/20/08
to gtfs-c...@googlegroups.com
Yuriy,

I think the core of this decision is whether, on the whole, it would
make it easier or harder to work with GTFS files.

As you mentioned, adding comments allows you to document files (though
it's likely that most GTFS files are generated by scripts and
exports), and to temporarily comment out some of the data.

On the other hand, it would be unfortunate if comments interfered with
the operation of the tools that are commonly used to work with GTFS
data:
1) CSV libraries in languages such as Python, Java, and C++
2) off-the-shelf software like Excel, MS Access, and other databases

It would be interesting to hear more about what tools the members of
this use to work with GTFS data. I'd love to hear some details on
this (and thoughts on whether others are interested in using comments
in their data).

In the GoogleTransitDataFeed open-source project, we've used the
Python "csv" package (though we do some pre-processing to handle
character encoding issues). Joachim Pfeiffer's Java code appears to
just write out CSV using raw string concatenation, in the
TransxchangeHandlerEngine class. Both of those pieces of code could
probably be made to work with comments without much effort.

Joe

Reply all
Reply to author
Forward
0 new messages