On Mon, Nov 23, 2009 at 1:49 PM, Martin Akerman <
make...@gmail.com> wrote:
> When I say on-demand, I mean agencies will be publishing as many times
> as they are queried. A simple php, perl, java or ruby script can
> create the text files and zip them on-demand so that a package is
> fresh and hot off the press every time google or anybody fetches the
> files. I don't know of any agencies doing that at the moment but maybe
> somebody else here does.
>
in that sense of on-demand, gtfs-data-exchange does md5 checks on the
zip files when retrieved to ensure uniqueness, but it will probably do
md5's on the contained files in the future. ie: you can't currently
re-upload the same file multiple times. (also as far as on-demand
publishing goes, it (to me) almost indicates an even stronger need for
a developer to know when the underlying data changes; do i refresh my
system every month, day, hour, minute? )
> Just to illustrate, I created an export tool for a different purpose
> in 2006 that shows a file generated on-demand.
>
http://floridatransitindicators.org/detail.php?chart=5a -> Use "XML
> Data" and "Most Recent Excel Data" to see "XML-RPC" and "CSV
> on-demand" in action.
> It is only a little out of date because it has not been updated but I
> think it illustrates what I'm getting at.
>
> If what you are suggesting is a historical index, I'm for it. However,
> I'd be sure to get permission from the agencies you index.
> The indexing system you are speaking of would not hurt the publishing
> of new GTFS files and I can see the value to having an archive for
> historical information.
It's designed as both historical archival, and index of sources;
however, with regards to replacing the PublicFeeds page it's the index
of sources functionality that matters.
>
> Metrolink is a perfect example of an agency that may not want to be
> included in the index.
>
http://www.metrolinktrains.com/tripplanner/schedule_data.php
> They have some rules before the package can be downloaded. They also
> state "Keep your work up to date. Check this page frequently and note
> when schedules are updated. Please don't distribute the raw files: We
> want to avoid out-of-date versions of schedules and other information
> being circulated".
>
gtfs-data-exchange does check for updated files daily, in keeping with
the goal of solving the out-of-date schedule problem, and the 'check
frequently' request. However, metrolink is contradictory on the terms
around usage of those files, as the license that page points to
clearly says:
"... hereby grants you (Licensee) non-exclusive, limited and revocable
rights to use, reproduce, and redistribute SCRRA Data (Data)..."
which, of course, is the whole point of publishing schedule data to
developers in the first place; so it can be re-distributed and get
into the hands of riders.
> I like the web site you put together.
thanks
> I'd still like agencies to host their packages so to not interfere
> with the evolution of distribution of transit data.
me too; I want agencies to be directly involved in publishing data,
and i don't want gtfs-data-exchange to be a required middleman.
> The future is most likely in some form of XML-RPC variation hosted at
> the agency and not in large CSVs.
i doubt many small agencies will ever move beyond static schedule
files, but yes some larger agencies will be moving towards more
interactive api endpoints, especially with regards to realtime data;
but thats a different topic.