libroutez import status

20 views
Skip to first unread message

William Lachance

unread,
Jan 28, 2009, 11:59:58 AM1/28/09
to my...@googlegroups.com
Hey all,

So I finally got the graph imported into libroutez. It turns out that
the myttc broke several assumptions that I had in the library (notably:
trip ids are numbered 0-N with no gaps, ditto with route and trip ids),
so I needed to rework some things to use a mapping between internal
libroutez ids and gtfs ids. (actually, libroutez won't use this mapping
internally, but it will be necessary to turn the directions it spits out
into something sensible in a web application)

Incidentally, I think I need to write a replacement for processing gtfs
feeds. google's transit feed python library is really slow for large
datasets. :(

Anyway, while parsing the feed using this library, I got a ton of
complaints exactly like this (e.g. always 664):

Invalid value 664 in field shape_id

And also like this one (sequence number and trip numbers vary):

Timetravel detected! Arrival time is before previous departure at
sequence number 69 in trip 36167.

Unfortunately I can't plot paths with the resulting graph, as there are
negative edges in the results (possibly/probably related to the above).
I'll need to figure out what's going on and fix that before I do. But
FWIW, a process which does nothing but keep the TTC dataset (+ GTA OSM)
in memory uses 324.1 megs of RAM.
--
William Lachance <wrl...@gmail.com>

Joe Hughes

unread,
Jan 28, 2009, 8:51:59 PM1/28/09
to MyTTC
On Jan 28, 8:59 am, William Lachance <wrl...@gmail.com> wrote:
> So I finally got the graph imported into libroutez. It turns out that
> the myttc broke several assumptions that I had in the library (notably:
> trip ids are numbered 0-N with no gaps, ditto with route and trip ids),
> so I needed to rework some things to use a mapping between internal
> libroutez ids and gtfs ids. (actually, libroutez won't use this mapping
> internally, but it will be necessary to turn the directions it spits out
> into something sensible in a web application)

BTW, have you tried running your library against many other bits of
publicly available GTFS data? There should be a pretty good diversity
in the data that's out there now.

> Incidentally, I think I need to write a replacement for processing gtfs
> feeds. google's transit feed python library is really slow for large
> datasets. :(

Patches welcome! :] Even if you don't feel like hacking the code,
it'd be worth discussing approaches on the GoogleTransitDataFeed
list. I know Tom Brown has been spending some time working on
optimizations and restructuring the object model to work better.

> Anyway, while parsing the feed using this library, I got a ton of
> complaints exactly like this (e.g. always 664):
>
> Invalid value 664 in field shape_id

Looks like there's no shape defined in shapes.txt with shape_id==664.

Joe

William Lachance

unread,
Jan 29, 2009, 5:30:54 PM1/29/09
to my...@googlegroups.com
On Wed, 2009-01-28 at 17:51 -0800, Joe Hughes wrote:
> On Jan 28, 8:59 am, William Lachance <wrl...@gmail.com> wrote:
> > So I finally got the graph imported into libroutez. It turns out that
> > the myttc broke several assumptions that I had in the library (notably:
> > trip ids are numbered 0-N with no gaps, ditto with route and trip ids),
> > so I needed to rework some things to use a mapping between internal
> > libroutez ids and gtfs ids. (actually, libroutez won't use this mapping
> > internally, but it will be necessary to turn the directions it spits out
> > into something sensible in a web application)
>
> BTW, have you tried running your library against many other bits of
> publicly available GTFS data? There should be a pretty good diversity
> in the data that's out there now.

To be honest, no. The MyTTC feed is the first one I've tried other than
my (in development) feed for the city of Halifax. Long term of course, I
do want to have something which does a good job of handling everything
out there.

> > Incidentally, I think I need to write a replacement for processing
> gtfs
> > feeds. google's transit feed python library is really slow for large
> > datasets. :(
>
> Patches welcome! :] Even if you don't feel like hacking the code,
> it'd be worth discussing approaches on the GoogleTransitDataFeed
> list. I know Tom Brown has been spending some time working on
> optimizations and restructuring the object model to work better.

Of course you're right. I did some poking in the code and already I see
some low-hanging fruit. I'll try to find the time to email the list
about it tomorrow...
--
William Lachance <wrl...@gmail.com>

Reply all
Reply to author
Forward
0 new messages