TransXChange2GTFS with NaPTAN v2

90 views
Skip to first unread message

Andrew Byrd

unread,
Apr 21, 2013, 9:55:13 AM4/21/13
to googletran...@googlegroups.com
I am experimenting with converting the TfL data set to GTFS, with the intent to eventually convert data for other parts of the UK.

In the process, I ran into a couple of issues and patched the source code before realizing that these were known issues and patches were included in these existing tickets:
Issue 334: Transxchange2GoogleTransit doesn't support NaPTAN v2.x CSV
Issue 336: Transxchange2GoogleTransit has an ArrayOutOfBoundsException when NaPTAN BusStopTypes are empty

My patch (attached) is slightly different in that it detects whether quotes are present before attempting to strip them out, and should work with either v1 or v2 NaPTAN files, simply falling back on the v2 field name when the v1 field name is not present (as suggested in https://groups.google.com/d/msg/googletransitdatafeed/XoeWPfS-UCs/b9NVhUnusP4J). Tabs and spaces might be mixed because they were mixed in the original code and I wasn't sure which to use on this project.

A couple of questions:
1. The currently available NaPTAN data seems to be version 2, so the converter available via SVN at http://code.google.com/p/googletransitdatafeed/ fails on current data and must be patched. Is the code available there still maintained, or has the main repo moved elsewhere?
2. Can anyone point me to some v1 NaPTAN (or just a Stops file) for testing purposes?

Thanks!
Andrew
naptan_v2.patch

JP

unread,
Apr 21, 2013, 11:39:00 AM4/21/13
to GoogleTransitDataFeed
Hi Andrew,
The converter code on the Google Code page http://code.google.com/p/googletransitdatafeed/
is still being maintained. There is one fork that I am aware of, as
described here:
http://groups.google.com/group/googletransitdatafeed/browse_thread/thread/5e87963df4be502b#

There has been an export of all of Great Britain in 2011, but that was
it. http://data.gov.uk/dataset/nptdr
That made me lose interest to keep pushing the converter code base. So
I did not pursue the integration of the changes as documented in the
issue tracker. Switching over to v2 CSV format however should be a
pretty straightforward exercise, so let me see that I can roll your
patches into the converter code base. I haven't reviewed them yet, but
thank you for the contribution!

On the radar... in the meantime, I've become aware of another source
for country wide data. Coincidentally, I put the paperwork in the mail
yesterday, so if everything checks out I think I will be compelled to
put work into the converter code base again.

JP


On Apr 21, 6:55 am, Andrew Byrd <and...@fastmail.net> wrote:
> I am experimenting with converting the TfL data set to GTFS, with the
> intent to eventually convert data for other parts of the UK.
>
> In the process, I ran into a couple of issues and patched the source code
> before realizing that these were known issues and patches were included in
> these existing tickets:
> Issue 334: Transxchange2GoogleTransit doesn't support NaPTAN v2.x CSV
> Issue 336: Transxchange2GoogleTransit has an ArrayOutOfBoundsException when
> NaPTAN BusStopTypes are empty
>
> My patch (attached) is slightly different in that it detects whether quotes
> are present before attempting to strip them out, and should work with
> either v1 or v2 NaPTAN files, simply falling back on the v2 field name when
> the v1 field name is not present (as suggested
> inhttps://groups.google.com/d/msg/googletransitdatafeed/XoeWPfS-UCs/b9N...).
> Tabs and spaces might be mixed because they were mixed in the original code
> and I wasn't sure which to use on this project.
>
> A couple of questions:
> 1. The currently available NaPTAN data seems to be version 2, so the
> converter available via SVN athttp://code.google.com/p/googletransitdatafeed/<https://code.google.com/p/googletransitdatafeed/> fails
> on current data and must be patched. Is the code available there still
> maintained, or has the main repo moved elsewhere?
> 2. Can anyone point me to some v1 NaPTAN (or just a Stops file) for testing
> purposes?
>
> Thanks!
> Andrew
>
>  naptan_v2.patch
> 8KViewDownload

Andrew Byrd

unread,
Apr 21, 2013, 1:26:10 PM4/21/13
to googletran...@googlegroups.com
On 04/21/2013 05:39 PM, JP wrote:
> Hi Andrew,
> The converter code on the Google Code page http://code.google.com/p/googletransitdatafeed/
> is still being maintained. There is one fork that I am aware of, as
> described here:
> http://groups.google.com/group/googletransitdatafeed/browse_thread/thread/5e87963df4be502b#

Thanks for the clarification. It does look like Daniel has already
addressed several of the issues I am running into.

> Switching over to v2 CSV format however should be a
> pretty straightforward exercise, so let me see that I can roll your
> patches into the converter code base. I haven't reviewed them yet, but
> thank you for the contribution!

I finished a conversion of the London data using the v2 NaPTAN, and
while the patch allows the conversion to run to completion without
errors, the stops.txt is missing many fields. This appears to be caused
(at least partly) by column misalignment: the fields are being split on
the 3-character sequence "," but some fields are not quoted in the input
data. It looks like the new NaPTAN data would require a rework of the
CSV parser (or use of a third-party library) to handle quoted fields
throughout.

If I am indeed able to get current data for other parts of the UK I will
continue to work on the code.

> On the radar... in the meantime, I've become aware of another source
> for country wide data. Coincidentally, I put the paperwork in the mail
> yesterday, so if everything checks out I think I will be compelled to
> put work into the converter code base again.

I was expecting to use this one, which appears to be updated weekly:
http://data.gov.uk/dataset/traveline-national-dataset
http://traveline.info/tnds-login-or-register.html

"The weekly TNDS, for the whole UK and by region, is released under the
Open Government Licence and is free to user."

I just realized that the London data contain may enough information to
export GTFS without cross-referencing with the NaPTAN data (I see stop
name elements) and am re-running the conversion. However, the example
converter config files in the London converter script at
https://code.google.com/p/googletransitdatafeed/wiki/LondonConvertScript
do use NaPTAN CSV...

-Andrew

Joachim Pfeiffer

unread,
Apr 22, 2013, 12:20:38 AM4/22/13
to googletran...@googlegroups.com
HI Andrew,
I am trying to access to source inside the .patch file. Can you do me a favor and zip up the source files you've changed so I can review and introduce in the code base?
JP


Andrew

--
You received this message because you are subscribed to the Google Groups "GoogleTransitDataFeed" group.
To unsubscribe from this group and stop receiving emails from it, send an email to googletransitdat...@googlegroups.com.
To post to this group, send email to googletran...@googlegroups.com.
Visit this group at http://groups.google.com/group/googletransitdatafeed?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Andrew Byrd

unread,
Apr 22, 2013, 2:30:27 AM4/22/13
to googletran...@googlegroups.com

The output of the conversion using that patch was not satisfactory due
to the way fields are quoted in the current NaPTAN file. I am about half
finished with a more robust CSV parser, and will post the changes as
soon as I have been able to produce a usable GTFS feed.

Sure, I can supply entire source files if the patches are not working out.

-Andrew
> <https://code.google.com/p/googletransitdatafeed/> fails on current
> data and must be patched. Is the code available there still
> maintained, or has the main repo moved elsewhere?
> 2. Can anyone point me to some v1 NaPTAN (or just a Stops file) for
> testing purposes?
>
> Thanks!
> Andrew
>
> --
> You received this message because you are subscribed to the Google
> Groups "GoogleTransitDataFeed" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to googletransitdat...@googlegroups.com
> <mailto:googletransitdatafeed%2Bunsu...@googlegroups.com>.
> To post to this group, send email to
> googletran...@googlegroups.com
> <mailto:googletran...@googlegroups.com>.

Joachim Pfeiffer

unread,
Apr 22, 2013, 9:54:46 AM4/22/13
to googletran...@googlegroups.com
Great, thanks.
The converter project started before the emergence of GIT, so no GIT spoken here (;->).

Reply all
Reply to author
Forward
0 new messages