Hi all,
Just a brief email to let you know the outcome of an interesting email
exchange that started last Monday. I'll write more about this once
I've got the data uploaded, but the short story is that I have
obtained from Transit 2.2 million trackpoints from the 2008 High Speed
Data Collection Survey and I've just uploaded the first region (North
Canterbury of course) to OSM.
<http://www.transit.govt.nz/hsdc/overview.jsp>
I've got quite a few regions that need converting, but have a good
process worked out now so hope to get most of them up evenings this
week. I'm tagging them with TransitHSDC2008 in addition to the usual
NZ and New+Zealand tags.
<http://openstreetmap.org/traces/tag/TransitHSDC2008>
Will write more once they're uploaded, including thanks to others that
have helped make this happen.
Cheers Gav
hi gavin, that looks hugely useful info, i'm sure it'll help lots,
particularly the areas with no yahoo aerial photo coverage.
one thing that concerns me though, is the same question that came out
of the discussions a couple of months ago on the linz data. there was
no consensus reached on how osm were going to display attribution, and
the suggestion was not to include the data, at least until the new
license is settled; i'm concerned these track points from transit fall
into the same category, and could potentially pollute the data
that's assuming this is crown copyright also, and that there will be
similar restrictions placed on it's use?
1. As the trackpoints will for most people blur into a general mess,
it won't be an issue to identify and attribute individual points.
2. Not everyone is going to be accessing the trackpoints - they are
raw data to produce structured data.
3. Each GPX file has had a disclaimer included in the header
description tag. I have also pointed the URL in the GPX header to a
page on the gis.org.nz wiki which is going to be the effective home
page for this data outlining (again) the disclaimer etc.
The disclaimer is as follows.
"DISCLAIMER: The data contained in this data set is collected and used
by Transit New Zealand for specific purposes. Transit and its
employees or agents involved in preparation of this database cannot
accept liability for its contents or for any consequences arising from
its use. People using the contents of the database should apply, and
rely upon, their own skill and judgement. The contents should not be
used in isolation from other sources of advice and information."
I was quite up front in explaining this to my contact at Transit, and
after a couple of emails, he was happy with what I proposed.
It might be a different matter if I had attempted to get shapefiles of
the network, but given that trackpoints blend into everyone else's I
don't think there are going to be any pollution issues, not when
compared to loading structured data from LINZ.
Also on this note, a new guideline was developed on the Govt Web
Standards wiki earlier this year that is about to be more widely
promoted to encourage more flexible licensing and the sort of thing
you are after.
<http://www.gis.org.nz/wiki/Government_Geospatial_Information_Web_Access_Guideline
>
So I think Government are becoming more aware of the approach that is
needed for these sort of community-led ventures, and these are good
means of getting Government to test the waters of opening up more
information.
Cheers Gav
Hi Gavin ... what can I say, nice work.
Slight issue -- I notice that the track lines are not broken between
segments and so you get massive "leaps" in the data. for example:
http://openstreetmap.org/user/rediguana/traces/132459
Perhaps OSM's filters take care of that automatically? If not, it is
easily fixed with GpsBabel. In the following I split a track into
multiple segments when the jump between points > 100m. (FWIW I got an
identical split with a 50m threshold)
gpsbabel -i gpx -f transit-20080312-coastalotago.gpx \
-x track,sdistance=0.1k \
-o gpx -F transit-20080312-coastalotago_split.gpx
for all tracks in dir a little unix shell script magic:
# may need to add the "pack" filter, -x track,pack,sdistance=0.1k
for MAP in transit-*.gpx ; do
gpsbabel -i gpx -f "$MAP" -x track,sdistance=0.1k \
-o gpx -F "`basename $MAP .gpx`_split.gpx"
done
After that it all looks good. To test I loaded them into a GRASS GIS
lat/lon WGS84 location using the v.in.gpsbabel module:
#import as track
v.in.gpsbabel -t in=transit-20080312-coastalotago.gpx \
out=coastalotago_312_trk format=gpx
#import as points (grass 6.4svn)
v.in.gpsbabel -tp in=transit-20080312-coastalotago.gpx \
out=coastalotago_312_pts format=gpx
#import as points (grass 6.3.0)
cat transit-20080312-coastalotago.gpx | grep '<trkpt' | \
cut -f2-4 -d'"' | sed -e 's/" lon="/|/' | \
v.in.ascii out=coastalotago_312_pts x=2 y=1
regards,
Hamish
Hi Gavin ... what can I say, nice work.
Slight issue -- I notice that the track lines are not broken between
segments and so you get massive "leaps" in the data. for example:
http://openstreetmap.org/user/rediguana/traces/132459
Perhaps OSM's filters take care of that automatically?
If not, it is easily fixed with GpsBabel. In the following I split a track into
multiple segments when the jump between points > 100m. (FWIW I got an
identical split with a 50m threshold)
Gavin:
> Yep - the data I received had only date, not timestamps, and the points were
> sometimes out of order in the text files I received.
I assume they were not random, dump from DB lists them north->south or..?
Presumably they were originally loaded into the system in a sequential way.
>> Perhaps OSM's filters take care of that automatically?
>
> If the points were properly ordered by time, they should have, but in this
> case I couldn't see an easy fix, and as the main application is viewing as
> points, rather than worring about the connecting segments, I wasn't too
> worried.
ok, understood.
>> If not, it is easily fixed with GpsBabel. In the following I split a track
>> into multiple segments when the jump between points > 100m.
>> (FWIW I got an identical split with a 50m threshold)
>
> Nice - I didn't even think of splitting the tracks.
>
> I'll give that a go now. Note that pack/merge won't work with this data as
> all of them have timestamps set to 00:00:00, which for GPSBabel means
> that you're going to end up with only one point at that time - additional points
> with the same time are dropped. So distance is the only way we can do this
> with this data.
ah, I hadn't realized that gpsbable used the timestamp as the key.
I hadn't actually tried pack,merge. No matter, turns out the transit
files I saw the warning: "trackfilter-split: Cannot split more than
one track, please pack (or merge) before!" for were actually just my
own experiments that the wildcard caught.
> Update - it appears to work fine on removing the ugly rendering appearance
> on the tracklogs, but there are still some areas that are not properly
> formed - e.g. where there are points on both sides of the highway, the
> tracks zig-zag from one side to the next, and these are only 20m apart, so I
> think the files will be unreasonably large if we split the tracks to that
> level. Note that even 0.1km has significantly increased the file size in
> some files from anywhere from 20%-60% because of the additional
> overhead associated with creating multiple tracks.
ok; if these are going in as points, then not a problem for the purpose.
The one road I looked at with travel on both sides of the street
formed nice lines, but ok, not all of them will be like that. Of
course multi-lane roads will have positional issues too.
> So, I probably won't modify the source files as they are uploaded to OSM as
> their original task was to be standalone trackpoints rather than polylines,
> and this was reflected in the data I got from Transit.
sounds fine.
> I have added your
> code suggestions to the wiki page for the data, so that anyone that visits
> it can understand the limitations of the data, and see how to refine it
> further, and tidy up the rendering of the tracklogs. Hope you're OK with me
> recording your advice there :)
sure; it's a public list with the word "open" in it after all. :)
> <http://www.gis.org.nz/wiki/Transit_High_Speed_Data_Collection_Survey#GPX_File_Limitations>
>
> This is probably as far as we can take this without asking Transit to
> provide timestamps so we can properly order the tracklogs. I will suggest
> that we try to get timestamps for the 2009 survey as long as there are no
> privacy concerns.
there may well be privacy concerns. I know if I were doing the survey
it would be annoying for some jerk to go into the data and analyze how
long I stopped for a pie at lunch or when I didn't slow down to 50kph
until 12m after the speed zone change then start writing letters to
the editor/boss...
But using timestamp as the key is mostly just a quirk of the gpsbabel
tool, not a serious shortcoming of the data. As long as they are
sequential in time other tools may be easily used.
re. wiki page:
----------
"Data Quality
From Transit's website.
The SCRIM+ is fitted with Trimble GPS equipment sampling the
Omni-Star satellite to record the differential GPS coordinates of the
centreline. Tilt sensors for crossfall and gradient together with a
gyroscope provide alignment details when out of sight of satellites.
The original data was provided by Transit in GD49/NZMG and was
reprojected using the NTv2 grid to WGS84 to maintain a high level of
accuracy. Each point should be accurate to ~+/- 1m. "
----------
I am not sure what DGPS level they used, but it is a real shame they
chose to go with NZMG. I hope internally they saved as lat/lon WGS84!
NZMG is both mathematically and practically meaingless below ~1m, so
it would be a shame if they paid a lot of money and went to a lot of
effort to get sub-meter accuracy then degraded it by using an inferior
projection system when NZTM+GD2k is right there.
Another thing, great that you used the NTv2 grid to do the datum
transform, but what assurance do you have that Transit* did the same
when they converted from the GPS's raw WGS84 to NZMG on the way into
the dataset?
* actually from what I've seen/recall the Trimble software does this
internally somewhere.
Regardless, even 5m error probably wouldn't be noticed. Attached find
a little test image where I put the transit data (red dots, green
line) over the top of LINZ topo4 road_cl layer (some years old;
reprojected from NZTM -> LL/wgs84). As Highway 1 South heads past
Dunedin's Octagon. At the top right the GPS track is about 24m NW of
the old LINZ road data.
I'd be interested to know how the latest v14 LINZ data lines up there.
Maybe this is part of the reason why Transit commissioned the job?
---
If anyone is interested, here is the PROJ.4 (proj.maptools.org)
command to do NZMG/GD49->LL/WGS84 with the distortion grid:
cs2cs -f "%.7f" +init=epsg:27200 +nadgrids=/path/to/nzgd2kgrid0005.gsb \
+to +proj=longlat +datum=WGS84 \
< coord_NZMG.txt > coord_LLwgs84.txt
data files should be two columns separated by whitepace, "easting northing".
enough from me already,
Hamish
does anyone know what happened with this; did all the tracks get uploaded?
gavin?
I'm tagging them with TransitHSDC2008 in addition to the usualNZ and New+Zealand tags.<http://openstreetmap.org/traces/tag/TransitHSDC2008>
does anyone know what happened with this; did all the tracks get uploaded?
yes, i did. the reason i ask is because when i looked at osm within
potlatch, not all the roads (in fact very few) had gps tracks
associated with them. i was under the impression the transit logs were
for every road in the country? my question would have been better
worded as; were there any problems/missing data i guess?
i've tried the other link (to transit), but that appears to be dead
Yes - that would be the disconnect. This data was solely for the National Highway network. So, you should only see the data for the highways and nothing else. All non-state highways will not be included in that upload.
As I understand it, and unless things have changed with the newer NZ Transport Agency, the 73 (for the time being) road controlling authorities are the agencies that contain authoritative road data for their jurisdiction. The old Transit was only authoritative for the highway network.
Hope that clears it up! :)
Cheers Gav