Hi,
Some quick thoughts:
- We should help users who do geocoding (search coordinates by attribute)
so that they can use partial strings and application could create
automatically a selection list of candidates. This would mean LIKE
queries. Is is still correct information that LIKE in SQLite cannot
utilize index and are tricks in these documents are still valid?
http://joshua.perina.com/africa/gambia/fajara/post/converting-to-sqlite-and-like-query-optimization
http://web.utk.edu/~jplyon/sqlite/SQLite_optimization_FAQ.html
- For reverse geocoding Spatialite should do a "Nearest One" or perhaps
"Nearest N" query (because first hit from OSM address data may be rubbish)
and such function does not exist yet. Something like the indexed nearest N
search in PostGIS
http://boundlessgeo.com/2011/09/indexed-nearest-neighbour-search-in-postgis/
would be nice to have and not only for addresses but also generally.
Creating the address database from OSM data will be painful. First simple
thing to do is to select points with some address tag from OSM data and
save them into database. Next, the same should be done for the polygons
because great part of addresses is put into building polygons. I have used
centroid function and appended those features into the same table than
addressed collected from points. I am not sure if lines may have usable
addresses but if they do then centroid might suit for those too.
This is where the problem begins. All kind of funny features in OSM can
have addresses: restaurants, shops, bus stops and so on. The same address
may be found from a building polygon and from several POIs inside the
building. Very many addresses are incomplete and they have only some of
the common address tags
http://wiki.openstreetmap.org/wiki/Proposed_features/House_numbers/Karlsruhe_Schema#Tags
In addition, addr:city is often in tag "is_in".
I believe that I would not rely too much on the "addr:city" and "is_in"
tags but I would search some good old polygon dataset about municipality
borders and feed in the addr:city data through a "within" spatial query.
At least I would use that for filling the missing data and for quality
control of existing tags.
I haven't been thinking at all about the interpolated street numbers and
other advanced features in
http://wiki.openstreetmap.org/wiki/Addresses
Perhaps it would be good to materialize some interpolated address points
with for example "Line_Interpolate_Equidistant_Points" function. If there
will be a dedicated geocoding tool it might be made clever enough to place
a not-really-in-the-data address "Harbour Street 6" in between existing
"Harbour Street 4" and "Harbour Street 8" even without a materialized
point feature in the address db.
It should be possible to use easily also other address sources that OSM.
Some countries have already published official addressed as open data and
there are more to come. Real OSM mappers live in a faith that all data
become better when they are converted and imported into OSM but I often
prefer using the native data. It should be no problem to have several
address tables in Spatialite and let users to select which one(s) are used
by the geocoder tool if tables share a common schema.
I have Finnish OSM addresses in my WFS and they can be used for testing.
Here is a sample GetFeature with maxfeatures=10
http://hip.latuviitta.org/cgi-bin/tinyows?service=wfs&version=1.0.0&maxfeatures=10&request=getfeature&typename=lv:osm_address
I have also some other addresses. This query takes 10 from a complete,
high quality address data of City of Helsinki
http://hip.latuviitta.org/cgi-bin/tinyows?service=wfs&version=1.0.0&maxfeatures=10&request=getfeature&typename=lv:hki_osoiteluettelo
This zip should contain a few million Finnish addresses from the open data
by National Land Survey of Finland in a Spatialite database
http://latuviitta.org/documents/mtk_osoitteet_2012.zip
Accuracy of these addresses is not so good but it is good enough for playing.
-Jukka Rahkonen-