Strategies for enriching data with addresses

24 views
Skip to first unread message

Bèr

unread,
Dec 17, 2018, 11:22:34 AM12/17/18
to imposm
Hello,

During, or after, the import, I'd like to enrich the data with address details such as street, city, country, country_code and so on. On both import and updates.

Most of this data can be inferred, but is not available as `addr:*` tags; when things like `addr:city` are, empty on nodes, they can be inferred from administrative boundaries, instead. A service like nominatim, however, can provide this data. Maybe other services can too, I'm not too familiar in this area.

Has someone done this with imposm already? If so, is there a standard or common set-up to get such data into postgresql/postgis?

I'm currently thinking of writing an external event-streamer that tails the imposm logs and which emits "import" events, an event-handler can subscribe to this and follow along with imposm to update records with address details from such an "external tool", maybe even by just reading the data over HTTP from nominatim. This would be slow, but would ensure that eventually, the address is in the postgres OSM database. Did anyyone connect imposm to an event-bus or some-such successfully already?

Or am I missing an important detail or feature of imposm and is this far easier than thought?

Thanks for your time,

Bèr




Sven Geggus

unread,
Dec 17, 2018, 11:57:20 AM12/17/18
to imp...@googlegroups.com
Bèr schrieb am Monday, den 17. December um 17:22 Uhr:

> During, or after, the import, I'd like to enrich the data with address
> details such as street, city, country, country_code and so on. On both
> import and updates.

Did you read my question in the latest thread about calling a script or
SQL command after a database update done by "imposm run"?

I think this question is at least related.

I made a patch to call an external script after replication but nobody
commented to this yet.

Sven


--
This APT has Super Cow Powers.
(apt-get --help on debian woody)

/me is giggls@ircnet, http://sven.gegg.us/ on the Web

Imre Samu

unread,
Dec 17, 2018, 12:18:27 PM12/17/18
to imp...@googlegroups.com
>Most of this data can be inferred, but is not available as `addr:*` tags; when things like `addr:city` are, empty on nodes, they can be inferred from administrative boundaries, instead. ....
>Has someone done this with imposm already? If so, is there a standard or common set-up to get such data into postgresql/postgis? 

imho:    in this case: 
- You can import the admin polygons  ( with imposm3) - and you can add the missing addr:city tags - with SQL  postprocessing  ( https://postgis.net/docs/ST_Contains.html  )       it is faster than calling external services.

Imre

 

--
--
_______________________________________________
Imposm mailing list
imp...@googlegroups.com
http://groups.google.com/group/imposm

---
You received this message because you are subscribed to the Google Groups "imposm" group.
To unsubscribe from this group and stop receiving emails from it, send an email to imposm+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Bèr Kessels

unread,
Dec 18, 2018, 9:01:47 AM12/18/18
to imp...@googlegroups.com
Hello,

Op 17-12-18 om 18:18 schreef Imre Samu:
> >Most of this data can be inferred, but is not available as `addr:*`
> tags; when things like `addr:city` are, empty on nodes, they can be
> inferred from administrative boundaries, instead. ....
> >Has someone done this with imposm already? If so, is there a standard
> or common set-up to get such data into postgresql/postgis? 
>
> imho:    in this case: 
> - You can import the admin polygons  ( with imposm3) - and you can add
> the missing addr:city tags - with SQL  postprocessing 
> ( https://postgis.net/docs/ST_Contains.html  )       it is faster than
> calling external services.
>

This, together with the SQL-post-processing as proposed by Sven, seems
like a good strategy.

It has some downsides though: administrative boundaries don't include
street level address details, since they are not infered from
ST_Contains but instead by more complex logic such as interpolating
numbers and looking at the closest way-polygon. Nominatim or even
closed-source services like OpenCage do provide this data.

Using this strategy would work for providing "country", "city",
"province" and in some cases "postal code regions" though. Not a full
address, but a long way towards it.

It does imply that my import fills at least a table with administrative
boundaries, which is then used to update the POIs that lack addr: tags.
Something like the following seems a good strategy. I like it
especially, because it would keep everything contained in
imposm+postgres. No external dependencies to install, maintain and monitor!

I could trigger a SQL script that uses the data from an
osm_administrative table into the POI table. Duplicating and
denormalising, but ensuring that each POI has some basic addr data filled.

Alternative is to join an `administrative` table on ST_Contains() and
then fetch country, province/district, city, etc from the join and use
that. Normalising but making the queries rather complex.

Thanks for the thoughts! I'll report back after I've played around with
this and know how this works. Mostly about how far this brings me, and
if getting streets and housenumbers right is remotely possible this way.

Bèr


Reply all
Reply to author
Forward
0 new messages