Allen Briscoe-Smith writes...
...what products you use to ETL spatial data into the PostGres DB?
This would be the reference layers like Parcels or Streets that the
application uses to derive and “seed” spatial information into the
address layer/table.
Allen
The city of San Francisco uses the Feature Manipulation Engine (FME)
as the backbone of its ETL work - it's a very good tool. You can
certainly use it by itself but I would bet on the need to automate and
rerun whatever you build. I would also bet on running into something
that FME does not do or does not do well. Therefore, we have built a
simple python framework capable of running FME workspaces as well as
some other stuff to help automate our ETL jobs. To check this out you
can look here:
http://code.google.com/p/eas/source/browse/trunk/etl/src/config/examples/config_jobs.py
It's based on the command pattern.
The idea is to allow you to work mostly at the "macro" level in
config_jobs.py.
Like this:
- using DEV | QA | PROD
- ftp a file from source to target
- use an FME workspace to load the data I just FTP'd
- run this bunch of SQL on the target database
- run this proc on the target database
- send an email indicating success
We tend to use FME to move data from one place to another - and prefer
SQL for the transformations.
Now, more to your question.
Our initial ETL of the addresses was a one off (unautomated) and is
decribed here:
http://code.google.com/p/eas/source/browse/trunk/docs/fargeo/OneTimeAddressETL_draft.pdf
We are currently automating part of that - if you want to look at it
let me know.
We have a daily process for parcels and streets which you can see here
(again):
http://code.google.com/p/eas/source/browse/trunk/etl/src/config/examples/config_jobs.py
FME workspaces have been excluded from the public repo because they
contain passwords.
I can elaborate on this if you think it would be helpful.
Regards,
Paul