Derek Eder
geomancer.io
Python, Flask, csvkit
Data sources: Census, Labor Services, Bureau of Economic Analysis etc. all implemented differently
Started with the ones easy to work with.
Overall architecture similar to Open States... similar problem... 50 states, each with their own legislatures. Developed a framework, that all feeds into.
We followed a similar pattern with Geomancer. Plugin different data sources. How to add your own data sources.
Goal is to match data based on shared geography. List of geography types (5- or 9-digit zip codes, city, congressional district, county, FIPS (census tract, county, state) school district, state)
Built in a way that others can add stuff.
Troy: Demo
2. click Get Started
3. Header row, upload for instance, state data
4. select geography. It'll try to recognize geography level by default
5. select data to add such as Educational Attainment (shows how many columns are added, also a "margin of error"
Note: Geomancer does not store your data, but it caches it for faster response
6. Download the spreadsheet
Demo of county match of environmental to income data
Eric:
Will document the API
Q: Can you support geocoding? getting lat-longs back for addresses
It's actually on our someday-maybe wishlist
you can use two separate columns to identify geography
they got a year of funding
Extending the open source code
documentation on the site gives you a brief overview.
github....geomancer/mancers/base.py has a lot of documentation
Go to the DataMade repo, there is a closed pull request for BEA
Trickiest dataset was from the EPA. The issue was having uninteresting geographies such as a region. And the data was attached to a particular facility, so we couldn't use it.
As a developer, you could make a custom geography
BLS mancer to be added.
TODO option: Add PANDA as a back end
If there is a data point that we should have, let us know
This site is intended as an appliance to incorporate as your own tool.
Serdar: you can spin up your own instance:
The
geomancer.io site might not be up there forever. Also, if you're working with sensitive data, you need to install your own instance behind your org's firewall.
Road map from here:
Some more data sources
decennial census
BLS data
PANDA integration
More ideas
International data
workflow is to add a new geotype (countries)
the hardest thing is how do I wrap my brain around international apis, such as the UN. Comment: World Bank API. Has "just about everything you would want."
OpenElections data, precincts
Can you upload a data source? Currently you have to write a "mancer."
A mancer is a wrapper. But the wrapper doesn't know the ins and outs of your data. such as fuzzy matching that makes it easy to link up data.
PANDA might be a challenge.
Note: csvkit has a join on columns feature.
re: international data
ArcGIS Open Data, search by country
Idea: Upload a shapefile
The geographies don't necessarily match up with all the data sources
Idea: several states (Michigan, Wisconsin... etc.) the sub-county level is its own government NCD codes would be huge for journalists in those 12 states.
Note: FIPS codes are being retired. USGS will be official keeper... so migrating to GNIS as new standard
The file has the cross reference to FIPS codes.
By this year or next year, have to include GNIS in the data.
Same with school districts, USA spending, etc.
Detailed writeup