Open Cities/Counties (and beyond)

100 views
Skip to first unread message

Justin

unread,
Jan 8, 2013, 3:02:41 PM1/8/13
to fifty-sta...@googlegroups.com
I would like to open up discussion about processing city and county data (votes, legislators, committees, etc). 

As of 2012, there were roughly 5,500 cities and counties in the US with a population of more than 10,000 people. Maintaining scrapers for all of these entities is impractical. This is particularly true for smaller governments with less resources. In short, much of the data isn't parsable, the format changes often, or, at best, haphazard to process (eg using OCR for scanned PDF's). 

"open government" is a very loose term. As far as I know, there aren't any federal or state guidelines defining what an ideal open, accountable, and transparent government should look like. FOIA, merely states that "each agency shall make the raw statistical data used in its reports available electronically to the public upon request". It certainly doesn't dictate government entities must make their data (votes, journals, etc) machine processable and free.

One could envision a daily/weekly process which queries local government data, from the inside (push versus pull method), and sends to a centralized remote endpoint (eg. Sunlight Foundation). 

Arguably, our modern governments as we know them today would not have come to fruition without the printing press. Just as Benjamin Franklin believed in the printing press and catapulting news and ideas to the people, I'm of the belief we're on the cusp of a large movement with the internet and government. However, without definitive requirements, I'm afraid these lower level governments will implement adhoc services (just as their states have). 

TL;DR I want to live to see “Open States” happen for all levels of government. 

Shauna Gordon-McKeon

unread,
Jan 8, 2013, 3:12:33 PM1/8/13
to fifty-sta...@googlegroups.com
Justin,

This is something I've thought a bit about.  You're absolutely right that maintaining scrapers for thousands of cities is impractical - what seems more achievable is making the scraping process simpler such that random, interested citizens of various towns could apply them to their local governments as desired.  

Related to this could be a standard way to present data that individual citizens could bring to their local officials.  Who knows how successful this would be, but presumably at least a few local governments would be willing to alter how their data is published, that could be easily automated.

I wonder, too, if this is something that local, traditional media outlets such as newspapers could be enlisted to help with.  

I'm curious whether you have specific ideas to make open cities happen?  I agree with you, it's a formidable task.

- Shauna


--
You received this message because you are subscribed to the Google Groups "Open State Project" group.
To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/EFPn8mQzL5YJ.
To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/fifty-state-project?hl=en.

Justin

unread,
Jan 8, 2013, 5:02:09 PM1/8/13
to fifty-sta...@googlegroups.com
I think there is something to be said for Municode. I just scraped each states library page (eg http://www.municode.com/Library/TN) and they've managed to aggregate and unify municipal law and ordinances for ~2,645 cities and counties. I don't know any details about the backend or how they achieved this but in reading their company history page ... 

"It has always been our goal to integrate technology into our workflow and workforce for the benefit of our customers. For instance, we have an email address for submission of ordinances, customer service email and an FTP site for posting of ordinances that cannot be emailed."

One key point that struck me was they use email for code submission. As archaic and antiquated as it may seem, using email/form is still a very common practice. I don't want to get too technical here. Maybe someone more knowledgeable than I can chime about Municode.

I think it's going to take a hybrid solution where some governments are scraped and others submit their data through an API. IMHO, it would behoove Sunlight to setup an endpoint to process inputs (via PUT/POST requests).

But here's the important part. I'm no sales and marketing guru but there's an old age saying, "features tell, benefits sell". How will this ultimately benefit the city or county? More than likely, you won't get the needed support without obvious benefits. If one of those benefits pertains to a more efficient and flexible government, which might result in the incumbent being re-elected, you may have an even better chance of legislation being passed to mandate the initiative.

BTW - are you the author of this page? 

To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.

James Turk

unread,
Jan 8, 2013, 5:09:09 PM1/8/13
to fifty-sta...@googlegroups.com
I'm happy to see that there's real interest in this, it has long been a goal of us on the project and we're hopefully going to have something to announce fairly soon on this, but the shape of that isn't yet fully formed.

The one thing that is safe to say is that just like we wouldn't have completed Open States without this great community, we're going to be relying a lot more on interested parties to help us shape the future of whatever it is we end up doing at the municipal level.  

As far as a push endpoint, that isn't something we've seen interest in from state government but certainly something easy enough for us to provide should there be any down the line.

-James

Shauna Gordon-McKeon

unread,
Jan 8, 2013, 5:10:03 PM1/8/13
to fifty-sta...@googlegroups.com

BTW - are you the author of this page? 


I am.  I'm impressed that you've read it - I made that less than a day ago!



 
On Tuesday, January 8, 2013 3:12:33 PM UTC-5, Shauna wrote:
Justin,

This is something I've thought a bit about.  You're absolutely right that maintaining scrapers for thousands of cities is impractical - what seems more achievable is making the scraping process simpler such that random, interested citizens of various towns could apply them to their local governments as desired.  

Related to this could be a standard way to present data that individual citizens could bring to their local officials.  Who knows how successful this would be, but presumably at least a few local governments would be willing to alter how their data is published, that could be easily automated.

I wonder, too, if this is something that local, traditional media outlets such as newspapers could be enlisted to help with.  

I'm curious whether you have specific ideas to make open cities happen?  I agree with you, it's a formidable task.

- Shauna


On Tue, Jan 8, 2013 at 3:02 PM, Justin <tcpa...@gmail.com> wrote:
I would like to open up discussion about processing city and county data (votes, legislators, committees, etc). 

As of 2012, there were roughly 5,500 cities and counties in the US with a population of more than 10,000 people. Maintaining scrapers for all of these entities is impractical. This is particularly true for smaller governments with less resources. In short, much of the data isn't parsable, the format changes often, or, at best, haphazard to process (eg using OCR for scanned PDF's). 

"open government" is a very loose term. As far as I know, there aren't any federal or state guidelines defining what an ideal open, accountable, and transparent government should look like. FOIA, merely states that "each agency shall make the raw statistical data used in its reports available electronically to the public upon request". It certainly doesn't dictate government entities must make their data (votes, journals, etc) machine processable and free.

One could envision a daily/weekly process which queries local government data, from the inside (push versus pull method), and sends to a centralized remote endpoint (eg. Sunlight Foundation). 

Arguably, our modern governments as we know them today would not have come to fruition without the printing press. Just as Benjamin Franklin believed in the printing press and catapulting news and ideas to the people, I'm of the belief we're on the cusp of a large movement with the internet and government. However, without definitive requirements, I'm afraid these lower level governments will implement adhoc services (just as their states have). 

TL;DR I want to live to see “Open States” happen for all levels of government. 

--
You received this message because you are subscribed to the Google Groups "Open State Project" group.
To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/EFPn8mQzL5YJ.
To post to this group, send email to fifty-sta...@googlegroups.com.

To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/fifty-state-project?hl=en.

--
You received this message because you are subscribed to the Google Groups "Open State Project" group.
To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/OMrqj4QpcWQJ.

To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.

Justin

unread,
Jan 8, 2013, 5:24:19 PM1/8/13
to fifty-sta...@googlegroups.com
Haha. I noticed you contributed to opencities-ma and one thing led to another.


Justin

unread,
Jan 8, 2013, 6:00:03 PM1/8/13
to fifty-sta...@googlegroups.com
I'm a bit surprised you haven't seen interest from state governments for a push endpoint. You definitely have the support, or so it appears, of the federal government, seeing as how the Sunlight Foundation has already been mentioned 3 times this year in the House of Representatives.

...
The Sunlight Foundation reports that during the 2012 election cycle alone, super PACs, as they are called, spent more than $620 million to affect the Federal elections.
...
The nonpartisan Sunlight Foundation recently praised our endeavors in that effort by saying: "It is clear that the House has become a more transparent institution over the last 2 years."
...
This is a strongly held bipartisan measure that has received praise from a number of transparency groups, including the Sunlight Foundation, as I mentioned at the outset. 
....

Justin

unread,
Jan 8, 2013, 11:18:17 PM1/8/13
to fifty-sta...@googlegroups.com
I wonder, too, if this is something that local, traditional media outlets such as newspapers could be enlisted to help with.  

 That's a good idea. I'm going to reach out to a few contacts from my local media. 

Justin

unread,
Jan 9, 2013, 2:18:54 AM1/9/13
to fifty-sta...@googlegroups.com
Here's another interesting tidbit to consider.

When it comes scraping legislators and other elected officials, most local governments only have a dozen or so people. I can copy and paste 9-15 people's information faster than I can write a script to parse it.

Take the City of Miami and Miami-Dade County for example, with 23 people.

City
Mayor Tomas P. Regalado
Commissioner Wifredo (Willy) Gort
Commissioner Marc Sarnoff (Vice Chairman)
Commissioner Frank Carollo
Commissioner Francis Suarez (Chairman)
Commissioner Michelle Spence Jones
City Manager Johnny Martinez
City Attorney Julie O. Bru 
City Clerk Dwight S. Danie

County 
Mayor Carlos A. Gimenez
District 1 - Barbara J. Jordan
District 2 - Jean Monestime
District 3 - Audrey Edmonson
District 4 - Sally A. Heyman
District 5 - Bruno A. Barreiro
District 6 - Rebeca Sosa
District 7 - Xavier L. Suarez
District 8 - Lynda Bell
District 9 - Dennis C. Moss
District 10 - Javier D. Souto
District 11 - Juan C. Zapata
District 12 - José "Pepe" Diaz
District 13 - Esteban Bovo, Jr.

Justin

unread,
Jan 9, 2013, 3:16:22 AM1/9/13
to fifty-sta...@googlegroups.com
That being said, for legislators, it's almost as if you just need to monitor the sites for name changes to see if someone was added or removed. Speaking of which, local governments typically meet 2-3 times per month, some more, some less. Accordingly, the data (votes/minutes) isn't updated on the website as frequent as federal and state. Scrapers may actually be overkill for the majority. However, you would still probably want scrapers where prudent (eg larger populations, easy to parse, etc).

This whole copy-paste and update frequency thing reminds me of a browser scraper idea I had a few months ago but never had a good use case. 

Most of you probably use browser plugins or extensions for Firefox or Chrome. They're very powerful. Here's how one might see themselves scraping.

For illustration, I'm using Chrome and the Adblock plus extension. The options and design would obviously be changed to reflect the Open Cities/Counties extension.

- Configure the scraper. What are you scraping (legislators, minutes, committees)? What city/county? Your authorization key, etc. etc.

- If you highlight some text and then right-click, depending on the extension, you are given an option to do something with that DOM element.


- Extensions aren't limited by the same origin policy. You could send the data to remote API endpoint using AJAX for processing.

James Turk

unread,
Jan 9, 2013, 10:33:26 AM1/9/13
to fifty-sta...@googlegroups.com
We've discussed some of these ideas before, and there is merit in them but there are a few things that they don't really account for.

* As you realized, the hard part isn't collecting the names, it is monitoring for changes.  If we were just collecting them we could use a wikipedia like approach (or why bother- just update wikipedia)
* The idea for a chrome/firefox extension is interesting, but these sites can change frequently and often break in subtle ways, not to mention the fact that a chrome extension wouldn't give you the flexibility needed to pull all of the information.  Sure you could say pull the first and 3rd columns of a table, but how would you say "also click the link in the 3rd or 4th column (it may vary- the HTML is sloppy) and pull the img element which has one of the following 3 class names" as our scrapers often have to do.  

This isn't meant to discourage creative thinking like this, but simply to point out that if there's one thing I've learned from working on Open States is that things are always harder than they first appear.

PS.  I have often desired to write a chrome extension to help in fact, I don't think it'd be possible to have it do all of the work, but something that helps identify elements so that as you write code you can experiment with xpath in the browser is something that's been on my todo list for a long time.  If you're interested in playing with chrome extensions I'd be happy to chat more about what would be useful there.  


--
You received this message because you are subscribed to the Google Groups "Open State Project" group.

Justin

unread,
Jan 9, 2013, 2:38:56 PM1/9/13
to fifty-sta...@googlegroups.com
Cool. We can take the discussion elsewhere. Are you on IRC? If not I will email you my thoughts.

If anyone is interested and/or has experience with browser extensions feel free to email me.
To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.

David Moore :: OpenCongress

unread,
Jan 9, 2013, 2:57:09 PM1/9/13
to fifty-sta...@googlegroups.com

Hi Justin, 

Agreed - an accountable, transparent gov't at all levels (federal, state, county, city, local) is necessary for a more participatory democracy. We think we'll eventually get there -- and cities offer a great point of leverage now.

--  that's where we're headed with our free, libre & open-source Ruby gem GovKit & our Rails web app OpenGovernment, now in active development with James McKinney of OpenNorth as technical lead.  

We're currently working on three cities w/ funding from the Knight Foundation and with open data from the awesome Open States team (for D.C. and states), but we're already looking ahead to expand into supporting more municipalities w/ opengovdata offerings (e.g. some of those on CFA wiki). 

Jump into #opengovernment on Freenode.net anytime to chat IRC, and ping me anytime to connect voice. We're a small open-source effort getting ready to make a call for more volunteer contributions this spring before our next release - would love to collaborate on a joint funding proposal and in finding developers who share our goals of making it easy to contact elected officials at city & state levels. 

Join our list-serv - still ramping back up - and feel free to follow our soc. media acc't @open_gov. Hope you'll get in touch w/ me & James to get involved more at the city level - and the Code For America folks too, nationwide. 

Andrew Hoppin

unread,
Jan 9, 2013, 6:41:55 PM1/9/13
to fifty-sta...@googlegroups.com, fifty-sta...@googlegroups.com
Awesome.  For our part we at nuams are working on making a robust open data platform really simple and cheap ( and open-source) to implement for smaller towns and counties, so that at least a larger proportion of these smaller jurisdictions will have an easier time publishing their data in a standards-compliant manner friendly to scrapers and search engines alike, and with an API to boot.

Best,
Andrew

Sent from my iPhone

Jerry Hall

unread,
Jan 9, 2013, 7:03:15 PM1/9/13
to fifty-sta...@googlegroups.com
Great Andrew. There's definitely a need for a toolkit smaller municipalities can utilize and it just makes sense to develop a common ontology so we don't end up back where we're starting :)

Jerry


On Wednesday, January 9, 2013 3:41:55 PM UTC-8, Andrew Hoppin wrote:
Awesome.  For our part we at nuams are working on making a robust open data platform really simple and cheap ( and open-source) to implement for smaller towns and counties, so that at least a larger proportion of these smaller jurisdictions will have an easier time publishing their data in a standards-compliant manner friendly to scrapers and search engines alike, and with an API to boot.

Best,
Andrew

Sent from my iPhone

On Jan 8, 2013, at 3:12 PM, Shauna Gordon-McKeon <shau...@gmail.com> wrote:

Justin,

This is something I've thought a bit about.  You're absolutely right that maintaining scrapers for thousands of cities is impractical - what seems more achievable is making the scraping process simpler such that random, interested citizens of various towns could apply them to their local governments as desired.  

Related to this could be a standard way to present data that individual citizens could bring to their local officials.  Who knows how successful this would be, but presumably at least a few local governments would be willing to alter how their data is published, that could be easily automated.

I wonder, too, if this is something that local, traditional media outlets such as newspapers could be enlisted to help with.  

I'm curious whether you have specific ideas to make open cities happen?  I agree with you, it's a formidable task.

- Shauna
On Tue, Jan 8, 2013 at 3:02 PM, Justin <tcpa...@gmail.com> wrote:
I would like to open up discussion about processing city and county data (votes, legislators, committees, etc). 

As of 2012, there were roughly 5,500 cities and counties in the US with a population of more than 10,000 people. Maintaining scrapers for all of these entities is impractical. This is particularly true for smaller governments with less resources. In short, much of the data isn't parsable, the format changes often, or, at best, haphazard to process (eg using OCR for scanned PDF's). 

"open government" is a very loose term. As far as I know, there aren't any federal or state guidelines defining what an ideal open, accountable, and transparent government should look like. FOIA, merely states that "each agency shall make the raw statistical data used in its reports available electronically to the public upon request". It certainly doesn't dictate government entities must make their data (votes, journals, etc) machine processable and free.

One could envision a daily/weekly process which queries local government data, from the inside (push versus pull method), and sends to a centralized remote endpoint (eg. Sunlight Foundation). 

Arguably, our modern governments as we know them today would not have come to fruition without the printing press. Just as Benjamin Franklin believed in the printing press and catapulting news and ideas to the people, I'm of the belief we're on the cusp of a large movement with the internet and government. However, without definitive requirements, I'm afraid these lower level governments will implement adhoc services (just as their states have). 

TL;DR I want to live to see “Open States” happen for all levels of government. 

--
You received this message because you are subscribed to the Google Groups "Open State Project" group.
To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/EFPn8mQzL5YJ.
To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/fifty-state-project?hl=en.

--
You received this message because you are subscribed to the Google Groups "Open State Project" group.
To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.

Justin

unread,
Jan 19, 2013, 4:31:14 PM1/19/13
to fifty-sta...@googlegroups.com
In attempt to create a schema for storing resolutions, ordinances, and votes for historical analysis, I've been reviewing a lot of local government websites. As many of you probably know, the vast majority of the votes and resolution/ordinance descriptions are stored within the minutes or agendas. I've probably looked at 150 documents/pages from various cities, counties, meeting/committee types from around the US in the last couple of days.

In particular though, I've really been channeling my scraping and data mining efforts on Minutes for counties in South Carolina. Like many other counties and local governments throughout the US, they are in PDF format. Here are a few of my findings.

- More than 80% of the counties make their minutes available.

- However, as of 2012, about 40% of the counties store their minutes in scanned PDFs. Which effectively makes parsing impractical, at best.

- The formats vary widely, even for the same county. If not from committee to committee, from month to month or year to year.

- There are a slew of other sub topics/items woven into these meetings (especially for larger bodies) but *most* have the following items, in no particular order, in common (meaning, they're identifiable and parsable):
-- meeting name, date time, location
-- roll call
-- agenda approval, agenda amendments
-- appointments
-- reports
-- public input
-- consent agenda
-- resolutions
-- ordinances
-- executive session
-- adjournment

- Irregardless of the the legalities, you can't, with confidence, rely on  SIRE or Municode based sites for extracting ordinance or resolution descriptions (or votes for that matter). For example, the data from last week or last months council meeting isn't probably going to be there. Sometimes they aren't updated for months or even years.

- I'm certainly no Linguistic or English major but I was somewhat surprised to find as many grammar/spelling errors as I did. These documents are littered with them. I note this because, in effort to normalize, it's imperative to address the irregularities in the data. The trouble is, you're having to implement said measures, sometimes very specific, for every instance. Ideally, it would be nice to consolidate these efforts in a global fashion using backend technologies. I imagine these documents do not get the same amount of attention and proofreading of similar federal and state government published documents/resources. It's evident humans do in fact create these documents. :) The NLTK library might come in handy here.

- Piggy backing from my last point, a "Robert's Rules of Order" library would also come in handy.

On Tuesday, January 8, 2013 3:02:41 PM UTC-5, Justin wrote:

James McKinney

unread,
Jan 20, 2013, 11:20:11 AM1/20/13
to fifty-sta...@googlegroups.com
Hi Justin,

Minutes and agendas at the local level are indeed quite a mess! I think an important question to start with is, what are we trying to do with these documents? And to then ask, is it necessary to those goals to have a precise model of the documents' contents?

I think splitting these documents into individual items and then tagging them as "agenda amendment", "adoption of minutes", "tabling of report", etc. is too difficult a task, and I don't think it's necessary to achieve many useful goals.

For example, if we want to add upcoming agendas to Scout (https://scout.sunlightfoundation.com/) so that residents can receive notice when an upcoming agenda mentions keywords they care about (so that they can testify at the meeting to try to influence outcomes), you just need the agendas is text format. If it's straight-forward to split agendas into individual items, it would be a bonus to be able to point to the specific agenda item when sending the Scout alert - but it would still be a useful service if breaking up the agenda proved too difficult.

In terms of which documents are most interesting, I think it's more interesting and important to improve access to documents that have to do with decisions that haven't been made yet, like agendas of upcoming meetings. I think it's much less interesting to improve access to documents about decisions that have already been made (minutes, roll calls and votes), because there are far fewer opportunities for residents to have any meaningful participation in the legislative process at that point.

In terms of where to focus attention at local transparency - if there's a choice between helping people track what's been done, and helping them influence what will be done, I'll always choose the latter.

James

--
You received this message because you are subscribed to the Google Groups "Open State Project" group.

To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.

Steven Clift

unread,
Jan 20, 2013, 11:48:02 AM1/20/13
to fifty-sta...@googlegroups.com, democr...@forums.e-democracy.org, publicm...@forums.e-democracy.org
Say, I just noticed this exchange:
https://groups.google.com/d/topic/fifty-state-project/6lrqRE_8qIY/discussion



On public meeting notices, these resources may be of use:
http://publicmeetings.info
http://www.openpublicmeetings.info/recommendations.html

Our Public Meetings online working group has 117 members to connect with:
http://forums.e-democracy.org/groups/publicmeetings

Also, in terms of information on officials or even an open data set of
the official URLs of all local government jurisdictions, I am curious
what mySociety is planning with their services (and Sunlight) noting
their Google.org announcements.

Our DemocracyMap online working group has 96 members interested in the
"who represents me" slice:
http://forums.e-democracy.org/groups/democracymap
Also see: http://democracymap.org


I am always interested in connecting these ideas to efforts with
sustained resources. When it comes to creating new open data
collections across multiple local governments related to _democracy_
and _power_ there is a huge market failure. Stuff like fire hydrants
are far easier, but if we can about impact we need the public to have
timely access (and notification) of information about upcoming
government decisions or actions that matter to them (by location,
interest, cost, etc.).

Cheers,
Steve

Justin

unread,
Jan 20, 2013, 2:37:38 PM1/20/13
to fifty-sta...@googlegroups.com
One of the main focal points is identifying historical trends in voting records and participation. Most Federal and State leaders didn't make it to that level of government without serving elsewhere, at the local level. This is not their first rodeo. Over the years they've voiced their opinion and probably voted on a matter many times, be it serving as a Chairman of a board/council or participating at a local meeting as a citizen.

Before I vote someone into office, I want to be able to assert their stance, and my knowledge about that person not be driven entirely by a debate, website, billboard, or a TV ad but rather real data. 

"The best predictor of future behavior is past behavior"

This is difficult, but not impossible. One thing is for sure, good ole regular expressions and splits aren't going to cut it. This is a great book about text processing - http://gnosis.cx/TPiP/ (particularly relevant is chapter 4).
To unsubscribe from this group, send email to fifty-state-project+unsub...@googlegroups.com.

Justin

unread,
Jan 20, 2013, 4:32:36 PM1/20/13
to fifty-sta...@googlegroups.com
There are several byproducts of analyzing minutes - evaluating ones voting record is just one of them. That's great if you know an issue you feel close to is coming up in next months agenda. But who do you go to ask questions or lobby with, and why?  Because you or an acquaintance knows them? Or because they have a proven track record, backed by data, of supporting your case in similar or adjacent issues?

IMO, minutes are the proverbial bread-and-butter of local government. It's where the action happens and they're one of the very few resources we have, publicly, to measure the effectiveness of local government ... and ... AFAIK, they're just idly sitting there. Everything from identifying company and organizational alliances, public input/temperature, whos' who, to ordinance and resolution relationships between local and state government are stored within the minutes.



To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/W3S-8W5SAUgJ.

To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.

Philip Ashlock

unread,
Jan 20, 2013, 4:34:19 PM1/20/13
to fifty-sta...@googlegroups.com, democr...@forums.e-democracy.org
Cross posting this to both the Open States mailing list (http://groups.google.com/group/fifty-state-project) and the DemocracyMap mailing list (http://forums.e-democracy.org/groups/democracymap)

Continuing James' point about the incremental steps that can start to get you a minimal viable product to accomplish your goals, continuing your earlier emails about looking up elected officials, and to continue various other emails about doing basic queries for the primary information associated with local government bodies and elected officials, I thought I'd mention some updates about the DemocracyMap effort.

Justin, earlier you used Miami as an example where it would make more sense to just copy and paste information about elected officials rather than write a scraper just for the one city, but it turns out that about half the states (including Florida) have an organization that aggregates and openly publishes information about elected officials for most of the cities of the state. I've started writing scrapers for a few of these states, but I haven't written one for Florida yet ;)

I'm trying to keep track of the states that need scrapers on this wiki page (http://pages.e-democracy.org/DemocracyMap_Representatives), so please edit that if anyone wants to contribute

I'm using these scrapers and other data sources (including the Sunlight and OpenStates APIs) to feed into a sort of meta-API that aggregates, normalizes, and caches jurisdiction and representative data for the various levels of government in the US.

You can find that API, a Demo, and more information about contributing at:
http://api.democracymap.org

Here are some basic stats:
  • Primary contact information for all US States (website, address, etc)
  • All 50 Governors
  • Primary contact information for all US counties (website, address, etc)
  • All County Officials ~ 36,000 reps
  • Primary contact information for all US cities (website, address, etc)
  • Mayors for major cities ~1200
  • City Reps in California ~10,700
  • City Reps in Washington State ~4400

This work will continue to be coordinated on the democracymap mailing list, but I'd also expect it to be partially merged with the work around states - partly since both PPF and Sunlight are moving to the city level and partly because each state provides a good staging ground for aggregating city data. In fact, I'm really not interested in building any central data store of local government data, but instead a standardized way to query for local government data across each state. I'd like services like this to operate as cascading federated APIs and have already started to mockup up this architecture with the DemocracyMap prototype API. Moving forward, I'm also planning to work more closely with the Boundary Services API and the work OpenNorth has been furthering. Up to this point, I've been iterating in a somewhat adhoc manner as I get a better sense of the data that's available.

As far as the overall strategy for getting data in, I think it will start off as volunteer managed scrapers with some manual data entry to complement that, but that the long term strategy will be to provide better data publishing tools for each State's League of Cities or State government as well as tools for each city to publish hyperlocal jurisdiction data. Unfortunately there are lots of other non-technical challenges as well such as convincing some state organizations to openly publish their data rather than charge upwards of $300 for it. I think that comes with giving them tools that makes it easier for them to manage the data and highlighting the fact that other states are already openly publishing the data. Much of the challenge thus far has just been finding the obscure websites that are hosting this data. There were actually a few states that publish this as raw machine readable data and I think the best one overall is from the state of Pennsylvania, but that was also one of the hardest ones to find.

Ok, more updates soon and I hope some of you will help write scrapers. For now, I'm off to dive into the beginnings of the inauguration festivities :)

Phil


On Sun, Jan 20, 2013 at 2:37 PM, Justin <tcpa...@gmail.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msg/fifty-state-project/-/W3S-8W5SAUgJ.

To post to this group, send email to fifty-sta...@googlegroups.com.
To unsubscribe from this group, send email to fifty-state-pro...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages