http://watchdog.jottit.com/volunteer
## why'd they vote that way?
[Ebonya Washington found][Washington 2007] [PDF] that politicians with
daughters tend to be better on women's issues. [David Wheeler
argues][Wheeler 2008] that state income, state dependence on fossil
fuels, and political ideology explained the voting on the
Warner-Lieberman global warming bill.
[Washington 2007]:
http://www.econ.yale.edu/faculty1/washington/genderpap10.pdf
[Wheeler 2008]: (http://www.cgdev.org/content/publications/detail/16387)
Your tasks is to write some automated code, ideally using the amazing
[TETRAD][], that analyzes votes on bills along with the other data in
watchdog and tries to explain why politicians voted that way.
[TETRAD]: http://www.phil.cmu.edu/projects/tetrad
## contribution clustering
Most campaign contribution data, like that on Open Secrets, is
[grouped into a bunch of basic industry categories][os]: Investment,
Real Estate, Entertainment, Lobbyists. These categories are
preselected and donations are placed into them by researching the
employer of the donor; a time-consuming manual process.
We could automate a lot of it, perhaps, but what if we tried something
difference: what if we ran [a clustering algorithm][wk] (see see [this
poorly-named book][pci] for explanation and Python examples) on the
data and let the data determine which clusters are most relevant.
Obviously, we'll need humans to interpret the data at the end, but
that's a lot less work and a lot more interesting.
[os]: http://www.opensecrets.org/politicians/industries.php?cycle=2008&cid=N00001821
[wk]: http://en.wikipedia.org/wiki/Data_clustering
[pci]: http://books.theinfo.org/go/0596529325
## voter/contributor heatmaps
Soon we'll have [[vrdb|voter registration data]] and individual
contribution data for much of the country. But this is a lot of data
-- we'll want nicer ways of visualizing it.
One obvious way is through maps: show where the most registered voters
are clustered; show where political contributions come from. Fundrace
does [a version of this][fm] that I find faintly hideous. _The New
York Times_, as you might imagine, [was a bit more tasteful][nyt].
(More from those guys: [1][], [2][].) Surely we can do better. Or at
least something.
[fm]: http://fundrace.huffingtonpost.com/neighbors.php?type=city&city=manhattan
[nyt]: http://www.nytimes.com/imagepages/2004/06/23/politics/20040623_BLOCK_GRAPHIC.html
[1]: http://www.nytimes.com/interactive/2007/10/23/business/20071104_MEGACHURCH_GRAPHIC.html
[2]: http://www.nytimes.com/interactive/2007/10/23/business/20071104_MEGACHURCH_GRAPHIC.html
This graph should be easily created from the data we have, and a small
article using the data would be a good way to spread awareness of the
project.
It would be even cooler if we could provide tools which let users play
with these numbers in real time, then someone could spread a deep link
into the arranged graph, and let people explore out from there. I know
there are tools and sites which currently do something like this for
various data sets, but I can't remember enough about them to even
google them.
-Alex
This would be a fun project for a volunteer.
> It would be even cooler if we could provide tools which let users play
> with these numbers in real time, then someone could spread a deep link
> into the arranged graph, and let people explore out from there. I know
> there are tools and sites which currently do something like this for
> various data sets, but I can't remember enough about them to even
> google them.
You're probably thinking of swivel.com and many-eyes.com.