Hi!
I'm a statistician from Brazil. rOpenSci is really a great initiative!
Here in Brazil, elections are made electronically. That means we have a very efficient calculation of the votes (the final results come in less than three hours after the urns are closed). That also means that we have a great repository of data about elections.
The Superior Court of Elections (TSE) makes this data publicly available here
http://www.tse.jus.br/hotSites/pesquisas-eleitorais/resultados.html. The data is available by state (we have 27), in CSV format. I would like to build an R interface to query this data so that it could be easily analyzed and visualized. The two steps I can imagine to accomplish this task are
1) Transfer the data to a public, "queriable" repository (by queriable I mean that one could either download all the data or some parts of the data. A kind of API?).
2) Create an R package with some functions to gather data and make it tidy.
I think the biggest problem is the data size. It's not so big, but I think it's impossible to just add it in an R package. It's about 10GB.
My questions are
a) Am I posting these questions in the right place?
b) Is there any free solution to step 1? I considered Google BigQuery, to use with Hadley's bigrquery package, but I can't create a public and free data repo there...
c) Is rOpenSci a good place to share my package?
Thanks in advance,
Julio Trecenti
OBS: Sorry for my bad English ;)