A great first day at PyCon.

Sarah Bird

unread,

Apr 13, 2014, 8:26:47 PM4/13/14

to publi...@webfoundation.org

Hi all,

Just thought I'd give you a little update of what happened at today.

It started earlier than expected this morning where a very helpful developer Jean-Paul helped me debug a unicode issue I had been having with the Open Data Comparison site. Yay PyCon community & thanks Jean Paul (bcc'd with others I met today on this mail)

This morning we held an "Open Space"[1] on Open Data. We had a nice turn-out, somewhere between 25 and 30 people showed up over 3 hours and we chatted open-data and started making connections for the sprints tomorrow. There was a great range of people, from those just interested in learning a little more about open data to people deeply involved - Ian and Bruce from CKAN, Bob and Drew from Sunlight Foundation, Herb Lainchbury from Open Data Society British Colombia, Jason Leveille from National Priorities Project, and many more who's full names and contact I didn't manage to get.

We got into the format discussion and got some great perspectives from Herb - like maybe its just easier for people to make .xls the standard format! (and maybe .odf would work as a more open middle ground!) The different needs of publishers and users, devs and non-devs, when it comes to standardized datas.

We got some great background from Jason Leveille of the National Priorities Project on the challenge they have every year trying to get the same data from the same publishers as they tweak the way its published.

Eric Canen from the University of Wyoming talked about how messy health data can get and how hard it can be to get hold of.

Herb also talked about trying to build a top-10 list of datasets that municipalities should be publishing and strategies for getting publishers over the hump of disclosing the first thing

We broke into smaller groups and lots of different discussions happened that I wasn't able to be part of. Bob and Drew were a goldmine of knowledge and happily will be here for all the sprint days - we are co-sprinting with Sunlight Foundation so we get all the open data folks geeking out together.

Ana worked with Eric Canen and Dana to start digging deep into the data and they will be continuing on that tomorrow.

I also got to bump into the awesome Asheesh, who is partly responsible for last year's successful sprint as he taught me to scrape through his pycon talk on scapy

At 5pm all the sprints were introduced, Bob made an excellent empassioned pitch to free the data!

After the introductions, we had a few interested people, and Brantley, Nick and I started work! Nick has been kind enough to offer to help with a lot of the kinks in Open Data Comparison, and Brantley is going to be working on the data landscape map tooling to see if we can apply some machine learning to the problem.

Am super excited for tomorrow.

All the best,

Sarah Bird

[1] An Open Space is just a time slot where anyone interested in a topic can show up and we see what happens.

--

Sarah Bird

sa...@aptivate.org
skype: birdsarah

www.linkedin.com/in/birdsarah/

Aptivate - Ethical IT for International Development

www.aptivate.org

mr...@worldbank.org

unread,

Apr 13, 2014, 9:22:19 PM4/13/14

to publi...@webfoundation.org

Thanks for the brief Sarah, all sounds great. Good luck tomorrow!

Marcela

Sent from my iPhone

--
You received this message because you are subscribed to the Google Groups "Public OCDS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to public-ocds...@webfoundation.org.
To post to this group, send email to publi...@webfoundation.org.
Visit this group at http://groups.google.com/a/webfoundation.org/group/public-ocds/.
To view this discussion on the web visit https://groups.google.com/a/webfoundation.org/d/msgid/public-ocds/CA%2B-Xp4-K5wjZtRH5r0LO5QZ-xg2f8k%3DxF5%3DBw9nu3HVaYg1GoA%40mail.gmail.com.

James McKinney

unread,

Apr 13, 2014, 10:26:40 PM4/13/14

to Sarah Bird, publi...@webfoundation.org

FYI, the top 10 datasets survey can be accessed here: https://www.surveymonkey.com/s/CNCY5Z8

While CSV (I don’t think anything would be lost by converting Excel to CSV) may be a viable format, the challenge is still to determine standard column headings - or, put more broadly, to enumerate the terms (the classes and properties) that will make up the specification. Putting those terms into a particular format (RDF, JSON, CSV, etc.) is the simpler part of the process. But I can certainly see how plugging into tools that publishers already use would increase adoption - whether that’s by defining an Excel template that can be exported as CSV, or coming up with other tools that can integrate with whatever other systems they are using.

See you tomorrow!

James

Pascal Robichaud

unread,

Jun 29, 2014, 9:27:22 PM6/29/14

to publi...@webfoundation.org, sa...@aptivate.org

En effet, les gros systèmes devraient avoir des outils d'exportation.

Sinon, de plus en plus de système utilisent des entrepôts de données à partir desquels l'extraction se ferait.

Ceci dit, il y a une partie qui revient aux organisations et aux fournisseurs de logiciels.

Pascal

Reply all

Reply to author

Forward