Check this out :) Looks like a great talk!
PiaOpen Data and Content Mining
University of Cambridge and Open Knowledge FoundationSeminar Room, Level 2 Fisher Library, 2 pm, Wednesday October 31st, 2012
publicly funded research in the Scientific Technical Medical (STM)
literature contains multibillion dollars of unused value. Most
scientific articles contain names, numbers, places, chemicals,
organisms, graphs, tables, etc. which can be extracted and re-used. This
leads to better science, new information products, startup companies,
better information for policy makers and much more which I have
estimated at "low billions" for chemistry alone. For STM, especially
medicine, the figure is much higher. Yet this is currently unavailable
for the reasons: (a) publishing uses PDF which is a very poor way of
conveying the information (b) publishers active prevent mining of the
content to preserve their revenues.
We must change this, and soon, though (a) evangelism of the
opportunity (b) lobbying for our rights (c) building the next generation
of tools. I shall cover all these, including our Manifesto on Open
Content Mining and demonstrations of AMI2 - a weakly intelligent
amanuensis for the scientist (based initially on understanding PDFs).
This offers great opportunities for citizenry in general to liberate
this vast resource of valuable information.
All welcome, no RSVP needed
(Host: Dr Mat
Todd, School of Chemistry, matthe...@sydney.edu.au