There exists a project called "The Gamma
", written by Tomas Petricek
This project tries to answer this kind of questions.
I've tried to produce a visualization for your questions, but sadly I haven't found data about opiate and obesity related deaths but I hope you get an idea:
Also interesting (and I think the inspiration for "The Gamma") are Type Providers
in F#, which make it possible to consume dynamic data from statically compiled languages in a type safe way.
It differs from code generation techniques in that the type information gets produced on demand and it is also possible to generate the type definition lazily, so a big data source (like the worldbank data
) can be navigated efficiently.
You can think of a type provider as a compiler extension for an external dynamic data source.
I think the biggest obstacle is that there doesn't exist a common standard to describe public data (at least I don't know of any common standard) with any semantic meaning to a computer.
As you have already mentioned there are big chunks of raw data (without schema information consumable by computers), some difficult to understand APIs (I can imagine that every department and country produces there own API) and a bunch of html pages which are only human readable (which no human reads voluntarily because the presentation is mediocre).