Project idea - UI for mortality data

Evan Czaplicki

unread,

Oct 17, 2017, 4:07:31 PM10/17/17

to elm-dev

A while ago I tried to make some graphs of U.S. mortality data over time. In particular, I wanted to see:

Opiate-related deaths over time by state.
Obesity-related deaths over time by state.

All of this information is publicly available, but extremely difficult to get a hold of in any useful format. For example, there is:

Decades and gigabytes of raw data.
A public API that is somewhat limited and extremely difficult to understand.
The ICD-10 system used for categorizing diseases, searchable online.

So it is "all available" and yet I do not know anyone who can ask "How many people died of X in my state in this decade?" and actually get a nice graph showing year-by-year changes. It is insane how such basic information is effectively unavailable for guiding discussions around these topics.

Point is, I think this would be an extremely interesting project. The UI work will be interesting, scoping the project requires interesting trades, and the end result brings more data to any discussion of health in the U.S. So I think this would be amazing to have, and I wanted to share the idea!

Ben Goldsmith

unread,

Oct 21, 2017, 1:08:21 PM10/21/17

to elm-dev

I think you've hit the nail on the head with some of the problems on collating and viewing this data. It's often in different formats and requires skills in many disciplines just to get the basic visualization!

I was lucky enough to chat with Cesar Hidalgo, whose team at the MIT Media Lab is working on this stuff. They've built some sites that are pretty close to what you're talking about:

Data USA

Obesity map
Drug overdose maps
Anywhere it says "add to cart" is kind of misleading -- it simply lets you collate and join the data. You don't have to pay for any of it.

Data Viva (like Data USA, but for Brazil)
Observatory of Economic Complexity (visualizing trade data)
DIVE (a tool for ingesting and auto-visualizing data)

I'm sure these aren't exactly what you're looking for, but I wanted to drop them in here as inspiration or useful resources.

Tobias Burger

unread,

Oct 23, 2017, 2:01:54 PM10/23/17

to elm-dev

There exists a project called "The Gamma", written by Tomas Petricek.

This project tries to answer this kind of questions.

I've tried to produce a visualization for your questions, but sadly I haven't found data about opiate and obesity related deaths but I hope you get an idea:

https://gallery.thegamma.net/64/mortality-rate-in-usa

Also interesting (and I think the inspiration for "The Gamma") are Type Providers in F#, which make it possible to consume dynamic data from statically compiled languages in a type safe way.

It differs from code generation techniques in that the type information gets produced on demand and it is also possible to generate the type definition lazily, so a big data source (like the worldbank data) can be navigated efficiently.

You can think of a type provider as a compiler extension for an external dynamic data source.

I think the biggest obstacle is that there doesn't exist a common standard to describe public data (at least I don't know of any common standard) with any semantic meaning to a computer.

As you have already mentioned there are big chunks of raw data (without schema information consumable by computers), some difficult to understand APIs (I can imagine that every department and country produces there own API) and a bunch of html pages which are only human readable (which no human reads voluntarily because the presentation is mediocre).

W. Brian Gourlie

unread,

Oct 23, 2017, 6:14:47 PM10/23/17

to elm-dev

Hey Evan,

This looks like a really interesting project. I'm trying to figure out the best way to get the data into a usable state. My first instinct is to better understand how to interpret the raw data, normalize it, and then dump it into a database.

The problem with the raw data is that it's mostly opaque-The column names are pretty cryptic. Do you know if there is a resource for better understanding the raw data?

Brian

Evan Czaplicki

unread,

Oct 23, 2017, 6:32:01 PM10/23/17

to elm-dev

Brian, I do not know a resource for figuring out the column names. That is part of what I got stuck on too :)

I recall seeing PDFs like this one on pages like this one that seem to be a key of some sort. I could not work it out though.

I suspect the fastest way to proceed would be find someone who has worked with this data successfully before. I asked on twitter here to see if anyone knows anyone. We shall see!

--
You received this message because you are subscribed to the Google Groups "elm-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elm-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elm-dev/b551535f-5df0-48c7-b16c-d3e1821eeabd%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

W. Brian Gourlie

unread,

Oct 24, 2017, 1:56:27 AM10/24/17

to elm-dev

Evan,

I figured out how to interpret the raw data and have documented it here: https://gist.github.com/bgourlie/044ca86397003b5bd8d1f30533322ab3

Brian

To unsubscribe from this group and stop receiving emails from it, send an email to elm-dev+u...@googlegroups.com.

Reply all

Reply to author

Forward