full Sandbox dataset

507 views
Skip to first unread message

mrob...@ipro.us

unread,
Aug 30, 2018, 7:12:43 PM8/30/18
to Developer Group for CMS Blue Button API
I reviewed the posts in this forum and I apologize if I missed a thread that discusses my question.  

We are trying to develop some very condition/disease-specific applications (such as an app solely addressing heart disease) and are wanting to review some of the synthetic patients within the Blue Button sandbox that have the particular CPT codes we're trying to find.  

I went to the Blue Button API website at https://bluebutton.cms.gov/developers and downloaded the CSV file at the link called CSV of 100 sample beneficiaries with rich claims data

I was expecting to see a dataset with a sample of 100 of the 30,000 synthetic beneficiaries, but the spreadsheet that I got seems to show all 30,000 synthetic users' login credentials and some columns with claim counts for the each of the beneficiaries.  I attached the file to this thread as a reference.  

Would it be possible to get a file with the 100 sample beneficiaries, or even better, the full 30,000?

Thanks for all of your work!
synthetic_users_by_claim_count_full.csv

karl....@cms.hhs.gov

unread,
Aug 31, 2018, 10:15:39 AM8/31/18
to Developer Group for CMS Blue Button API
mroberts,

Good timing! I wrote a script yesterday to pull all of the synthetic data as ndjson (for a completely different reason, coincidentally). But here you go: synthetic_data_30000.ndjson.gz (345M compressed, 16G uncompressed). It includes all resource types (Patient, Coverage, and ExplanationOfBenefit) for all synthetic beneficiaries.

Please also see this earlier post with all of the data in relational sqlite format, which may be easier for you to analyze: https://groups.google.com/d/msg/developer-group-for-cms-blue-button-api/7hClK4aFaHY/jxQswFzEBAAJ.

[Small note: I pulled it from our backend system, not the frontend, so the URLs in there won't match what you see coming from our sandbox. Everything else should be the same, though.]

Best regards,
Karl M. Davis
Blue Button 2.0 API, Engineering Lead

ja...@govrock.com

unread,
Aug 31, 2018, 6:37:26 PM8/31/18
to Developer Group for CMS Blue Button API
the synthetic data for 30000 users in ndjson format would be really useful to me too!
But the link doesn't work :) Can we get a link update?

Thanks,
Jason

karl....@cms.hhs.gov

unread,
Aug 31, 2018, 7:19:37 PM8/31/18
to Developer Group for CMS Blue Button API
Jason,

Whoops! Thanks for pointing that out. Here you go:
synthetic_data_30000.ndjson.gz (345M compressed, 16G uncompressed)

Best regards,
Karl M. Davis
Blue Button 2.0 API, Engineering Lead


malamahe...@gmail.com

unread,
Nov 27, 2019, 2:00:29 AM11/27/19
to Developer Group for CMS Blue Button API
Awesome! I almost started writing a script on my end to automate users and fetch their all pages of data.
Then I stopped wondering if I am violating anything. We are trying to do some data analysis and wanted more data than just one Patient/Member.

Thanks for this!!

Chulmin Lee

unread,
Dec 14, 2020, 12:46:55 PM12/14/20
to Developer Group for CMS Blue Button API
The link seems to be broken again, you please update the link one more time so I can get the synthetic data set?

Thanks!

Sasha Cuerda

unread,
Dec 15, 2020, 4:07:56 PM12/15/20
to Developer Group for CMS Blue Button API
Would also be interested in this synthetic data set if someone can make it available.

Jack Williams

unread,
Dec 17, 2020, 3:57:56 PM12/17/20
to Developer Group for CMS Blue Button API
All,

We are working on providing a new, updated file.  Due to the Holidays this will be sometime towards the end of January before we are able to finalize it and release it to the group. As soon as it is available we will make sure we post it out for download. Thank you for your continued patience. 

Regards,
Jack Williams
Blue Button 2.0 API, Developer Evangelist

Chulmin Lee

unread,
Jan 15, 2021, 7:58:52 AM1/15/21
to Developer Group for CMS Blue Button API
Is there any updates to the availability of the synthetic dataset?

Jack Williams

unread,
Jan 20, 2021, 1:30:52 PM1/20/21
to Developer Group for CMS Blue Button API
Good afternoon,

The latest update is that we are working towards being able to provide this file by mid February. 

Regards,
Jack Williams
Developer Evangelist - BlueButton Team

Chulmin Lee

unread,
Feb 25, 2021, 4:50:59 PM2/25/21
to Developer Group for CMS Blue Button API
Has there been any updates on the availability on this?

Thanks,
Simon.

Corinne Stroum

unread,
Apr 6, 2021, 12:39:47 PM4/6/21
to Developer Group for CMS Blue Button API
Hi!  I'd love this full extract if it's available.
Reply all
Reply to author
Forward
0 new messages