How to load data into the smart-dev-sandbox

213 views
Skip to first unread message

Scott

unread,
Nov 27, 2020, 7:44:12 PM11/27/20
to SMART on FHIR
Hi,
Is it possible to load data into the Smart Dev Sandbox? We would like to create some custom Synthea data for dev.

If so how? Very new to this any help would be appreciated.

-Scott

Luis Sayago

unread,
Nov 28, 2020, 8:57:59 AM11/28/20
to SMART on FHIR
Raj Vansia created a python script that you can use to create patients from the Synthea files to pretty much any FHIR endpoint. Have not tried it yet personally but I have the same use case and was planning to soon, so if it works for you let us know!

Tim Harsch

unread,
Nov 30, 2020, 11:49:20 AM11/30/20
to SMART on FHIR
I've been wanting to try this out for awhile, was very pleased to see the blog you posted.  So I gave this a try just now.  Overall, it shows a lot of promise.  I already had the smart-dev-sandbox stack up and running, so I used that.  Here's what I found on my first attempt:

1) I tried using DSTU2 with smart-dev settings R2_ENABLED=0
2) I used synthea with properties in src/main/resources/synthea.properties set to:

  • exporter.hospital.fhir_dstu2.export = true
  • exporter.fhir_dstu2.export = true
  • exporter.practitioner.fhir_dstu2.export = true

3) ./run_synthea -p 10 # 10 patients

4) The script in the blog worked great after I changed a few paths.

5) The HAPI server in smart-dev-sandbox threws errors on import stating some Resource types and properties weren't valid for DSTU2, and it seemed the import did not succeed.

I then switched my smart-dev stack to use R4_IMAGE=smartonfhir/hapi-5:r4-empty and changed the synthea.properties back to false. You have to clean the output directory each time or the old records are left behind so `rm -rf output/*`  I then ran `./run_synthea -p 1` and the upload script.  It seemed to succeed without a problem.  I note the upload of one record took quite a long time - on the order of 3-4 minutes.   My previous run with 10 records took the same amount of time, so there is probably some large init time going on.

I then used the patient browser to find the new patient (there should be only 1 at this point).  The patient shows up fine in the browser, complete with address and MRN and such.  Clicking on the patient yielded an error in the UI.  The HAPI server threw some null pointer error.

Things I want to do before I try again.

1.  git pull on both synthea and smart-dev-sandbox.  my repos are probably a few months old.

2.  For smart-dev I should build the docker images locally.  I went straight to `docker-compose up` to run the stack which I think pulled it directly from Docker-hub, those images look like they haven't been updated in 6 months.

Hope this helps,

Tim


carl.a...@gmail.com

unread,
Nov 30, 2020, 2:38:06 PM11/30/20
to SMART on FHIR
This is a problem I tried to tackle earlier this summer for the Argonaut Patient Lists initiative.  I generated 1000 us-core patients in Synthea and then loaded them into an empty hapi server, file by file as Raj had done.  The drawback with this approach is that each patient file also includes the definitions for the referenced resources in the patient file, so the resulting server contains many duplicate references for certain types.

For example, if patient A has an encounter with provider Z at location M, and patient B also has an encounter with the same provider at the same location - there will be two identical resources loaded for both location M and provider Z, differing only in ID (let's call them location M' and provider Z').  Further, patient A will reference location M while patient B will reference location M'.

I realized this would not be ideal, so I wrote a fair bit of python to make my corpus of synthea generated patients loadable into hapi while also eliminating duplicate identical shared entities.

For now, the result of that work lives here:

But I intend to refactor the tool into it's own repo here:
The tool does more than just load the data, too.  It tags the loaded resources, so it's possible to delete what's been loaded from a server without affecting other resources in the server.

One caveat I found was that there was a bug in Synthea where the Location resources contained a non-unique ID across patient files (leading to unavoidable duplicates), but this was fixed in September.  If you're trying to load Synthea data generated by a version that predates this fix - you will still see duplicate location resources (although nothing else should be duplicated).

If anyone here is interested in helping me kick the tires of this tool as I migrate it to a new repo, I would be grateful!

Keith Boone

unread,
Nov 30, 2020, 3:00:55 PM11/30/20
to carl.a...@gmail.com, SMART on FHIR

I did something similar, it works well enough for Connectathon testing, but needs a better solution b/c I never did deal with dependency checking.

 

You can see my solution here: https://github.com/AudaciousInquiry/hapi-fhir-jpaserver-starter/blob/master/src/main/java/com/ainq/saner/SanerServerCustomizer.java

 

                Keith

 

 

Keith Boone, Enterprise Architect
kbo...@ainq.com | (617) 640-7007

AUDACIOUS INQUIRY
Bold Solutions for Connected Healthcare®
ainq.com|Twitter|YouTube|Facebook| LinkedIn

--
You received this message because you are subscribed to the Google Groups "SMART on FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to smart-on-fhi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/smart-on-fhir/320bdc08-b72a-4c1f-9b4a-bf02e8a97b71n%40googlegroups.com.

carl.a...@gmail.com

unread,
Nov 30, 2020, 3:06:33 PM11/30/20
to SMART on FHIR
Quick update to my previous reply:
The duplicate Location bug was corrected in mid October, not September - FWIW.  So, it's not in the latest Synthea release, v2.6.1, but it is merged into the master branch.

The fix was only applied to the R4 exporter.  If you are interested in seeing this fix applied to STU3 or DSTU2, you may want to weigh-in here:

Also, there is a blog post which talks about how to avoid duplicates by generating transaction bundles in Synthea and loading those (which is more straightforward).

Tim Harsch

unread,
Dec 3, 2020, 7:03:32 PM12/3/20
to SMART on FHIR
I weighed in on that issue and got a response that it was fixed in the other exporters too.

michael....@gmail.com

unread,
Dec 11, 2020, 2:57:08 AM12/11/20
to SMART on FHIR
FWIW I use Tag-uploader for this. https://github.com/smart-on-fhir/tag-uploader
Michael

Reply all
Reply to author
Forward
0 new messages