Understanding the test data in public HAPI server

64 views
Skip to first unread message

Tim Harsch

unread,
Mar 4, 2025, 5:59:03 PM3/4/25
to HAPI FHIR
Hi,
I have a need to come up with a few choice patients from the public HAPI server.  They would meet some criteria like a patient with >100 observations, >20 conditions, etc.  In previous incarnations of the server I could pretty much just random sample a handful of them until I could find an interesting one.  But now, with over 1.7M patients with a total of over 4M observations, it can't really be done by random sample without putting a lot of load on the server.

Is there a repo somewhere of the data that was constructed for the server, so I can maybe load it locally and query it with SQL ? 
-or-
Is there a FHIR REST query that might help in this regard?  Or maybe a combination of a few queries I could do without having to issue 100s of queries.

Thanks,
Tim


James Agnew

unread,
Mar 5, 2025, 9:56:32 AM3/5/25
to Tim Harsch, HAPI FHIR
Probably the easiest thing would be to create your own data using Synthea.

FWIW though, here's a patient with a lot of conditions that I've used for testing in the past: https://hapi.fhir.org/baseR4/Condition?subject=Patient/gtp101

And here's one with a lot of observations: https://hapi.fhir.org/baseR4/Observation?subject=Patient/30

Cheers,
James

--
You received this message because you are subscribed to the Google Groups "HAPI FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hapi-fhir+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hapi-fhir/1dd5652f-60ad-4139-848b-960c306633f9n%40googlegroups.com.

Tim Harsch

unread,
Mar 5, 2025, 3:30:19 PM3/5/25
to James Agnew, HAPI FHIR
Hi James,
Thanks for the response and the sample patients.  They could be useful.  I would prefer to have a single patient with 10s to 100s of Observations, Conditions, Allergies, Procedures, etc.  On the old dataset we had a patient named "Berry Keebler" that was pretty good, but the data has been recreated since then and he is gone now.

To be clear, I am not wanting the patient(s) for any automated testing or anything, we have automated tests on local resources already.  These patients are just good test patients for manual testing when we pull a test patient's records from HAPI.   Low impact to the server.

Is the code for the build process you used for Synthea to create the dataset in the test server available in a github repository?  Are the data files from the completed run available somewhere? S3 or something like that?

Thanks,
Tim

James Agnew

unread,
Mar 5, 2025, 5:18:15 PM3/5/25
to Tim Harsch, HAPI FHIR

We don't actually supply any of the data on the server, it was a blank slate when we first turned it on, and any data that has been created or deleted was done by the public. Well, we did wipe it once a few years back, but then it was the same blank slate.

Cheers,
James

sent from my phone.

Tim Harsch

unread,
Mar 5, 2025, 6:24:08 PM3/5/25
to James Agnew, HAPI FHIR
Oh, ok.  Thanks for clarifying.  Maybe we will just upload some data for one or two users then, and if ever you wipe the server in the future we could just put them back.

Thanks,
Tim
Reply all
Reply to author
Forward
0 new messages