Scaling HAPI FHIR JPA Server

Ben Li-Sauerwine

unread,

Mar 2, 2021, 4:13:20 AM3/2/21

to HAPI FHIR

The documentation here claims that "Many large architectures, including enterprise messaging systems, regional data repositories, telehealth solutions, etc. have been created using HAPI FHIR JPA server as a backend. These systems often scale to handle millions of patients and beyond." I'm testing on an auto-scaling cluster with each instance of the JPA server having 1 CPU and 2GB of memory. These instances are together backed by a single managed Postgres database. I start seeing instability after even just a few hundred resources as I import from Synthea transaction bundles for 100 patients.

For 100 total patient bundles, I followed the procedure of ingesting a bundle, receiving a response from the server, waiting 60 seconds for the server to stabilize, then ingesting the next. At the end, I found that only 57 of 100 were successful. I'm going to need to be able to scale to O(millions) of individual FHIR resources, so clearly I need more power behind the system.

For anyone managing millions of patients, can you provide a hint as to the size and type of backing database you use, as well as how much CPU and memory you allocate to each instance of the JPA server? If you can recommend any specific configuration settings that you found important, that would also be helpful.

Thanks!

~Ben

Xiaocheng Luan

unread,

Mar 3, 2021, 11:04:18 AM3/3/21

to Ben Li-Sauerwine, HAPI FHIR

Millions of individual resources should be well within the capacity of a single instance of HAPI server. "Millions of patients", if you meant the different types of resources for the millions of patients, which could easily total to billions of resources, may be a different story.

I'm wondering what each JPA server instance is like in your settings, do you use the JPA server starter package or do you build your own service from the HAPI FHIR library (or significantly customized the existing server starter package)? We are facing similar issues and are also in search for a solution.

--
You received this message because you are subscribed to the Google Groups "HAPI FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hapi-fhir+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hapi-fhir/e1380fd5-dfc0-4f82-9e80-d3a847de07c0n%40googlegroups.com.

Ben Li-Sauerwine

unread,

Mar 3, 2021, 2:18:10 PM3/3/21

to HAPI FHIR

We're looking at near O(millions) of individual FHIR resources representing tens of thousands of patients in the near term, but we could eventually end up with many more.

In my current configuration, we're using an auto-scaling Kubernetes cluster running the JPA server with each instance having 1 CPU and 2GB RAM. This is currently backed by a 2 vCore Postgres instance in Azure.

Clearly, I'm going to need much more in the way of computational resources. I'm hoping that someone with experience managing records for a large healthcare system can provide some hints as to the best CPU and memory parameters for the JPA server and the ideal ratio of JPA server instances to database power.

Xiaocheng Luan

unread,

Mar 3, 2021, 8:51:24 PM3/3/21

to Ben Li-Sauerwine, HAPI FHIR

Not sure how much exploration you have done, if not, I would suggest starting with 4-core/8GB, or better, 8-core/16GB single instance to get some feeling of the system, it should be able to handle a few million resources. The cluster may come into picture down the road for scalability/availability, which is something I'm starting to think about but have not gone anywhere yet.

To view this discussion on the web visit https://groups.google.com/d/msgid/hapi-fhir/1bf21db4-d9e3-43e3-abab-4fe4342f0dden%40googlegroups.com.

Reply all

Reply to author

Forward