HAPI FHIR Server Persistence Unit

291 views
Skip to first unread message

Kevin Mayfield

unread,
Aug 7, 2016, 3:20:07 AM8/7/16
to HAPI FHIR
I've finding the FHIR Server very useful and I'm looking at moving it to our production environments.

The main problem I have is how long some of the lookup's take. It takes around 150ms for each lookup on a system with less 1 million resources. My client is Apache Camel + HAPI Hl7v2 calling HAPI FHIR via REST, the lookups are used to assemble the resource. So posting a resource can take 1-2 seconds.

I've tried different SQL databases, JVM config and load balancing using multiple application(/FHIR) servers. 
What I would want to look at next is calling HAPI_PU directly - think this means using hapi-fhir-jpaserver-base directly?? 

But I don't think that will give me the performance increase I'm after as I suspect it's the DB structure not REST that's the problem. 
Most of my lookups use identifiers which I understand are stored in HFJ_SPIDX_TOKEN table which has columns RES_TYPE and SP_SYSTEM both of which are Strings, to get some performance increase I'd want to move these to Integers ID's which refer to Resource and System lists to a separate table [It's quicker to search Int than Strings]. Would also want to make similar changes to other objects and tables.

I would like to do some more investigation and make changes to test this, ideally committing(/sharing) the code if I've made some improvement. However I'm new to hibernate so initial focus would be learning (but I'm not new to SQL and pick up things fast). [If I get the go ahead to use HAPI FHIR Server I should get more developer resources to help]

What would be the best approach to contributing? 

Kevin Mayfield

unread,
Aug 7, 2016, 3:33:26 AM8/7/16
to HAPI FHIR
Some background info:

We're looking at using HAPI FHIR Servers to provide FHIR API's where we have PAS systems that don't support FHIR. These servers will be updated by HL7v2 ADT feeds.

Also to provide XDS Document Registries. If we need to support traditional XDS queries we would be mapping them to FHIR queries. So this isn't MHD where FHIR queries map to XDS.

So we're only looking at supporting a small number of resources if that makes a difference (should we use a different approach to persistence?)

James Agnew

unread,
Aug 8, 2016, 7:19:25 AM8/8/16
to Kevin Mayfield, HAPI FHIR
Hi Kevin,

This is interesting- when people talk about performance with the JPA server, generally their issue is with searches that are returning large numbers of results. One of the areas we're actively working right now is on speeding that up (I'm working on building a test rig as we speak for this exact purpose). That doesn't sound like your scenario though.

I'm surprised that writes are problematic for you though; I haven't found this to be the case in the scenarios I've run. Would you be able to elaborate a bit on the specifics of what you're doing? I'm guessing from your description that you are posting a transaction to your server, and that this transaction has a bunch of conditional creates or conditional updates? Assuming so, could you share the match URL being used?

In terms of how to get started trying to work on the JPA code itself- I guess the best place to start would be to plug in a database browser to the database so you can have a look at the schema it generates. I would probably start by trying to build manual SQL queries for the things you want to do. If you have ideas for new approaches in the schema, I could probably give you pointers about where to start in the codebase for those.

Cheers,
James


--
You received this message because you are subscribed to the Google Groups "HAPI FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hapi-fhir+unsubscribe@googlegroups.com.
To post to this group, send email to hapi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hapi-fhir/2406b80c-a8af-47f5-a6a6-b05013c815f1%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Kevin Mayfield

unread,
Aug 8, 2016, 7:58:34 AM8/8/16
to James Agnew, HAPI FHIR
thanks.

Not a problem with posting resources. It's the lookup's that build the resource to be added to the server. I'll do a lookup/search on (these are all on identifiers to get the id e.g. GET http://localhost:8080/hapi-fhir-jpaserver/baseStu3/Patient?identifier=urn:fhir.nhs.uk:id/NHSNumber|9876543210):

Organization
Practitioner
Episode
Encounter
Location
Patient

These take around 200ms each. 

Actual posting is taking around 80ms

The combination of the above takes between 2300ms and 4000ms depending on resource.

I'll have a look at going direct to the db but a quick test doesn't look promising

SELECT RES_ID FROM hapifhirstu3.HFJ_SPIDX_TOKEN
where RES_TYPE='Patient' and SP_SYSTEM = 'urn:fhir.nhs.uk:id/NHSNumber' and SP_VALUE='9876543210';

It does sound like you're already looking into these issues.

Kev


James Agnew

unread,
Aug 8, 2016, 9:06:44 AM8/8/16
to Kevin Mayfield, HAPI FHIR
Hi Kevin,

That specific query I wouldn't expect to be vary fast, as there is no index on that combination of columns. The database should have an index by the name of 'IDX_SP_TOKEN' with the columns RES_TYPE,SP_NAME,SP_SYSTEM,SP_VALUE and that's the combination that the JPA module uses.

How does the following query perform for you?

select * from HFJ_SPIDX_TOKEN where RES_TYPE= 'Patient' and SP_NAME='Token' and SP_SYSTEM = 'urn:fhir.nhs.uk:id/NHSNumber' and SP_VALUE='9876543210'

Cheers,

James


Kevin Mayfield

unread,
Aug 8, 2016, 9:36:44 AM8/8/16
to James Agnew, HAPI FHIR
Much better - 32 ms. :)

I'll have a look at moving my code directly on to the persistence unit or SQL. 

James Agnew

unread,
Aug 8, 2016, 9:55:50 AM8/8/16
to Kevin Mayfield, HAPI FHIR
This is still weird then... If a manual search takes 32ms, I would expect the entire lookup to take not much longer than that.

Presumably this lookup when you're actually doing it in your real environment is only returning 1 result? Possibly part of the problem here is that even small search results are being paged back into the database (useful for large searches but not useful for small ones). That sounds like something worth fixing.

Out of curiosity, what is the lookup for? If you're just trying to get the ID of the resource with the given NHS number so that you can update it, would it be feasible to do all of this in a FHIR transaction using conditional create/update depending on the logic you're trying to achieve? You'll get a fair bit of optimization for free that way- most importantly the lookups are done in-memory instead of by using database paging, but you also get savings because only a single database transaction is created and there is much less HTTP back-and-forth.

(There's actually a rudimentary example of how to do this here if it's useful.)

Cheers,
James

Kevin Mayfield

unread,
Aug 8, 2016, 10:22:59 AM8/8/16
to James Agnew, HAPI FHIR
If I understand this correctly, in that example:

You build a Patient resource and give it a random ID. This will be posted to the database if the identifier =http://acme.org/mrns|12345 doesn't exist.

The observation resource references subject/patient by this random ID. 

When this bundle is processed the observation will reference the database Id of the Patient resource (and it doesn't matter if the Patient resource existed beforehand or was newly created).


Yes, this does sound exactly the same. 

Appreciated.
K

Kevin Mayfield

unread,
Aug 9, 2016, 7:26:14 AM8/9/16
to hapi...@googlegroups.com

Not getting a clear answer on performance - it seems faster but my results clouded by other issues (lack of memory and processing power at present)

The code is certainly a lot cleaner and clearer.

On 8 August 2016 at 19:38, Kevin Mayfield <mayfiel...@gmail.com> wrote:
I'll have better answer tomorrow but it's seems much faster

Total post time seems to be 200ms which is same as before but that also includes the other handling - down from 3000ms. [It may be better as I'm running on laptop on mo]

On 8 August 2016 at 16:09, James Agnew <james...@gmail.com> wrote:
Yup, exactly.

If this does work for you, I'd love to hear about what your new performance numbers look like- as I mentioned, JPA performance is an active topic right now so hearing how people are managing with bigger datasets is nice.

Actually, for that matter if you're able to share that DB you're using and how many resources you're storing (or any other numbers you think might be interesting) I'd love to hear about it. :)

Cheers,
James
Reply all
Reply to author
Forward
0 new messages