Delete and expunge of resources is very slow .. is this expected?

678 views
Skip to first unread message

Chuck Summers

unread,
Aug 29, 2022, 8:11:22 PM8/29/22
to hapi...@googlegroups.com

Hi,

 

BACKGOUND:

 

Our organization has been using the 5.6.3 HAPI-FHIR JPA server for storage / analysis of patient generated health data (PGHD).  This PGHD comes from a variety of source .. Fitbits, Garmin, Dexcom, and other wearable devices.   It is streamed into the HAPI-FHIR server throughout the day.  Independent and/or sample series measurements are stored as FHIR R4 observations that reference back to the patient they are associated with.  This productized operation has been happening for almost 2 years to-date   Today there are 30M+ observations spread across 30K+ patients in our HAPI-FHIR implementation.   Not all patients have the same count of observations .. it varies widely.. some patients have handful of observations and some have 20K+ attached.

 

THE PROBLEM:

 

It has recently come to our attention that one of our applications that injects measurements has been recording them with the wrong measurement collection date-time.  The application has fixed the flaw and would like to replay what measurement data it can from archived sources.   In preparation for this replay… we are trying to completely remove the bad observations from our HAPI-FHIR database.   The staff and researchers using the HAPI-FHIR data feel it best that ALL the known to be bad measurements be completely scrubbed from the system.  We have a complete list of observations by resource id to be removed.  The list is 10M observations in length.   You read this correct... this is 1/3 of the present observations that we are trying to remove from an active production HAPI-FHIR service.   We have a script that calls the HAPI-FHIR API for each individual bad observation resource to soft delete and then hard delete it (i.e. expunge).   The expunge action is done exactly as documented in 5.15.1 and 5.15.5… see instance-level expunge.     The complexities of the HAPI-FHIR JPA scheme appear that it would be better to do this thru HAPI-FHIR mechanics .. than for my team to hack away at the underpinning resource, version, and spidx database tables… and .. hope for the best.

 

IT IS VERY SLOW:

 

We began the delete operation last Friday.  Over 2+ days .. the deletion / expunge rate of execution was roughly 65K observations per day.    Deletion and expunge is occurring… however, it is unbearably slow.   We have a 4 node HAPI-FHIR cluster backed by a shared Oracle database (i.e. not Oracle RAC).  The FHIR and Oracle nodes are not under stress … we see no signs that any system or database resources are being exhausted.   We know that the delete / expunge API calls are the cause of the slowness.  We have a dry-run mechanism built into our script that goes thru every step .. except for the delete / expunge API calls.  Dry-run mode executes quickly.   

 

TODAY we stopped the job this morning and altered it to push 5X more delete / expunge operations in parallel at the HAPI-FHIR service.  This has been running for 6+ hours.. so far this change does not look to have improved anything.   I will post more tomorrow.  

 

If anyone has any thoughts / suggestions.. that would be much appreciated… even if the comments are confirming that it is expected to be slow.

 

An alternate way to accomplish the above may be to extract / copy out the good observation data and insert it into a new HAPI-FHIR R4 service / database set-up… and .. then switch the new copy into production operation once the copy out action is completed.   We have migrated 20M+ resources before in this way to upgrade existing FHIR resource data from DSTU2 to R4 format.  This took roughly 3.5 days.     In comparison… the current in-place delete / expunge rate of 65K per day for 10M resource .. is projected to be complete in .. wait for it .. 10M / 65 / day = 153+ days .. :^(.   

 

It appears to me  that the operation of HAPI-FHIR delete and expunge is burdened with far more things to do than insertion is.

 

Regards

Chuck

-- 

signature_2992659199

 

James Agnew

unread,
Aug 30, 2022, 11:22:12 AM8/30/22
to Chuck Summers, HAPI FHIR
Hi Chuck,

For what it's worth, deleting is definitely not as heavily optimized in HAPI FHIR as writing is. That said:

- You aren't the only one who has noticed and commented on this. A fair bit of optimization has happened in the last few releases, so certainly it would be worth considering trying to upgrade if this is possible.

- One specific enhancement I'll point to is the "$delete-expunge" operation, which is a batch operation that skips a lot of the processing that slows down traditional deletes. This is documented here: https://smilecdr.com/docs/fhir_repository/deleting_data.html (Smile CDR docs, but this operation exists in HAPI FHIR too).

- If all else fails, one thing you could do is just manually mark the resources as deleted in the database using a SQL query. The HFJ_RESOURCE.RES_DELETED_AT column just needs to be non-null in order to mark a resource as deleted. If you do go down this path, you should also run a manual reindex job, which will clean up after your deletion and improve search performance overall (until you do this, searches will still find your deleted resources but they will be filtered from results after the fact so that clients don't ultimately see them. After you reindex, searches won't find these resources any more).

Cheers,
James


--
You received this message because you are subscribed to the Google Groups "HAPI FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hapi-fhir+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/hapi-fhir/AD8284D4-2251-4A18-AE3D-DD08D13952C1%40aboutobjects.com.

Rafael Weingartner-Ortner

unread,
Feb 21, 2023, 7:04:53 AM2/21/23
to HAPI FHIR
I have a similar case as described above: I would like to
1. delete all resources of one specific resource type (DiagnosticReport)
2. delete some resources of type Observation (with a specific "code")

I want to fully delete those resources.

Thanks, James for the information about the "$delete-expunge" operation and the description of manually marking resources as deleted in the database. I would have a question for the latter: If I mark those as deleted in my database, are they also actually deleted at some point (SQL DELETE)? Or are they only removed from the index?
Can I execute a SQL DELETE on those rows or would this corrupt the database?

Thanks, Rafael

Reply all
Reply to author
Forward
0 new messages