ID already exists in Database Sharding

Ragul Venkatesan

unread,

Apr 22, 2025, 5:27:03 AM4/22/25

to HAPI FHIR

I wanted to bring to your attention an issue we are facing with our current setup of the HAPI FHIR server. We have implemented multiple PostgreSQL databases for a single HAPI FHIR server using database sharding to address performance issues related to resource searching.

However, we are encountering the following problem:

Initially, we saved 70 resources in Database 1 (DB1).
Subsequently, we saved another 70 new resources in Database 2 (DB2), during which some negative keys were generated.
When attempting to save additional resources in DB2, the system responds with an "ID already exists" error.

Could you please advise on how we might resolve this issue or suggest any best practices for managing resource IDs across sharded databases?

James Agnew

unread,

Apr 22, 2025, 10:13:16 AM4/22/25

to Ragul Venkatesan, HAPI FHIR

This probably depends on how you are actually doing the sharding. Postgres doesn't have any native sharding capabilities, so you are presumably using an extension?

HAPI has been tested using Postgres' native partitioning, which is a related concept. A sample schema is available here, and requires the newly added Database Partitioning Mode (which I will caution is still a new concept and will probably require some experimentation in the starter project to get working).

Cheers,

James

--
You received this message because you are subscribed to the Google Groups "HAPI FHIR" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hapi-fhir+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/hapi-fhir/ac1b249c-f53f-4f2a-a40b-34cd0d05ba1en%40googlegroups.com.

Ragul Venkatesan

unread,

Apr 23, 2025, 3:06:50 AM4/23/25

to James Agnew, HAPI FHIR

Thanks for your response, James.

Extends the class org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource and created an interceptor class(find below) to choose database based on header,

@Component

@Interceptor

public class ShardInterceptor extends InterceptorAdapter {

@Hook(Pointcut.SERVER_INCOMING_REQUEST_PRE_HANDLED)

public void routeToShard(RequestDetails requestDetails) {

ShardContextHolder.setShardKey(requestDetails.getHeader("X-Country-ID"));

}

@Hook(Pointcut.SERVER_OUTGOING_RESPONSE)

public void clearShardContext() {

ShardContextHolder.clear();

}

From Database 1, (70 Resources)

From Database 2, (70 Resources)

Error thrown while attempting to post few resources,

{
"resourceType": "OperationOutcome",
"issue": [
{
"severity": "error",
"code": "processing",
"diagnostics": "HAPI-0550: could not execute batch [Batch entry 0 insert into hfj_spidx_string (hash_exact,hash_identity,hash_norm_prefix,sp_missing,sp_name,partition_date,partition_id,res_id,res_type,sp_updated,sp_value_exact,sp_value_normalized,sp_id) values (-5011986586831392247,-2891289766040777762,2328326519963105475,'FALSE','description',NULL,NULL,20,'SearchParameter','2025-04-23 06:55:23.661+00','Search Patient by phone number','SEARCH PATIENT BY PHONE NUMBER',211) was aborted: ERROR: duplicate key value violates unique constraint \"hfj_spidx_string_pkey\"\n Detail: Key (sp_id)=(211) already exists. Call getNextException to see other errors in the batch.] [insert into hfj_spidx_string (hash_exact,hash_identity,hash_norm_prefix,sp_missing,sp_name,partition_date,partition_id,res_id,res_type,sp_updated,sp_value_exact,sp_value_normalized,sp_id) values (?,?,?,?,?,?,?,?,?,?,?,?,?)]"
}
]
}

Now, I'm going to change the flow of Primary key(ID) generation in Hapi FHIR. Please guide me and give some tips.

Thanks and Regards,

RAGUL V

James Agnew

unread,

Apr 23, 2025, 6:34:51 AM4/23/25

to Ragul Venkatesan, HAPI FHIR

Unfortunately this kind of configuration isn't something we support in HAPI FHIR. It may well be possible to get it working, but we don't design or test scenarios like this so I don't really have any advice on what it would take.

My advice would be to look into whether you can achieve the outcome you're looking for using native database partitioning, paired with HAPI's partitioning. Possibly even the new Database Partition Mode. This is tested and supported so it's likely it be a much smaller lift.

If you really need fully segregated physical database shards and are open to a commercial solution, MegaScale might be an option too.