Data Loss Prevention and Pub/Sub sendForBulkImport

31 views
Skip to first unread message

Vineeth Kanaparthi

unread,
Sep 18, 2024, 3:36:05 PM9/18/24
to GCP Healthcare Discuss
Hi team,

Can you please let us know what happens when sendForBulkImport is enabled?

Is it triggered for each instance import or is it triggered after an entire study/series is imported? 

Also, this article contains an image which talks about data loss prevention api, is there additional documentation on how to implement this for dicom data?

We are trying to process the study after all series and all instances of that study are imported. Can you guide us on how to achieve this?  

Thanks
Vineeth

Truc Le

unread,
Sep 18, 2024, 3:56:16 PM9/18/24
to Vineeth Kanaparthi, GCP Healthcare Discuss
Hi Vineeth,

If you sendForBulkImport, we will send notifications for all instances ingested into the DICOM store using the import method (i.e. it's triggered for each instance).

Import does not guarantee any processing order, therefore we don't know when the whole series is imported. If that is a new DICOM store, waiting for the import completion and retrieving studies by dicomweb works. Otherwise, you would need to use dicomweb's StoreInstances to ingest each study.

Data Loss Prevention is another GCP product. Without understanding your goals, we cannot give advice.

Best,
Truc


--
You received this message because you are subscribed to the Google Groups "GCP Healthcare Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gcp-healthcare-di...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gcp-healthcare-discuss/8640fd88-799d-4d4f-af35-b443dad39c9an%40googlegroups.com.

Paul Church

unread,
Sep 18, 2024, 4:13:31 PM9/18/24
to Truc Le, Vineeth Kanaparthi, GCP Healthcare Discuss
That diagram refers to the healthcare de-identification API, which integrates with DLP. See also https://cloud.google.com/healthcare-api/docs/concepts/de-identification and https://cloud.google.com/healthcare-api/docs/how-tos/dicom-deidentify for documentation related to using de-id with DICOM stores.

Truc Le

unread,
Sep 18, 2024, 5:10:07 PM9/18/24
to Vineeth Kanaparthi, Paul Church, GCP Healthcare Discuss
What are all possible ways dicom instances ingested into your dicom store? Only import or also StoreInstances?
 
If it's only from import, you could file a new feature request for notification at study or series level instead of instance level. It's because during the import, we process the instances in parallel to maximize efficiency. We don't know when a study is fully ingested until we process the entire input or all instances.

However, if your dicom store also has data coming from StoreInstances, it's not clear to me how you define "after ingestion".

Thanks,
Truc


If you use StoreInstances which accept multiple files, when the method succeeds, you can trigger your function.


On Wed, Sep 18, 2024 at 4:33 PM Vineeth Kanaparthi <vkana...@promaxo.com> wrote:
Thank you for your quick responses.

For additional context, we would like to trigger functions for the series (verification, ml models etc) after the ingestion.

Truc Le, Can you please elaborate on " If that is a new DICOM store, waiting for the import completion and retrieving studies by dicomweb works. Otherwise, you would need to use dicomweb's StoreInstances to ingest each study."? 

I understand that import does not guarantee any order but dicom tags do have information regarding related instances. A native google healthcare feature to send notifications based on these tags after a series/study is imported would be awesome to further process the study/series.

(0020,1200)ISNumber of Patient Related Studies
(0020,1202)ISNumber of Patient Related Series
(0020,1204)ISNumber of Patient Related Instances
(0020,1206)ISNumber of Study Related Series
(0020,1208)ISNumber of Study Related Instances
(0020,1209)ISNumber of Series Related Instances

Truc Le

unread,
Sep 18, 2024, 7:31:37 PM9/18/24
to Vineeth Kanaparthi, Paul Church, GCP Healthcare Discuss
I understand your point and from the PACS perspective, it makes sense. However, as a DICOM API, it's non-trivial for us to make this assumption because:
  1. The tags above are not reliable as it's not mandatory and it needs to be consistent across all instances within the study/series.
  2. We need to handle the API as a whole system. For example, during import, if there is a deletion, those counters become invalid.
  3. There could be multiple ingestion workflows that add more series into a study. I understand that this is not your case but it could happen in practice.
Anyway this is a new feature request and I'd recommend filing a feature request.

Thanks,
Truc


On Wed, Sep 18, 2024 at 6:00 PM Vineeth Kanaparthi <vkana...@promaxo.com> wrote:
For the import job to run,
  • the data should already be there in the source bucket
  • so whoever/whatever is uploading to the source bucket has to trigger the import job after the upload to the source bucket is done, to be sure that all the data has arrived.
  • This is an additional step. 
  • Even if we use StoreInstances api directly. We have to do this additional step (triggering another api/function/notification to indicate that ingestion is done). 
  • Some clients wont be able to do this additional step, imagine all the different pacs systems that are out there with just support for basic DICOMWeb or DIMSE
But if there is a native feature in google healthcare api to determine that all files related to the study/series have been uploaded (by using the dicomtags mentioned in the email chain above). We can piggy back on that trigger to run our other workflows.

Without this, we will have to depend on the instance level triggers and maintain atomic counters to trigger the downstream workflow once the counter reaches a specified number.

Thanks
Vineeth
Reply all
Reply to author
Forward
0 new messages