Subjects anonymization

347 views
Skip to first unread message

david.b...@brain-dynamics.es

unread,
Jan 25, 2017, 10:59:07 AM1/25/17
to xnat_discussion
Hello. I'm using XNAT 1.7.3 and I'd like to know how to achieve a full subject anonymization. If I have understood how XNAT handles the anonymization, it overrides or deletes DICOM headers according to the anonymization scripts. However, the actual patient data read from the DICOM files is used and stored in the database. If you upload a DICOM image of a new subject, it will even create a new subject with the patient's name. What I want to do is remove the patient data and maybe other data too BEFORE it is used, thus making XNAT oblivious to that information. Is there a way to do that through configuration or do I have to add or modify the code? Thanks in advance.

Cinly Ooi

unread,
Jan 25, 2017, 11:13:59 AM1/25/17
to xnat_di...@googlegroups.com
If you push data to XNAT Prearchive, a copy of your stored data will be placed in Prearchive, in my case /export/data/xnat/prearchive, where /export/data/xnat is the top level directory for all XNAT data. In Prearchive, the data that gets stored that is an exact copy of what you had uploaded.

When you move the data into archive, the anonymization script runs and push data into archive, so the data in Archive is anonymized as instructed the anonymization script. The original data in Prearchive is deleted.

So, yes, there is a period of time, i.e., when the data is in Prearchive, when the data in XNAT is not anonymized

As I understand it, this is not good enough for you. You want the data to be anonymized before XNAT gets it, i.e. you want the data in Prearchive to be anonymized.  For this, I am afraid that the only way you can do it is to anonymize the data first before sending it to XNAT.

The rule of thumb is, for any webserver handling file upload, the original file must be stored somewhere as it is when the webserver receive the data and before it acts on the data. As such, if you do not want non-anonymized data to appear on your webserver, you must anonymize it yourself before uploading.

I am not sure whether tools like XNAT Desktop will do the anonymization first before sending data to XNAT.

You can use DICOMBrowser to anonymize DICOM data en-mass and send data to your XNAT server.


Best Regards,
Cinly

*****
“There should not be an over-emphasis on what computers tell you, because they only tell you what you tell them to tell you,” -- Joe Sutter, Boeing 747 Chief Engineer.

On 25 January 2017 at 15:24, <david.b...@brain-dynamics.es> wrote:
Hello. I'm using XNAT 1.7.3 and I'd like to know how to achieve a full subject anonymization. If I have understood how XNAT handles the anonymization, it overrides or deletes DICOM headers according to the anonymization scripts. However, the actual patient data read from the DICOM files is used and stored in the database. If you upload a DICOM image of a new subject, it will even create a new subject with the patient's name. What I want to do is remove the patient data and maybe other data too BEFORE it is used, thus making XNAT oblivious to that information. Is there a way to do that through configuration or do I have to add or modify the code? Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.
Visit this group at https://groups.google.com/group/xnat_discussion.
For more options, visit https://groups.google.com/d/optout.

McKay, Mike

unread,
Jan 25, 2017, 11:45:50 AM1/25/17
to xnat_di...@googlegroups.com

Another option is to upload the data using the XNAT Upload Applet or the XNAT Upload Assistant. If you upload the data via either of these methods and have saved a site-wide anonymization script, your data should be anonymized before being uploaded to XNAT. You can go to https://wiki.xnat.org/display/XW2/Step+4+of+8:+Write+Anonymization+Scripts for more information on this. Hope that helps!

 

-Mike

--

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

 

--

You received this message because you are subscribed to the Google Groups "xnat_discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

 


The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

Herrick, Rick

unread,
Jan 25, 2017, 2:08:17 PM1/25/17
to xnat_di...@googlegroups.com

This is actually isn’t how anonymization works through the receive/prearchive/archive lifecycle, or at least doesn’t take the site anonymization script into consideration. Incoming DICOM data is actually anonymized at two separate points in this process:

 

·         As the data is received and written to the prearchive, the site-wide anonymization script is applied

·         As the data is moved from the prearchive to the archive, the anonymization script for the project is applied

 

To illustrate this, I set up a site-wide anonymization script that had the following expression:

 

(0008,1090) := "SITE ANON"

 

I also set up a script for my project with this:

 

(0010,21B0) := "PROJECT ANON"

 

I then set some values in the DICOM I was sending:

 

$ dcmdump +P "0008,1090" +P "0010,21B0" ~/DICOM/test-anon/000001.dcm

(0008,1090) LO [TOTALLY PRIVATE STUFF]                  #  22, 1 ManufacturerModelName

(0010,21b0) LT [PRETTY PRIVATE STUFF]                   #  20, 1 AdditionalPatientHistory

 

I sent the data via C-STORE to XNAT and it landed in the prearchive. Looking at the data that lands in the prearchive yields:

 

$ find . -type f -name *.dcm | xargs dcmdump +P "0008,1090" +P "0010,21B0" | more

(0008,1090) LO [SITE ANON]                              #  10, 1 ManufacturerModelName

(0010,21b0) LT [PRETTY PRIVATE STUFF]                   #  20, 1 AdditionalPatientHistory

 

So the tag specified in the site-wide anon script has been anonymized away, while the project-specific tag still has its original value. Now I moved the data from the prearchive to the archive project. The data that lands there shows this:

 

$ find . -type f -name *.dcm | xargs dcmdump +P "0008,1090" +P "0010,21B0" | more

(0008,1090) LO [SITE ANON]                              #  10, 1 ManufacturerModelName

(0010,21b0) LT [PROJECT ANON]                           #  12, 1 AdditionalPatientHistory

 

The original tag values are completely gone from the system at this point: they can’t be found in log files, the database, catalog XML files, nor in the DICOM data stored in the archive. They’re also not accessible in the DICOM header dumps through UI, which are actually generated on demand and so will only return what’s in the DICOM itself.

 

From this, we can say that your DICOM data is completely secure and anonymized with a couple of caveats:

 

·         Data en route from the sender to the DICOM receiver would be vulnerable to network sniffing or man-in-the-middle attacks (XNAT doesn’t support TLS encryption on its receiver, although that would be something that could be set up; support for TLS encryption in many sender tools is a bit spotty as well)

·         If there is a significant number of DICOM tags that may contain sensitive data for one project but contain important values to retain for another project (meaning that those values can’t be anonymized or deleted at the site-wide level), those fields would be exposed for the period of time that the data sits in the prearchive before being moved to the target project

·         If your anonymization scripts don’t delete or transform sensitive data values, those fields would be exposed

 

Like Mike said, you can use the upload assistant application to send data as well. That has a number of advantages:

 

·         The upload assistant applies the site-wide and project-specific anonymization scripts on the data before being sent across the wire. This means that even a MITM exploit would fail to extract sensitive information (again, presuming the anon scripts are effectively scrubbing that data).

·         It uses http(s) for transfer, meaning that the reliability of the network connection is much better and more suited for long-haul data transfers (e.g. between institutions), whereas C-STORE traffic has a very high failure rate outside of fairly closely tied networks.

 

It has its disadvantages as well, since it requires a separate installation, isn’t integrated directly into most clinical workflows the way that standard DICOM composite operations are, etc.

 

HTH.

 

-- 

Rick Herrick

Sr. Programmer/Analyst

Neuroinformatics Research Group

Washington University School of Medicine

 

From: "xnat_di...@googlegroups.com" <xnat_di...@googlegroups.com> on behalf of Cinly Ooi <cinl...@gmail.com>
Reply-To: "xnat_di...@googlegroups.com" <xnat_di...@googlegroups.com>
Date: Wednesday, January 25, 2017 at 10:13 AM
To: "xnat_di...@googlegroups.com" <xnat_di...@googlegroups.com>
Subject: Re: [XNAT Discussion] Subjects anonymization

 

If you push data to XNAT Prearchive, a copy of your stored data will be placed in Prearchive, in my case /export/data/xnat/prearchive, where /export/data/xnat is the top level directory for all XNAT data. In Prearchive, the data that gets stored that is an exact copy of what you had uploaded.

--

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

 

--

You received this message because you are subscribed to the Google Groups "xnat_discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

Daniel Marcus

unread,
Jan 25, 2017, 2:19:23 PM1/25/17
to xnat_di...@googlegroups.com
Rick, thanks for the correction to Cinly's response and the very helpful explanation of XNAT's DICOM anonymization workflow. 

It's also worth mentioning a few more options:
- You can submit DICOM securely via the REST API.
- You can use RSNA's Clinical Trial Processor (CTP) as an anonymizing intermediate between the DICOM sender and XNAT.
- You can use an intermediate XNAT to receive the data, anonymize it, and then forward on to a 2nd XNAT for permanent storage.  This functionality depends on XNAT 1.7 and the minty fresh xsync plugin.

Hopefully one of the options suggested in this thread fits your workflow!

-Dan

--

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.

 


The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.

david.b...@brain-dynamics.es

unread,
Jan 26, 2017, 4:52:54 AM1/26/17
to xnat_discussion
Thank you all for your answers. So, if I've understood it correctly, uploading images to the Prearchive using any of the upload options XNAT offers (I use the compressed uploader), will trigger the site-wide anonymization script. That means the images should be stored in the prearchive already anonymized. When I check the DICOMs stored in the prearchive using this method, I can see the anonymization script has been applied to the DICOM files. In particular, data like the patient's name has been removed. However, in the prearchive, the subject has already been created in XNAT using the patient's name. So, XNAT reads the DICOM headers before the anonymization has taken place, creates the subject and stores it and then proceeds to anonymize the DICOM files. How can I stop this data from being registed at all in XNAT? I guess if I could get the anonymization script to take place before XNAT reads the data from the DICOM to create the subject data, the private information won't be stored and the subject created will have the name I've specified in the site-wide anonymization script, right? Thanks again!

--

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

 


The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To post to this group, send email to xnat_di...@googlegroups.com.

Daniel Marcus

unread,
Jan 26, 2017, 9:45:45 AM1/26/17
to xnat_di...@googlegroups.com
Hi Dave,

You got the DICOM file anonymization right, but the rest is incorrect. 

When a DICOM study is in the prearchive, nothing has been written to the database yet. That's the point of the prearchive. Subjects aren't created in the XNAT database until the data are archived. 

When the study is being archived, XNAT will inspect the DICOM to figure out the subject ID.  Recall that the site anonymization script will have already run, so if you're concerned about PHI, you've likely have have setup site anonymization to remove PHI at this point. But if there's still PHI in the DICOM, then XNAT may well guess at a subject ID that includes that PHI.  On the archive page, the user has the opportunity to enter a different ID. When they submit that page, the project anonymization script is executed and the subject ID is assigned and written to the database.

Given that workflow, there are several steps for you and your users to remove PHI before anything is written to the database.

-Dan

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.

Maria de la Iglesia Vayá

unread,
Jan 26, 2017, 2:33:31 PM1/26/17
to xnat_di...@googlegroups.com

Hi all


In our case, we have connected the Regional PACS to our XNAT instance through Clinical Trial Processor (CTP), there is the best solution (in our opinion), in CTP you can implement scripts for implementing the part 15 of the DICOM standard as you can see in the slide 19 in the next presentation. (that part is related to the Attribute Confidentiality Profiles)


I hope this help you,

----

María de la Iglesia Vayá, PhD
Deputy Directorate- General for Health Information Systems (CEIB-CS)
Regional Ministry for Health
The scientific representative of Spain in the Interim Board of EuroBioimaging (Medical Imaging)    
Brain Connectivity Lab. Neurological Impairment Program
Joint Unit  FISABIO & Prince Felipe Research Center (CIPF)
C/Eduardo Primo Yúfera (Científic), nº 3 (Junto Oceanográfico)
46012 Valencia, Spain

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussion+unsubscribe@googlegroups.com.
To post to this group, send email to xnat_discussion@googlegroups.com.

Daniel Marcus

unread,
Jan 26, 2017, 7:20:09 PM1/26/17
to xnat_di...@googlegroups.com
Nice work Maria!

david.b...@brain-dynamics.es

unread,
Jan 27, 2017, 6:32:34 AM1/27/17
to xnat_discussion
OK, I think I've got it this time. I've created an anonymization script and set it site-wide. Then, I've uploaded an image with private information to the prearchive. Through the prearchive view, as Daniel pointed out, I can overwrite the subject information and other information. Then, when I acrhive it, the data I can see in XNAT is properly anonymized: the subject doesn't have the name read from the DICOM and the DICOM headers have been anonymized as instructed by the site-wide script. However, if I go to the subject and view its XML, I can see, at the bottom of the XML, the data read from the DICOM before the anonymization, that is, the private information I was trying to avoid in the first place (tags like xnat:dcmPatientName).

So, even though it works as expected, it doesn't quite suite my needs here. I could do what Maria and Daniel suggested before and use an intermediary between the user and my destination XNAT to perform the anonymization. However, I'm trying to keep this as simple and transparent to the users as possible, so I'm trying to stick to XNAT alone. In the end, I think the solution to do what I'm trying to do is to code. So far I've managed to modify the pre-archive importer so it performs the anonymization before the XML with the data read from the DICOMs is created. As this XML is the one that contains the files I can later see by viewing a subject, it solves that aspect of the problem. I have some loose ends that need sorting out, but I think I'm on the right path, don't you think?

Thank you all for the help!

Rick Herrick

unread,
Feb 8, 2017, 1:14:25 PM2/8/17
to xnat_discussion
Hey David,

We've actually managed to re-create this issue, or at least something very like it, in another context. We're looking into the cause right now and will let you know what we find out.

Qingyan Guan

unread,
Sep 3, 2018, 4:38:50 AM9/3/18
to xnat_discussion
Hi Rick,

is there follow up solution on this issue? I had same xml file problem.
Reply all
Reply to author
Forward
0 new messages