How to, and best way to, search XNAT and download data with demographics

357 views
Skip to first unread message

Robin Kämpe

unread,
Dec 10, 2019, 9:03:07 AM12/10/19
to xnat_discussion
Hi Xnat-crew!

We have acquired a pretty decent sized database with over 1000 imaging sessions.

I would now like to search my XNAT for:

A specific range of project which I know has healthy controls

Get their ages in spread sheet

Only show matches for subjects who has imaging sessions with the scans MP_RAGE_64 (siemens T1) or T1_w  (philips T1)

Then download only the T1w for all subjects and a spread sheet with the corresponding names and their ages.

I have played around with the advanced search function but I get a bit confused with joining columns and I can't find any search criteria for scans (e.g. only show MP_RAGE_64), I only get to choose within subject and session based properties.

Can someone link me to a good guide or let me know if there is a better way?

Thanks!!

Moore, Charlie

unread,
Dec 10, 2019, 10:56:53 AM12/10/19
to xnat_di...@googlegroups.com

Robin,

 

Based on those scan types, I’m assuming you’re dealing with MR here. That looks like you’re doing a search on the Subject datatype and then joining the MR Session datatype, right? I think you can do everything you need with just the MR Session datatype. Try this:

1.       Do an Advanced Search with MR Sessions as the pivot. On step 3, select your desired projects.

2.       The search results should include a “Scans” column. Click the column header, and add a “LIKE” filter with “MP_RAGE_64” value, then click “More…” and repeat for your other scan type. Note that things like “MP_RAGE_64foobar” will probably also match this. If that’s a dealbreaker, then this entire approach probably won’t work.

3.       At this point you should have only the sessions in your desired list of projects with at least one of those scan types. If you click Options > Spreadsheet at this point in the table, you should get a CSV representation of the table.

a.       Note that you will get a row for every session, not every subject. Is that desired/OK?

b.       Age should be an included column, but if you don’t want any of the columns in your spreadsheet, you can click any of the column headers and click “Hide Column”.

4.       Finally, when you click Options > Download, it should take you to the download page with only your filtered sessions in the list to download. You can then select the scan types you wanted on that page and get your data.

 

Thanks,

Charlie

--
You received this message because you are subscribed to the Google Groups "xnat_discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/849fc158-02ba-43b4-9cbc-d3de48703ca3%40googlegroups.com.

 


The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail.

Robin Kämpe

unread,
Dec 11, 2019, 6:32:25 AM12/11/19
to xnat_discussion
Thank you!

I could probably survive with this approach =).

It would be good though if a subject could only be displayed once since I only want 1 T1 scan from each subject and they might have 2 sessions within a project (which you asked about). Can I use wildcards? In that case we usually denote the dessions XXXX_1 and  XXXX_2 or XXXX_S1 and XXXX_S2 , then i could add a filter _1 and _S1?

I tried this for data from two projects with the sessions proejct1: _1  project2: _S1 and _S2 where I want to keep _1 and _S1

Adding
(Label LIKE _S1)
Only left the _S1 scans (good)

Adding to this (via OR)
(Label Like _1)
Re-added all my _1 scans (Good)

But it also re-added 2 out of the 5 _S2 scans:

Label
3451_S2
and
7310_S2

I don't see the pattern _1 in these

Otherwise this is very nice=)

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_di...@googlegroups.com.

Moore, Charlie

unread,
Dec 11, 2019, 10:33:41 AM12/11/19
to xnat_di...@googlegroups.com

Robin,

 

I think the “_” is being interpreted by some regex along the way as a control/wildcard character when you intend it as a literal. Try putting a backslash in front of it (i.e. “\_1” and “\_S1”). That seemed to work for me on my development environment.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/6bd9b2a4-6dd2-49fb-8b3e-3d0df7a9b3cf%40googlegroups.com.

Robin Kämpe

unread,
Dec 11, 2019, 5:33:48 PM12/11/19
to xnat_discussion
Thanks Carlie!!

One last question on this subject since another well timed problem came up. We are going to download the OASIS3 dataset (a lot of MRI data on an xnat webserver). They have a lot of different data types and we managed fine to e.g. only get the subjects with a dementia (data type CDR) score of 0.0 by using "=". One problem though:

Each subject has up to 4 sessions. The session names are always on the format d0000 where 0000 is the number of days since the start of the study or similar. The important thing is that a subject can have e.g. d0134, d0245, d0310, d0505 (scanned reference day + 134 days, + 245 days and + 505 days). We only want their first scan (d0135) since we only want 1 T1w image per subject.

Is there any way to get this? I think we can display scan dates but we would still always end up with 1-4 sessions per subject but only want the earliest for each subject.

Worst case i download all 2600 T1s and delete all older scans but we would save spread sheet size, time and labor if we some how could filter. Is this mission impossible? =)

Thanks again!

Moore, Charlie

unread,
Dec 11, 2019, 6:04:33 PM12/11/19
to xnat_di...@googlegroups.com

Robin,

 

Hmm, I have a method that might work depending on how strict your requirements are. Trying a search at the subject level with MR Sessions specified as “detailed” in the additional data types section will give you one row per subject, but:

1.       I’m not sure what determines which session gets included in the join when a subject has >1 session in the system.

2.       Because the base data type is for subjects, you won’t get the Options > Download link.

 

Snag #2 here is an annoyance, but snag #1 is the potentially fatal one. If the join happens to work exactly by taking the first session (note: first by upload date or by study date?), or if you’re fine taking an arbitrary session from each subject, then we’ve passed the fatal snag. If not… I’m not sure how to do this. It might be possible with a very fancy custom search XML, but I don’t know for sure.

 

The hacky part to actually get to the download from here would be:

1.       Filter to whatever subset of subjects you actually wanted.

2.       Hide every column in the table except for the one corresponding to the session label. Note that if the session labels aren’t unique sidewide, it would be better to use the actual session IDs, rather than the labels (e.g. XNAT_EXXXXX). I think you can add the column for ID instead of the label.

3.       Download a spreadsheet of these session labels/IDs (Options > Spreadsheet).

4.       Go to the XNAT index page (the page you land on after logging in).

a.       Find the “MR” tab in the middle of the page.

b.       Click the “Search by Exact IDs” option.

c.       Select “SESSION ID”.

d.       Paste the contents of the CSV, without the “Label” header

5.       Hopefully you should now be on a search result table with the sessions specified from before, so Options > Download should take you to the download page as before.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/dec7940f-9e85-41b9-91d2-df6988e2e49d%40googlegroups.com.

Robin Kämpe

unread,
Dec 13, 2019, 9:13:34 AM12/13/19
to xnat_discussion
Thanks for going into this rabbit hole with me =).

I tried the approach. Both:

Preforming the search on subjects (like you suggested)
and
Preforming the search on subjects (like you suggested) + filter, download .csv + search again using the csv

Both seem to provide a random session.
e.g.
subject OAS30038 has these sessions:
MR_d1214
MR_d2242
MR_d3376
MR_d4495

Just searching on Subjects with MR-session and ClinicalData as additional searches gives the subject
OAS30038 with the session  d4495 (i.e. last session).

If I go further and filter,then just download the labels and search again like you described, I got the session
OAS30038_MR_d3376 
I.e. the second last session. When looking over the results I see no pattern. So it does not seem to work... (if the goal is to get the earliest session in time). Makes a little sense since there is no date on these session just the label in dXXXX format.

My suggestion might be:
to just search on MR-session with ClinidalData (detailed) and filter on CDR = 0.0 and then either:
1) remove all columns but he label (like you suggested) and download the  spread-sheet  (i.e. ca 4 sessions per subject) and select only the earliest session (e.g. using a python script) and preform the MR-search like you suggested on the labels I wanted (i.e. earliest MR-session). This would probably just give me my wanted sessions!

or

2) Download all files and delete the ones I don't want.

I like approach 1) but there is a problem though since I'm not sure how the filtering is done and if I trust it (depending on how this data base seems to be structured. When I select ClinicalData (detailed) and submit cdr 0.0 in the box prior to submit I only get cdr MR-session with cdr column values of 0.0 which is good. But the CDR value is a value from the ClinicalData sessions which are not connected to the MR-session as far as I know since they are not exactly on the same dates (dXXXX). The are independent sessions of ClinicalData type.

If I do, like above, set the CDR field to 0.0 prior to submitting the search I get, for e.g.  subject OAS30331, 4 MR-sessions:
MR_d2229
MR_d2824
MR_d3478
MR_d4694

The age at these sessions were 73,75, 77 and 80. All display a cdr column value of 0.0 in each of these 4 rows. Implying no dementia at any of these sessions / ages.

This person has 12 ClinicalData sessions from d0000 until d4618 (a little before the last MR session). The person scores cdr 0.0 until ClinicalData_d3393, on which a score of 0.5 is reached. A year later (ClinicalData_d4183) scores 0.5 and ClinicalData_d4618 score1.0 (Dementia).
I.e. according the the ClinicalData sessions dementia is diagnosed at 76.77 years of age. 

So MR_Session d3478 and d4694 (77 and 80 years at scanning session) should have a cdr of  0.5 which is not the case (0.0 for all).

If I do no CDR filtering, just searching on MR-session but also add ClinicalData, then we find e.g. subject OAS30334. This subject has 2 ClinicalData sessions and 2 MR_sessions (both showed in the search results).

Clinical_Sessions:
OAS30334_ClinicalData_d0000, at age 78.38, CDR = 0.5
OAS30334_ClinicalData_d0594, at age 80.01, CDR = 1.0

MR_sessions:
MR_session_d0000 at age 78.38, CDR columns says 0.5 (correct)
MR_session_d0889 at age 80.92, CDR column says 0.5 (should be 1.0).

This does not add-up. So even if I filter on CDR 0.0 and the subject above would have had CDR 0.0 at the first session and CDR 1.0 at the second session it is not impossible that both sessions would show up with cdr 0.0. Making me not trust this.

I know this must be confusing for XNAT but my point is that if I specify a cdr filter to be 0.0 it only displays rows with cdr 0.0. If I can trust that this at least means that some of the ClinicalScores of CDR =0.0, then I can use the first MR-session like I suggested and be fairly sure that CDR was 0.0 at that point in time.
If I can't trust the cdr score in the search result at all I don't know what to do.

Thanks again!

Moore, Charlie

unread,
Dec 13, 2019, 10:34:02 AM12/13/19
to xnat_di...@googlegroups.com

Robin,

 

I’ll see if I can get someone else with detailed knowledge of the joins in here to respond. However, I think the answer is unfortunately that this ClinicalData and MR Sessions are siblings in the experiment hierarchy, so trying to match between them by date or age isn’t something that XNAT even has any sort of concept for.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/fe3bdcc2-a710-47b0-bdad-74b11a9dc153%40googlegroups.com.

Robin Kämpe

unread,
Dec 13, 2019, 11:06:16 AM12/13/19
to xnat_discussion
I understand, it's a dataset that we just got access to and I am very confused about how to handle it without intence manual labour.

Thank you for your help!! I think I can probably sort on cdr 0.0 and download the spread-sheet, eliminate all later sessions for each subject and then input it under the MR tab to search and download. Then we can do a couple of samples just to check that these subjects had 0.0 value in the beginning.

Thanks again!!

Robin Kämpe

unread,
Jan 9, 2020, 3:15:49 AM1/9/20
to xnat_discussion
Hi again Charlie!

I managed to make this work by applying your method but in multiple steps (1: Search on clinical data on which cdr=0.0, 2: Get the Search those subjects, who have at least one clinical session of cdr 0.0, 3: Get all sessions of those subjects, 4: Use python to get the session with the lowest dXXXX value in the session name, 5: Search on these sessions and download).


I now plan to get all relevant data from our own XNAT database. I have made our users create subjects in an umbrella subject project and then share these to MR-projects which only creates one subject even if a person enters 2 studies.
In the umbrella project they are called their date of birth + 4 digits.
In the respective projects (project, share, label) they are called eg. sub01, proj_name_01, placebo_01 etc.

When I search for MR-sessions in a project there is no single column which contains only this project subject name (e.g. placebo_01), only the umbrealla subject name. The project name can be found together under "Labels(Subject)" but it is in this format:
sub_name_in_umberalla_project , sub_name_in_study1, sub_name_in_study_2 (this person was shared to two projects/studies but I only search in one study (study1) still the study2 label and the umbrella label shows up.

Q is there no way to get only the project sub_name in the searched project to show up in a single column?

I can probably manage this by using the session as an identifier but just curious for future searches.

Thanks!

Moore, Charlie

unread,
Jan 9, 2020, 9:48:54 AM1/9/20
to xnat_di...@googlegroups.com

Hi Robin,

 

It sounds like for this search, you’re searching in only 1 project, and you want the results (labels, etc.) to reflect the context of that project, right? I think it will probably do what you want if you bring up the project page itself and filter on the MR Sessions and/or Subjects table at the bottom. I believe those should be using the identifiers for the project page that you’re on. Give that a try and see if that addresses the issue.

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/87a29068-3282-49a8-946b-29d378df6b9a%40googlegroups.com.

Robin Kämpe

unread,
Jan 15, 2020, 2:59:35 AM1/15/20
to xnat_discussion
Hej!

I searched within the project like you suggested which made it possible to filter on the subject codes. Then I joined it to MR-sessions and this provided all MR-session accession numbers. I could then copy those and paste into the search field to get all of them =).

Why is there no download option when you go to one project and search? I can only get the spreadsheet? My method worked fine but it would be one step less if I could download straight from the search within the project. We run xnat 1.7.4.1.

Thanks, problem is solved =)

Moore, Charlie

unread,
Jan 15, 2020, 10:05:49 AM1/15/20
to xnat_di...@googlegroups.com

Robin,

 

Unless I’m misunderstanding, I think it behaves the same way as the site-level search. If you started at the Subject level and joined MR sessions to it, you won’t get the Download link since you have to start at the session level for that option to appear. Can you do a similar search by clicking on the “MR Sessions” tab in that table? (if it’s not available, there should be a dropdown to add listings for datatypes in that table).

To unsubscribe from this group and stop receiving emails from it, send an email to xnat_discussi...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xnat_discussion/b01af01f-8290-48d1-8e67-1a496190a442%40googlegroups.com.

Robin Kämpe

unread,
Jan 16, 2020, 4:29:21 AM1/16/20
to xnat_discussion
Oooh, awesome. I had to "Add Tab" to get the download option. Then I can filter on the "Subject" column. This is great for next time.

Thanks a lot Charlie, you have been insanely helpful =).
Reply all
Reply to author
Forward
0 new messages