Downloading files (using pyxnat) uploaded to experiment?

374 views
Skip to first unread message

Torsten Rohlfing

unread,
May 20, 2013, 5:25:36 PM5/20/13
to xnat_di...@googlegroups.com

Hi -

I am trying to use pyxnat to download files uploaded to experiments on our xnat 1.6.1 server via "tagged uploads." Somehow, I seem to be unable to get this right, probably due to my complete lack of understanding what I am doing ;)

I can get a list of files for a given resources:

>>> print xnat.select.experiment('NCANDA_E00288').resources('3213').files().get()
['NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt', 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.edat2']

But trying to "get" one of these files gives me only a list with the file name as the sole entry:
 
>>> print xnat.select.experiment('NCANDA_E00288').resources('3213').file( 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt' ).get()
['NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt']
 

The "get_copy()" member doesn't seem to exist at all:

>>> print xnat.select.experiment('NCANDA_E00288').resources('3213').file( 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt' ).get_copy()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Files' object has no attribute 'get_copy'


Adding project and subject IDs to the query changes things, but remains unsuccessful:

>>> print xnat.select.project('xxx_incoming').subject('NCANDA_S00170').experiment('NCANDA_E00288').resource('3213').files().get()
['NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt', 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.edat2']
>>> print xnat.select.project('xxx_incoming').subject('NCANDA_S00170').experiment('NCANDA_E00288').resource('3213').file( 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt' ).get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/fs/u00/torsten/lib/python2.7/site-packages/pyxnat-0.9.4-py2.7.egg/pyxnat/core/resources.py", line 1749, in get
    raise DataError('Cannot get file: does not exists')
pyxnat.core.errors.DataError: Cannot get file: does not exists

Any help greatly appreciated!

Torsten

Torsten Rohlfing

unread,
May 20, 2013, 6:16:45 PM5/20/13
to xnat_di...@googlegroups.com

So I understand now why this fails:



On Monday, May 20, 2013 2:25:36 PM UTC-7, Torsten Rohlfing wrote:

>>> print xnat.select.project('xxx_incoming').subject('NCANDA_S00170').experiment('NCANDA_E00288').resource('3213').files().get()
['NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt', 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.edat2']
>>> print xnat.select.project('xxx_incoming').subject('NCANDA_S00170').experiment('NCANDA_E00288').resource('3213').file( 'NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt' ).get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/fs/u00/torsten/lib/python2.7/site-packages/pyxnat-0.9.4-py2.7.egg/pyxnat/core/resources.py", line 1749, in get
    raise DataError('Cannot get file: does not exists')
pyxnat.core.errors.DataError: Cannot get file: does not exists



The uploaded file is in a sub-folder, "stroop", so this works:

>>> print xnat.select.project('xxx_incoming').subject('NCANDA_S00170').experiment('NCANDA_E00288').resource('3213').file( 'stroop/NCANDAStroopMtS_3cycles_7m53stask_100SD-001-5229.txt' ).get()
/var//tmp/user@host/04c3aa98d7f8d42a90ac6a9edcead8b6

Which leads me to ask a follow-up question - if I didn't know about the sub-folder (via the web interface), then how would I get pyxnat to tell me about it? There seems to be no hint of the existence of name of the folder, nor its name.

Again, thanks!

Torsten

Rick Herrick

unread,
May 21, 2013, 2:10:56 PM5/21/13
to xnat_di...@googlegroups.com
Hey Torsten,

What you're seeing is, I *think*, a bug. I qualify that because I'm not sure if we'd had the ability to hit a resource folder by label in 1.6.1, although I think we did. Anyway, this is resolved in the latest development tip.

I can talk about this in terms of the URLs for the REST API, but I haven't really done much pyxnat work with resources, so someone else may know something about how pyxnat handles these and can chime in on that.

That number that you're seeing is the abstract resource ID for the resource folder. You should be able to reference a given resource folder via either the abstract resource ID *or* the resource folder label. You can get that info with this call:


That will return something like this:

{"ResultSet": {
    "Result": [
        {
            "cat_id": "XNAT_E00001",
            "element_name": "xnat:resourceCatalog",
            "category": "resources",
            "xnat_abstractresource_id": "29",
            "label": "res1",
            "cat_desc": " "
        }
    ],
    "totalRecords": "0",
    "title": "Resources"
}}

You can then use the abstract resource ID or the label (in the dev tip) to retrieve the contents of that resource folder:


Again, there may be some convenience functions for pyxnat to resolve this, but that may also be getting stymied by the bug where we lost support for retrieving by resource folder label.

Torsten Rohlfing

unread,
May 21, 2013, 4:06:43 PM5/21/13
to xnat_di...@googlegroups.com
Hi Rick -

To me, this *seems* like a pyxnat bug. Or at least something pyxnat should be able to sort out.

When I get a resource like so:

https://SERVER/xnat/data/experiments/ID_E00300/resources/3454/

I get something like this:

<cat:Catalog ID="QA">
<cat:metaFields>
<cat:metaField name="AUDIT">6758:Thu May 16 13:54:23 PDT 2013=Removed:1|5459:Wed May 08 19:29:58 PDT 2013=Added:1|5458:Wed May 08 19:29:58 PDT 2013=Added:1|6757:Thu May 16 13:54:23 PDT 2013=Removed:1|5457:Wed May 08 19:29:11 PDT 2013=Added:1|6756:Thu May 16 13:53:40 PDT 2013=Removed:1</cat:metaField>
</cat:metaFields>
<cat:entries>
<cat:entry ID="QA/t1.nii.gz" URI="t1.nii.gz" content="ADNI Phantom QA File" createdBy="RestAPI" createdEventId="5457" createdTime="2013-05-08T19:29:11.203" format="nifti_gz" modifiedBy="RestAPI" modifiedEventId="6756" modifiedTime="2013-05-16T13:53:40.065" name="t1.nii.gz"><cat:tags><cat:tag>qa adni nifti_gz</cat:tag></cat:tags>
</cat:entry>
</cat:entries>
</cat:Catalog>

So, here, the file entry for "t1.nii.gz" has two attributes - "URI" without the "folder", but also "ID" with it.

Now, I don't oversee all of this, but in pyxnat/resources.py, function "attributes" I see:

        return self._getcells(['URI', 'Name', 'Size', 'path',
                               'file_tags', 'file_format', 'file_content'])

and in fuction "get":

        if not self._absuri:
            self._absuri = self._getcell('URI')

Looks to me like using "ID" instead of "URI" would be a better idea here?

Best,
  Torsten

Torsten Rohlfing

unread,
May 21, 2013, 4:50:32 PM5/21/13
to xnat_di...@googlegroups.com
Ah, never mind ... I just found a different experiment where the "URI" has the correct path and "ID" has a "null/" prefix instead.

Bummer.

Torsten Rohlfing

unread,
May 21, 2013, 5:21:03 PM5/21/13
to xnat_di...@googlegroups.com
Okay, hopefully the final word from me on this matter - thanks to Rick's useful hints.

So this does NOT work (if files are stored with some sort of "folder" prefix):

    # Get list of resource files that match the file name pattern
     experiment_files = []
     for resource in  experiment.resources().get():
        experiment_files += [ (resource, file) for file in experiment.resource( resource ).files().get() if re.match( '^.*\.txt$', file ) ]

     tempdir = tempfile.mkdtemp()
     for (resource,file) in experiment_files:
         # Download file from XNAT into temporary directory
         file_path = experiment.resource( resource ).file( '%s' % file ).get_copy( os.path.join( tempdir, file ) )
 
But this DOES work:

    # Get list of resource files that match the file name pattern
     experiment_files = []
     for resource in  experiment.resources().get():
        resource_files = xnat._get_json( '/data/experiments/%s/resources/%s/files?format=json' % ( xnat_eid, resource ) );
        experiment_files += [ (resource, re.sub( '.*\/files\/', '', file['URI']) ) for file in resource_files if re.match( '^.*\.txt$', file['Name'] ) ]

     tempdir = tempfile.mkdtemp()
     for (resource,file) in experiment_files:
         # Download file from XNAT into temporary directory
         file_path = experiment.resource( resource ).file( '%s' % file ).get_copy( os.path.join( tempdir, file ) )

Of course this ONLY works if there is no second "files/" in the file name itself, but if there is, one could always fix the regular expression is re.sub() I guess.

Okay, my problem solved, moving on... :)

Thanks again, Rick!

Torsten

gregh...@gmail.com

unread,
Feb 16, 2016, 9:11:39 AM2/16/16
to xnat_discussion, torsten...@gmail.com
Hi all,

sorry to resurrect an old thread but I would like to know if pyxnat has been modified in order to get resources included in a 'hidden' sub-directory?

Thank you.

Grégory
Reply all
Reply to author
Forward
0 new messages