Generic way to access dicom files in a zip upload

405 views
Skip to first unread message

David Sinn

unread,
Dec 14, 2020, 12:21:42 PM12/14/20
to QATrack+
Hello everyone,

I've used Pylinac to analyze .zip files uploaded through QATrack+ before, but I'm wondering if there's a clean way to do this for modules that don't specifically have zip functions coded.

In my case, I want the user to upload a zip of 4 picket fence tests at different gantry angles and I want to access the constituent dicom files.  

Any good and clean ideas?

Thanks

Randle Taylor

unread,
Dec 14, 2020, 12:40:03 PM12/14/20
to David Sinn, QATrack+
Something similar to this will work:

import io
import pydicom
import zipfile

zf = ZipFile(BIN_FILE)
f = io.BytesIO(zf.read("gantry_90.dcm"))
dcm = pydicom.read_file(f)
# do something with dcm



--
You received this message because you are subscribed to the Google Groups "QATrack+" group.
To unsubscribe from this group and stop receiving emails from it, send an email to qatrack+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/qatrack/5551bc1b-254b-4539-8b03-def0433146d5n%40googlegroups.com.

Randle Taylor

unread,
Dec 14, 2020, 12:59:59 PM12/14/20
to David Sinn, QATrack+
If you need to extract the files to disk for some reason, you can also usually use the tempfile module to extract the zip file to disk.

e.g.

import os
import tempfile
import zipfile

tempdir = tempfile.mkdtemp()
zf  = zipfile.ZipFile(BIN_FILE)
zf.extractall(tempdir)

fpaths = [os.path.join(tempdir, f) for f in os.listdir(tempdir)]

then you can do whatever you want with the paths.

RT

David Sinn

unread,
Dec 14, 2020, 1:18:08 PM12/14/20
to QATrack+
Perfect, thank you very much.  

In the first reply just had to change  zf = ZipFile(BIN_FILE)  to zf=zipfile.ZipFile(BIN_FILE) but otherwise that's awesome!! 

Benjamin Bown

unread,
Mar 11, 2021, 10:33:23 AM3/11/21
to QATrack+
Hi,

I'm trying to do something similar but I'm having trouble. I have an upload test for Catphan analysis with Pylinac for which a zip file is uploaded. It does the analysis then I want to read the dicom tag for series description that I want for naming the pdf report later.
So far I have;

import io
from pylinac import CatPhan604
import matplotlib.pyplot as plt
from zipfile import ZipFile
import pydicom

upload = BIN_FILE.path
mycbct = CatPhan604.from_zip(upload)
mycbct.analyze()
with ZipFile(' upload ', 'r') as zf:
        # Do stuff with zip
#etc.

However, I get this error which I have attached. (Sorry, I couldn't find the log to copy/paste!)

pydicom_error.jpg

Any help appreciated. Thanks.

Ben

Randle Taylor

unread,
Mar 11, 2021, 4:14:46 PM3/11/21
to Benjamin Bown, QATrack+
Hi Ben,

In the line:

with ZipFile(' upload ', 'r') as zf:

you are passing the string 'upload' to the ZipFile rather than the upload variable.  You should be able to either pass the upload variable "with ZipFile(upload, 'r') as zf:"  or the BIN_FILE object itself "with ZipFile(BIN_FILE, 'r') as zf:" .

Hope that helps!

RT

Benjamin Bown

unread,
Mar 12, 2021, 4:18:23 AM3/12/21
to QATrack+
Hi Randy,

Thanks for getting back to me. Apologies but I'd put my previous code wrong, I didn't actually have the string quotes in QATrack+, they were there because I copy pasted it from my local IDLE when working with a file directly (which is working).
I also tried as you said with using BIN_FILE directly and still have the same problem. Side question; is there any difference to using BIN_FILE vs BIN_FILE.path?

Here is the code now, copied correctly from QATrack+;
import io
from pylinac import CatPhan604
import matplotlib.pyplot as plt
from zipfile import ZipFile
import pydicom


mycbct = CatPhan604.from_zip(BIN_FILE)
mycbct.analyze(hu_tolerance=40, scaling_tolerance=1, thickness_tolerance=0.2,
                low_contrast_tolerance=1, cnr_threshold=15, zip_after=False)

#Get DICOM tag (0018,103E) User provided description of the Series. Used to name pdf report by scan protocol.

with ZipFile(BIN_FILE, 'r') as zf:
        file = zf.namelist()
        list_dcm_files = [s for s in file if ".dcm" in s]
        dcm_file = zf.open(list_dcm_files[0])
        try:
            ds = pydicom.dcmread(dcm_file, stop_before_pixels=True)
            series_description = ds.SeriesDescription
        except AttributeError:
            series_description = 'CBCT' #If series description not found then it's because it's a CBCT image from TrueBeam

#etc

And the fail error in QATrack+;

Invalid Test Procedure: catphan_604_analysis.py", line 18, in Test: Catphan 604 Upload & Analysis
  File "/home/generic/venvs/qatrack3/lib/python3.6/site-packages/pydicom/filereader.py", line 871, in dcmread
    force=force, specific_tags=specific_tags)
  File "/home/generic/venvs/qatrack3/lib/python3.6/site-packages/pydicom/filereader.py", line 668, in read_partial
    preamble = read_preamble(fileobj, force)
  File "/home/generic/venvs/qatrack3/lib/python3.6/site-packages/pydicom/filereader.py", line 625, in read_preamble
    logger.debug("{0:08x}: 'DICM' prefix found".format(fp.tell() - 4))
io.UnsupportedOperation: seek

Thanks for your help,

Ben

Randle Taylor

unread,
Mar 12, 2021, 9:06:26 AM3/12/21
to Benjamin Bown, QATrack+
Hi Ben,

I tried this myself and figured out what I think the issue is.  In this bit of code:

dcm_file = zf.open(list_dcm_files[0])
 
dcm_file is an in memory zip file.  When you try to open a file with pydicom I think it reads a bit of the file to check if it's a dicom file then "seeks" back to byte 0 of the file to start the full read.   This fails because the zip file does not support the seek method.  Instead, reading the zip file into an in memory BytesIO object and passing that to pydicom should do the trick for you:

dcm_file = zf.open(list_dcm_files[0])
dcm_file = io.BytesIO(dcm_file.read())

The difference between BIN_FILE and BIN_FILE.path is that the former is an actual file object while BIN_FILE.path is just a string containing the path to the file on disk. Many python libraries/functions accept either the actual file object, or the file path.

Hope that helps!

Randy

Benjamin Bown

unread,
Mar 12, 2021, 10:43:21 AM3/12/21
to QATrack+
Excellent! That works perfectly.

Thank you so much!

Ben
Reply all
Reply to author
Forward
0 new messages