Issue 113 in pydicom: read_file() returns incomplete dataset for DICOM file with nested private sequences

13 views
Skip to first unread message

pyd...@googlecode.com

unread,
Feb 29, 2012, 1:50:10 PM2/29/12
to pydic...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium Difficulty-Medium

New issue 113 by d.j.hun...@gmail.com: read_file() returns incomplete
dataset for DICOM file with nested private sequences
http://code.google.com/p/pydicom/issues/detail?id=113

I have a DICOM file that contains a couple of private sequences of
undefined length, which themselves contain undefined length sequences
nested within them. The transfer syntax is implicit VR. When I attempt to
read this file, many of the data elements (including the pixel data) are
missing from returned dataset.

Looking at the code, the problem appears to originate when
data_element_generator() reaches a private sequence whose VR is unknown.
Under this condition, the sequence is treated as binary data of undefined
length and read_undefined_length_value() is called, which parses the file
until a sequence delimiter tag is reached. However, in the case of nested
sequences, the next sequence delimiter to be reached corresponds to the end
of the first nested sequence rather than that of the parent sequence.

As such, the parent sequence is only partially read, and the rest of the
sequence is parsed as if it is the top-level dataset. When the parent’s
actual sequence delimiter is reached it is detected by read_dataset(),
ultimately causing read_file() to terminate early.

As a workaround, I’ve modified data_element_generator() to check all data
elements of undefined length to see if they are sequences (based on the
assumption that the first four bytes of an SQ data element value will be
always be an Item Tag):


--- a/src/oxmorf/dicom/filereader.py
+++ b/src/oxmorf/dicom/filereader.py
@@ -247,7 +247,13 @@
VR = dictionaryVR(tag)
except KeyError:
pass
- if VR == 'SQ':
+
+ bytes = fp_read(4)
+ fp.seek(fp_tell()-4)
+ possible_group, possible_elem = unpack(endian_chr+"HH", bytes)
+ possible_item_tag = TupleTag((possible_group, possible_elem))
+
+ if (VR == 'SQ') or (possible_item_tag == ItemTag):
if debugging:
logger_debug("%04x: Reading and parsing undefined
length sequence"
% fp_tell())


The hope is that this should prevent any sequences being read as binary
data (it seems to work ok so far, although I've not properly tested it).

If needed, I should shortly be able to provide the DICOM file in question.

Many thanks,

David

pyd...@googlecode.com

unread,
Feb 29, 2012, 10:31:54 PM2/29/12
to pydic...@googlegroups.com
Updates:
Status: Accepted
Labels: Milestone-Release1.0

Comment #1 on issue 113 by darcy...@gmail.com: read_file() returns

incomplete dataset for DICOM file with nested private sequences
http://code.google.com/p/pydicom/issues/detail?id=113

Thanks for the detailed report, and yes, a file would be helpful (as
always, with no confidential information of any kind).

I think I will leave this until after the 0.9.7 release (after which
pydicom move towards python 3), and backport the solution to the python 2.x
branch, with thorough testing in place.

pyd...@googlecode.com

unread,
Mar 5, 2012, 9:20:27 AM3/5/12
to pydic...@googlegroups.com

Comment #2 on issue 113 by d.j.hun...@gmail.com: read_file() returns
incomplete dataset for DICOM file with nested private sequences
http://code.google.com/p/pydicom/issues/detail?id=113

Here's the offending file.

David

Attachments:
nestedPrivateSQ.dcm 80.9 KB

pyd...@googlecode.com

unread,
Jun 13, 2012, 3:13:15 PM6/13/12
to pydic...@googlegroups.com
Updates:
Status: Fixed

Comment #3 on issue 113 by Sue...@gmail.com: read_file() returns incomplete
dataset for DICOM file with nested private sequences
http://code.google.com/p/pydicom/issues/detail?id=113

This issue was closed by revision 84af4b240add.

pyd...@googlecode.com

unread,
Jun 13, 2012, 3:15:36 PM6/13/12
to pydic...@googlegroups.com

Comment #4 on issue 113 by Sue...@gmail.com: read_file() returns incomplete
dataset for DICOM file with nested private sequences
http://code.google.com/p/pydicom/issues/detail?id=113

David,

I was able to come up with a patch based on your suggestion. The one
modification I made was to set the VR to 'SQ' if an item tag was indeed
found. This allows proper parsing of the file as well as proper formatting
while printing the sequences.

Additionally, I created a very simple example file as well as a unittest.

-Suever

Reply all
Reply to author
Forward
0 new messages