OpenJPEG fails to decode a JPEG-2000 codestream, jasper apparently can

Marco Sambin

unread,

Sep 14, 2017, 11:52:51 AM9/14/17

to OpenJPEG

Hi all.
I have a sample JPEG-2000 codestream (extracted from a DICOM medical image) which apparently can be correctly decoded by other Jasper-based implementations, while it cannot be decoded by OpenJPEG, which is giving a "Stream too short" error.

My application uses an old OpenJPEG 1.4 implementation, but I get the "Stream too short" error also with the latest OpenJPEG 2.2 (using opj_decompress against the extracted JPEG-2000 codestream).

Here is the sample codestream:

https://drive.google.com/file/d/0B1KymQKQ9pcsMWxDRWROeVpna2s/view?usp=sharing

Maybe some expert can shed some light about what's "wrong" with this codestream, and if I can find some workaround to correctly decode it?

Thanks in advance for your help.
Regards,

Marco

Even Rouault

unread,

Sep 14, 2017, 12:24:50 PM9/14/17

to open...@googlegroups.com, Marco Sambin

Marco,
 
this is indeed a invalid codestream. If you replace the terminating 0x00 byte by a end-of-codestream marker 0xFF 0xD9, then openjpeg 2.2 will successfully decompress it
 
It could be desirable for openjpeg to be a bit less strict on those sort of non conformities (with warnings instead of errors)
 
Even
 
-- 
Spatialys - Geospatial professional services
http://www.spatialys.com

Marco Sambin - NeoLogica

unread,

Sep 15, 2017, 4:24:24 AM9/15/17

to open...@googlegroups.com, Even Rouault

Dear Even,

thank you so much for your precious feedback!

Applying your suggestion indeed allows OpenJPEG (even my older 1.4) to correctly decode this codestream!

One question: in my own application-level code, before passing the JPEG-2000 codestream's byte array to the OpenJPEG library for decoding, I am already looking for the SOC and EOC markers, in order to cut eventual additional "outer" padding present in the source DICOM image, which may "hurt" the decoding routine of OpenJPEG (at least, with the older v1.4). With this sample JPEG-2000 codestream, indeed, the EOC marker isn't found by my application-level code. Is there a reliable (or at least, meaningful) way to "guess" where to insert the EOC marker when missing? Just in an effort to be less "strict", and attempt to decode also images which have some "small" non-conformities in their JPEG-2000 codestream...

For instance, as you suggested, in this case it works if I insert the two-bytes EOC in place of the last byte (which is set to "0"). When an explicit EOC is not found, would it make sense to always look for "0" byte(s) starting from the end of codestream, and replace it (or them) with the EOC marker?

Thank you very much in advance for your feedback.

Best regards,

Marco

CONFIDENTIALITY NOTE: This electronic transmission, including all attachments, is directed in confidence solely to the person(s) to whom it is addressed, or an authorized recipient, and may not otherwise be distributed, copied or disclosed. The contents of the transmission may also be subject to intellectual property rights and such rights are expressly claimed and are not waived. If you have received this transmission in error, please notify the sender immediately by return electronic transmission and then immediately delete this transmission, including all attachments, without copying, distributing or disclosing same.

NOTA DI RISERVATEZZA: Il presente messaggio, compresi gli eventuali allegati, è destinato in via confidenziale esclusivamente ai destinatari dello stesso e non può essere distribuito, divulgato o copiato. Il contenuto può anche essere soggetto a diritti di proprietà intellettuale. Se avete ricevuto la presente e-mail per errore, Vi preghiamo di informare il mittente e di eliminarla, distruggendo tutto il contenuto e gli eventuali allegati, senza produrne copie e senza divulgare alcunché.

Even Rouault

unread,

Sep 15, 2017, 5:48:20 AM9/15/17

to Marco Sambin - NeoLogica, open...@googlegroups.com

On vendredi 15 septembre 2017 10:24:21 CEST you wrote:
> Dear Even,
> 
> thank you so much for your precious feedback!
> 
> Applying your suggestion indeed allows OpenJPEG (even my older 1.4) to
> correctly decode this codestream!
> 
> One question: in my own application-level code, before passing the
> JPEG-2000 codestream's byte array to the OpenJPEG library for decoding, I
> am already looking for the SOC and EOC markers, in order to cut eventual
> additional "outer" padding present in the source DICOM image, which may
> "hurt" the decoding routine of OpenJPEG (at least, with the older v1.4).
> With this sample JPEG-2000 codestream, indeed, the EOC marker isn't found
> by my application-level code. Is there a reliable (or at least, meaningful)
> way to "guess" where to insert the EOC marker when missing? Just in an
> effort to be less "strict", and attempt to decode also images which have
> some "small" non-conformities in their JPEG-2000 codestream...
> 
> For instance, as you suggested, in this case it works if I insert the
> two-bytes EOC in place of the last byte (which is set to "0"). When an
> explicit EOC is not found, would it make sense to always look for "0"
> byte(s) starting from the end of codestream, and replace it (or them) with
> the EOC marker?
 

That's a bit tricky. The codestream (excluding the SOD marker) might potentially be terminated by a NUL byte... I actually figured what to do by using the
https://github.com/OSGeo/gdal/blob/trunk/gdal/swig/python/samples/dump_jp2.py
script that comes with the GDAL library
 
It uses the DumpJPK2CodeStream() function of
https://github.com/OSGeo/gdal/blob/trunk/gdal/gcore/gdaljp2structure.cpp#L926
to navigate through the codestream. Basically it is reading 2 bytes to figure out which marker it is, then reading the 2 next bytes that give the marker size (for markers with explicit size) and skip over them, and loop over the file that way.
 
 
Its output on your file was (cutting the beginning):
 
[...]
  <Marker name="SOT" offset="561117" length="12">
    <Field name="Isot" type="uint16">0</Field>
    <Field name="Psot" type="uint32">94040</Field>
    <Field name="TPsot" type="uint8">6</Field>
    <Field name="TNsot" type="uint8">0</Field>
  </Marker>
  <Marker name="PLT" offset="561129" length="20" />
  <Marker name="SOD" offset="561149" length="94008" />
  <Error message="Cannot read marker" offset="655157" />
</JP2KCodeStream>
 
So it means it managed to read entirely the last SOD marker (Start Of Data) and failed to read the following marker at offset 655157 (convention of the utility: offset 0 first byte)
The file size is 655158 bytes. So it means there was after the data part of the code stream one extraneous byte. Seeing it was 0, it was not a valid marker (they must start with 0xFF). So I removed it and put 0xFF 0xD9 instead.

Marco Sambin - NeoLogica

unread,

Sep 15, 2017, 9:05:51 AM9/15/17

to open...@googlegroups.com, Even Rouault

Dear Even,

thank you very much for your valuable feedback, once again.

I will see what I can do in my code.

Best regards,

Marco

Reply all

Reply to author

Forward