Validating an MRC imag/stack before running a program on it

247 views
Skip to first unread message

Christopher Lilienthal

unread,
Jun 7, 2016, 10:29:20 AM6/7/16
to EMAN2
Hello,

This is not specific to EMAN2, but does affect the program suite along with every other cryo-em program used to process MRCs.  Is there a good way to validate an MRC image/stack before having a program operate on it thus preventing a crash due to bad/corrupt/incomplete data?

Steve Ludtke

unread,
Jun 7, 2016, 5:47:00 PM6/7/16
to em...@googlegroups.com
Not sure what situation you are contending with which would require this?  The only situations I can think of where I have ever encountered a corrupt MRC file were A) when the disk became full during writing, B) a program was killed in the middle of writing a file or C) an incomplete file transfer.

I am not aware of any cryoEM image processing pipeline which does "prevalidation" of input images. If you have a bad image file, then the actual process fails. There is little point in detecting it in advance, since if you have a bad image file, there is no way you could recover automatically anyway.  Does that make sense?

Anyway, if you want to detect whether an image file is readable, you can:

e2iminfo.py -s y.mrc

which will fail if the file is corrupt. If it is a stack file and you wish to check all of the images, you could

e2iminfo.py -as y.mrcs

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol.                Those who do
Co-Director National Center For Macromolecular Imaging            ARE
Baylor College of Medicine                                     The converse
slu...@bcm.edu  -or-  ste...@alumni.caltech.edu               also applies
http://ncmi.bcm.edu/~stevel

On Jun 7, 2016, at 9:29 AM, Christopher Lilienthal <lil...@umich.edu> wrote:

Hello,

This is not specific to EMAN2, but does affect the program suite along with every other cryo-em program used to process MRCs.  Is there a good way to validate an MRC image/stack before having a program operate on it thus preventing a crash due to bad/corrupt/incomplete data?

--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mario J. Borgnia

unread,
Jun 8, 2016, 9:16:40 AM6/8/16
to em...@googlegroups.com
Hi Steve,
I am not sure about the specifics of this case. I came across a
situation yesterday trying to open an image exported from the viewer in
EPU. I tried opening it with e2display.py and got the error messages
pasted below. I faced a similar problem when trying to open the image
with IMOD. I checked the size of the image and header and it seemed OK.
I then tried opening the image in Chimera and it worked, saved it to a
new file and I was able to open it in both IMOD and EMAN2. The image is
too large to attach to this message, you can download it from
https://www.dropbox.com/s/hj8ezrs6sm2mllt/grid1_001.mrc?dl=0.
Thanks!

Mario

Here is the error:

> e2display.py grid1_001.mrc
Reached premature end-of-file reading region from image/slice number 0
of file with 33555456 bytes.
Traceback (most recent call last):
File "/usr/local/EMAN2-20160603/bin/e2display.py", line 295, in <module>
main()
File "/usr/local/EMAN2-20160603/bin/e2display.py", line 127, in main
display_file(i,app,options.singleimage,usescenegraph=options.newwidget)
File "/usr/local/EMAN2-20160603/bin/e2display.py", line 229, in
display_file
w = EMWidgetFromFile(filename,application=app,force_2d=force_2d)
File "/usr/local/EMAN2-20160603/lib/emimage.py", line 206, in __new__
data = [EMData(filename,0)]
File "/usr/local/EMAN2-20160603/lib/EMAN2db.py", line 380, in db_emd_init
self.__initc(*parms)
RuntimeError: ImageReadException at
/build/co/eman2.daily/libEM/emutil.cpp:1160: error with
'Unknownfilename': 'incomplete data read' caught





On 06/07/2016 05:46 PM, Steve Ludtke wrote:
> Not sure what situation you are contending with which would require
> this? The only situations I can think of where I have ever
> encountered a corrupt MRC file were A) when the disk became full
> during writing, B) a program was killed in the middle of writing a
> file or C) an incomplete file transfer.
>
> I am not aware of any cryoEM image processing pipeline which does
> "prevalidation" of input images. If you have a bad image file, then
> the actual process fails. There is little point in detecting it in
> advance, since if you have a bad image file, there is no way you could
> recover automatically anyway. Does that make sense?
>
> Anyway, if you want to detect whether an image file is readable, you can:
>
> e2iminfo.py -s y.mrc
>
> which will fail if the file is corrupt. If it is a stack file and you
> wish to check all of the images, you could
>
> e2iminfo.py -as y.mrcs
>
> ----------------------------------------------------------------------------
> Steven Ludtke, Ph.D.
> Professor, Dept. of Biochemistry and Mol. Biol. Those
> who do
> Co-Director National Center For Macromolecular Imaging ARE
> Baylor College of Medicine The
> converse
> slu...@bcm.edu <mailto:slu...@bcm.edu> -or-
> ste...@alumni.caltech.edu <mailto:ste...@alumni.caltech.edu>
> also applies
> http://ncmi.bcm.edu/~stevel <http://ncmi.bcm.edu/%7Estevel>
>
>> On Jun 7, 2016, at 9:29 AM, Christopher Lilienthal <lil...@umich.edu
>> <mailto:lil...@umich.edu>> wrote:
>>
>> Hello,
>>
>> This is not specific to EMAN2, but does affect the program suite
>> along with every other cryo-em program used to process MRCs. Is
>> there a good way to validate an MRC image/stack before having a
>> program operate on it thus preventing a crash due to
>> bad/corrupt/incomplete data?
>>
>> --
>> --
>> ----------------------------------------------------------------------------------------------
>> You received this message because you are subscribed to the Google
>> Groups "EMAN2" group.
>> To post to this group, send email to em...@googlegroups.com
>> <mailto:em...@googlegroups.com>
>> To unsubscribe from this group, send email to
>> eman2+un...@googlegroups.com
>> <mailto:eman2+un...@googlegroups.com>
>> For more options, visit this group at
>> http://groups.google.com/group/eman2
>>
>> ---
>> You received this message because you are subscribed to the Google
>> Groups "EMAN2" group.
>> To unsubscribe from this group and stop receiving emails from it,
>> send an email to eman2+un...@googlegroups.com
>> <mailto:eman2+un...@googlegroups.com>.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> --
> ----------------------------------------------------------------------------------------------
> You received this message because you are subscribed to the Google
> Groups "EMAN2" group.
> To post to this group, send email to em...@googlegroups.com
> To unsubscribe from this group, send email to
> eman2+un...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/eman2
>
> ---
> You received this message because you are subscribed to the Google
> Groups "EMAN2" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to eman2+un...@googlegroups.com
> <mailto:eman2+un...@googlegroups.com>.

Steven Ludtke

unread,
Jun 8, 2016, 10:40:31 AM6/8/16
to em...@googlegroups.com
Hi Mario. This is, indeed, an invalid MRC file. The NSYMBT field in the header shows that there should be a 128k extended header present in the file. In any case, the length of the file is exactly 4096x4096x2+1024. Ie - no extended header. My suspicion is that Chimera either doesn't recognize the extended header block at all, is ignoring the missing data, or is doing some sort of non-standard error detection and assuming that the extended header value is incorrect. Software which follows the MRC standard

Cheng, A., Henderson, R., Mastronarde, D., Ludtke, S. J., Schoenmakers, R. H., Short, J., Marabini, R., Dallakyan, S., Agard, D. & Winn, M. (2015) MRC2014: Extensions to the MRC format header for electron cryo-microscopy and tomography. J. Struct. Biol. PMID: 25882513

SHOULD produce an error when reading this file. Where did this file come from? There is an IMOD tag in the header, which seems to imply it produced the file, but this flag wasn't present in IMOD files until a couple of years ago, and IMOD supposedly follows the extended header conventions as outlined in the paper above...
> To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept of Biochemistry and Mol. Biol. (www.bcm.edu/biochem)
Co-Director National Center For Macromolecular Imaging (ncmi.bcm.edu)
Co-Director CIBR Center (www.bcm.edu/research/cibr)
Baylor College of Medicine
slu...@bcm.edu





Mario J. Borgnia

unread,
Jun 9, 2016, 4:41:24 PM6/9/16
to em...@googlegroups.com
Hi Steve,
Thanks for looking into this. The file is an image exported from the
"preparation" screen in EPU on a Titan Krios. I have a number of them
and they all have the same problem. I guess that it is a bug in EPU.
Best,

Mario

Steve Ludtke

unread,
Jun 9, 2016, 5:54:08 PM6/9/16
to em...@googlegroups.com
EPU is producing images with an IMOD tag in the header?!?! That's just too funny! It seems like someone must have gotten lazy and rather than understanding the MRC header, they just copied the block from an existing image. It is a trivial problem to fix for existing images. This will fix a bunch of files all at once :

#!/usr/bin/env python
from sys import argv

for fn in argv[1:]:
f=open(fn,"r+")
f.seek(23*4)
f.write("\000"*8)

----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol. Those who do
Co-Director National Center For Macromolecular Imaging ARE
Baylor College of Medicine The converse
slu...@bcm.edu -or- ste...@alumni.caltech.edu also applies
http://ncmi.bcm.edu/~stevel

Mario J. Borgnia

unread,
Jun 9, 2016, 6:11:39 PM6/9/16
to em...@googlegroups.com
My bad, the IMOD tag must have been the result of an attempt that I made
to restore the image using alterheader. I have put the two files here:

https://www.dropbox.com/sh/p44ejmeyvwjau7b/AAB4PoNamMU6LennqfRwTMeDa?dl=0

Steve Ludtke

unread,
Jun 9, 2016, 8:34:46 PM6/9/16
to em...@googlegroups.com
Umm, that seems to be an entire temporary folder. Not sure what files you're pointing at. Also not sure if you wanted to share that dropbox link with the world...

Does the script I sent fix the issue?
----------------------------------------------------------------------------
Steven Ludtke, Ph.D.
Professor, Dept. of Biochemistry and Mol. Biol. Those who do
Co-Director National Center For Macromolecular Imaging ARE
Baylor College of Medicine The converse
slu...@bcm.edu -or- ste...@alumni.caltech.edu also applies
http://ncmi.bcm.edu/~stevel

Mario Borgnia

unread,
Jun 9, 2016, 8:41:57 PM6/9/16
to EMAN2

I meant to share this one
https://www.dropbox.com/sh/jt5186cry0pm9or/AAA37eIXemWZDZlxcpI4O7mLa?dl=0
The other one is irrelevant anyway
The script works, of course :)
Thanks

Reply all
Reply to author
Forward
0 new messages