Lossy compression of movies

126 views
Skip to first unread message

vadim....@gmail.com

unread,
May 9, 2022, 7:34:47 AM5/9/22
to EMAN2

Dear EMAN2 team,

 I am currently looking for options for long-term storage of cryo-EM data, i.e. movies in TIFF, MRC or EER format. I was impressed with the compression capabilities of EMAN's HDF format, however, I am still worried about the possibility of the information loss. Do you may be have any guide on how to select the bit compression values? What are the critical points for the consideration for input files? (super-resolution or not, image size in pixels, distribution of pixel values).

 Many thanks for your kind help!

Best wishes,

Vadim Kotov

Ludtke, Steven J.

unread,
May 9, 2022, 8:33:12 AM5/9/22
to em...@googlegroups.com
We're planning to send the final revision of the manuscript back to JSB (it's effectively accepted at this point). The publication process took a lot longer than we were expecting for this paper (partially out fault). We would have posted the first draft to biorxiv but felt it was important to see if the reviewers found any important issues with the first draft before putting it out there as a standard. Anyway, if anyone wants a preprint I'm happy to share it individually. Seems pointless to post it to biorxiv at this point, it should be out reasonably soon. 

The basic advice is:

- for raw counting-mode movie frames, compressed TIFF movies with associated gain normalization image is probably the best approach. Bit truncation is only appropriate once counts are up to ~5-10 e-/pixel. ie - gain normalized movie averages
- for gain normalized movie averages, 3-4 bits is enough in most cases. 5 bits is enough in pretty much any situation we could imagine. The only exception is microED data (ie - data which is recorded in Fourier space with high dynamic range)
- for class averages and 3-D maps anything from 8 - 12 bits is good, depending on specific needs. At this level it is generally more a question of what the data will be used for than information loss.

It's all explained in quite a lot of detail in the paper... email me if you want to see the preprint.

--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <slu...@bcm.edu>                      Baylor College of Medicine
Charles C. Bell Jr., Professor of Structural Biology        Dept. of Biochemistry 
Deputy Director, Advanced Technical Cores                   and Molecular Biology
Academic Director, CryoEM Core
Co-Director CIBR Center



--
--
----------------------------------------------------------------------------------------------
You received this message because you are subscribed to the Google
Groups "EMAN2" group.
To post to this group, send email to em...@googlegroups.com
To unsubscribe from this group, send email to eman2+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/eman2

---
You received this message because you are subscribed to the Google Groups "EMAN2" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eman2+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/eman2/82bede48-60d0-4a53-97bb-419b2fb15bbbn%40googlegroups.com.

Niels Volkmann

unread,
May 9, 2022, 9:22:33 AM5/9/22
to em...@googlegroups.com
Hi Steve,

I hope you are doing well. I would like to get a pre-print of the paper.

Cheers,
Niels

Steve Ludtke

unread,
May 9, 2022, 9:36:39 AM5/9/22
to em...@googlegroups.com
Hi Niels,
missed you at the Tahoe meeting. preprint attached
-----------------------------------------
Steven Ludtke, slud...@gmail.com 


compression_preprint.pdf

vadim....@gmail.com

unread,
May 9, 2022, 10:15:27 AM5/9/22
to EMAN2
Dear Steve,

many thanks for a detailed response, and also for sharing the manuscript!

I have a short follow-up question regarding the raw movies. I tried to use e2proc2d.py to convert a movie in MRC format to TIF, however, the command would only save the first frame. How can I create a multi-frame TIF using EMAN2?

% e2version.py
EMAN 2.91 final ( GITHUB: 2021-03-08 11:36 - commit: 81caed2 )
Your Python version is: 3.7.9
% e2iminfo.py test_movie.mrc
test_movie.mrc     1 images in MRC format     4096 x 4096 x 24
representing 0 particles
% e2proc2d.py --threed2twod test_movie.mrc output.tif
Process 3D as a stack of 24 2D images
24 images, processing 0-23 stepping by 1
1 images
% e2iminfo.py output.tif
output.tif     1 images in TIFF format     4096 x 4096
representing 0 particles


the same issue occurs when I rename "test_movie.mrc" to "test_movie.mrcs".

Ludtke, Steven J.

unread,
May 9, 2022, 11:06:43 AM5/9/22
to em...@googlegroups.com
Ahh, sorry for the misunderstanding there. 
- HDF can certainly be used for raw frames with compression, I was just suggesting that truncating bits from data with only 1or 2 bits (typical movie frames) wasn't very profitable. Storing in compressed HDF with more bits than the data possesses will not truncate any bits. ie - 4 or 5 bits is fine for individual movie frames too, and will still take advantage of unused bits on the upper end.

- My statement about TIFF movies was pointing that the TIFF compressed movies produced by many microscope facilities are already compressed as much as they safely can be if you wish to retain the raw movie frames. The uncertainty concept invoked in the paper isn't useful until expectation values are at least 3-4 bits.

- We don't support writing TIFF stacks in EMAN2 yet (reading is fine). This was not intentional, but was an oversight which then got caught up in the whole issue with TIFF support for large files. We hope to rectify this soon. As we discuss in the manuscript, TIFF would make a very nice alternative to HDF if not for a couple of specific issues the TIFF community hasn't really managed to address properly yet.

--------------------------------------------------------------------------------------
Steven Ludtke, Ph.D. <slu...@bcm.edu>                      Baylor College of Medicine
Charles C. Bell Jr., Professor of Structural Biology        Dept. of Biochemistry 
Deputy Director, Advanced Technical Cores                   and Molecular Biology
Academic Director, CryoEM Core
Co-Director CIBR Center


Reply all
Reply to author
Forward
0 new messages