Expected memory allocations

15 views
Skip to first unread message

Bruce Dawson

unread,
Jul 28, 2023, 12:33:01 AM7/28/23
to libjpeg-turbo User Discussion/Support
While looking at Chrome's performance when loading a huge .jpg image (10,000 x 8,000) I noticed that libjpeg-turbo was making three 160,000,000 byte allocations when decoding this image. I debugged the realize_virt_arrays code a bit and talked to some people and learned that it is allocating an array of JBLOCK structures. Each of those covers an 8x8 pixel region so there are 1,250 x 1,000 of those. Because it is being decoded to YUV420 format the JBLOCK structure is 128 bytes, so that's where the 160,000,000 bytes comes from.

The question, then, is whether it is expected that realize_virt_arrays would do three of these allocations. I can't figure out what the other two are used for and so I was wondering if anybody here has any insights. 480,000,000 bytes is more than I had expected.

It could be something with how Chrome is using libjpeg-turbo. I haven't tried creating an isolate repro of the issue.

DRC

unread,
Jul 28, 2023, 4:53:44 PM7/28/23
to libjpeg-t...@googlegroups.com
OK, this provides more context for the issue you filed:

https://github.com/libjpeg-turbo/libjpeg-turbo/issues/713

However, I don't think that the libjpeg API library actually zeroes any
of those buffers.  Correct me if I'm wrong.

When decompressing a large baseline JPEG image, the libjpeg API library
doesn't use more than 100 kB or so.  When decompressing a large
progressive JPEG image, however, the library will allocate a whole-image
buffer for it.  (The whole-image buffer is needed in order to collect
the decompressed output of each scan before decompressing the next
scan.)  In that case, the library should allocate about 6 bytes per
pixel when decompressing a 4:4:4 8-bit-per-sample progressive JPEG image
and about 3 bytes per pixel when decompressing a 4:2:0 8-bit-per-sample
progressive JPEG image.

In other words, I would expect the library to allocate about 480 MB if
the 10,000 x 80,000 image uses 4:4:4 subsampling.  (You mention that the
image is being decoded into YUV420 format, but you never mentioned
whether the JPEG image itself was 4:2:0.)

I don't really know of a way around that, unfortunately, except to use
partial image decompression.

DRC

Bruce Dawson

unread,
Jul 28, 2023, 6:05:04 PM7/28/23
to libjpeg-t...@googlegroups.com
I played around a bit with memory breakpoints to see when the three buffers are used. Memory breakpoints are amazing. It was while doing this that I hit the code that zeroes the 160,000,000 byte buffers. This is the call stack (from a debug build, for ease of debugging):

  vcruntime140d.dll!memset_repstos() Line 35 Unknown
> blink_platform.dll!chromium_jzero_far(void * target, unsigned __int64 bytestozero) Line 133 C
  blink_platform.dll!access_virt_barray(jpeg_common_struct * cinfo, jvirt_barray_control * ptr, unsigned int start_row, unsigned int num_rows, int writable) Line 968 C
  blink_platform.dll!consume_data(jpeg_decompress_struct * cinfo) Line 205 C
  blink_platform.dll!chromium_jpeg_consume_input(jpeg_decompress_struct * cinfo) Line 332 C
  blink_platform.dll!blink::JPEGImageReader::Decode(blink::JPEGImageDecoder::DecodingMode decoding_mode) Line 654 C++

Note that Chromium uses the preprocessor to rename jzero_far and jpeg_consume, but they are otherwise unchanged.

It zeroes the memory one row at a time, so 160,000 bytes at a time in this case. As you say in the issue, changing that would be challenging, and would only help the progressive-jpeg case. The comment above the block of code in access_vert_barray says:

  /* Ensure the accessed part of the array is defined; prezero if needed.
   * To improve locality of access, we only prezero the part of the array
   * that the caller is about to access, not the entire in-memory array.
   */

With my test image, on my machine (I'm not sure what might cause results to vary, but maybe it's timing dependent?) my memory breakpoints also showed me that the second of the three buffers is never used. My guess is that this is dependent on how many passes the progressive jpeg contains.

Aside: the more I look at this the more I dislike progressive jpegs - there are clearly many situations where they make things worse.

I do recognize that libjpeg-turbo is under resourced so I assume that if I want my issues worked on I'll probably have to do it myself, or convince Google to donate more money. I appreciate your taking the time to answer my naive questions.


--
You received this message because you are subscribed to the Google Groups "libjpeg-turbo User Discussion/Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to libjpeg-turbo-u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/libjpeg-turbo-users/e4284237-22b8-6be5-bc98-e41c80702dec%40virtualgl.org.
Reply all
Reply to author
Forward
0 new messages