Google Chrome toBlob / drawImage vs OpenCV imencode / resize differences

85 views
Skip to first unread message

VR

unread,
Jun 1, 2023, 10:58:12 AM6/1/23
to Chromium-dev, Daniel Toth
Hello everyone,

TL;DR
I am trying to understand how and why Google Chrome toBlob & drawImage methods are different from OpenCV imencode & resize functions. I created a small and reproducible example to show that two images, resized and / or encoded, leads sometime to different results in Google Chrome & OpenCV.

How
I created a python script, which takes the relative path of an image (only tested with .bmp, .jpg & .png), the destination width / height and the destination encoding format. This script will encode the raw bytes of the input image to base64, send it to Google Chrome, load it to an image and finally resize (drawArea) and encode (toBlob) it.
In parallel, the python script will apply the exact same operations using the OpenCV equivalent functions (resize & imencode). Finally, the function is printing mean absolute error along with source & destination parameters.

Expectation
The expected result would be to get very close results from both interfaces.

Limits
The experiment was run only on a very few frames, running on thousand of frame would confirm the behavior is the same on other data.

Result
Interpretation of the current results :
  • Without resizing
    • Using BMP as source image
      • encoding to JPEG : not reproducible
      • encoding to BMP : reproducible
      • encoding to PNG : reproducible
    • Using PNG as source image
      • encoding to JPEG : not reproducible
      • encoding to BMP : reproducible
      • encoding to PNG : reproducible
    • Using JPG as source image : never reproducible
  • With resizing : never reproducible
My interpretation are:
  • Resize function is not reproducible
  • PNG & BMP decoding and encoding are reproducible
  • JPG decoding and / or encoding are not reproducible

Code
compare.py
import cv2
import base64
import numpy as np
import os

from selenium.webdriver import Chrome


def compute(path, d_mime_type, d_width=None, d_height=None):
"""This function computes the mean absolute difference between an image resized then encoded
from the browser and the same image resized then encoded from opencv.

Args:
path (str): Path to the image to resize and encode.
d_mime_type (str): Destination mime type of the image to encode.
d_width (int, optional): Destination width of the image to encode. Defaults to None.
d_height (int, optional): Destination height of the image to encode. Defaults to None.

Returns:
float: Mean absolute difference between the two images.

"""

# Decoding input image
image = cv2.imread(path)

# Setting default values
s_width = image.shape[1]
s_height = image.shape[0]
d_width = d_width if d_width else s_width
d_height = d_height if d_height else s_height
s_mime_type = os.path.splitext(path)[-1][1:]

# Encoding raw image to base64 (for the browser)
encoded_image = f"data:image/{s_mime_type};base64," + base64.b64encode(open(path, "rb").read()).decode()

# Getting results from the browser
driver = Chrome()
driver.get("file://" + os.path.join(os.getcwd(), "index.html"))
browser_result = driver.execute_script(
"return encodeAndResize(arguments[0], arguments[1], arguments[2], arguments[3])",
encoded_image,
f"image/{d_mime_type}",
d_width,
d_height
).split(",")[-1]
browser_result = np.frombuffer(base64.b64decode(browser_result), np.uint8)
browser_result = cv2.imdecode(browser_result, cv2.IMREAD_COLOR).astype(np.int16)
driver.quit()

# Getting results from opencv
opencv_result = cv2.resize(image, (d_width, d_height), interpolation=cv2.INTER_LINEAR)
_, opencv_result = cv2.imencode(f".{d_mime_type}", opencv_result)
opencv_result = cv2.imdecode(opencv_result, cv2.IMREAD_COLOR).astype(np.int16)

# Comparing results
mae = np.mean(np.abs(browser_result - opencv_result))
print(f"MAE {s_mime_type} ({s_width}x{s_height}) -> {d_mime_type} ({d_width}x{d_height}): {mae}")


compute("example.bmp", "jpeg", 100, 100)
compute("example.bmp", "jpeg", 250, 250)
compute("example.bmp", "jpeg", 500, 500)
compute("example.bmp", "jpeg")

compute("example.bmp", "bmp", 100, 100)
compute("example.bmp", "bmp", 250, 250)
compute("example.bmp", "bmp", 500, 500)
compute("example.bmp", "bmp")

compute("example.bmp", "png", 100, 100)
compute("example.bmp", "png", 250, 250)
compute("example.bmp", "png", 500, 500)
compute("example.bmp", "png")

compute("example.jpg", "jpeg", 100, 100)
compute("example.jpg", "jpeg", 250, 250)
compute("example.jpg", "jpeg", 500, 500)
compute("example.jpg", "jpeg")

compute("example.jpg", "bmp", 100, 100)
compute("example.jpg", "bmp", 250, 250)
compute("example.jpg", "bmp", 500, 500)
compute("example.jpg", "bmp")

compute("example.jpg", "png", 100, 100)
compute("example.jpg", "png", 250, 250)
compute("example.jpg", "png", 500, 500)
compute("example.jpg", "png")

compute("example.png", "jpeg", 100, 100)
compute("example.png", "jpeg", 250, 250)
compute("example.png", "jpeg", 500, 500)
compute("example.png", "jpeg")

compute("example.png", "bmp", 100, 100)
compute("example.png", "bmp", 250, 250)
compute("example.png", "bmp", 500, 500)
compute("example.png", "bmp")

compute("example.png", "png", 100, 100)
compute("example.png", "png", 250, 250)
compute("example.png", "png", 500, 500)
compute("example.png", "png")


index.html
<!doctype html>
<html lang="fr">
<head>
<meta charset="utf-8">
<script src="script.js"></script>
</head>
<body>
</body>
</html>


script.js
async function encodeAndResize(base64Image, dMimeType, dWidth, dHeight) {

// Decode the image
const image = new Image();
image.src = base64Image;
await image.decode();

// Write the image to canvas
const canvas = document.createElement("canvas");
const { width, height } = image;
dWidth = dWidth || width;
dHeight = dHeight || height;
canvas.width = dWidth;
canvas.height = dHeight;
const context = canvas.getContext("2d");
context.drawImage(image, 0, 0, width, height, 0, 0, dWidth, dHeight);

// Encode to jpeg
const blob = await new Promise(resolve => canvas.toBlob(resolve, dMimeType));
const data = await new Promise((resolve, _) => {
const reader = new FileReader();
reader.onloadend = () => resolve(reader.result);
reader.readAsDataURL(blob);
});
return data;
}


Environment
  • Ubuntu 22.04 LTS
  • Google Chrome 112.0.5615.121 (official built)
  • Python 3.10.6
  • numpy==1.21.5
  • selenium==4.9.0
  • open-cv==4.7.0.72

Thank you in advance for any help.



**CONFIDENTIAL NOTICE** You have received this message from Odin Medical Limited. The information contained in this email is confidential, may be privileged and is intended solely for the use of the named addressee. If you are not the intended recipient, please contact the sender immediately and delete this email including any attachments, from your computer system or applicable device. Unauthorised disclosure, copying or other use of this email and/or attachment is prohibited. If you are the intended recipient and do not wish to receive further communication from us or to exercise your rights under the General Data Protection Regulation, please contact us using the following email address: c...@odin-vision.com and we will endeavour to comply with your request.

Justin Novosad

unread,
Jun 6, 2023, 1:53:15 PM6/6/23
to vincent...@odin-vision.com, Chromium-dev, Daniel Toth
There are two main issue at play here: compression quality and resize quality.

Compression quality 

BMP and PNG are "lossless" image formats so when you do an encode/decode roundtrip through those image formats you will get back the exact same pixel values that you put in.  The only exceptions are cases where pixel value conversions are required.  For example if the source image has a higher bit depth or a different colorspace than the image file's encoding, then you can expect a loss of precision.  JPEG, on the other hand is a lossy image format that sacrifices fidelity to achieve higher compression ratios.  This tradeoff varies from one jpeg encoder library to another and most JPEG encoders have parameters that allows the user to control the quality vs compression tradeoff.  Different web browser may use different encoders or encoder setting for producing JPEG image files. HTMLCanvasElement.toBlob has a third argument that you can use to control compression quality. See the spec, here: https://html.spec.whatwg.org/multipage/canvas.html#dom-canvas-toblob. The quality argument has no effect on lossless formats like bmp or png. It only affecs enders that actually have a quality tradeoff. The behaviour of the quality setting is not completely defined in the spec and may vary between web browsers. This is normal considering that is no universal way to parameterize JPEG encoding quality using a single number.  The spec says:
Different implementations can have slightly different interpretations of "quality". When the quality is not specified, an implementation-specific default is used that represents a reasonable compromise between compression ratio, image quality, and encoding time.

If you are concerned about image fidelity, I recommend you use the highest quality setting of 1.0 when using toBlob to make JPEG files.

Resize quality

Resizing an image requires resampling the pixels, and there isn't a single universal way to do that.  As with compression, there are tradeoffs. In this case the main tradeoff is between performance and visual quality. When upscaling images, we typically interpolate pixel values and for downscaling we either sample a single pixel from the source image (may lead to aliasing artifacts such as Moire patterns), or we do a weighted average of the values of source pixels that intersect the destination image's pixel (higher quality).  The weighted average approach can be implemented in many ways with different algorithms for computing the weights. In GPU-accelerated rendering, we often use a computationally efficient approach known as mipmapping. When using CanvasRenderingContext2D.drawImage to resample an image, the resampling quality is controlled by the rendering context's imageSmothingQuality attribute.  This attribute can be set to "low", "medium", or "high".  By default, it uses "low"  which typically means bilinear interpolation for both upscaling and downscaling.  This is fast but may yield poor quality results, especially for downscaling. The HTML spec does not specify exactly which algorithms get used for each of the low/medium/high settings. This is because different web browsers may use different graphics libraries that provide different image resampling algorithms.  Therefore, you can not expect the exact behavior of imageSmoothingQuality to be interoperable between browsers (as with JPEG compression quality). In fact the behavior is currently not even exactly interoperable between different computers running the same version of Chrome.  This is because the "medium" and "high" settings use the mipmaping algorithm that is provided by the graphics driver. So to computers with different brand/models/driver versions of GPU may sometimes produce slightly different result.


--
--
Chromium Developers mailing list: chromi...@chromium.org
View archives, change email options, or unsubscribe:
http://groups.google.com/a/chromium.org/group/chromium-dev
---
You received this message because you are subscribed to the Google Groups "Chromium-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chromium-dev...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/chromium-dev/e81108bb-f0da-4a83-aed3-c75e48af5992n%40chromium.org.
Reply all
Reply to author
Forward
0 new messages