Intent to Ship: Deprecate module size limit for WebAssembly.Module()

604 views
Skip to first unread message

Andreas Haas

unread,
Apr 5, 2023, 9:05:43 AM4/5/23
to blink-dev

Contact emails

ah...@google.com

Explainer

None

Specification

None

Summary

There exists a limit on the size of a module that can be compiled with `new WebAssembly.Module()` on the main thread. This limit is 4KB, and it was introduced when WebAssembly modules got compiled eagerly with an optimizing compiler, which could block the main thread for many seconds and even minutes. In the meantime V8 launched lazy compilation for WebAssembly modules, and the execution time of `new WebAssembly.Module()` is below 1 second even for the biggest modules we see, even on the weakest devices we measured. Therefore it is time to remove this limit.



Blink component

Blink>JavaScript>WebAssembly

TAG review

None

TAG review status

Not applicable

Risks



Interoperability and Compatibility



Gecko: Shipped/Shipping

WebKit: Shipped/Shipping

Web developers: Strongly positive We received repeated bug reports because of this limit. Especially for tests synchronous compilation with `new WebAssembly.Module()` is useful, but the size limit prevents bigger tests from using synchronous compilation.

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

None



Debuggability



Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

No

Flag name



Requires code in //chrome?

False

Estimated milestones

Shipping on desktop114
Shipping on Android114
Shipping on WebView114


Anticipated spec changes

Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).

None

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5080569152536576

Links to previous Intent discussions



This intent message was generated by Chrome Platform Status.

--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


Daniel Bratell

unread,
Apr 5, 2023, 9:55:01 AM4/5/23
to Andreas Haas, blink-dev

LGTM1

This doesn't show up in our chromestatus UI. Have you sent if for "shipping" there? If no further comments arrive, it may be that it has fallen off our radar because of that.

/Daniel

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAELSTve0zdDNeCDXvG%3D73-zVy8Fps_9eFErWfOocSfxbzOxGHQ%40mail.gmail.com.

Yoav Weiss

unread,
Apr 5, 2023, 9:57:15 AM4/5/23
to Andreas Haas, blink-dev
Is it interoperably tested by other means? I'm not super familiar with WASM testing..
 

Andreas Haas

unread,
Apr 5, 2023, 10:09:15 AM4/5/23
to Yoav Weiss, blink-dev
Hi Yoav,

I'm not sure what you mean. At the moment this 4KB limit exists in Chrome, but it does not exist in Safari or Firefox. I tested this locally on my Macbook. I don't know if there exists another test at the moment which passes on Safari and Firefox but fails on Chrome, and would pass on Chrome after we remove the limit.

Cheers, Andreas

Alex Russell

unread,
Apr 13, 2023, 2:21:46 PM4/13/23
to blink-dev, Andreas Haas, blink-dev, Yoav Weiss
"Below 1 second" for something that can block the main thread is not particularly heartening. Can you please provide the histogram data you're seeing to justify this? Would you be happy to raise the cap to a larger (but still fixed) size based on a baseline device config instead?, e.g.:

https://infrequently.org/2022/12/performance-baseline-2023/

Best,

Alex
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Andreas Haas

unread,
Apr 14, 2023, 5:00:08 AM4/14/23
to Alex Russell, blink-dev, Yoav Weiss
Hi Alex,


I think the question is more, how can we justify such a limit? I mean, I agree, it is not a good experience if the main thread is blocked for 1 second, but we have to consider the scenario in which this is happening. The main thread is blocked for one second after a WebAssembly module was downloaded which is tens of megabytes big.

Additionally, in the current environment it is not likely that you end up serving a big WebAssembly module to the user with synchronous compilation by accident. WebAssembly modules are typically generated by compilers which also generate the JS glue code around it. These compilers produce glue code that uses asynchronous compilation or even streaming compilation. Therefore a developer would have to make an effort to even serve a big WebAssembly module with synchronous compilation.

There are scenarios where developers make this effort, and I don't think we should prevent developers when they make this conscious decision. One such scenario is tests. It is much easier to write and run tests with synchronous compilation. We run nearly all our WebAssembly tests in V8 with synchronous compilation. We also got bug reports repeatedly where developers struggle with their tests because of the 4KB limit.

So overall I think the limit was justified in the beginning, but now with lazy compilation and baseline compilation this justification is gone. I don't think this limit makes the web a better place anymore, it just makes the life of developers difficult in specific niche situations.

Cheers, Andreas

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


Ian Kilpatrick

unread,
Apr 14, 2023, 2:18:23 PM4/14/23
to Andreas Haas, Alex Russell, blink-dev, Yoav Weiss
Out of curiosity - 
What is performance like on a low tier Android phone (I see only a Pixel 7 tested above)?
What is the performance of your benchmark on other browsers - across device classes? (Even if they don't have this limit - this intent will mean that it'll be interoperable to use the sync method - potentially causing compat problems for the other browsers).

Ian

Andreas Haas

unread,
Apr 17, 2023, 6:12:04 AM4/17/23
to Ian Kilpatrick, Alex Russell, blink-dev, Yoav Weiss
Hi Ian,


You need corp access for it, and I didn't have access to low tier Android phones with corp access.

Safari also compiles lazily, so their compile times are similar to ours. Firefox compiles modules eagerly, and therefore takes longer. I don't really have the devices or the setup to do the measurements on other browsers. I measured the performance of Firefox on my workstation, where the compilation of the 80MB module takes slightly less than 1.6 seconds. This is about 60% slower than Chrome with eager compilation. I tried Chrome with eager compilation on the atlas Chromebook. Compilation of the 80MB module takes 2.8 seconds there.

Cheers, Andreas

Philip Jägenstedt

unread,
Apr 19, 2023, 11:59:21 AM4/19/23
to Andreas Haas, Ian Kilpatrick, Alex Russell, blink-dev, Yoav Weiss
Hey Andreas,

Do you know what the limits of other browsers are? If testing a 1 GB module is too slow to be reliable (sometimes timing out) then perhaps there's a large-ish module you can test with that still exceeds the current limits?

Note that you could also add a manual test in WPT for the real limit (1 GB) and run it at least once manually to ensure it works the same in all browsers.

Best regards,
Philip

Andreas Haas

unread,
Apr 20, 2023, 6:58:25 AM4/20/23
to Philip Jägenstedt, Ian Kilpatrick, Alex Russell, blink-dev, Yoav Weiss
Hi Philip, Yoav,

I added a test to the wasm spec tests now, see https://github.com/WebAssembly/spec/pull/1642. It creates modules of size 1GB and 1GB+1 and checks that compilation passes or fails, respectively. The modules consist of a single custom section, so that minimal processing time and module creation time should be introduced.

As far as I know, the other browsers never had a special limit on the module size, other than the spec'ed 1GB limit. I confirmed that now with Firefox.

Cheers, Andreas

Yoav Weiss

unread,
Apr 20, 2023, 7:06:30 AM4/20/23
to Andreas Haas, Philip Jägenstedt, Ian Kilpatrick, Alex Russell, blink-dev
LGTM2

Thanks for testing this! :)

Alex Russell

unread,
Apr 20, 2023, 2:58:27 PM4/20/23
to blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Alex Russell, blink-dev, Andreas Haas
Thanks for the document, Andreas.

The numbers in it are still hugely concerning, and I'm a -1 until and unless we have data from P75-P90 Androids and Windows devices. Our telemetry from Edge shows that nearly half of users are on slow, spinning rust and 2-4 core devices, and the only system in your test list that reflects something in this range is the N2840 in the HP Chromebook 13 G1.

Motivations in the document regarding testing are not compelling. Folks doing testing can provide flags to the browser (which you could expose to raise the limit without an Intent).

Would you be willing to accept a higher cap? Your document suggests that ~10MiB (unzipped) might be reasonable, but would want to see data from low-end Android before going even that far.

Thanks.

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Andreas Haas

unread,
Apr 21, 2023, 7:47:29 AM4/21/23
to Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick
Hi Alex,

I will try to organize a low-end device to do the measurements you are asking for. Could you please describe how the results would guide your decision on this issue?

I would also ask you to clarify your concerns. What are the scenarios you have in mind, and what are the improvements a user would get from this limit?

When I think of concrete scenarios, then I end up either with the scenario being unlikely, or that the limit would not make a big difference for the user experience:
* Unlikely, because it is hard to end up with a big WebAssembly module that you compile synchronously. You don't write WebAssembly modules by hand, you generate them with a compiler like Emscripten. The compiler typically does not only generate the WebAssembly module for you, it also generates the code for you that downloads and compiles the WebAssembly module. A scenario where the limit may matter is that the compiler decides to use synchronous compilation instead of asynchronous compilation. However, why would the compiler do that?
    * Because synchronous compilation provides benefits? What would these benefits be? The WebAssembly module already gets downloaded asynchronously, so continuing with synchronous compilation has no advantage to asynchronous compilation.
    * By accident, because the compiler writer does not know better? I think it's unlikely that a compiler writer would choose to use synchronous compilation by accident, and that this compiler is then used to produce a WebAssembly module with 10s of MB, and that the resulting WebAssembly module is then used in any reasonable webpage.
* The limit would not make a difference for the following reason: On the website you cite above the average network bandwidth is 9Mbps. Even compressed the 80MB module in my measurements would take 10s of seconds to download. Does blocking the main thread of less than 1 second really matter to the user after they waited many times as long for the download?

As I wrote before, I understand why this limit was introduced originally. However, these reasons don't exist anymore, and all that remains is a spec violation, see https://webassembly.github.io/spec/js-api/index.html#limits. Even a limit of 10MB is still a spec violation, and as I wrote before, I don't see any justification for that.

Cheers, Andreas

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


K. Moon

unread,
Apr 21, 2023, 9:44:23 AM4/21/23
to Andreas Haas, Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick
I don't have any authority here, but my two cents: Given there's a synchronous and an asynchronous API already, why not let the page author decide the right trade-off for themselves? I know in general we'd like to discourage authors from blocking the renderer main thread, but there already seem to be plenty of incentives not to do that already, and the well-lit paths don't encourage large synchronous compilations. It'd be different if this somehow required blocking the browser main thread, of course.

Alex Russell

unread,
Apr 24, 2023, 1:13:23 PM4/24/23
to blink-dev, Andreas Haas, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Alex Russell
  Hey Andreas,

First, I deeply appreciate your willingness to do the work to get us data.

I'm generally interested in understanding the potential for modules to create main-thread jank. If we accept the CWV INP threshold of ~200ms, would like to make sure that for the vast majority of users (P90? P95?) don't hit synchronous operations longer than that. So the question about sizing is relative to that goal.

Put differently, if we raise the limit to XMB vs. YMB, and developers begin to develop to that new threshold, how much jank can this introduce?

As for "does blocking the main thread for 1s really matter?", all I can tell you is that this project has worked diligently for more than a decade to try to eliminate sources of main-thread jank. It would be surprising if we relaxed that goal for this specific feature.

Best,

Alex



To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Andy Wingo

unread,
Apr 26, 2023, 3:30:06 AM4/26/23
to 'Andreas Haas' via blink-dev, Alex Russell, Andreas Haas, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick
Hi,

Just a note about use-cases:

On Fri 21 Apr 2023 13:47, "'Andreas Haas' via blink-dev" <blin...@chromium.org> writes:

> * You don't write WebAssembly modules by hand, you generate them with
> a compiler like Emscripten.

The WebAssembly module itself may well generate, compile, and
instantiate a module -- a form of JIT code generation. See
https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly
for a longer discussion. Probably at some point there will be a
WebAssembly proposal to address this use case.

Sometimes synchronous compilation can be useful for the JIT use case,
for example so you can embed pointers to linear memory in the generated
module and be sure they are still valid by the time the generated module
runs. Of course, generally you would still want to design for
asynchronous compilation to avoid blocking the main loop, but in
development at least it's nice that synchronous instantiation is
possible.

Regards,

Andy

Andreas Haas

unread,
Apr 26, 2023, 11:49:31 AM4/26/23
to Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick

Hi Alex,

Here are the performance measurements you requested on some Android phones I found.

sizeNokia 1PixelPixel 2Pixel 3Pixel 7
713B4.57ms2.96ms1.46ms0.47ms0.22ms
25.92KB5.54ms3.79ms1.44ms1.77ms0.98ms
35.12KB86.39ms46.52ms16.32ms13.17ms8.70ms
249.84KB13.99ms7.53ms3.17ms3.27ms2.16ms
294.95KB14.70ms8.02ms3.86ms3.23ms2.34ms
378.2KB18.61ms10.56ms7.06ms4.83ms4.48ms
768.98KB78.11ms38.79ms28.45ms18.43ms5.70ms
1.26MB45.29ms22.59ms18.35ms7.20ms5.20ms
2.89MB89.30ms35.56ms32.37ms13.77ms11.06ms
3.28MB215.52ms83.29ms84.42ms32.57ms21.96ms
6.65MB127.58ms59.96ms42.95ms22.00ms16.34ms
8.9MB398.86ms218.55ms114.45ms49.70ms42.54ms
10.08MB535.84ms281.89ms144.76ms65.43ms53.22ms
13.56MB327.44ms136.32ms87.69ms51.83ms44.26ms
19.12MB504.83ms232.17ms133.18ms88.73ms75.36ms
34.54MB1541.58ms910.30ms423.14ms261.57ms275.40ms
60.37MBOOM609.99ms307.16ms219.67ms183.00ms
63.77MB1491.76ms735.30ms360.98ms214.80ms168.56ms
64.08MB1473.18ms727.87ms350.73ms251.27ms180.56ms
67.41MB1499.28ms809.60ms380.74ms298.00ms228.82ms
83.26MBOOM708.36ms384.34ms272.90ms

As far as I understand the discussion here there are worries that if we remove this limit, then the user experience will regress, especially on low-end devices.

On the other hand, my goal with this proposal is to improve the developer experience.

Additionally I want to mention that this limit on synchronous WebAssembly compilation is an inconsistency in the web platform, and a violation of the WebAssembly standard which recommends a maximum WebAssembly module size of 1GB. In principle we should implement the standard, and if we don't agree with the standard, then we should work on changing the standard and not ignore it.

Back to user experience vs developer experience. I think for both sides this limit does not make a big difference. Developers will use asynchronous compilation anyways because that's what to-WebAssembly compilers generate automatically. Also, typically a WebAssembly module has to be downloaded first, and with the `WebAssembly.compileStreaming()` API there exists a tool that allows you to download and compile a WebAssembly module at the same time. So in the typical scenario where you first download a WebAssembly module and then you compile it, synchronous WebAssembly compilation does not provide any benefits to the developer, and it is also not the default. So developers would only decide consciously in special situations to use synchronous compilation.

So which scenarios would cause a bad user experience?
A bad user experience would be if the main thread is blocked repeatedly for longer amounts of time, or at bad moments in time like during an animation. 
Now, WebAssembly module compilation takes time because new code gets provided to the web page. However, there is only so much code you want to load in your web page, especially code in modules that are bigger than 1MB. So the chance that the main thread gets blocked repeatedly for longer amounts of time is low.
So what about jank during animations? If we look at JavaScript, then JavaScript code gets typically loaded in one big script, or in many small scripts. The big script gets loaded during the startup of a web page, small scripts may get loaded over time. For WebAssembly it is the same so far, and will probably stay the same: big modules get loaded during the startup of a web page, only smaller modules would get loaded later. During page startup, users are used to waiting a bit for the page to start up, to load data from the server and to execute initialization code. Jank introduced by synchronous WebAssembly compilation would only be a small part during startup. Note that downloading the big WebAssembly module takes typically much longer than compiling it.
To sum up, jank introduced by synchronous WebAssembly compilation would either happen during page startup where it's dominated by other delays like the much longer module download. Small modules may also get compiled later, but then the introduced jank would be small because the module is small.

Then, how does the limit make the developer experience bad?
First, as mentioned above, the limit is an inconsistency in the web platform and a violation of the WebAssembly standard. Inconsistencies in the web platform are known as big developer pain points, and the bug reports we got about this limit show the same.
Second, there are special scenarios where synchronous compilation has advantages. I already mentioned testing, where the order in which you compile and execute tests is important. Saying that developers should just set special browser flags to disable the limit sounds arrogant to me. You want to test your code in an as realistic context as possible, as all special configurations of the testing setup may lead to missed bugs.
But there are also other situations, especially for experimental projects where synchronous compilation could make things easier, like the JIT compiler that Andy Wingo mentioned.
There is also the situation where you would like to add a custom section to your WebAssembly module which supports debugging. One example would be a name section that can be loaded by DevTools. This custom section would not change the compile time at all because custom sections get ignored by the WebAssembly engine. For the developer this custom section can be very valuable, but also very big. It would be a big developer pain point if the developer has a module which is far below the limit in release mode, but with the debugging information in the custom section the module would be bigger than the limit, and they would not be able anymore to run their web page, at least not in Blink.
To sum up, the web developers only notice the limit in very specific scenarios, but when they do, the limit is surprising and a big developer pain point.

So overall I don't think this limit matters for most users or developers, but in the cases where it matters it is much more painful for the developer than for the user.

To answer to your email specifically:

I don't think that the CWV INP threshold is a good guideline here, because, as I mentioned above, the main-thread jank caused by WebAssembly compilation is likely to happen when it matters less, and when it's dominated by other delays like the module download.

Cheers, Andreas

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Alex Russell

unread,
Apr 27, 2023, 4:30:13 AM4/27/23
to blink-dev, Andreas Haas, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Alex Russell, Elliott Sprehn, Chris Wilson, Jeffrey Yasskin
Hey Andreas,

Thank you for collecting that data! Making a decision based on evidence is much easier and less contentious. The Nokia 1 is a classic "all-slow-A53-cores, all the time" device on that terrible 28nm process that is (finally!) leaving the ecosystem. Devices with that profile are likely leaving service, and the Pixel and Pixel 2 are more evocative of where the low-end is likely to be today and in the near future, so I'm comfortable anchoring on that.

This raises some questions about your data, particularly the jumps up and down in the first six rows. Do you have an intuition for why the 35K module was so much more costly than those around which are nominally larger? Did it exercise different features or tickle different compilation paths? Should that sort of variance be taken into consideration here?

Your post also raises a set of points that have a history which I'm going to try to avoid fully recounting here. As I no longer serve as Standards TL for Chrome, you might consult with Chris and Jeff (cc'd) regarding the project's overall orientation towards letter vs. spirit of the law when it comes to standards making and our priority of constituencies in browser making and standards setting. Suffice to say, we have inherent leeway to go our own way when standards are a hazard to users, and the Blink Launch Process has been carefully designed and tended to maintain that flexibility.

I will, however, dig into a specific point from the initial design discussions which involved debate between the W3C TAG (on which I served at the time), the WASM WG, and the then-serving API OWNERS.

In the initial design, the current spec language around synchronous compilation existed in roughly it's current form, and this was caught rather late as a sub-point of larger concerns about platform integration that the WASM WG had (generously) overlooked; things like CSP controls, etc. It was confidently asserted by some at the time that "nobody will ship synchronous compilation to their users", but when prodded by Elliott (cc'd), myself, and others, it turned out that this was exactly what the popular toolchains of the day were doing.

Blink took the position, based on the devices and networks that we serve the majority of our users on, that this was not a responsible approach and would lead to large regressions, akin to much of the badness that the V8 team has spent huge resources to mitigate with main-thread JS compilation (background). The explicit calculus with the 4K limit was based on a market reality that, if no other engine were to be responsible, but Chromium would, that we would still generate the intended positive effect in the ecosystem. All of the points you've raised about developer needs and benefits were litigated at the time, and found wanting based on the project's overall goals.

It is, of course, frustrating that the WASM WG (including our participants there) have not seen fit to update the spec with more realistic guidance for browser embedders, but suffice to say, Chromium's approach has worked. The net effect of a fully unbounded proposal in this intent would be to create a predictable free-fire zone on main thread blocking, which is something that dozens of your colleagues have spent hundreds of person-years to reduce.

Obviously, we aren't still discussing a fully unbound proposal any more (thank you!), but as we look to set a new limit, it will be helpful to know more details about where we can make tradeoffs that help developers without harming the user experience of the web. For example, do you (or other folks here) have an intuition (or data) about the needs of JITs? Will memory regions need to be self-contained WASM modules to enable reasonable behaviour? Or is chunking into more (smaller) modules acceptable in some situations?

Thanks again for gathering data and helping us make informed decisions. I deeply appreciate your willingness to compromise here.

Best,

Alex

To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Andreas Haas

unread,
Apr 27, 2023, 10:16:19 AM4/27/23
to Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Elliott Sprehn, Chris Wilson, Jeffrey Yasskin
Hi Alex,

On Thu, Apr 27, 2023 at 10:30 AM Alex Russell <sligh...@chromium.org> wrote:
Hey Andreas,

Thank you for collecting that data! Making a decision based on evidence is much easier and less contentious. The Nokia 1 is a classic "all-slow-A53-cores, all the time" device on that terrible 28nm process that is (finally!) leaving the ecosystem. Devices with that profile are likely leaving service, and the Pixel and Pixel 2 are more evocative of where the low-end is likely to be today and in the near future, so I'm comfortable anchoring on that.

This raises some questions about your data, particularly the jumps up and down in the first six rows. Do you have an intuition for why the 35K module was so much more costly than those around which are nominally larger? Did it exercise different features or tickle different compilation paths? Should that sort of variance be taken into consideration here?

The main difference is the number of imported functions. At the moment we still compile wrapper functions for JavaScript functions that get imported to WebAssembly. The number of imports is typically low. In the traces I analyzed, mostly of bigger modules, this wrapper compilation gets dominated completely by function validation. However, we have plans to implement a generic wrapper builtin so that we can avoid these wrapper compilations during startup.
 
Your post also raises a set of points that have a history which I'm going to try to avoid fully recounting here. As I no longer serve as Standards TL for Chrome, you might consult with Chris and Jeff (cc'd) regarding the project's overall orientation towards letter vs. spirit of the law when it comes to standards making and our priority of constituencies in browser making and standards setting. Suffice to say, we have inherent leeway to go our own way when standards are a hazard to users, and the Blink Launch Process has been carefully designed and tended to maintain that flexibility.

I will, however, dig into a specific point from the initial design discussions which involved debate between the W3C TAG (on which I served at the time), the WASM WG, and the then-serving API OWNERS.

In the initial design, the current spec language around synchronous compilation existed in roughly it's current form, and this was caught rather late as a sub-point of larger concerns about platform integration that the WASM WG had (generously) overlooked; things like CSP controls, etc. It was confidently asserted by some at the time that "nobody will ship synchronous compilation to their users", but when prodded by Elliott (cc'd), myself, and others, it turned out that this was exactly what the popular toolchains of the day were doing.

Blink took the position, based on the devices and networks that we serve the majority of our users on, that this was not a responsible approach and would lead to large regressions, akin to much of the badness that the V8 team has spent huge resources to mitigate with main-thread JS compilation (background). The explicit calculus with the 4K limit was based on a market reality that, if no other engine were to be responsible, but Chromium would, that we would still generate the intended positive effect in the ecosystem. All of the points you've raised about developer needs and benefits were litigated at the time, and found wanting based on the project's overall goals.

It is, of course, frustrating that the WASM WG (including our participants there) have not seen fit to update the spec with more realistic guidance for browser embedders, but suffice to say, Chromium's approach has worked. The net effect of a fully unbounded proposal in this intent would be to create a predictable free-fire zone on main thread blocking, which is something that dozens of your colleagues have spent hundreds of person-years to reduce.

Obviously, we aren't still discussing a fully unbound proposal any more (thank you!), but as we look to set a new limit, it will be helpful to know more details about where we can make tradeoffs that help developers without harming the user experience of the web. For example, do you (or other folks here) have an intuition (or data) about the needs of JITs? Will memory regions need to be self-contained WASM modules to enable reasonable behaviour? Or is chunking into more (smaller) modules acceptable in some situations?

I have no more data than my intuition. There are two kinds of JIT that I can think of: 1) a JIT as part of an execution environment (imagine V8) where code is generated just in time to be executed immediately. In that case, you would probably have many small modules that are linked together through shared tables and memories. 2) an online IDE which compiles a bigger project that can then be tested and debugged. In that case you would probably end up with one single module.
 
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Elliott Sprehn

unread,
Apr 27, 2023, 5:01:17 PM4/27/23
to Andreas Haas, Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin
There seems to be a lot of different things being discussed here:

(1) The limit seems too low for newer hardware or engine architectures.
(2) In some scenarios during development it would be nice to have synchronous compiles.
(3) The spec doesn't say anything about sync compile limits.
(4) There's an expectation (from some) that developers will "do the right thing" for performance.
(5) We should remove the limit entirely.

For (1), and from looking at the table, it seems like maybe ~25k is a more modern limit? That seems fine to me. Why is 35k slower than 250k though?

For (2) Alex is correct that the right "escape hatch" around best practices for development is runtime flags. There's many examples of this in the web platform, for example the ability to use invalid certs for localhost. It's also repeated multiple times in this thread that production code should be using async compiles, and that tooling usually generates that too. You also state that development should match production (which I agree with), so then development should also be async?

For (3) that sounds like a spec problem. The spec should be adjusted to allow the existence of implementation defined limits.

For (4) I think it's objectively false. Not just within performance, but engineering in general (see ex. Rust for safety by default :)). As someone who represents thousands of engineers building web apps I can tell you that it's incredibly hard to get performance right, and every time the platform imposes constraints to keep folks from shooting their feet off the outcomes at scale are better.

For (5) I think it would be a mistake to remove the limit entirely. I was very involved in adding this limit originally, and it came from looking at traces across many websites and working with many companies for years trying to make their web apps fast. We also noticed that early on folks using WASM were trying to compile larger and larger files on the main thread. That we avoided the mistakes of JS code loading for WASM should be viewed as a victory for the platform, not a wart.

A rough analogy would be to think about this from the angle of disk IO on the main thread. It would both make development/testing easier, and probably be fast some of the time, ex. ~3ms for readFileSync of .gitconfig on my Macbook. Even so, the web doesn't allow sync file IO from the main thread because the risks outweigh the DX improvements.

- E

Andreas Haas

unread,
Apr 28, 2023, 9:23:07 AM4/28/23
to Elliott Sprehn, Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin
Hi Elliott,

On Thu, Apr 27, 2023 at 11:01 PM Elliott Sprehn <esp...@chromium.org> wrote:
There seems to be a lot of different things being discussed here:

(1) The limit seems too low for newer hardware or engine architectures.
(2) In some scenarios during development it would be nice to have synchronous compiles.
(3) The spec doesn't say anything about sync compile limits.
(4) There's an expectation (from some) that developers will "do the right thing" for performance.
(5) We should remove the limit entirely.

For (1), and from looking at the table, it seems like maybe ~25k is a more modern limit? That seems fine to me. Why is 35k slower than 250k though?

The difference is the number of imported functions declared in the wasm module. We have plans to improve the handling of imported functions, but in our metrics the absolute time spent on the handling of imported functions was always quite small, so other optimizations had priority so far. 
 
For (2) Alex is correct that the right "escape hatch" around best practices for development is runtime flags. There's many examples of this in the web platform, for example the ability to use invalid certs for localhost. It's also repeated multiple times in this thread that production code should be using async compiles, and that tooling usually generates that too. You also state that development should match production (which I agree with), so then development should also be async?

There are not only end-to-end tests where the testing environment should be as close to production as possible, but there are also other kinds of tests, like unit tests. After writing the async test runner for the core wasm spec tests I can tell you that it can be a huge pain to write async tests.
 
For (3) that sounds like a spec problem. The spec should be adjusted to allow the existence of implementation defined limits.

As stated before: we are currently violating the spec. In Chrome/V8, we should implement the spec, not our personal opinions. If you think that the spec should be changed, please direct your arguments at the public Wasm community (github.com/WebAssembly/spec/issues). If the spec is changed to include a limit for sync compilation, then this limit will naturally be implemented in V8/Blink. Until then, we should be spec-compliant by not having a limit.

FWIW, the Wasm-JS API spec doesn't simply allow arbitrary implementation-defined limits, but in the interest of interoperability of implementations defines exactly what the limits should be: https://webassembly.github.io/spec/js-api/index.html#limits
 
For (4) I think it's objectively false. Not just within performance, but engineering in general (see ex. Rust for safety by default :)). As someone who represents thousands of engineers building web apps I can tell you that it's incredibly hard to get performance right, and every time the platform imposes constraints to keep folks from shooting their feet off the outcomes at scale are better.

I think it would be good if we did not discuss engineering in general here, but specifically about WebAssembly compilation. The typical data flow for wasm compilation is that a wasm module first gets downloaded and then compiled. For sync compilation this would look the following (example from https://web.dev/loading-wasm/)

(async () => {
  const response = await fetch('fibonacci.wasm');
  const buffer = await response.arrayBuffer();
  return new WebAssembly.Module(buffer);
})();

With async compilation it would be the following:

(async () => {
  const response = await fetch('fibonacci.wasm');
  const buffer = await response.arrayBuffer();
  return await WebAssembly.compile(buffer);
})();

Streaming compilation would even be shorter:

(async () => {
  const response = await fetch('fibonacci.wasm');
  return await WebAssembly.compileStreaming(response);
})();
 
If you look at these three examples, you can see that sync compilation is not the easiest to use, it just saves one await after two awaits in the lines before. On the contrary, streaming compilation is even much easier to use, and it would also be faster.

And even more than that, the chance that this code ever gets written by hand is small, as this code typically gets produced by a tool chain like emscripten.

Additionally, the time it takes to download the module completely dominates the time it takes to compile the module, so if there is a performance problem, then it's because the module that gets downloaded is too big, the compilation most likely barely matters.

For (5) I think it would be a mistake to remove the limit entirely. I was very involved in adding this limit originally, and it came from looking at traces across many websites and working with many companies for years trying to make their web apps fast. We also noticed that early on folks using WASM were trying to compile larger and larger files on the main thread. That we avoided the mistakes of JS code loading for WASM should be viewed as a victory for the platform, not a wart.

The situation back then and now is quite different. Back then we did not have lazy compilation, and we did not have a baseline compiler. Therefore the compilation of large modules could take long, even more than a minute on a low-end device. But nowadays we have lazy compilation, we have a baseline compiler. Even on a lowend phone like the Pixel1 the compilation of a module with 200'000 functions finishes in just 800ms.

About module sizes, yes, modules were getting bigger, an 80MB module did not exist back then. But these modules come from big legacy desktop apps that were ported to the web platform, with WebAssembly. 

If we look at these big desktop apps in detail, then they were all compiled with emscripten, which uses streaming compilation by default, and would have never used synchronous compilation. So for big apps, the limit did not make any difference, and the situation would most likely be exactly the same as it is now, even without the limit.

The main difference this limit makes is that it makes the life of a few web developers really difficult who have a justified use case for sync compilation. For the user the limit most likely has not made any difference so far, and will also not make a difference in the future. That's why I consider it a wart.
 
A rough analogy would be to think about this from the angle of disk IO on the main thread. It would both make development/testing easier, and probably be fast some of the time, ex. ~3ms for readFileSync of .gitconfig on my Macbook. Even so, the web doesn't allow sync file IO from the main thread because the risks outweigh the DX improvements.

I don't think this is a useful analogy here. readFileAsync cannot easily replace readFileSync in the way it would be used normally. Async WebAssembly compilation, however, can most of the time directly replace sync compilation, as I showed above with the code snippets. Also, before a WebAssembly module can be compiled it first has to be acquired, and typically this takes much longer than the compilation itself.

Also, the frequency and timings of file accesses are very different from the frequencies and timings of WebAssembly compilations. There are barely any similarities, so I don't think there is any value of bringing up file accesses here.

Alex Russell

unread,
Apr 28, 2023, 10:05:52 AM4/28/23
to blink-dev, Andreas Haas, Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
On Friday, April 28, 2023 at 2:23:07 PM UTC+1 Andreas Haas wrote:
Hi Elliott,

On Thu, Apr 27, 2023 at 11:01 PM Elliott Sprehn <esp...@chromium.org> wrote:
There seems to be a lot of different things being discussed here:

(1) The limit seems too low for newer hardware or engine architectures.
(2) In some scenarios during development it would be nice to have synchronous compiles.
(3) The spec doesn't say anything about sync compile limits.
(4) There's an expectation (from some) that developers will "do the right thing" for performance.
(5) We should remove the limit entirely.

For (1), and from looking at the table, it seems like maybe ~25k is a more modern limit? That seems fine to me. Why is 35k slower than 250k though?

The difference is the number of imported functions declared in the wasm module. We have plans to improve the handling of imported functions, but in our metrics the absolute time spent on the handling of imported functions was always quite small, so other optimizations had priority so far. 
 
For (2) Alex is correct that the right "escape hatch" around best practices for development is runtime flags. There's many examples of this in the web platform, for example the ability to use invalid certs for localhost. It's also repeated multiple times in this thread that production code should be using async compiles, and that tooling usually generates that too. You also state that development should match production (which I agree with), so then development should also be async?

There are not only end-to-end tests where the testing environment should be as close to production as possible, but there are also other kinds of tests, like unit tests. After writing the async test runner for the core wasm spec tests I can tell you that it can be a huge pain to write async tests.
 
For (3) that sounds like a spec problem. The spec should be adjusted to allow the existence of implementation defined limits.

As stated before: we are currently violating the spec. In Chrome/V8, we should implement the spec, not our personal opinions.

Again, I'd refer you to the Chris and the priority of consistencies. We do not ship features we do not agree with, and we do not take risks blindly. Adjudicating these risks is what the API OWNERS do, and in this thread you're hearing directly from OWNERS (and emeritus OWNERS) who are guiding you gently (but firmly) to understand that spec fiction does not take priority.

The way the Intent process is structured, your role as the Intent propoers is to convince us that the risks in any of the proposed risks are acceptable. For more on this, see the talk that Mike and I put together a few years ago:

 
If you think that the spec should be changed, please direct your arguments at the public Wasm community (github.com/WebAssembly/spec/issues).

It is generally the engineers already working in an area that engage with Working Groups. A fully uncapped sync compile proposal will remain blocked here until evidence that the risks to the user experience are not high, and that puts the onus on you (and the V8 team) to advocate for alignment with ground reality within the WG, not the other way around.

Adding Shu, who may be able to help.
 
If the spec is changed to include a limit for sync compilation, then this limit will naturally be implemented in V8/Blink. Until then, we should be spec-compliant by not having a limit.

That is not an argument that addresses the key question at the heart of the Blink Launch Process: "does this change solve an important problem well?"

Best,

Alex
 

Cheers, Andreas
 
Hi Alex,

TAG review statusNot applicable


Risks


Interoperability and Compatibility



Gecko: Shipped/Shipping

WebKit: Shipped/Shipping

Web developers: Strongly positive We received repeated bug reports because of this limit. Especially for tests synchronous compilation with `new WebAssembly.Module()` is useful, but the size limit prevents bigger tests from using synchronous compilation.

Other signals:

WebView application risks

Does this intent deprecate or change behavior of existing APIs, such that it has potentially high risk for Android WebView-based applications?

None



Debuggability



Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?Yes

Is this feature fully tested by web-platform-tests?No

Is it interoperably tested by other means? I'm not super familiar with WASM testing..
 


Flag name

Requires code in //chrome?False

Estimated milestonesShipping on desktop114Shipping on Android114Shipping on WebView114

Anticipated spec changes

Open questions about a feature may be a source of future web compat or interop issues. Please list open issues (e.g. links to known github issues in the project for the feature specification) whose resolution may introduce web compat/interop risk (e.g., changing to naming or structure of the API in a non-backward-compatible way).

None

Link to entry on the Chrome Platform Statushttps://chromestatus.com/feature/5080569152536576


Links to previous Intent discussions

This intent message was generated by Chrome Platform Status.

--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.


--

Andreas Haas

Software Engineer

ah...@google.com


Google Germany GmbH

Erika-Mann-Straße 33

80636 München


Geschäftsführer: Paul Manicle, Liana Sebastian

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg


Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

    

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.


--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+unsubscribe@chromium.org.

Andreas Haas

unread,
Apr 28, 2023, 1:16:16 PM4/28/23
to Alex Russell, blink-dev, Yoav Weiss, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Hi,

sorry that I misunderstood the process.

I guess I understand that you will not agree to remove this limit entirely. I don't understand your reasons for that, or the concrete scenarios that we try to avoid, but I can accept that.

For the issues we received bug reports about I guess a higher limit will be sufficient.

I would propose a limit of 8MB. My reasoning behind that limit is the following:
The maximum function size according to the spec is 7,654,321 bytes [1], so just below 8MB. A module with 1 function plus metadata should therefore most likely be below 8MB. The JIT compiler would therefore not be blocked by this limit if they generate functions one by one.

According to the measurements the 8MB should also be mostly fine. If we take the Pixel 1 as the baseline then compiling an 8MB module should take around 200ms, which is the CWV INP threshold.

So, would a limit of 8MB be acceptable for you?

Cheers, Andreas 

Yoav Weiss

unread,
May 2, 2023, 5:34:00 AM5/2/23
to Andreas Haas, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Still LGTM2


On Fri, Apr 28, 2023 at 7:16 PM Andreas Haas <ah...@google.com> wrote:
Hi,

sorry that I misunderstood the process.

I guess I understand that you will not agree to remove this limit entirely. I don't understand your reasons for that, or the concrete scenarios that we try to avoid, but I can accept that.

Let me try to expand on that a bit - specifications and implementations are both tools to enable us to deliver the experience we want to our users (and developers).
In this case there's some tension between the current specification and the desired user experience. Diverging from the spec creates interoperability tension, but this can be a reasonable tradeoff in order to improve the overall user experience. We can then go back and try to convince folks to modify the spec and other implementation to converge on an interoperable developer experience, that would lead to good user experience.
  
In this specific case, I think Alex believes that removing the limit entirely would result in developers shipping extremely large modules for sync JIT compilation, resulting in main-thread responsiveness issues. I tend to agree that some limit is warranted.
  

For the issues we received bug reports about I guess a higher limit will be sufficient.

I would propose a limit of 8MB. My reasoning behind that limit is the following:
The maximum function size according to the spec is 7,654,321 bytes [1], so just below 8MB. A module with 1 function plus metadata should therefore most likely be below 8MB. The JIT compiler would therefore not be blocked by this limit if they generate functions one by one.

According to the measurements the 8MB should also be mostly fine. If we take the Pixel 1 as the baseline then compiling an 8MB module should take around 200ms, which is the CWV INP threshold.

So, would a limit of 8MB be acceptable for you?

That seems like a reasonable place to set the limit. IIUC, this will enable the use case that this intends to enable (JIT compilation of any function), while at the same time would bring us to the INP threshold on our baseline device. "Eating up" the INP threshold for compilation is not ideal, but at the same time, this would be the extreme case of the imposed limit, so I wouldn't expect it to happen often.

As an aside: There seems to be a non-linear "jump" on all devices between the values for 6.5M and 8.9M. Seems worthwhile to look into that and figure out what the value in which this jump happens, if it's above or below 8M and whether further optimizations would help there in case it's below the limit.

Andreas Haas

unread,
May 2, 2023, 7:55:26 AM5/2/23
to Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Hi Yoav,

Thank you for your explanation on why you think some limit is justified.

There are 4 phases that have the biggest influence on the execution time of sync compilation:

1) Validation time. All functions require validation. The validation time is linear to the number of all instructions of all wasm functions in the module. The validation time typically dominates the total execution time, especially on low-end devices. 

2) Decoding time. The whole module gets decoded, the header sections in detail (e.g. which function has which signature), the code section only gets split into functions. Custom sections like the name section would get mostly ignored.

3) Creation of the internal data structure that represents the WebAssembly module. On low-end devices this phase is insignificant, on high-end devices where validation is much faster the memory operations in this phase are noticeable.

4) Compilation of wrappers for imported functions. Calls from WebAssembly to JavaScript, and vice versa, need to translate all parameters and return values from the WebAssembly value space (i32, i64, f32, f64) to the JavaScript value space (Number, BigInt). This translation is happening in wrapper functions. One wrapper has to be compiled per function signature that is used for an import. In my list of imports, the 8.9MB module is the one with the second highest number of different signatures for imports, with 27 different signatures. The highest is the 35KB module with 31 different signatures for imports.
We have plans to introduce a generic wrapper builtin which is able to deal with all signatures. This would eliminate the wrapper compilation completely during WebAssembly module compilation.

Tracing on the atlas Chromebook for the 8.9MB module shows the following times:
Validation time: 24.9ms
Decoding time: 6.5ms
Data structure creation time: 0.9ms
Wrapper compilation time: 9.7ms

The times for the 6.6MB module are:
Validation time: 12.5ms
Decoding time: 2.8ms
Data structure creation time: 0.3ms
Wrapper compilation time: 0.5ms

Note that the 6.6MB module has a name section which makes up about 30% of the whole module, and which gets ignored completely during module compilation.

I hope this makes the performance numbers now a bit more understandable.

Cheers, Andreas

Andreas Haas

unread,
May 3, 2023, 2:05:37 PM5/3/23
to Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Hi Yoav,

I know that the API owners have much more experience than me when it comes to web APIs, and what kind of code gets shipped on the web. So when you think that some limit for synchronous WebAssembly compilation is warranted, then I believe you and accept it.

What is not clear to me yet if you are aware of the magnitudes we are talking about here.

The huge modules I measured contain around 200'000 functions, the newer modules a bit less because of inlining. The 80MB module was the main module for Photoshop, so that's already a really big application. Also, it's a quite old legacy application, so I guess there is also quite some dead code in this module. As it turns out, such a big module has many downsides, from loading time to debugging and so on, so there are plans to support splitting such modules into smaller modules to make the handling easier. So my intuition is that modules will not get significantly bigger than 80MB. Are you worried that modules could get much bigger than 80MB and 200'000 functions, and that's why you think a limit is warranted?

Or do you have a world in mind where developers load many external libraries, and that the loading of these external libraries will cause too much blocking of the main thread? If libraries are loaded, how big would such libraries typically be? I mean, a 8MB module would probably also contain around 20'000 functions, which seems quite a lot to me. How many libraries with 20'000 functions each are to be expected?

When I did my measurements I thought I showed them that even modules which anyway get compiled with streaming compilation can get compiled reasonably fast with synchronous compilation. I do not assume that any meaningful webpage will ever compile such big modules synchronously, simply because with the currently existing tools it is hard to accumulate such an amount of code and use synchronous compilation. As far as I understand, you do seem to assume that if unlimited synchronous compilation exists, then it will also be used for the biggest modules. Is that true? Do you make this assumption just to be on the safe side, or do you have other reasons.

One more question: Assuming there would not be a limit, and the main thread does get blocked by synchronous compilation. Does any single jank matter, no matter when it happens, or is it repeated janks that should be avoided? My assumption was that having one jank during startup would not matter so much, and that avoiding repeated janks later during the app execution would matter more. Especially since other steps during startup like downloading the wasm module would take much longer.

I guess these are a lot of questions now, but I would like to understand your thought process better for future projects.

Thanks, Andreas

On Tue, May 2, 2023 at 11:33 AM Yoav Weiss <yoav...@chromium.org> wrote:

Yoav Weiss

unread,
May 4, 2023, 4:39:28 AM5/4/23
to Andreas Haas, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
On Wed, May 3, 2023 at 8:05 PM Andreas Haas <ah...@google.com> wrote:
Hi Yoav,

I know that the API owners have much more experience than me when it comes to web APIs, and what kind of code gets shipped on the web. So when you think that some limit for synchronous WebAssembly compilation is warranted, then I believe you and accept it.

What is not clear to me yet if you are aware of the magnitudes we are talking about here.

The huge modules I measured contain around 200'000 functions, the newer modules a bit less because of inlining. The 80MB module was the main module for Photoshop, so that's already a really big application. Also, it's a quite old legacy application, so I guess there is also quite some dead code in this module. As it turns out, such a big module has many downsides, from loading time to debugging and so on, so there are plans to support splitting such modules into smaller modules to make the handling easier. So my intuition is that modules will not get significantly bigger than 80MB. Are you worried that modules could get much bigger than 80MB and 200'000 functions, and that's why you think a limit is warranted?

I think a limit is warranted as a guardrail (or a pre-baked intervention, if you will). We want to make sure that the user experience on the web is a good one, even once developers start using jitted WASM modules en-masse.
I agree that loading an 8MB (let alone an 80MB) module is already likely to create significant drag on the user experience, just from its loading. But as long as it's not blocking the main thread, one can imagine e.g. sites providing users with something else to do (or at worst, updating them on the loading status) while this happens. That cannot be the case if the main thread is blocked for large periods of time.
 

Or do you have a world in mind where developers load many external libraries, and that the loading of these external libraries will cause too much blocking of the main thread? If libraries are loaded, how big would such libraries typically be? I mean, a 8MB module would probably also contain around 20'000 functions, which seems quite a lot to me. How many libraries with 20'000 functions each are to be expected?

I don't know. If these limits are never reached, that's great. It also means they have no meaningful cost on developers.
 

When I did my measurements I thought I showed them that even modules which anyway get compiled with streaming compilation can get compiled reasonably fast with synchronous compilation. I do not assume that any meaningful webpage will ever compile such big modules synchronously, simply because with the currently existing tools it is hard to accumulate such an amount of code and use synchronous compilation. As far as I understand, you do seem to assume that if unlimited synchronous compilation exists, then it will also be used for the biggest modules. Is that true? Do you make this assumption just to be on the safe side, or do you have other reasons.

As I explained above, we need to provide guardrails to prevent developers from shooting their users in the foot, due to lack of awareness or asymmetry between the devices they test their sites on, and the ones used by their users.
 

One more question: Assuming there would not be a limit, and the main thread does get blocked by synchronous compilation. Does any single jank matter, no matter when it happens, or is it repeated janks that should be avoided? My assumption was that having one jank during startup would not matter so much, and that avoiding repeated janks later during the app execution would matter more. Especially since other steps during startup like downloading the wasm module would take much longer.

Neither is great. Main thread jank that prevents the user from interacting with the page is arguably worse, but I'm not sure how that fits into the calculus.
 

I guess these are a lot of questions now, but I would like to understand your thought process better for future projects.

Happy to chat about it if that helps! :)

PhistucK

unread,
May 4, 2023, 4:59:01 AM5/4/23
to Yoav Weiss, Andreas Haas, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Could a middle ground be having a developer tools toggle, or a command line flag, to remove the limit altogether in case this is needed during development?

PhistucK


To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAL5BFfXKZ0%2BtNXpgN7w_r%2BjOz3TDRhuwm7dAHXwzJ5-q19AxNw%40mail.gmail.com.

Andreas Haas

unread,
May 4, 2023, 6:40:25 AM5/4/23
to PhistucK, Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo
Hi Yoav, PhistucK,

I think I realized now where some of the confusion and misunderstanding came from:

My goal was to remove the limit because it's a spec violation, and I don't see a justification to keep the limit. There are some use cases like testing and user-space JIT compilers, but in both cases a higher limit is sufficient.

So, to become spec compliant for me the default was to implement the current spec and to remove the limit unless there is a reason to keep it.

I guess for the API owners it's the other way around, removing the limit may allow unwanted behavior, so for you the default would be to keep the limit, unless there is a use case to remove the limit.

In a practical sense there may not be a difference between a limit of 8MB and not having a limit at all.

My problem now is that with the higher limit we still violate the spec, an with the spec test I introduced during this discussion the spec violation is even more visible. As someone wrote before, a solution to this problem could be to change the spec, but the same as there is no reason to keep or remove the limit in Chrome, there is no reason to introduce such a limit into the spec.

Cheers, Andreas

Mike Taylor

unread,
May 4, 2023, 1:13:25 PM5/4/23
to Andreas Haas, PhistucK, Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Elliott Sprehn, Shu-yu Guo

On 5/4/23 6:40 AM, 'Andreas Haas' via blink-dev wrote:

Hi Yoav, PhistucK,

I think I realized now where some of the confusion and misunderstanding came from:

My goal was to remove the limit because it's a spec violation, and I don't see a justification to keep the limit. There are some use cases like testing and user-space JIT compilers, but in both cases a higher limit is sufficient.
Adding a flag or setting in DevTools at least for the testing use-case does seem useful.


So, to become spec compliant for me the default was to implement the current spec and to remove the limit unless there is a reason to keep it.

I guess for the API owners it's the other way around, removing the limit may allow unwanted behavior, so for you the default would be to keep the limit, unless there is a use case to remove the limit.

In a practical sense there may not be a difference between a limit of 8MB and not having a limit at all.

My problem now is that with the higher limit we still violate the spec, an with the spec test I introduced during this discussion the spec violation is even more visible. As someone wrote before, a solution to this problem could be to change the spec, but the same as there is no reason to keep or remove the limit in Chrome, there is no reason to introduce such a limit into the spec.

This has already been noted elsewhere, but I'll try to reinforce. It's entirely OK for us to make choices on behalf of our users, even if a spec says otherwise. Specs change all the time (and frequently have bugs in them). My POV here wouldn't be that we're violating the spec, but that the spec allows for a potentially harmful user experience. At the very least, there's a reasonable argument to be made that the limit should be implementation defined, if other engines have different thresholds on blocking the main thread. And maybe an informative note explaining what the consequences of having no limits might be.

Another path is to attempt to spec the 8MB limit, and perhaps it can be made larger in the future - acknowledging that convincing the other engines to agree requires non-zero effort.

Elliott Sprehn

unread,
May 5, 2023, 12:13:30 AM5/5/23
to Andreas Haas, PhistucK, Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Shu-yu Guo


On Thu, May 4, 2023 at 6:40 AM Andreas Haas <ah...@google.com> wrote:
Hi Yoav, PhistucK,

[...]
 
My problem now is that with the higher limit we still violate the spec, an with the spec test I introduced during this discussion the spec violation is even more visible. As someone wrote before, a solution to this problem could be to change the spec, but the same as there is no reason to keep or remove the limit in Chrome, there is no reason to introduce such a limit into the spec.


Multiple experts in web performance have given reasons to keep the limit in this thread. That would be the reason to introduce it to the spec.

- E

Chris Harrelson

unread,
May 9, 2023, 5:00:35 PM5/9/23
to Elliott Sprehn, Andreas Haas, PhistucK, Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Shu-yu Guo
LGTM3 to change the limit to 8MB, for the reasons Andreas outlined (maximum function size, reasonable runtime on a low-end phone).

Also, I can totally see increasing the limit in the future, as the implementation is optimized further, typical hardware speeds increase, or there are compelling examples of developer use cases that come up. For now, 8MB seems a reasonable & conservative choice.

Thanks Andreas for your patience on this thread, and for providing such useful data points!

Chris

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.

Andreas Haas

unread,
May 9, 2023, 5:21:35 PM5/9/23
to Chris Harrelson, Elliott Sprehn, PhistucK, Yoav Weiss, Alex Russell, blink-dev, Philip Jägenstedt, Ian Kilpatrick, Chris Wilson, Jeffrey Yasskin, Shu-yu Guo
Thank you very much!
Reply all
Reply to author
Forward
0 new messages