Reasons for optimizations not supported by TurboFan

doit_man

unread,

Sep 13, 2023, 12:51:18 AM9/13/23

to v8-dev

Hi v8 developers,

I’m curious about the specific optimizations that TurboFan currently not support.

Other ahead-of-time (AOT) compilers perform a wide range of optimizations, while TurboFan comparatively has fewer.

For instance, the LLVM compiler employs optimization rules such as ```A+B --> A|B``` provided that A and B have no overlapping bits set.

Similarly, ```(A&B)|(A&C) —> A&(B|C)``` and ```(X * C1) / C2 —> X / (C2 / C1)``` are examples of optimizations present in LLVM but absent in TurboFan.

I think these kinds of rules could enhance TurboFan’s optimization capabilities.

I’m interested in understanding whether this absence is due to TurboFan’s current state of implementation, inherent limitations of a just-in-time (JIT) compiler, or if there are other factors at play.

Thanks.

dmerc...@google.com

unread,

Sep 13, 2023, 3:43:09 AM9/13/23

to v8-dev

Hi,

In general, LLVM and ahead-of-time compilers have all the time in the world to optimize a function, while Turbofan tries to save every millisecond it can (it's not quite true: LLVM also tries to compile somewhat quickly, but it's orders of magnitude slower that Turbofan). As a result, Turbofan does fewer optimizations than LLVM, in particular when they have non-linear cost. That being said, Turbofan is not set in stone, and we are always happy to add new optimizations, provided that their cost can be justified by the improvements in generated code.

> ```(A&B)|(A&C) —> A&(B|C)```

Optimizations such as this one are fairly cheap and straight-forward to implement. I'm guessing that the reason for their absence is that we didn't think about them or didn't see a specific case where they would improve the generated code. Feel free to submit patch to add them to Turboshaft's MachineOptimizationReducer (https://source.chromium.org/chromium/chromium/src/+/main:v8/src/compiler/turboshaft/machine-optimization-reducer.h).

> ```(X * C1) / C2 —> X / (C2 / C1)```

You probably meant "X * (C2 / C1)" instead of "X / (C2 / C1)". Such mistakes are a good argument against implementing such optimizations unless we see a clear use-case: if they are wrong but almost never used, then they might cause random crashes (or security issues) that would be very hard to debug.

Additionally, this simplification could be invalid depending on multiplication overflow, integer division and floating point approximations, which once again could easily introduce subtle bugs.

> ```A+B --> A|B``` provided that A and B have no overlapping bits set.

This one as well, I'm not sure we'd really want. First, it's probably rare that we know for a fact that 2 values don't have overlapping bits. And second, on most architectures that we support, an addition and a bitwise or have the same latency and throughput.

Best,

Darius

Marja Hölttä

unread,

Sep 13, 2023, 4:57:14 AM9/13/23

to v8-...@googlegroups.com

> > ```(X * C1) / C2 —> X / (C2 / C1)```

> You probably meant "X * (C2 / C1)" instead of "X / (C2 / C1)".

Nah they probably meant just that. (X * 2) / 10 is X / (10 / 2) aka X / 5, not X * (10 / 2) aka X * 5.

It would be the same as X * (C1 / C2) though, that's another possible simplification (if things don't overflow and so on).

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/9fcd075d-14b8-41f5-a793-7ddee3f4e322n%40googlegroups.com.

--

Google Germany GmbH

Erika-Mann-Straße 33

80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian.

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

Leszek Swirski

unread,

Sep 13, 2023, 5:28:30 AM9/13/23

to v8-dev

This (in both cases) is a trivially unsafe transformation though (if applied without any range analysis), since you can have:

X = MaxFloat64

C1 = 2

C2 = 4

(X*C1)/C2 = (MaxFloat64 * 2) / 4 = Infinity / 4 = Infinity
X*(C1/C2) = MaxFloat64 * (2 / 4) = MaxFloat64 * 0.5 = whatever half of MaxFloat64 is

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/CAED6dUCqPdxtE%2BGhbHfOvs5xeSauVZ5R9EXeutOOiZ%2BP4NdMqg%40mail.gmail.com.

doit_man

unread,

Sep 13, 2023, 9:35:10 PM9/13/23

to v8-dev

I appreciate your helpful responses.

It's evident that implementing optimizations in TurboFan entails careful consideration of potential unsafe scenarios.

I now recognize the significance of TurboFan's compilation time.

What steps can I take to validate the investment in optimization (cost of optimization)? I'm curious if there are specific benchmarks you rely on or if there are alternative methods.

Thank you

dmerc...@google.com

unread,

Sep 14, 2023, 3:02:44 AM9/14/23

to v8-dev

Apologies for my `X / (C2 / C1)`, and thanks Marja for pointing it out :)

> What steps can I take to validate the investment in optimization (cost of optimization)? I'm curious if there are specific benchmarks you rely on or if there are alternative methods.

In general, I'd recommend talking to us before spending months implementing an optimization that could turn out to not be worth it.

If you have an optimization ready and you want to benchmark it, you can look at its impact on JetStream2 and Speedometer2: performance of the generated code will be correlated to the score of these benchmarks, and the cost of running the optimization can be obtained by using the --turbo-stats flag. Depending on what the optimization does, using micro-benchmarks (or any other benchmark) to prove its usefulness can also be acceptable.

Best,

Darius

doit_man

unread,

Sep 14, 2023, 10:07:00 PM9/14/23

to v8-dev

Oh i understand. Thank you for your detailed and kind reply!!

Best regards.

Reply all

Reply to author

Forward