LLVM V8 Code Generator for WebAssembly

manji...@gmail.com

unread,

Nov 20, 2019, 7:02:34 PM11/20/19

to v8-dev

Hi,

I am working on a new code generator that based on LLVM, and generate code for both V8 built-ins and WebAssembly.

I try to compile the libffmpeg for video decoder, and find out 50% and more time reduction. Mainly because LLVM supports auto vectorize.

This project only support arm32 android for now.

Anyone feel interested may checkout the source from https://github.com/linzj/llvm-toy. And switch to the branch named arm-tf-7_8_279_17_725bcedcb69.

The hash at the end is the git commit hash.

Message has been deleted

manji...@gmail.com

unread,

Nov 21, 2019, 1:35:34 AM11/21/19

to v8-dev

Octane benchmark improvement, with no-opt manually.

在 2019年11月21日星期四 UTC+8上午8:02:34，manji...@gmail.com写道：

Yang Guo

unread,

Nov 21, 2019, 2:46:19 AM11/21/19

to v8-...@googlegroups.com

Interesting. Thanks for sharing!

I noticed that the source code you linked to have .gyp files. Which V8 version does it target? Do you have steps to build it and reproduce your benchmark results?

Cheers,

Yang

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/13052418-21de-4311-8244-008ee0a1db2f%40googlegroups.com.

manji...@gmail.com

unread,

Nov 21, 2019, 6:48:29 PM11/21/19

to v8-dev

The lastest version is for 7.8. And the hash at the last field of the branch name is the hash for git.

You can checkout the version using this hash, like git checkout 725bcedcb69.

You should checkout the LLVM version 8, the final release version, then patch it with the llvm-patch-by-far.patch, and build it to libLLVM-8.so.The build script can be referred to config32.sh.

Then patch v8 with v8.patch.

Then rsync src/ to v8/src/.

Then copy libLLVM-8.so to v8/lib/

Then you can build it with chrome. Oh, don't forget to add "UC_BUILD_TF_LLVM_BACKEND" to defines.

在 2019年11月21日星期四 UTC+8下午3:46:19，Yang Guo写道：

Interesting. Thanks for sharing!

I noticed that the source code you linked to have .gyp files. Which V8 version does it target? Do you have steps to build it and reproduce your benchmark results?

Cheers,

Yang

On Thu, Nov 21, 2019 at 7:35 AM <manji...@gmail.com> wrote:

Octane benchmark improvement, with no-opt manually.

在 2019年11月21日星期四 UTC+8上午8:02:34，manji...@gmail.com写道：
Hi,
I am working on a new code generator that based on LLVM, and generate code for both V8 built-ins and WebAssembly.
I try to compile the libffmpeg for video decoder, and find out 50% and more time reduction. Mainly because LLVM supports auto vectorize.
This project only support arm32 android for now.
Anyone feel interested may checkout the source from https://github.com/linzj/llvm-toy. And switch to the branch named arm-tf-7_8_279_17_725bcedcb69.
The hash at the end is the git commit hash.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to v8-...@googlegroups.com.

manji...@gmail.com

unread,

Nov 25, 2019, 8:35:39 PM11/25/19

to v8-dev

You may test the arm-6-9 branch first. The performance data show above comes from this branch. And it is so stable that it survived alibaba's 11.11.

mvst...@google.com

unread,

Nov 27, 2019, 9:53:13 AM11/27/19

to v8-dev

Hi, I've created a tracking bug on which we'll apply some more investigation (https://bugs.chromium.org/p/v8/issues/detail?id=10023). We'll have a detailed look in January. Are you interested in upstreaming your work? We're interested in using this for the built-ins and interpreter bytecode handlers, as part of the mksnapshot build step. We wouldn't try to do this at runtime.

All the best,

--Michael Stanton

--V8 Compiler Team, Manager

manji...@gmail.com

unread,

Dec 2, 2019, 1:10:33 AM12/2/19

to v8-dev

Yes, thanks. I am maintaining this feature, and just only me. I feel tired to fix all kinds of bug when new feature arise. Like ephemorons, and deleting isolate parameter from RecordWrite.

在 2019年11月27日星期三 UTC+8下午10:53:13，mvst...@google.com写道：

manji...@gmail.com

unread,

Dec 2, 2019, 1:12:32 AM12/2/19

to v8-dev

I don't check the mail in time, sorry for the late respond.

在 2019年11月27日星期三 UTC+8下午10:53:13，mvst...@google.com写道：

Hi, I've created a tracking bug on which we'll apply some more investigation (https://bugs.chromium.org/p/v8/issues/detail?id=10023). We'll have a detailed look in January. Are you interested in upstreaming your work? We're interested in using this for the built-ins and interpreter bytecode handlers, as part of the mksnapshot build step. We wouldn't try to do this at runtime.

manji...@gmail.com

unread,

Dec 2, 2019, 1:31:26 AM12/2/19

to v8-dev

We don't use this at runtime either. The compile time is horrible, and v8's binary size will grow up dramatically.

But we do try it for wasm. And make the binary performance close to the clang version.

在 2019年11月27日星期三 UTC+8下午10:53:13，mvst...@google.com写道：

Hi, I've created a tracking bug on which we'll apply some more investigation (https://bugs.chromium.org/p/v8/issues/detail?id=10023). We'll have a detailed look in January. Are you interested in upstreaming your work? We're interested in using this for the built-ins and interpreter bytecode handlers, as part of the mksnapshot build step. We wouldn't try to do this at runtime.

manji...@gmail.com

unread,

Dec 4, 2019, 3:44:54 PM12/4/19

to v8-dev

V8 have safe points record for c-call functions now. I am tracking a gc crash problem for days and finally find this change from 69.

manji...@gmail.com

unread,

Jan 7, 2020, 5:44:21 AM1/7/20

to v8-dev

Hi,
Any progress？ I am looking forward to upstream my compiler. And please grant me the permission to the bug tracking system.

Michael Stanton

unread,

Jan 7, 2020, 6:56:44 AM1/7/20

to v8-...@googlegroups.com

I've added your email on 'cc to the tracking bug. We scheduled to have a look in mid-January, and we'll update the tracking bug when we have more information.

On Tue, Jan 7, 2020 at 11:44 AM <manji...@gmail.com> wrote:

Hi,
Any progress？ I am looking forward to upstream my compiler. And please grant me the permission to the bug tracking system.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---

You received this message because you are subscribed to a topic in the Google Groups "v8-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/v8-dev/JvXutIAFQvw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/ecef11b7-82dc-468c-8a10-10ab27f96ed3%40googlegroups.com.

--

Michael Stanton

Manager, V8 Compiler Team

mvst...@google.com

Google Germany GmbH

Erika-Mann-Straße 33

80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Falls sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde.

This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.

manji...@gmail.com

unread,

Jan 9, 2020, 7:21:04 PM1/9/20

to v8-dev

2020-01-09 17-56-56 的屏幕截图.png

2020-01-09 17-59-12 的屏幕截图.png

Here's the profiler output when I test the following test on a Huawei 荣耀9 cell phone, the v8 version is 7.8:

<body>

(function() {

let a = [];

let i;

let t0 = performance.now();

for (i = 0; i < 0x1000000; ++i) {

a.push(i);

}

let t1 = performance.now();

console.log('time: ' + (t1 - t0));

})()

</script>

</body>

I feel glad to see the LdaNamedProperty has the amazing improvment, which is my original optimize target.

The compiler working on 7.8 version is pretty stable now, I have fixed many bugs these days.

在 2019年11月21日星期四 UTC+8上午8:02:34，manji...@gmail.com写道：

Hi,

Dan Elphick

unread,

Jan 10, 2020, 5:54:27 AM1/10/20

to v8-...@googlegroups.com

These results look very impressive. Do you think you could post the generated assembly code for something simple like StackCheck with and without llvm generation? I wonder if it's improving the stack check itself of the dispatch.

Thanks,

Dan

--

--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---

You received this message because you are subscribed to the Google Groups "v8-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/68666882-6e5c-478f-b067-4eb140c403da%40googlegroups.com.

manji...@gmail.com

unread,

Jan 12, 2020, 6:51:12 PM1/12/20

to v8-dev

Hi Dan,

Here it is.

And both disable the untrusted code migitation.

在 2020年1月10日星期五 UTC+8下午6:54:27，Dan Elphick写道：

To unsubscribe from this group and stop receiving emails from it, send an email to v8-...@googlegroups.com.

code.7z

Dan Elphick

unread,

Jan 13, 2020, 6:27:04 AM1/13/20

to v8-...@googlegroups.com

Thanks for sending this. There are some really interesting results here.

For instance in StackCheck, TF generates this sequence:

0x4560805c 1c e30f903d movw r9, #61501

0x45608060 20 e34f9fff movt r9, #65535

0x45608064 24 e08a9009 add r9, r10, r9

0x45608068 28 e5999000 ldr r9, [r9, #+0]

where LLVM generates:

0x455f61fc 1c e51a0fc3 ldr r0, [r10, #-4035]

This is definitely something we could do better in TF.

We did notice some problems though in the generated code. For SetPendingMessage, TF generates:

0x45608140 0 e30015f1 movw r1, #1521

0x45608144 4 e08a1001 add r1, r10, r1

0x45608148 8 e5913000 ldr r3, [r1, #+0]

0x4560814c c e5810000 str r0, [r1, #+0]

0x45608150 10 e2855001 add r5, r5, #1

0x45608154 14 e7d61005 ldrb r1, [r6, +r5]

0x45608158 18 e7982101 ldr r2, [r8, +r1, lsl #2]

0x4560815c 1c e1a00003 mov r0, r3

0x45608160 20 e12fff12 bx r2

0x45608164 24 e1a00000 mov r0, r0

whereas LLVM generates:

0x455f62c0 0 e2855001 add r5, r5, #1

0x455f62c4 4 e58a05f1 str r0, [r10, #+1521]

0x455f62c8 8 e7d60005 ldrb r0, [r6, +r5]

0x455f62cc c e7981100 ldr r1, [r8, +r0, lsl #2]

0x455f62d0 10 e59a05f1 ldr r0, [r10, #+1521]

0x455f62d4 14 e12fff11 bx r1

Again, this highlights how TF is not generating the most efficient instruction sequences for loads, but it looks like there's a problem in the LLVM code. It's loading into the accumulator (r0) the value it has just stored there, which means the accumulator will be unchanged as a result of this bytecode, but if you look at the original source, SetPendingMessage sets the accumulator to previous pending message. This is what happens in the TF case.

Dan

To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/70cc42d0-1ee6-4448-8e16-2bcafcfefaf9%40googlegroups.com.

manji...@gmail.com

unread,

Jan 13, 2020, 8:40:50 AM1/13/20

to v8-dev

Oh, you find a bug!
I will fix it tomorrow.
That's caused by the recent patch that make LLVM thinks root register points to a readonly memory region.

Actually a safe point recording bug has just been fixed today.

manji...@gmail.com

unread,

Jan 13, 2020, 8:49:16 AM1/13/20

to v8-dev

LLVM generates worse code
like v8. It's a bug. I make it realize GetElementPtr IR can use (-4096, 4096) constant at zero cost.

manji...@gmail.com

unread,

Jan 13, 2020, 9:27:50 AM1/13/20

to v8-dev

How do you find this bug so quick?
And note that LdaNamedProperty is 1/3 smaller than tf one. That should explain why the time it consumes is 1/3 less.

Dan Elphick

unread,

Jan 13, 2020, 10:35:17 AM1/13/20

to v8-...@googlegroups.com

It was in the builtin next to StackCheck and was quite short and so seemed worth looking at + the original interpreter authors were standing next to me.

On Mon, 13 Jan 2020 at 14:27, <manji...@gmail.com> wrote:

How do you find this bug so quick?
And note that LdaNamedProperty is 1/3 smaller than tf one. That should explain why the time it consumes is 1/3 less.

--
--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/93a15c6d-feb4-4950-ba40-c413a9ae437d%40googlegroups.com.

manji...@gmail.com

unread,

Jan 13, 2020, 9:21:29 PM1/13/20

to v8-dev

Here's the fixed code.

SetPendingMessage breaks the assumption of that root pointer refers to a readonly region.

The screen captures are the comparison of turbofan and LLVM, LLVM still does better in most builtin-functions.

LLVM's LdaNamedProperty still uses 31.79% less time than turbofan, and although the fixed version is 20 bytes larger than the old one, it still 24.70% smaller than the turbofan version.

I think V8 should make the memory region that root pointer refers to readonly. That will enable LLVM generate better code.

code_llvm.7z

2020-01-14 10-02-11 的屏幕截图.png

2020-01-14 10-02-10 的屏幕截图.png

manji...@gmail.com

unread,

Jan 14, 2020, 4:49:05 AM1/14/20

to v8-dev

This code reenable root region rematerialize, the last one has a problem and disable rematerialize.

I have not yet get my cell phone back, so it has not been tested.

code_llvm.7z

Dan Elphick

unread,

Jan 14, 2020, 5:28:27 AM1/14/20

to v8-...@googlegroups.com

In both LLVM builtin dumps you posted, I noticed that EphemeronKeyBarrier is significantly shorter for LLVM 80 bytes vs 348 for the original.

It looks like LLVM has optimised away several conditionals including the check for fp_mode instead assuming it's kSaveFPRegs. Since fp_mode is a parameter to the builtin this does not seem like a safe optimisation.

On Tue, 14 Jan 2020 at 09:49, <manji...@gmail.com> wrote:

This code reenable root region rematerialize, the last one has a problem and disable rematerialize.

I have not yet get my cell phone back, so it has not been tested.

--

--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/c93b4e21-d892-4f08-9414-dc67ff45cb1c%40googlegroups.com.

manji...@gmail.com

unread,

Jan 14, 2020, 10:24:34 AM1/14/20

to v8-dev

I make the prologue saves all non c's callee save float point registers. So that no matter what the fp_mode is all the float point registers are secured. But it's definitely not the most effective way. It's a conservative solution. We may optimize the call to RecordWrites in the code assemble phase, when the float point usage can be discovered.

Kurten Chan

unread,

Mar 5, 2020, 12:45:16 AM3/5/20

to v8-dev

Hi,

Any progress for this investigation?

In our fast start project maybe need this feature.

在 2019年11月27日星期三 UTC+8下午10:53:13，Michael Stanton写道：

Dan Elphick

unread,

Mar 6, 2020, 4:29:34 AM3/6/20

to v8-...@googlegroups.com

Hi there,

The tracking bug for this effort is https://bugs.chromium.org/p/v8/issues/detail?id=10023, which has some further discussion adding to what’s in this thread. We’ve made several changes to TurboFan generation in response to this and we’re always happy to accept patches or look at specific cases where our pipeline is deficient.

However at this point, we don't have the resources to commit to a project of this size. We're still open to the concept of using LLVM for builtin generation and would accept compelling patches, but to put this in context, we expect a change of this nature would incur significant costs in terms of maintaining patches to LLVM as well as greatly complicating the build processes for V8 and Chrome.

Thanks again for your impressive work and proving that this is possible!

--

--
v8-dev mailing list
v8-...@googlegroups.com
http://groups.google.com/group/v8-dev
---
You received this message because you are subscribed to the Google Groups "v8-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to v8-dev+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/v8-dev/c2f335e1-7e66-4ccc-a6a2-2d6a0ea7925e%40googlegroups.com.

manji...@gmail.com

unread,

Mar 8, 2020, 10:02:29 PM3/8/20

to v8-dev

I think this message should refer to me.

Ok, I understand. It's too complicated to invovle other big project like LLVM. I have done a lot modification on it, including codegen/arch/analysis... So many people own these different modules should get involved.

It's much simpler for the third party to handle these jobs.

I will continue this project on UC Browser, however. It is already RTM to UC Browser and Taobao Mobile/Alipay Mobile in the version 6.9, I will continue to improve in the upcomming version 7.8.

在 2020年3月6日星期五 UTC+8下午5:29:34，Dan Elphick写道：

To unsubscribe from this group and stop receiving emails from it, send an email to v8-...@googlegroups.com.

Reply all

Reply to author

Forward