JITed very simple method.

222 views
Skip to first unread message

John Hening

unread,
Feb 3, 2018, 6:31:28 PM2/3/18
to mechanica...@googlegroups.com
public static int dontOptimize;


public static int simple(){
 
return dontOptimize;
}

And JITed version of that:


 
0x00007fdc75112aa0: mov    %eax,-0x14000(%rsp)
 
0x00007fdc75112aa7: push   %rbp
 
0x00007fdc75112aa8: sub    $0x30,%rsp
 
0x00007fdc75112aac: mov    $0x7fdc7443be90,%rax  ;   {metadata(method data for {method} {0x00007fdc7443bb30} 'simple' '()I' in 'Main')}
 
0x00007fdc75112ab6: mov    0xdc(%rax),%esi
 
0x00007fdc75112abc: add    $0x8,%esi
 
0x00007fdc75112abf: mov    %esi,0xdc(%rax)
 
0x00007fdc75112ac5: mov    $0x7fdc7443bb30,%rax  ;   {metadata({method} {0x00007fdc7443bb30} 'simple' '()I' in 'Main')}
 
0x00007fdc75112acf: and    $0x1ff8,%esi
 
0x00007fdc75112ad5: cmp    $0x0,%esi
 
0x00007fdc75112ad8: je     0x00007fdc75112af7
 
0x00007fdc75112ade: mov    $0x76d2414f0,%rax  ;   {oop(a 'java/lang/Class' = 'Main')}
 
0x00007fdc75112ae8: mov    0x68(%rax),%eax    ;*getstatic dontOptimize
                                               
; - Main::simple@0 (line 11)

 
0x00007fdc75112aeb: add    $0x30,%rsp
 
0x00007fdc75112aef: pop    %rbp
 
0x00007fdc75112af0: test   %eax,0x1861060a(%rip)        # 0x00007fdc8d723100
                                               
;   {poll_return}
 
0x00007fdc75112af6: retq  





Though I know assembly I cannot understand a such simple code. Especially, what does lines between 0x00007fdc75112aac-0x00007fdc75112ad8 means? I highlighted it.

Alex Blewitt

unread,
Feb 4, 2018, 6:28:54 AM2/4/18
to mechanica...@googlegroups.com

On 3 Feb 2018, at 23:31, John Hening <goci...@gmail.com> wrote:

public static int dontOptimize;


public static int simple(){
 
return dontOptimize;
}

And JITed version of that:


 
0x00007fdc75112aa0: mov    %eax,-0x14000(%rsp)
 
0x00007fdc75112aa7: push   %rbp
 
0x00007fdc75112aa8: sub    $0x30,%rsp
 
0x00007fdc75112aac: mov    $0x7fdc7443be90,%rax  ;   {metadata(method data for {method} {0x00007fdc7443bb30} 'simple' '()I' in 'Main')}
 
0x00007fdc75112ab6: mov    0xdc(%rax),%esi
 
0x00007fdc75112abc: add    $0x8,%esi
 
0x00007fdc75112abf: mov    %esi,0xdc(%rax)
 
0x00007fdc75112ac5: mov    $0x7fdc7443bb30,%rax  ;   {metadata({method} {0x00007fdc7443bb30} 'simple' '()I' in 'Main')}
 
0x00007fdc75112acf: and    $0x1ff8,%esi
 
0x00007fdc75112ad5: cmp    $0x0,%esi
 
0x00007fdc75112ad8: je     0x00007fdc75112af7
 
0x00007fdc75112ade: mov    $0x76d2414f0,%rax  ;   {oop(a 'java/lang/Class' = 'Main')}
 
0x00007fdc75112ae8: mov    0x68(%rax),%eax    ;*getstatic dontOptimize
                                               
; - Main::simple@0 (line 11)

 
0x00007fdc75112aeb: add    $0x30,%rsp
 
0x00007fdc75112aef: pop    %rbp
 
0x00007fdc75112af0: test   %eax,0x1861060a(%rip)        # 0x00007fdc8d723100
                                               
;   {poll_return}
 
0x00007fdc75112af6: retq  





Though I know understand I cannot understand a such simple code. Especially, what does lines between 0x00007fdc75112aac-0x00007fdc75112ad8 means? I highlighted it.

This is profiling information. It is accessing the metadata for the method, and adding 8 each time it’s called. The comparison is checking for overflow of that counter, and jumping elsewhere if that’s the case. 

It’s effectively equivalent to:

if method.getMetadata().count++ > 0x1ff8 / 8 goto elsewhere

The purpose is to determine whether or not the method has been called a number of times (i.e. it is “hot”) and will then get called into the C2 compiler to re-optimise the method. 

Alex 

John Hening

unread,
Feb 4, 2018, 7:08:58 AM2/4/18
to mechanica...@googlegroups.com
Alex,

thanks for your response. But, how to know that? I suspected that it is a kind of statistic (because it is metadata, it is incremented every time), but I didn't know what is that exactly.

P.S. Obviously, a such counter is thread-local? I don't see any synchronization here, so it must be thread-local, yeah?

Aleksey Shipilev

unread,
Feb 4, 2018, 3:18:05 PM2/4/18
to mechanica...@googlegroups.com, John Hening
On 02/04/2018 12:31 AM, John Hening wrote:
> |
> publicstaticintdontOptimize;
>
>
> publicstaticintsimple(){
>  returndontOptimize;
> }
> |
>
> And JITed version of that:

>   0x00007fdc75112aa0:mov    %eax,-0x14000(%rsp)
>   0x00007fdc75112aa7:push   %rbp
>   0x00007fdc75112aa8:sub   $0x30,%rsp
>   *_0x00007fdc75112aac_*:mov    $0x7fdc7443be90,%rax  ;  {metadata(method data
> for{method}{0x00007fdc7443bb30}'simple''()I'in'Main')}
>   0x00007fdc75112ab6:mov    0xdc(%rax),%esi

>
> Though I know understand I cannot understand a such simple code. Especially, what does lines between
> |0x00007fdc75112aac-|||_*0x00007fdc75112ad8*_| means? I highlighted it.

This is most likely the method compiled at level 2/3 (C1 with profiling), and the updates you see at
...aac are updates of profiling data. It is unlikely to be on hotpath -- if it is, this is the
performnace bug in tiered compilation machinery.

Zen question: How do you know you got the the version you have executed on hot path? With tiered
compilation, on-stack replacement, deoptimization, etc. there are much more than 1 version of the
method, JMH's -prof perfasm can be used to highlight what version is actually executing. For example:

public static int dontOptimize;

@Benchmark
@CompilerControl(CompilerControl.Mode.DONT_INLINE)
public int test() {
return dontOptimize;
}


Yields:

C1, level 1, org.openjdk.StaticBench::test, version 443 (36 bytes)

...
[Verified Entry Point]
3.16% 0x00007f63c91b0e80: mov %eax,-0x14000(%rsp)
3.35% 0x00007f63c91b0e87: push %rbp
1.65% 0x00007f63c91b0e88: sub $0x30,%rsp
2.39% 0x00007f63c91b0e8c: movabs $0x782737b50,%rax ; {oop(a &apos;java/lang/Class&apos;
= &apos;org/openjdk/StaticBench&apos;)}
2.86% 0x00007f63c91b0e96: mov 0x68(%rax),%eax ;*getstatic dontOptimize
; - org.openjdk.StaticBench::test@0 (line 44)
1.75% 0x00007f63c91b0e99: add $0x30,%rsp
1.26% 0x00007f63c91b0e9d: pop %rbp
0.82% 0x00007f63c91b0e9e: test %eax,0x1774025c(%rip) # 0x00007f63e08f1100
; {poll_return}
3.14% 0x00007f63c91b0ea4: retq

The method is trivial, and so compilation stops at level 1.

-XX:-TieredComplation yields:

C2, org.openjdk.StaticBench::test, version 82 (37 bytes)

...
[Verified Entry Point]
7.11% 0x00007fbdb109c6c0: sub $0x18,%rsp
0.18% 0x00007fbdb109c6c7: mov %rbp,0x10(%rsp) ;*synchronization entry
; org.openjdk.StaticBench::test@-1 (line 44)
0.04% 0x00007fbdb109c6cc: movabs $0x782647980,%r10 ; {oop(a &apos;java/lang/Class&apos; =
&apos;org/openjdk/StaticBench&apos;)}
6.25% 0x00007fbdb109c6d6: mov 0x68(%r10),%eax ;*getstatic dontOptimize
; - org.openjdk.StaticBench::test@0 (line 44)
0x00007fbdb109c6da: add $0x10,%rsp
0.08% 0x00007fbdb109c6de: pop %rbp
27.58% 0x00007fbdb109c6df: test %eax,0xccef91b(%rip) # 0x00007fbdbdd8c000
; {poll_return}
0.02% 0x00007fbdb109c6e5: retq

...which is almost the same code, and this is why tiered compilation policy is happy with Level 1
compilation.

-Aleksey

signature.asc

John Hening

unread,
Feb 5, 2018, 3:58:03 PM2/5/18
to mechanical-sympathy
@Aleksey,

1. Why do you find it is a bug in tierred compilation machinery? It is not on my eye.  The compilation level is: C2, level 4

2. I have still a doubt: why profiling counter is not synchronized. What if 2 or more threads executing a function
simple()

Can you explain?

Aleksey Shipilev

unread,
Feb 6, 2018, 5:55:36 AM2/6/18
to mechanica...@googlegroups.com, John Hening
On 02/05/2018 09:58 PM, John Hening wrote:
> 1. Why do you find it is a bug in tierred compilation machinery? It is not on my eye.  The
> compilation level is: C2, level 4

Wait a minute, where exactly does it say "C2, level 4" for you?

I am saying the disassembly you have provided in the original message is probably level 2/3, and if
it is hot, that might be a bug in tiered compilation machinery: we are not supposed to spend a lot
of time in methods with profiling enabled.

But you haven't verified that disassembly is actually on the hot path. Level 4 code would be on
hotpath, and it would be without profiling.

> 2. I have still a doubt: why profiling counter is not synchronized. What if 2 or more threads
> executing a function

That is a race, so profile is not very accurate. We (mostly) do not care about that: there is a
tradeoff between profiling overhead and profile accuracy. (Yes, I know racy updates are quirky and
potentially lose the unbounded number of updates).

-Aleksey

signature.asc

John Hening

unread,
Feb 6, 2018, 4:11:44 PM2/6/18
to mechanical-sympathy
Aleksey, I make a mistake


The piece of code from the first post was taken with: XX:+PrintAssembly. There wasn't info about compilation level. So, I run it using JMH and I take the compilation level (= C2, 4) . However, the generated code was different.
Sorry for that.

Thanks your response :)

John


W dniu niedziela, 4 lutego 2018 00:31:28 UTC+1 użytkownik John Hening napisał:
Reply all
Reply to author
Forward
0 new messages