likely and unlikely macros

357 views
Skip to first unread message

Carfield Yim

unread,
Jun 24, 2018, 8:52:01 AM6/24/18
to mechanica...@googlegroups.com
HI all

Just know there is likely and unlikely macros at Linux kernel, https://stackoverflow.com/questions/109710/how-do-the-likely-and-unlikely-macros-in-the-linux-kernel-work-and-what-is-t, just wonder if there similar optimization support in JVM? Or something like this is not relevance to high level language?

Regards

Sergey Melnikov

unread,
Jun 24, 2018, 9:51:41 AM6/24/18
to mechanical-sympathy
Hi,
In fact, HotSpot uses a bit wide approach. The key concept here (in general) is code layout. If you take a look at x86 assembly there are a plenty of options to emit plain if statement. For example:

1)
cmp ...
je CONTINUE
//// if's body
CONTINUE:
// next statement

2)
cmp ...
jne IF_BODY /// jne instead of je!!
CONTINUE:
// next statement
....tonns of code
IF_BODY:
// if's body
jmp CONTINUE

From performance point of view, each of these options has unique pros&cons. For example, the first options the most suitable for cases if if's body executed relatively frequently. And the second options is more suitable for cases if if's body executes rarely.

So, gcc doesn't have any idea if if's body will be executed often or rarely. Therefore, likely/unlikely are intrinsics which helps gcc's backend align code proper way.

Let's return back to JVM. At profiling phase HotSpot emits counters for gathering information how frequently each branch has been executed. Therefore, in comparison to gcc, HotSpot/C2 has an additional info which statements executes most times and which statements executes rarely. C2's codegen uses this info for emutting the fastest layout of branches.

And, back to your question. There is no need for these intrinsics in HotSpot JVM. Most of these aspects are already handled in HotSpot/C2. Nevertheless, code layout is critically important for performance, especially for such alignment-critical platform as modern x86 is.

--Sergey

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

--Sergey

Ivan Bogouchev

unread,
Jun 25, 2018, 5:46:41 PM6/25/18
to mechanica...@googlegroups.com

On Sun, 24 Jun 2018 at 14:51, Sergey Melnikov <melnikov.serg...@gmail.com> wrote:
<snip/>
Let's return back to JVM. At profiling phase HotSpot emits counters for gathering information how frequently each branch has been executed. Therefore, in comparison to gcc, HotSpot/C2 has an additional info which statements executes most times and which statements executes rarely. C2's codegen uses this info for emutting the fastest layout of branches.

And, back to your question. There is no need for these intrinsics in HotSpot JVM. Most of these aspects are already handled in HotSpot/C2. Nevertheless, code layout is critically important for performance, especially for such alignment-critical platform as modern x86 is.



Well, it's not always about overall speed. Sometimes the path we care about and where latency is critical is not the most frequently executed one.
This has been discussed on the SG14 group list around standardising likely/unlikely gcc attributes [1].

Unfortunately I am not aware of any way to control the code emitted by C2 to favour a codepath that is actually not the one being executed the most.

--
[1] https://groups.google.com/a/isocpp.org/forum/#!msg/sg14/ohFcWdlvrh0/dPrLh5AdAgAJ

Wojciech Kudla

unread,
Jun 25, 2018, 6:06:51 PM6/25/18
to mechanica...@googlegroups.com
So far in Hotspot we've had compiler commands, which is basically a set of simple directives that allows one to exercise some amount of control over the compiler behaviour. 
Their most commonly used feature was probably 'dontinline'. 
Inlining depth is another example of assuming control and preventing cascading deopts from propagating downwards. 

With the advent of Java 9 and compiler interface the possibilities are much wider. Up to the point where you can write your own compilers. 

There is also AOT compilation in Java for Linux which presents yet another option to enforce stricter control over code layout and instructions employed. 

And of course there is whole range of JVM parameters (-XX:...) driving compiler behavior and profiling. 

Hope this helps 

Gil Tene

unread,
Jun 28, 2018, 1:23:49 AM6/28/18
to mechanical-sympathy
I've been kicking around / proposing the idea of something like a Thread.thisPathShouldBeConsideredHot() hint call, which (similar to Thread.onSpinWait()) would be valid to implement as a no-op, and would act as a hint to the JVM that sequential code path it is in should be given a high weight consideration when performing optimizations. The idea would be that you would place this call in rarely-executing-but-speed-critical paths in the code. As in:

if (lookForOneInAMillionOpportunityToMakeTonsOfMoney() == YEAH_BABY) {
    tradeQuicklyBeforeAnyoneBeatsMeToIt
();
   
Thread.thisPathShouldBeConsideredHot();
} else {
    dangThisIsBoring
++;
    cleanFishTank
.doIncrement();
    rearrangeItemsInCloset
.sinceIHaveNothingBetterToDoNow();
}

Carfield Yim

unread,
Jul 1, 2018, 11:48:50 AM7/1/18
to mechanica...@googlegroups.com
Yes it would be excellent!

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Avi Kivity

unread,
Jul 1, 2018, 1:34:57 PM7/1/18
to mechanica...@googlegroups.com, Carfield Yim

Here's a measurement of 3% improvement from applying such an annotation to one line:


https://groups.google.com/d/msg/seastar-dev/g0FD6NVQ_AI/HNRLcIMOAQAJ

--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Kirk Pepperdine

unread,
Jul 2, 2018, 6:00:13 AM7/2/18
to mechanica...@googlegroups.com
Indeed and what about

Thread.itMightBeNiceIfYouCouldAlignOnInstructionBufferBoundry();
for(int I = 0; I < reallyBigNumber; I++)
   executeMyLoopBody();

Kind regards,
Kirk



To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.

Steven Stewart-Gallus

unread,
Jan 19, 2019, 11:43:32 PM1/19/19
to mechanica...@googlegroups.com
You can use a MutableCallSite with MethodHandles.constant if you really need to to create "mostly final" variables (read this thread https://groups.google.com/forum/#!topic/mechanical-sympathy/ERhCVVjzxt0 ). In addition null checks are heavily optimised by the JVM to use segmentation faults and so can be slightly faster than boolean checks sometimes. As well, JVM inlining handle this quite naturally.

Altogether the full pattern is something like the following but with exceptions and I typed this off the top of my head so I probably got the ByteBuddy details incorrect:

public abstract class Trap {
   
public static Trap newInstance(CallSite cs) {
       
var clazz = return new ByteBuddy()
       
.subclass(Trap.class, Modifier.PUBLIC | Modifier.FINAL)
       
.defineMethod("getValue", Object.class, Modifier.PUBLIC | Modifier.FINAL)
        .intercept(InvokeDynamic::/* some other stuff I don't remember right now*/)
       
.defineField("methodHandle", CallSite.class, Modifier.PUBLIC | Modifer.FINAL)      
        .make()
        .load()
       
.getLoaded();
       clazz
.getField("methodHandle").set(null, cs);
       
return clazz.newInstance();
   
}
   
public abstract Object getValue();
}

public static final MutableCallSite CS = new MutableCallSite(MethodHandles.constant(Object.class, ""));
public static final Trap PEOPLE_ARE_DOING_TRICKY_REFLECTION_STUFF_DEOPTIMIZE = Trap.newInstance(CS);

public void mymethod() {
    fastpath
: {
   
if (
PEOPLE_ARE_DOING_TRICKY_REFLECTION_STUFF_DEOPTIMIZE.getValue() == null) {
       
break fastpath;
   
}
   
// rest of code
   
return;
   
}
    handleSlowpath();
}


But this is probably overkill. Also it depends on the situation. If you need to optimise for a common subclass of an abstract class you can use an explicit check of instanceof.

if (value instanceof Bar) {
   
((Bar)value).foo();
} else {
    value
.foo();
}

There's probably a few other sorts of tricks as well.

Reply all
Reply to author
Forward
0 new messages