Need help for optimizing `invoke`

112 views
Skip to first unread message

Yichao Yu

unread,
Apr 21, 2015, 10:46:36 PM4/21/15
to juli...@googlegroups.com
Hi,

I've tried to optimize the `invoke` function in julia before
(PR#9642[1]) and got some (IMHO) pretty promising numbers. However,
with the massive change in the past 4 months and since I wasn't very
familar with julia internals at the time (not that I'm familar with it
now either...) the patch need a major refactor to apply to current
master.

Since there are multiple optmizations that can be applied in different
stages, I decide to start with sth simple first. I chose inlining the
call site in `codegen.cpp` because 1) it's cpp, 2) it's probably less
coupled to other parts (and I kind of want to see #10805[3] and
#9765[2] being addressed first...).

Anyway, I tried to port the commit I did this last time[4] to the
current master. Getting the cpp compile was not so hard (and thanks to
the new tuple type system, it's much easier to retrive the type
parameter passed in). However, I couldn't figure out how to generate
the call this time. The difference from last time is that I cannot
find a function to fill in `specFunctionObject` and
`emit_call_function_object` branch into `emit_jlcall` which give a
SEGFAULT when trying to cast a `Value*` to a `Function*` in
`prepare_call`. (not that I understand why is `jl_cstyle_compile`
necessary or what exactly was `cFunctionObject` last time either
though....)

I guess the performance of `invoke` is probably not very important but
I find it a fun thing to work on and can help me learn some julia
internals. It would be really nice if someone can give me some help on
implementing this.

Thanks.

Yichao Yu

[1] https://github.com/JuliaLang/julia/pull/9642
[2] https://github.com/JuliaLang/julia/issues/9765
[3] https://github.com/JuliaLang/julia/issues/10805
[4] https://github.com/yuyichao/julia/commit/1646afe3b70aa49eb636845d5eeba7061f7ae6f8?diff=unified
[5] https://github.com/yuyichao/julia/commit/1646afe3b70aa49eb636845d5eeba7061f7ae6f8?diff=unified#diff-6d4d21428a67320600faf5a1a9f3a16aR2567

Jameson Nash

unread,
Apr 23, 2015, 10:03:58 AM4/23/15
to juli...@googlegroups.com
It might help if you can break down your questions into smaller bits. There's a lot of content here to address, which is probably slowing down response.

The code layer I believe you are referring to was restructured with the introduction of ABI-complient cfunction support. The code to support c-compatible function signatures was split off into a separate piece of machinery. `specFunctionObject` (previously `cFunctionObject`) is a `Function*` to a method with a specialized call signature. By contrast, `functionObject` is a `Function*` to a method with a generic, JL_CALLABLE signature. If you look at emit_call_function_object (and gen_cfun_wrapper), you will see that it detects whether the specFunctionObject is available and generates an appropriate call site.

Yichao Yu

unread,
Apr 23, 2015, 10:47:21 AM4/23/15
to juli...@googlegroups.com
On Thu, Apr 23, 2015 at 10:03 AM, Jameson Nash <vtj...@gmail.com> wrote:
> It might help if you can break down your questions into smaller bits.
> There's a lot of content here to address, which is probably slowing down
> response.

Thanks for your response.

>
> The code layer I believe you are referring to was restructured with the
> introduction of ABI-complient cfunction support. The code to support
> c-compatible function signatures was split off into a separate piece of
> machinery. `specFunctionObject` (previously `cFunctionObject`) is a
> `Function*` to a method with a specialized call signature. By contrast,
> `functionObject` is a `Function*` to a method with a generic, JL_CALLABLE
> signature. If you look at emit_call_function_object (and gen_cfun_wrapper),
> you will see that it detects whether the specFunctionObject is available and
> generates an appropriate call site.

This is very useful info for me.

Starting from here and trying to break down my questions:

1. Is it true that the `specFunctionObject` (by which I mean calling
the underlying c function) is more effecient than `functionObject`
(supposedly because there's less box/unbox overhead?) so should I try
to emit a specialized c function call when possible?

2. Can I use `emit_call_function_object` to emit a call to the
specialized function for `invoke` (and maybe "fallback" to emitting a
jlcall if necessary)? It seems to require a non-null
`specFunctionObject` which I couldn't figure out how.

The code I have right now is
```
Value *_theF = literal_pointer_val((jl_value_t*)mfunc);
Value *_theFptr = (Value*)mfunc->linfo->functionObject;
return emit_call_function_object(mfunc, _theF, _theFptr, true,
args + 2, nargs - 2, ctx);
```

where `mfunc` is the specialized function for `invoke` (should be the
same one that is called with `jl_apply()` at the end of
`jl_gf_invoke`)

Jameson Nash

unread,
Apr 23, 2015, 11:00:29 AM4/23/15
to juli...@googlegroups.com
On Thu, Apr 23, 2015 at 10:47 AM Yichao Yu <yyc...@gmail.com> wrote:
On Thu, Apr 23, 2015 at 10:03 AM, Jameson Nash <vtj...@gmail.com> wrote:
> It might help if you can break down your questions into smaller bits.
> There's a lot of content here to address, which is probably slowing down
> response.

Thanks for your response.

>
> The code layer I believe you are referring to was restructured with the
> introduction of ABI-complient cfunction support. The code to support
> c-compatible function signatures was split off into a separate piece of
> machinery. `specFunctionObject` (previously `cFunctionObject`) is a
> `Function*` to a method with a specialized call signature. By contrast,
> `functionObject` is a `Function*` to a method with a generic, JL_CALLABLE
> signature. If you look at emit_call_function_object (and gen_cfun_wrapper),
> you will see that it detects whether the specFunctionObject is available and
> generates an appropriate call site.

This is very useful info for me.

Starting from here and trying to break down my questions:

1. Is it true that the `specFunctionObject` (by which I mean calling
the underlying c function) is more effecient than `functionObject`
(supposedly because there's less box/unbox overhead?) so should I try
to emit a specialized c function call when possible?
No. It is more efficient when it is available, but would be less efficient where it is not available. You only need an ABI-compliant function call if you are passing the result to C (it is not faster).

2. Can I use `emit_call_function_object` to emit a call to the
specialized function for `invoke` (and maybe "fallback" to emitting a
jlcall if necessary)? It seems to require a non-null
`specFunctionObject` which I couldn't figure out how.
Yes, that is what emit_call_function_object does now.

The code I have right now is
```
        Value *_theF = literal_pointer_val((jl_value_t*)mfunc);
        Value *_theFptr = (Value*)mfunc->linfo->functionObject;
        return emit_call_function_object(mfunc, _theF, _theFptr, true,
                                         args + 2, nargs - 2, ctx);
```

where `mfunc` is the specialized function for `invoke` (should be the
same one that is called with `jl_apply()` at the end of
`jl_gf_invoke`)
theFptr should be the same as in the existing emit_call case:
theFptr = emit_nthptr_recast(theFunc, (ssize_t)(offsetof(jl_function_t,fptr)/sizeof(void*)), tbaa_func, jl_pfptr_llvmt); 

Yichao Yu

unread,
Apr 23, 2015, 11:09:31 AM4/23/15
to juli...@googlegroups.com
On Thu, Apr 23, 2015 at 11:00 AM, Jameson Nash <vtj...@gmail.com> wrote:
>
>
> On Thu, Apr 23, 2015 at 10:47 AM Yichao Yu <yyc...@gmail.com> wrote:
>>
>> On Thu, Apr 23, 2015 at 10:03 AM, Jameson Nash <vtj...@gmail.com> wrote:
>> 1. Is it true that the `specFunctionObject` (by which I mean calling
>> the underlying c function) is more effecient than `functionObject`
>> (supposedly because there's less box/unbox overhead?) so should I try
>> to emit a specialized c function call when possible?
>
> No. It is more efficient when it is available, but would be less efficient
> where it is not available. You only need an ABI-compliant function call if
> you are passing the result to C (it is not faster).

I see. (and I guess this is why the 'gen_cfun_wrapper' function is not
used in codegen? (only as C-API))

>>
>> The code I have right now is
>> ```
>> Value *_theF = literal_pointer_val((jl_value_t*)mfunc);
>> Value *_theFptr = (Value*)mfunc->linfo->functionObject;
>> return emit_call_function_object(mfunc, _theF, _theFptr, true,
>> args + 2, nargs - 2, ctx);
>> ```
>>
>> where `mfunc` is the specialized function for `invoke` (should be the
>> same one that is called with `jl_apply()` at the end of
>> `jl_gf_invoke`)
>
> theFptr should be the same as in the existing emit_call case:
> theFptr = emit_nthptr_recast(theFunc,
> (ssize_t)(offsetof(jl_function_t,fptr)/sizeof(void*)), tbaa_func,
> jl_pfptr_llvmt);

Trying this.

The code I had (and was working) before was copied from the
`jl_apply_generic` case. What is the difference between my case and
that case?

Thanks.

Yichao Yu

Jameson Nash

unread,
Apr 23, 2015, 12:34:02 PM4/23/15
to juli...@googlegroups.com
The code I had (and was working) before was copied from the
`jl_apply_generic` case. What is the difference between my case and
that case?

I agree. they should be very similar.

Yichao Yu

unread,
Apr 23, 2015, 2:48:21 PM4/23/15
to juli...@googlegroups.com
On Thu, Apr 23, 2015 at 12:34 PM, Jameson Nash <vtj...@gmail.com> wrote:
>> The code I had (and was working) before was copied from the
> `jl_apply_generic` case. What is the difference between my case and
> that case?
>
> I agree. they should be very similar.

So I guess I still don't really understand this part very well but the
code you have works.

Here is the new PR[1]. It'll be nice if someone can review it =).

[1] https://github.com/JuliaLang/julia/pull/10964
Reply all
Reply to author
Forward
0 new messages