[llvm-dev] RFC: Memcpy inlining in IR

42 views
Skip to first unread message

Amara Emerson via llvm-dev

unread,
Jun 19, 2019, 11:42:35 PM6/19/19
to llvm-dev
Hi all,

For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level.

The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR lowering pass could be shared in future between SDAG and GISel. Does anyone see issues with this?

Thanks,
Amara
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Roman Lebedev via llvm-dev

unread,
Jun 20, 2019, 3:05:29 AM6/20/19
to Amara Emerson, llvm-dev
On Thu, Jun 20, 2019 at 6:42 AM Amara Emerson via llvm-dev
<llvm...@lists.llvm.org> wrote:
>
> Hi all,
>
> For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level.
>
> The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR lowering pass could be shared in future between SDAG and GISel. Does anyone see issues with this?
Sounds similar to https://reviews.llvm.org/D60318
It should be done *really* late in the middle-end pipeline though.

> Thanks,
> Amara
Roman.

Sjoerd Meijer via llvm-dev

unread,
Jun 20, 2019, 6:22:34 AM6/20/19
to Amara Emerson, Roman Lebedev, llvm-dev
Looks like there are a lot of opinions where memcpy expansion/inlining needs to happen: (late) IR, or if it is a backend problem, see also for example https://reviews.llvm.org/D35035. Complicating factor here is that efficient memcpy lowering is crucial for performance and code-size (and they occur a lot).

Either way, I agree that the TLI hooks are not SelectionDAG specific, they can be used in an IR lowering pass.

Cheers,
Sjoerd.


From: llvm-dev <llvm-dev...@lists.llvm.org> on behalf of Roman Lebedev via llvm-dev <llvm...@lists.llvm.org>
Sent: 20 June 2019 08:04
To: Amara Emerson
Cc: llvm-dev
Subject: Re: [llvm-dev] RFC: Memcpy inlining in IR
 

Amara Emerson via llvm-dev

unread,
Jun 20, 2019, 6:48:32 PM6/20/19
to Sjoerd Meijer, llvm-dev
I agree that this should be a very late pass. Doing it in the IR would simplify the implementation in GlobalISel, but it would also allow us to perhaps have one shared expansion/optimization pass between both SDISel and GISel.

Volkan may look at upstreaming a partial implementation he has downstream.

Cheers,
Amara

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Matt Arsenault via llvm-dev

unread,
Jun 20, 2019, 6:54:19 PM6/20/19
to Amara Emerson, llvm-dev

> On Jun 19, 2019, at 11:41 PM, Amara Emerson via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> Hi all,
>
> For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level.
>
> The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR lowering pass could be shared in future between SDAG and GISel. Does anyone see issues with this?
>
> Thanks,
> Amara

We already have lib/Transforms/Utils/LowerMemIntrinsics.cpp, there just isn’t a general pass that expands these for targets. AMDGPU already always use this for memcpy handling.


-Matt

Amara Emerson via llvm-dev

unread,
Jun 20, 2019, 7:06:15 PM6/20/19
to Matt Arsenault, llvm-dev

> On Jun 20, 2019, at 3:54 PM, Matt Arsenault <ars...@gmail.com> wrote:
>
>
>
>> On Jun 19, 2019, at 11:41 PM, Amara Emerson via llvm-dev <llvm...@lists.llvm.org> wrote:
>>
>> Hi all,
>>
>> For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level.
>>
>> The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR lowering pass could be shared in future between SDAG and GISel. Does anyone see issues with this?
>>
>> Thanks,
>> Amara
>
> We already have lib/Transforms/Utils/LowerMemIntrinsics.cpp, there just isn’t a general pass that expands these for targets. AMDGPU already always use this for memcpy handling.
>
>
> -Matt
>

Sure, that might end up sharing some code but the key thing is to use the TLI hooks to implement the same optimizations that SelectionDAG currently does.

Amara

David Chisnall via llvm-dev

unread,
Jun 21, 2019, 8:16:51 AM6/21/19
to llvm...@lists.llvm.org
For CHERI, we have to be quite careful with memcpy because any pointer
copy must be done with pointer load / store operations for all pointer
sized-and-aligned places. I believe I've now found four (maybe five?)
different places in LLVM where memcpy is expanded. Most of those are in
the IR, not SelectionDAG.

If you're expanding to loads and stores, it's much better to do this at
the IR level with SelectionDAG because you can insert flow control
structures (so can emit loops), which I don't believe is a problem for
GlobalISel. Some targets do not expand to loads and stores, for example
on some x86 variants memcpy is expanded to a single REP MOVSB
instruction. This is probably much easier to implement as a DAG pattern.

David

On 20/06/2019 23:48, Amara Emerson via llvm-dev wrote:
> I agree that this should be a very late pass. Doing it in the IR would
> simplify the implementation in GlobalISel, but it would also allow us to
> perhaps have one shared expansion/optimization pass between both SDISel
> and GISel.
>
> Volkan may look at upstreaming a partial implementation he has downstream.
>
> Cheers,
> Amara
>
>> On Jun 20, 2019, at 3:22 AM, Sjoerd Meijer <Sjoerd...@arm.com

>> <mailto:Sjoerd...@arm.com>> wrote:
>>
>> Looks like there are a lot of opinions where memcpy expansion/inlining
>> needs to happen: (late) IR, or if it is a backend problem, see also
>> for example https://reviews.llvm.org/D35035. Complicating factor here
>> is that efficient memcpy lowering is crucial for performance and
>> code-size (and they occur a lot).
>>
>> Either way, I agree that the TLI hooks are not SelectionDAG specific,
>> they can be used in an IR lowering pass.
>>
>> Cheers,
>> Sjoerd.
>>

>> ------------------------------------------------------------------------
>> *From:*llvm-dev <llvm-dev...@lists.llvm.org
>> <mailto:llvm-dev...@lists.llvm.org>> on behalf of Roman Lebedev
>> via llvm-dev <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>>
>> *Sent:*20 June 2019 08:04
>> *To:*Amara Emerson
>> *Cc:*llvm-dev
>> *Subject:*Re: [llvm-dev] RFC: Memcpy inlining in IR


>> On Thu, Jun 20, 2019 at 6:42 AM Amara Emerson via llvm-dev
>> <llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>> wrote:
>> >
>> > Hi all,
>> >
>> > For GlobalISel, we’re exploring options for implementing inlining optimizations for memcpy and friends. However, looking around the existing implementation, I don’t see anything that would particularly be problematic for us to do it at the IR level.
>> >
>> > The existing TLI hooks to specify how certain memcpy calls should be lowered doesn’t have anything too SelectionDAG specific, and an IR lowering pass could be shared in future between SDAG and GISel. Does anyone see issues with this?

>> Sounds similar tohttps://reviews.llvm.org/D60318


>> It should be done *really* late in the middle-end pipeline though.
>>
>> > Thanks,
>> > Amara
>> Roman.
>>
>> > _______________________________________________
>> > LLVM Developers mailing list

>> > llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>


>> >https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> _______________________________________________
>> LLVM Developers mailing list

>> llvm...@lists.llvm.org <mailto:llvm...@lists.llvm.org>

JF Bastien via llvm-dev

unread,
Jun 21, 2019, 12:13:47 PM6/21/19
to David Chisnall, llvm...@lists.llvm.org

> On Jun 21, 2019, at 5:16 AM, David Chisnall via llvm-dev <llvm...@lists.llvm.org> wrote:
>
> For CHERI, we have to be quite careful with memcpy because any pointer copy must be done with pointer load / store operations for all pointer sized-and-aligned places. I believe I've now found four (maybe five?) different places in LLVM where memcpy is expanded. Most of those are in the IR, not SelectionDAG.

Maybe I’m wrong, but it seems to me like as soon as there’s a memcpy on a pointer, the IR is incorrect for you target’s semantics. i.e. the language frontend should never emit a memcpy on pointers, and it would be wrong for your target to synthesize new memcpy from load / store on pointers. Clang definitely has the required information to respect these constraints, but I don’t think LLVM IR does.

Reply all
Reply to author
Forward
0 new messages