[llvm-dev] TableGen processing of target-specific intrinsics

78 views
Skip to first unread message

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 29, 2020, 3:23:28 PM9/29/20
to llvm...@lists.llvm.org
Each of the main TableGen files for the supported targets includes

include "llvm/Target/Target.td"

In turn, Target.td includes

include "llvm/IR/Intrinsics.td"

The final lines of Instrinsics.td are

include "llvm/IR/IntrinsicsPowerPC.td"
include "llvm/IR/IntrinsicsX86.td"
include "llvm/IR/IntrinsicsARM.td"
include "llvm/IR/IntrinsicsAArch64.td"
include "llvm/IR/IntrinsicsXCore.td"
include "llvm/IR/IntrinsicsHexagon.td"
include "llvm/IR/IntrinsicsNVVM.td"
include "llvm/IR/IntrinsicsMips.td"
include "llvm/IR/IntrinsicsAMDGPU.td"
include "llvm/IR/IntrinsicsBPF.td"
include "llvm/IR/IntrinsicsSystemZ.td"
include "llvm/IR/IntrinsicsWebAssembly.td"
include "llvm/IR/IntrinsicsRISCV.td"

Why does every target include the all the instrinsics for all the targets?

For example, when I process the ARC TableGen file, the records include 1,187 intrinsics for the X86.

_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Nicolai Hähnle via llvm-dev

unread,
Sep 30, 2020, 9:20:43 AM9/30/20
to Paul C. Anagnostopoulos, llvm-dev
Hi Paul,

the intrinsics of all backends together with the target-agnostic
intrinsics are all part of a single large enum, and there are some
subtle assumptions made e.g. about how the values in this enum are
sorted (in order to emit tables that are suitable for binary search).
So this is an area in which to tread carefully.

That said, I don't think there's any inherent reason why we
_shouldn't_ try to entangle this a bit.

I've been thinking for a while now that in a sense, the
target-specific intrinsics are like a dialect of LLVM IR. It would
make sense to treat them as such more explicitly.

Cheers,
Nicolai

--
Lerne, wie die Welt wirklich ist,
aber vergiss niemals, wie sie sein sollte.

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 30, 2020, 9:58:36 AM9/30/20
to Nicolai Hähnle, llvm-dev
If I decide to tackle this issue, I will certainly post a proposal first. I think it will be down the road a bit, as I learn more about how the system works.

Is there a place online where the results of a full build are posted?


At 9/30/2020 09:20 AM, Nicolai Hähnle wrote:
>Hi Paul,
>
>the intrinsics of all backends together with the target-agnostic
>intrinsics are all part of a single large enum, and there are some
>subtle assumptions made e.g. about how the values in this enum are
>sorted (in order to emit tables that are suitable for binary search).
>So this is an area in which to tread carefully.
>
>That said, I don't think there's any inherent reason why we
>_shouldn't_ try to entangle this a bit.
>
>I've been thinking for a while now that in a sense, the
>target-specific intrinsics are like a dialect of LLVM IR. It would
>make sense to treat them as such more explicitly.
>
>Cheers,
>Nicolai

_______________________________________________

Reid Kleckner via llvm-dev

unread,
Sep 30, 2020, 4:31:57 PM9/30/20
to Nicolai Hähnle, llvm-dev, Paul C. Anagnostopoulos
On Wed, Sep 30, 2020 at 6:20 AM Nicolai Hähnle via llvm-dev <llvm...@lists.llvm.org> wrote:
Hi Paul,

the intrinsics of all backends together with the target-agnostic
intrinsics are all part of a single large enum, and there are some
subtle assumptions made e.g. about how the values in this enum are
sorted (in order to emit tables that are suitable for binary search).
So this is an area in which to tread carefully.

Strictly speaking, they are no longer part of the same enum:
But yes, they use the same enumerator space, and tablgen processes them all at once.

Processing intrinsics separately by target requires dividing the 32-bit intrinsic opcode space up front. For example, we could use 8 bits for target, 24 for intrinsics. This would make intrinsic ids no longer dense, and no longer suitable as array indices, but perhaps having distinct arrays per target is better anyway.

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 30, 2020, 5:48:35 PM9/30/20
to Reid Kleckner, llvm-dev
This has me curious, so I'm going to learn how it works.

Would it be possible to allow each target to generate its own intrinsic tables, and then write a program to merge them into one contiguous space?


At 9/30/2020 04:31 PM, Reid Kleckner wrote:


>On Wed, Sep 30, 2020 at 6:20 AM Nicolai Hähnle via llvm-dev <<mailto:llvm...@lists.llvm.org>llvm...@lists.llvm.org> wrote:
>Hi Paul,
>
>the intrinsics of all backends together with the target-agnostic
>intrinsics are all part of a single large enum, and there are some
>subtle assumptions made e.g. about how the values in this enum are
>sorted (in order to emit tables that are suitable for binary search).
>So this is an area in which to tread carefully.
>
>
>Strictly speaking, they are no longer part of the same enum:

><http://github.com/llvm/llvm-project/commit/5d986953c8b917bacfaa1f800fc1e242559f76be>http://github.com/llvm/llvm-project/commit/5d986953c8b917bacfaa1f800fc1e242559f76be


>But yes, they use the same enumerator space, and tablgen processes them all at once.
>
>Processing intrinsics separately by target requires dividing the 32-bit intrinsic opcode space up front. For example, we could use 8 bits for target, 24 for intrinsics. This would make intrinsic ids no longer dense, and no longer suitable as array indices, but perhaps having distinct arrays per target is better anyway.

_______________________________________________

Paul C. Anagnostopoulos via llvm-dev

unread,
Sep 30, 2020, 6:03:35 PM9/30/20
to Reid Kleckner, llvm-dev
Another question: Is there a reason why all the TableGen intrinsic files have to be included in every run of TableGen, one per backend per target? Couldn't those files be included only when running backends pertinent to the intrinsics?


At 9/30/2020 04:31 PM, Reid Kleckner wrote:

>On Wed, Sep 30, 2020 at 6:20 AM Nicolai Hähnle via llvm-dev <<mailto:llvm...@lists.llvm.org>llvm...@lists.llvm.org> wrote:
>Hi Paul,
>
>the intrinsics of all backends together with the target-agnostic
>intrinsics are all part of a single large enum, and there are some
>subtle assumptions made e.g. about how the values in this enum are
>sorted (in order to emit tables that are suitable for binary search).
>So this is an area in which to tread carefully.
>
>
>Strictly speaking, they are no longer part of the same enum:

><http://github.com/llvm/llvm-project/commit/5d986953c8b917bacfaa1f800fc1e242559f76be>http://github.com/llvm/llvm-project/commit/5d986953c8b917bacfaa1f800fc1e242559f76be


>But yes, they use the same enumerator space, and tablgen processes them all at once.
>
>Processing intrinsics separately by target requires dividing the 32-bit intrinsic opcode space up front. For example, we could use 8 bits for target, 24 for intrinsics. This would make intrinsic ids no longer dense, and no longer suitable as array indices, but perhaps having distinct arrays per target is better anyway.

_______________________________________________

Reply all
Reply to author
Forward
0 new messages