[LLVMdev] Union type, is it really used or necessary?

747 views
Skip to first unread message

Neal N. Wang

unread,
Jul 20, 2010, 3:41:39 AM7/20/10
to llv...@cs.uiuc.edu
In the LLVM type system, union type is similar to the one in C/C++. In the implementation of llvm-gcc, a C union type is converted to a struct of one field that can hold all possible values of the union type and type casts are used to make the code manipulating the union type "well typed". This approach seems work very well, is there really a need to keep union type in LLVM? Is there a front-end emitting union type?

Thanks,
Neal

Anton Korobeynikov

unread,
Jul 20, 2010, 4:36:19 AM7/20/10
to Neal N. Wang, llv...@cs.uiuc.edu
> used to make the code manipulating the union type "well typed". This
> approach seems work very well, is there really a need to keep union type in
> LLVM?
I think in its current state the unions should be removed from LLVM IR
in next release. It's pretty much unfinished and noone is willing to
work on them.

> Is there a front-end emitting union type?

Not that I'm aware of.

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Chris Lattner

unread,
Jul 20, 2010, 11:34:42 AM7/20/10
to Anton Korobeynikov, llvmdev@cs.uiuc.edu List, Neal N. Wang

On Jul 20, 2010, at 1:36 AM, Anton Korobeynikov wrote:

>> used to make the code manipulating the union type "well typed". This
>> approach seems work very well, is there really a need to keep union type in
>> LLVM?
> I think in its current state the unions should be removed from LLVM IR
> in next release. It's pretty much unfinished and noone is willing to
> work on them.

I agree.

-Chris

Talin

unread,
Jul 20, 2010, 5:46:45 PM7/20/10
to Chris Lattner, Neal N. Wang, llvmdev@cs.uiuc.edu List
On Tue, Jul 20, 2010 at 8:34 AM, Chris Lattner <clat...@apple.com> wrote:

On Jul 20, 2010, at 1:36 AM, Anton Korobeynikov wrote:

>> used to make the code manipulating the union type "well typed". This
>> approach seems work very well, is there really a need to keep union type in
>> LLVM?
> I think in its current state the unions should be removed from LLVM IR
> in next release. It's pretty much unfinished and noone is willing to
> work on them.

I agree.

Unfortunately I wasn't able to take the union stuff much farther than I did. Partly that was because my LLVM-related work has been on hiatus for the last 4 months or so due to various issues going on in my personal life. But it was also partly because I had reached the limit of my knowledge in this area, I wasn't able to delve deeply enough into the code generation side of LLVM to really understand what needed to be done to support unions.

As far as converting a union into a C struct that is large enough to hold all possible types of the union, there are two minor problems associated with this approach:

1) For frontends that generate target-agnostic code, it is difficult to calculate how large this struct should be. (Which is larger, 3 int32s or two pointers? You don't know unless your frontend knows the size of a pointer.) In my case, I finally decided to abandon my goal of making my frontend completely target-neutral. While it's relatively easy to write a frontend that is 99% target-neutral with LLVM, that last 1% cannot be eliminated.

2) Extracting the values from the union require pointer casting, which means that the union cannot be an SSA value - it has to have an address. This probably isn't a big issue in languages like C++ which use unions infrequently, but other languages which use algebraic type systems might suffer a loss of performance due to the need to store union types in memory.
 
-Chris

--
-- Talin

Neal N. Wang

unread,
Jul 20, 2010, 10:51:35 PM7/20/10
to Talin, llvmdev@cs.uiuc.edu List
On Tue, Jul 20, 2010 at 2:46 PM, Talin <vir...@gmail.com> wrote:
On Tue, Jul 20, 2010 at 8:34 AM, Chris Lattner <clat...@apple.com> wrote:

On Jul 20, 2010, at 1:36 AM, Anton Korobeynikov wrote:

>> used to make the code manipulating the union type "well typed". This
>> approach seems work very well, is there really a need to keep union type in
>> LLVM?
> I think in its current state the unions should be removed from LLVM IR
> in next release. It's pretty much unfinished and noone is willing to
> work on them.

I agree.

Unfortunately I wasn't able to take the union stuff much farther than I did. Partly that was because my LLVM-related work has been on hiatus for the last 4 months or so due to various issues going on in my personal life. But it was also partly because I had reached the limit of my knowledge in this area, I wasn't able to delve deeply enough into the code generation side of LLVM to really understand what needed to be done to support unions.

As far as converting a union into a C struct that is large enough to hold all possible types of the union, there are two minor problems associated with this approach:

1) For frontends that generate target-agnostic code, it is difficult to calculate how large this struct should be. (Which is larger, 3 int32s or two pointers? You don't know unless your frontend knows the size of a pointer.) In my case, I finally decided to abandon my goal of making my frontend completely target-neutral. While it's relatively easy to write a frontend that is 99% target-neutral with LLVM, that last 1% cannot be eliminated.
 
This is indeed a problem if a front-end or any pass has to compute the size of a type.  For example, Sometimes I need to find out the size of a type in my pass, I then call TargetData.getTypeStorageSize() to get the size of a particular type.  This practice will introduce architecture-dependent LLVM code.  IMHO, LLVM cannot avoid this problem anyway, unless such function is removed or returns a ConstantExpr.  Probably, LLVM has a function that returns a ConstantExpr type size, I'm just ignorant in this aspect.

Another thought is can you delay the computing of the maximum storage of a union type by using a max operator? 
Your example can be represented as "struct { max([3xi32], [2xi8*],...) }", this approach will avoid deciding the size in front-ends. But again allowing TargetData.getTypeStorageSize() can compromise the architecture-neutrality goal.
 

2) Extracting the values from the union require pointer casting, which means that the union cannot be an SSA value - it has to have an address. This probably isn't a big issue in languages like C++ which use unions infrequently, but other languages which use algebraic type systems might suffer a loss of performance due to the need to store union types in memory.
Can mem2reg alleviate  this problem?

Cheers,
Neal

Reid Kleckner

unread,
Jul 21, 2010, 12:50:20 AM7/21/10
to Talin, llvmdev@cs.uiuc.edu List, Neal N. Wang
On Tue, Jul 20, 2010 at 2:46 PM, Talin <vir...@gmail.com> wrote:
> 1) For frontends that generate target-agnostic code, it is difficult to
> calculate how large this struct should be. (Which is larger, 3 int32s or two
> pointers? You don't know unless your frontend knows the size of a pointer.)
> In my case, I finally decided to abandon my goal of making my frontend
> completely target-neutral. While it's relatively easy to write a frontend
> that is 99% target-neutral with LLVM, that last 1% cannot be eliminated.

+1 for more features that make it easier to generate target-agnostic
IR, despite its difficulty.

Speaking of incomplete features, most LLVM frontends do not use the
va_arg intrinsics, but they have not been cut. Presumably they are
useful for the same reason.

Reid

Nick Lewycky

unread,
Jul 21, 2010, 3:46:06 AM7/21/10
to Neal N. Wang, llvmdev@cs.uiuc.edu List

:-) It's ConstantExpr::getSizeOf(Ty).

You can then pass that into an alloca and allocate that number of bytes.

> Another thought is can you delay the computing of the maximum storage of
> a union type by using a max operator?

Sure, but that's annoying. The max(%X, %Y) becomes 'select i1 (icmp ult
%X, %Y), %X, %Y), or in code:
Constant *SizeX = ConstantExpr::getSizeOf(Ty1);
Constant *SizeY = ConstantExpr::getSizeOf(Ty2);
Constant *GT = ConstantExpr::getICmp(ICmpInst::UGT, SizeX, SizeY);
Constant *Max = ConstantExpr::getSelect(GT, SizeX, SizeY);

> Your example can be represented as "struct { max([3xi32], [2xi8*],...)
> }", this approach will avoid deciding the size in front-ends. But again
> allowing TargetData.getTypeStorageSize() can compromise the
> architecture-neutrality goal.
>
>
> 2) Extracting the values from the union require pointer casting,
> which means that the union cannot be an SSA value - it has to have
> an address. This probably isn't a big issue in languages like C++
> which use unions infrequently, but other languages which use
> algebraic type systems might suffer a loss of performance due to the
> need to store union types in memory.
>
> Can mem2reg alleviate this problem?

If the memory is alloca'd then mem2reg should take care of it, yes. Note
that the constant expression needs to be resolved to a concrete number
at some point for this to take place, which in practise means that the
TargetData will need to be added and an instcombine run will need to
take place before mem2reg can do its work.

Nick

Rafael Espindola

unread,
Jul 21, 2010, 10:05:13 AM7/21/10
to Reid Kleckner, Neal N. Wang, llvmdev@cs.uiuc.edu List
> Speaking of incomplete features, most LLVM frontends do not use the
> va_arg intrinsics, but they have not been cut.  Presumably they are
> useful for the same reason.

They are, but you still have the issue of va_list not being the same
type everywhere.

> Reid
> _______________________________________________
> LLVM Developers mailing list
> LLV...@cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>


Cheers,
--
Rafael Ávila de Espíndola

David A. Greene

unread,
Jul 28, 2010, 5:41:21 PM7/28/10
to Reid Kleckner, Neal N. Wang, llvmdev@cs.uiuc.edu List
Reid Kleckner <reid.k...@gmail.com> writes:

> On Tue, Jul 20, 2010 at 2:46 PM, Talin <vir...@gmail.com> wrote:
>> 1) For frontends that generate target-agnostic code, it is difficult to
>> calculate how large this struct should be. (Which is larger, 3 int32s or two
>> pointers? You don't know unless your frontend knows the size of a pointer.)
>> In my case, I finally decided to abandon my goal of making my frontend
>> completely target-neutral. While it's relatively easy to write a frontend
>> that is 99% target-neutral with LLVM, that last 1% cannot be eliminated.
>
> +1 for more features that make it easier to generate target-agnostic
> IR, despite its difficulty.
>
> Speaking of incomplete features, most LLVM frontends do not use the
> va_arg intrinsics, but they have not been cut. Presumably they are
> useful for the same reason.

Our frontend uses va_arg. I have an implementation for x86_64 that I've
been meaning to send up (along with a bazillion other patches...).

-Dave

Chris Lattner

unread,
Aug 28, 2010, 12:10:43 AM8/28/10
to Talin, Neal N. Wang, llvmdev@cs.uiuc.edu List
I removed unions from mainline in r112356.

-Chris

Erik de Castro Lopo

unread,
Sep 7, 2010, 9:22:14 AM9/7/10
to llv...@cs.uiuc.edu
Chris Lattner wrote:

> I removed unions from mainline in r112356.

Sorry for reviving this old thread, but I think the removal of
unions is a real pity.

I use Haskell to generate LLVM code using David Terei's LLVM
code from the GHC compiler (the compiler I'm working on is also
written in Haskell). Once I've generated LLVM IR code I use llc
to generate object code.

I'm currently use llvm-2.7 and have been using unions, not being
aware that they are going to be removed. The use case is for
forcing field alignments in a packed struct to be correct for
32 and 64 bits. In particular I have a struct with an i32 tag
field followed by a pointer.

When generating 32 bit code the struct looks like:

<{ i32, pointer }>

and for 64 bit code:

<{ union { i32, i64 }, pointer }>

The nice thing about this is that in my LLVM code generator,
I have a function offsetOf which can get me the byte offset of
and element in a struct. In the case above,

offsetOf (1)

returns 4 when generating 32 bit code and 8 when generating 64
bit code.

If there's another of guaranteeing struct alignment as well as
and easy way to get struct field offsets I'd like hear of it.
Otherwise, I'd like to know what needs to be done to get unions
back in LLVM.

Cheers,
Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Anton Korobeynikov

unread,
Sep 7, 2010, 10:36:24 AM9/7/10
to llv...@cs.uiuc.edu
Hello, Erik

> Otherwise, I'd like to know what needs to be done to get unions
> back in LLVM.

Well, the answer is pretty easy: someone should "fix" them to be
supported throughout the whole set of libraries and became a
"maintainer".
Otherwise the feature being unused will quickly became broken.

--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

Renato Golin

unread,
Sep 7, 2010, 10:55:11 AM9/7/10
to Anton Korobeynikov, llv...@cs.uiuc.edu
On 7 September 2010 15:36, Anton Korobeynikov <an...@korobeynikov.info> wrote:
> Otherwise the feature being unused will quickly became broken.

It was already broken for ages... :/

Even if you're not using the backends (or MC), having it in front-end
only will only confuse new users that will try to use it and hope it
just works (my case, a few months ago).

If there is nothing, you just work around it (by adding new features
to structs, if necessary) or re-create unions, depending on your
commitment to the union problem. Although having an union type would
be quite an improvement to IR readability, I really don't need it that
badly to write the whole back-end for it.

It's just a matter of priorities, unfortunately... :(

--
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

Talin

unread,
Sep 7, 2010, 11:16:42 AM9/7/10
to Renato Golin, llv...@cs.uiuc.edu
Here's a suggestion - can we make the "union patch" (the inverse of the patch that removed unions) as a downloadable file so that people who are interested in finishing the work can do so?
--
-- Talin

Renato Golin

unread,
Sep 7, 2010, 11:38:33 AM9/7/10
to Talin, llv...@cs.uiuc.edu
On 7 September 2010 16:16, Talin <vir...@gmail.com> wrote:
> Here's a suggestion - can we make the "union patch" (the inverse of the
> patch that removed unions) as a downloadable file so that people who are
> interested in finishing the work can do so?

The patch would degenerate quickly and become useless after a few
commits/releases.

Maybe a high level docs would be best, with the basic points like the
one on how to create a FunctionPass, but focused on how to create a
new Type. And a special section on the rationale about unions (and its
intrinsic problems with type sizes in front-end/back-end code
generation).

Chris Lattner

unread,
Sep 7, 2010, 11:40:20 AM9/7/10
to Talin, llv...@cs.uiuc.edu
On Sep 7, 2010, at 8:16 AM, Talin wrote:

Here's a suggestion - can we make the "union patch" (the inverse of the patch that removed unions) as a downloadable file so that people who are interested in finishing the work can do so?

It already is.  I reverted it with one commit, so you can obtain that patch with 'svn diff'.

-Chris

Eli Friedman

unread,
Sep 7, 2010, 11:41:37 AM9/7/10
to Talin, llv...@cs.uiuc.edu
On Tue, Sep 7, 2010 at 8:16 AM, Talin <vir...@gmail.com> wrote:
> Here's a suggestion - can we make the "union patch" (the inverse of the
> patch that removed unions) as a downloadable file so that people who are
> interested in finishing the work can do so?

Anyone who's really interested in working on it can just use "svn diff
-c 112356", and apply it with "patch -R".

-Eli

Erik de Castro Lopo

unread,
Sep 7, 2010, 5:24:15 PM9/7/10
to llv...@cs.uiuc.edu
Eli Friedman wrote:

> On Tue, Sep 7, 2010 at 8:16 AM, Talin <vir...@gmail.com> wrote:
> > Here's a suggestion - can we make the "union patch" (the inverse of the
> > patch that removed unions) as a downloadable file so that people who are
> > interested in finishing the work can do so?
>
> Anyone who's really interested in working on it can just use "svn diff
> -c 112356", and apply it with "patch -R".

Well I tried that, the patch fails to reverse apply. Out of the
34 files touched by the patch, not a single hunk actually manages
to reverse apply.

Assuming I was to decide to embark on the effort of getting unions
back into LLVM:

a) What is required for them to be accepted back in?

b) What are the chances of getting them in the 2.8 release?

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Chris Lattner

unread,
Sep 7, 2010, 6:03:33 PM9/7/10
to llv...@cs.uiuc.edu

On Sep 7, 2010, at 2:24 PM, Erik de Castro Lopo wrote:

> Eli Friedman wrote:
>
>> On Tue, Sep 7, 2010 at 8:16 AM, Talin <vir...@gmail.com> wrote:
>>> Here's a suggestion - can we make the "union patch" (the inverse of the
>>> patch that removed unions) as a downloadable file so that people who are
>>> interested in finishing the work can do so?
>>
>> Anyone who's really interested in working on it can just use "svn diff
>> -c 112356", and apply it with "patch -R".
>
> Well I tried that, the patch fails to reverse apply. Out of the
> 34 files touched by the patch, not a single hunk actually manages
> to reverse apply.
>
> Assuming I was to decide to embark on the effort of getting unions
> back into LLVM:
>
> a) What is required for them to be accepted back in?

It needs to work. When reverted, it was broken in almost all cases.

> b) What are the chances of getting them in the 2.8 release?

Zero.

-Chris

Erik de Castro Lopo

unread,
Sep 7, 2010, 7:31:52 PM9/7/10
to Chris Lattner, llv...@cs.uiuc.edu
Chris Lattner wrote:

> > a) What is required for them to be accepted back in?
>
> It needs to work. When reverted, it was broken in almost all cases.

'It needs work' and 'it was broken' doesn't really give me an
idea of what specifically is required.

Specifically, what I am interested in is using unions within
packed structs to force alignment. Using unions like this was
the easiest and most reliable way of forcing specific alignment.
It made it really easy to calculate offsets in high level code
allowing me to completely ignore whether I was generating code
for 32 or 64 bits.

> > b) What are the chances of getting them in the 2.8 release?
>
> Zero.

So a feature, of which a subset was actually working (I know
this because I am using unions successfully in the compiler
I'm working on) in the 2.7 release and was documented on the
web site

http://llvm.org/releases/2.7/docs/LangRef.html#t_union

just gets yanked?

Are you really trying to tell me that anyone using LLVM in anger
needs to be running SVN HEAD and keep an eye on the mailing list
to make sure that features they use aren't going to get arbitrarily
yanked?

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Chris Lattner

unread,
Sep 7, 2010, 7:34:10 PM9/7/10
to mle...@mega-nerd.com, llv...@cs.uiuc.edu
On Sep 7, 2010, at 4:31 PM, Erik de Castro Lopo wrote:
> Chris Lattner wrote:
>>> a) What is required for them to be accepted back in?
>>
>> It needs to work. When reverted, it was broken in almost all cases.
>
> 'It needs work' and 'it was broken' doesn't really give me an
> idea of what specifically is required.

There were numerous problems, basically any time someone tried to use it, it broke. I don't think codegen supported it at all for example.

> Specifically, what I am interested in is using unions within
> packed structs to force alignment. Using unions like this was
> the easiest and most reliable way of forcing specific alignment.
> It made it really easy to calculate offsets in high level code
> allowing me to completely ignore whether I was generating code
> for 32 or 64 bits.

Lots of people agree it would be a useful feature, we are lacking a useful implementation :-)

>
>>> b) What are the chances of getting them in the 2.8 release?
>>
>> Zero.
>
> So a feature, of which a subset was actually working (I know
> this because I am using unions successfully in the compiler
> I'm working on) in the 2.7 release and was documented on the
> web site
>
> http://llvm.org/releases/2.7/docs/LangRef.html#t_union
>
> just gets yanked?

The 2.7 release notes:
http://llvm.org/releases/2.7/docs/ReleaseNotes.html

contained:

"LLVM 2.7 has pre-alpha support for unions in LLVM IR. Unfortunately, this support is not really usable in 2.7, so if you're interested in pushing it forward, please help contribute to LLVM mainline."

Several other features that were in 2.7, but unmaintained, also got removed.


2.8 has already branched for its release, so it is too late for new features. Bill's email made this very clear.

-Chris

Dan Gohman

unread,
Sep 7, 2010, 7:42:12 PM9/7/10
to llv...@cs.uiuc.edu
On Sep 7, 2010, at 6:22 AM, Erik de Castro Lopo wrote:
>
> I'm currently use llvm-2.7 and have been using unions, not being
> aware that they are going to be removed. The use case is for
> forcing field alignments in a packed struct to be correct for
> 32 and 64 bits. In particular I have a struct with an i32 tag
> field followed by a pointer.
>
> When generating 32 bit code the struct looks like:
>
> <{ i32, pointer }>
>
> and for 64 bit code:
>
> <{ union { i32, i64 }, pointer }>
>
> The nice thing about this is that in my LLVM code generator,
> I have a function offsetOf which can get me the byte offset of
> and element in a struct. In the case above,
>
> offsetOf (1)
>
> returns 4 when generating 32 bit code and 8 when generating 64
> bit code.
>
> If there's another of guaranteeing struct alignment as well as
> and easy way to get struct field offsets I'd like hear of it.

If you want to make sure the pointer field is properly aligned, why not
just use a non-packed struct: { i32, pointer } ? Or if you really want
a packed struct, can you use <{ i32, i32, pointer }>, since you're
already emitting target-dependent IR anyway?

If you're computing the offset in order to use as a value
within the program, you can use ConstantExpr::getOffsetOf.
(If this isn't exposed in whatever bindings you're using,
that can be fixed.)

Dan

Erik de Castro Lopo

unread,
Sep 7, 2010, 8:35:23 PM9/7/10
to llv...@cs.uiuc.edu
Dan Gohman wrote:

> If you want to make sure the pointer field is properly aligned, why not
> just use a non-packed struct: { i32, pointer } ? Or if you really want
> a packed struct, can you use <{ i32, i32, pointer }>, since you're
> already emitting target-dependent IR anyway?
>
> If you're computing the offset in order to use as a value
> within the program, you can use ConstantExpr::getOffsetOf.
> (If this isn't exposed in whatever bindings you're using,
> that can be fixed.)

The thing is, I am not using the LLVM library, I generate IR
code directly from Haskell (not using the library because it
makes bootstrapping the compiler I'm working on more difficult).

Secondly, my code, which currently works with llvm-2.7, assumes
packed structs and unions. Using

<{ i32, i32, pointer }>

won't work for me because the 32 bit version looks like:

<{ i32, pointer }>

and that means the the pointer element will have different struct
indices depending on whether I'm compiling for 32 vs 64 bit
targets.

Another possible solution is using:

<{ i64, pointer }> ; 64 bit version
<{ i32, pointer }> ; 32 bit version

but then the first elements are of different types.

Generating IR code without the benefit of unions, makes both
me generating and generated code far less readable. I'm going
to have to stick with version 2.7 and work on getting unions
back in 2.9.

Erik
--
----------------------------------------------------------------------
Erik de Castro Lopo
http://www.mega-nerd.com/

Chris Lattner

unread,
Sep 7, 2010, 8:43:20 PM9/7/10
to llvmdev Dev
On Sep 7, 2010, at 5:35 PM, Erik de Castro Lopo wrote:
> Generating IR code without the benefit of unions, makes both
> me generating and generated code far less readable. I'm going
> to have to stick with version 2.7 and work on getting unions
> back in 2.9.

Great! I'm sure many people would really appreciate working union support in 2.9, thanks!

-Chris

Renato Golin

unread,
Sep 8, 2010, 4:47:53 AM9/8/10
to mle...@mega-nerd.com, llv...@cs.uiuc.edu
On 8 September 2010 00:31, Erik de Castro Lopo <mle...@mega-nerd.com> wrote:
> 'It needs work' and 'it was broken' doesn't really give me an
> idea of what specifically is required.

Search for "union" in the list and you'll find lots of problems and
places to fix.

I remember that Constant doesn't support zero-initialize for unions,
so even before codegen. Also, there was no support at all in the
generic codegen, nor target specific (AFAIK), which is a major factor
for unions, since type sizes, alignment and ABIs play an important
role in how you'll represent unions in code. Also, given it's dubious
status, I doubt there is anything related to that in MC.

I agree that the decision to remove it could be taken a bit less
lightly (maybe asking around first), but if you don't monitor the list
nor the change log (which said it would probably be removed), there
isn't much more to do.

If you want to support unions, you'll have to revert the commit,
extend support in IR (ex. implementing zero-initialize), and implement
the basic support in MC with at least one target (x86?). Even if you
don't use union types for generating code, people would come and think
there is support and report bugs on something that has zero
maintenance. In the end, if you're not willing to support "unions" as
a whole feature, it'll end up the same way as the previous
implementation.

I am one of the advocates we must have unions and it's up to the
codegen to figure out (based on data layout + architecture + whatever
else) what to do with it. But I also have to be realist and accept the
fact that if I can't spare the time to implement it decently, I should
wait until someone does (I do hope that's you!). ;)

--
cheers,
--renato

http://systemcall.org/

Reclaim your digital rights, eliminate DRM, learn more at
http://www.defectivebydesign.org/what_is_drm

David Given

unread,
Sep 8, 2010, 7:03:18 AM9/8/10
to llv...@cs.uiuc.edu
On 07/09/10 14:22, Erik de Castro Lopo wrote:
[...]

> When generating 32 bit code the struct looks like:
>
> <{ i32, pointer }>
>
> and for 64 bit code:
>
> <{ union { i32, i64 }, pointer }>

Surely LLVM will cause the first structure to be correctly aligned on
64-bit platforms by automatically inserting padding? Is explicit
alignment by the user really necessary?

--
┌─── dg@cowlark.com ───── http://www.cowlark.com ─────

│ life←{ ↑1 ⍵∨.^3 4=+/,¯1 0 1∘.⊖¯1 0 1∘.⌽⊂⍵ }
│ --- Conway's Game Of Life, in one line of APL

signature.asc

Jonas Maebe

unread,
Sep 8, 2010, 7:36:53 AM9/8/10
to LLVM Developers Mailing List

On 08 Sep 2010, at 13:03, David Given wrote:

> On 07/09/10 14:22, Erik de Castro Lopo wrote:
> [...]
>> When generating 32 bit code the struct looks like:
>>
>> <{ i32, pointer }>
>>
>> and for 64 bit code:
>>
>> <{ union { i32, i64 }, pointer }>
>
> Surely LLVM will cause the first structure to be correctly aligned on
> 64-bit platforms by automatically inserting padding?

No, because it is declared as "packed" (the angle brackets).


Jonas

Erik de Castro Lopo

unread,
Sep 8, 2010, 7:49:00 AM9/8/10
to llv...@cs.uiuc.edu
David Given wrote:

> On 07/09/10 14:22, Erik de Castro Lopo wrote:
> [...]
> > When generating 32 bit code the struct looks like:
> >
> > <{ i32, pointer }>
> >
> > and for 64 bit code:
> >
> > <{ union { i32, i64 }, pointer }>
>
> Surely LLVM will cause the first structure to be correctly aligned on
> 64-bit platforms by automatically inserting padding? Is explicit
> alignment by the user really necessary?

You missed the point. a struct defined with just parentheses like

{ i32, pointer }

may be padded. A struct defined with angle brackets and parentheses

<{ i32, pointer }>

is a packed struct, guaranteed not to have padding.

See:

http://llvm.org/docs/LangRef.html#t_pstruct

David Given

unread,
Sep 8, 2010, 8:18:15 AM9/8/10
to llv...@cs.uiuc.edu
On 08/09/10 12:36, Jonas Maebe wrote:
[...]

>> Surely LLVM will cause the first structure to be correctly aligned on
>> 64-bit platforms by automatically inserting padding?
>
> No, because it is declared as "packed" (the angle brackets).

Ah, I wasn't aware of that. Ta.

signature.asc

Dan Gohman

unread,
Sep 8, 2010, 2:38:58 PM9/8/10
to llvmdev Mailing List

On Sep 8, 2010, at 4:49 AM, Erik de Castro Lopo wrote:

> David Given wrote:
>
>> On 07/09/10 14:22, Erik de Castro Lopo wrote:
>> [...]
>>> When generating 32 bit code the struct looks like:
>>>
>>> <{ i32, pointer }>
>>>
>>> and for 64 bit code:
>>>
>>> <{ union { i32, i64 }, pointer }>
>>
>> Surely LLVM will cause the first structure to be correctly aligned on
>> 64-bit platforms by automatically inserting padding? Is explicit
>> alignment by the user really necessary?
>
> You missed the point. a struct defined with just parentheses like
>
> { i32, pointer }
>
> may be padded. A struct defined with angle brackets and parentheses
>
> <{ i32, pointer }>
>
> is a packed struct, guaranteed not to have padding.
>
> See:
>
> http://llvm.org/docs/LangRef.html#t_pstruct

>From what you've described so far, regular non-packed structs
would be a much better approach than packed structs with unions,
even in LLVM versions which support unions. The output would be
more readable and you'd get better optimization.

I don't know how much work it would involve for you to change
whatever assumptions you have in your code about packed structs, but
if you're considering doing work in LLVM to re-introduce and finish
unions, it's something to consider.

Dan

Reply all
Reply to author
Forward
0 new messages