[llvm-dev] Possible bug with struct types and linking

26 views
Skip to first unread message

Tim Armstrong via llvm-dev

unread,
Mar 24, 2016, 12:20:37 PM3/24/16
to llvm...@lists.llvm.org
Hi All,

  I ran into what I think is a linker bug around structural uniquing of types when trying to make a long-overdue upgrade from LLVM 3.3 to LLVM 3.8.0. Verification can fail after linking together two well-formed modules due to mismatched pointer types.

What happens is that both of the types in the source module get mapped to the same type in the destination module. The old behaviour was to map each type to the type in the destination with matching name and structure. This means that the type signatures of functions called from the Dst module can change, causing verification failures, e.g. if a call instruction originally from Dst calls a function in Src with the wrong typed pointer.

Dst Module
========
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }

<forward declarations of IdentityTiny and IdentityBool>

<functions that call IdentityTiny and IdentityBool>

Src Module
========
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }

define i16 @IdentityTiny(%"struct.impala_udf::TinyIntVal"* nocapture readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

define i16 @IdentityBool(%"struct.impala_udf::BooleanVal"* nocapture readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::BooleanVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

Combined Module
==============
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }

<functions that call IdentityTiny and IdentityBool with TinyIntVal* and BooleanVal* args>

define i16 @IdentityTiny(%"struct.impala_udf::TinyIntVal"* nocapture readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

; Function signature changed =>
define i16 @IdentityBool(%"struct.impala_udf::TinyIntVal"* nocapture readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

This seems like a bug to me, I wouldn't expect this scenario to result in malformed modules but it's not really clear what the expected behaviour of the linker is in situations like this.

It seems like this hasn't been hit by the LLVM linker since it starts with an empty module and links in all non-empty modules, so all non-opaque struct types in the destination module are already structurally unique. In our case we are starting with a main module then adding in IRBuilder code and code from other modules.

I have a fix and I'm happy to put together a patch with regression tests, but I wanted to check that this would be considered useful, given that it doesn't seem to be a common use-case for LLVM and the planned work on typeless pointers will make it irrelevant.

- Tim

Rafael Espíndola

unread,
Mar 24, 2016, 1:08:31 PM3/24/16
to Tim Armstrong, llvm-dev
On 24 March 2016 at 12:18, Tim Armstrong via llvm-dev

I would be interested in seeing the fix and full testcase.

It has been a long time since I looked at the type merging, but I
would expect it to work for this case.

Cheers,
Rafael
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Tim Armstrong via llvm-dev

unread,
Mar 24, 2016, 1:37:16 PM3/24/16
to Rafael Espíndola, llvm-dev
Thanks for the quick response Rafael, that's what I hoped to hear - I'll see what I can do!

The original repro is ~150000 lines of IR so I'll see if I can reduce it :)

- Tim

Sean Silva via llvm-dev

unread,
Mar 24, 2016, 6:10:18 PM3/24/16
to Tim Armstrong, llvm-dev
On Thu, Mar 24, 2016 at 10:36 AM, Tim Armstrong via llvm-dev <llvm...@lists.llvm.org> wrote:
Thanks for the quick response Rafael, that's what I hoped to hear - I'll see what I can do!

The original repro is ~150000 lines of IR so I'll see if I can reduce it :)

Tim Armstrong via llvm-dev

unread,
Apr 1, 2016, 2:20:46 AM4/1/16
to Sean Silva, llvm-dev
I was able to get a relatively simple reproduction of the issue and posted it along with a fix: http://reviews.llvm.org/D18683

Unfortunately I had to resort to writing a unit test rather than using the normal test infrastructure since llvm-link links modules in a way that avoids the bug.
Reply all
Reply to author
Forward
0 new messages