[LLVMdev] How to prevent insertion of memcpy()

2,510 views
Skip to first unread message

Dmitry Vyukov

unread,
May 29, 2012, 9:59:08 AM5/29/12
to llvmdev
Hi,

I have the following program:

// test.c
#include <stdlib.h>
struct foo_t {
  int x[1024];
};
__thread struct foo_t g_foo;
void bar(struct foo_t* foo) {
  g_foo = *foo;
}
int main() {
  struct foo_t* f = (struct foo_t*)malloc(sizeof(struct foo_t));
  bar(f);
  return 0;
}

When I compile it with clang I see that it inserts memcpy() in function bar():

$ clang -v
clang version 3.2 (trunk 157390)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang test.c -g && objdump -dCS a.out

void bar(struct foo_t* foo) {
  4005b0:       55                      push   %rbp
  4005b1:       48 89 e5                mov    %rsp,%rbp
  4005b4:       48 83 ec 10             sub    $0x10,%rsp
  4005b8:       48 89 7d f8             mov    %rdi,-0x8(%rbp)
  g_foo = *foo;
  4005bc:       48 8b 7d f8             mov    -0x8(%rbp),%rdi
  4005c0:       64 48 8b 04 25 00 00    mov    %fs:0x0,%rax
  4005c7:       00 00 
  4005c9:       48 8d 80 00 f0 ff ff    lea    -0x1000(%rax),%rax
  4005d0:       ba 00 10 00 00          mov    $0x1000,%edx
  4005d5:       48 89 7d f0             mov    %rdi,-0x10(%rbp)
  4005d9:       48 89 c7                mov    %rax,%rdi
  4005dc:       48 8b 75 f0             mov    -0x10(%rbp),%rsi
  4005e0:       e8 c3 fe ff ff          callq  4004a8 <memcpy@plt>
}
  4005e5:       48 83 c4 10             add    $0x10,%rsp
  4005e9:       5d                      pop    %rbp
  4005ea:       c3                      retq   
  4005eb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding with no success.
TIA

Anton Korobeynikov

unread,
May 29, 2012, 12:46:55 PM5/29/12
to Dmitry Vyukov, llvmdev
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.
--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University

_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Dmitry Vyukov

unread,
May 29, 2012, 12:52:15 PM5/29/12
to Anton Korobeynikov, llvmdev
On Tue, May 29, 2012 at 8:46 PM, Anton Korobeynikov <an...@korobeynikov.info> wrote:
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.

Hi,

Thanks. I've emailed cfe-dev.
We absolutely need clang/llvm to not insert the calls into our code.

Duncan Sands

unread,
May 29, 2012, 1:10:05 PM5/29/12
to llv...@cs.uiuc.edu
Hi Dmitry,

> We absolutely need clang/llvm to not insert the calls into our code.

why is that?

Ciao, Duncan.

Chandler Carruth

unread,
May 29, 2012, 1:16:06 PM5/29/12
to Dmitry Vyukov, llvmdev
This really isn't possible.

The C++ standard essentially requires the compiler to insert calls to memcpy for certain code patterns.

What do you really need here? Clearly you have some way of handling when the user writes memcpy; what is different about Clang or LLVM inserting memcpy? 

Dmitry Vyukov

unread,
May 29, 2012, 1:28:58 PM5/29/12
to Chandler Carruth, llvmdev
On Tue, May 29, 2012 at 9:16 PM, Chandler Carruth <chan...@google.com> wrote:
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.

Hi,

Thanks. I've emailed cfe-dev.
We absolutely need clang/llvm to not insert the calls into our code.

This really isn't possible.

The C++ standard essentially requires the compiler to insert calls to memcpy for certain code patterns.

What do you really need here? Clearly you have some way of handling when the user writes memcpy; what is different about Clang or LLVM inserting memcpy? 

I need it for ThreadSanitizer runtime. In particular
line 1238. But I had similar problems in other places.
Both memory access processing and signal handling are quite tricky, we can't allow recursion.

Chandler Carruth

unread,
May 29, 2012, 1:40:04 PM5/29/12
to Dmitry Vyukov, llvmdev
The first thing to think about is that you *do* need to use -fno-builtin / -ffreestanding when compiling the runtime because it provides its own implementations of memcpy.

The second is that there is no way to write fully generic C++ code w/o inserting calls to memcpy. =/ If you are writing your memcpy implementation, you'll have to go to great lengths to use C constructs that are guaranteed to not cause this behavior, or to manually call an un-instrumented memcpy implementation. I don't know of any easy ways around this. 

Dmitry Vyukov

unread,
May 29, 2012, 1:46:14 PM5/29/12
to Chandler Carruth, llvmdev
On Tue, May 29, 2012 at 9:40 PM, Chandler Carruth <chan...@google.com> wrote:
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.

Hi,

Thanks. I've emailed cfe-dev.
We absolutely need clang/llvm to not insert the calls into our code.

This really isn't possible.

The C++ standard essentially requires the compiler to insert calls to memcpy for certain code patterns.

What do you really need here? Clearly you have some way of handling when the user writes memcpy; what is different about Clang or LLVM inserting memcpy? 

I need it for ThreadSanitizer runtime. In particular
line 1238. But I had similar problems in other places.
Both memory access processing and signal handling are quite tricky, we can't allow recursion.

The first thing to think about is that you *do* need to use -fno-builtin / -ffreestanding when compiling the runtime because it provides its own implementations of memcpy.

We used both at some points in time, but the problem is that they do not help to solve the problem. I think we use -fno-builtin now, I am not sure about -ffreestanding.

The second is that there is no way to write fully generic C++ code w/o inserting calls to memcpy. =/ If you are writing your memcpy implementation, you'll have to go to great lengths to use C constructs that are guaranteed to not cause this behavior, or to manually call an un-instrumented memcpy implementation. I don't know of any easy ways around this. 

What are these magic constructs. I had problems with both struct copies and for loops.

Chandler Carruth

unread,
May 29, 2012, 1:50:35 PM5/29/12
to Dmitry Vyukov, llvmdev
Don't copy things by value ever. =/ It is really, *really* hard to do this. If at all possible, I would build your runtime against an un-instrumented memcpy (perhaps defined within the runtime), and then use aliases or other techniques to wrap the instrumented functions in the exported names necessary for use when intercepting memcpy calls from the instrumented program.

Dmitry Vyukov

unread,
May 29, 2012, 1:56:59 PM5/29/12
to Chandler Carruth, llvmdev
On Tue, May 29, 2012 at 9:50 PM, Chandler Carruth <chan...@google.com> wrote:
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.

Hi,

Thanks. I've emailed cfe-dev.
We absolutely need clang/llvm to not insert the calls into our code.

This really isn't possible.

The C++ standard essentially requires the compiler to insert calls to memcpy for certain code patterns.

What do you really need here? Clearly you have some way of handling when the user writes memcpy; what is different about Clang or LLVM inserting memcpy? 

I need it for ThreadSanitizer runtime. In particular
line 1238. But I had similar problems in other places.
Both memory access processing and signal handling are quite tricky, we can't allow recursion.

The first thing to think about is that you *do* need to use -fno-builtin / -ffreestanding when compiling the runtime because it provides its own implementations of memcpy.

We used both at some points in time, but the problem is that they do not help to solve the problem. I think we use -fno-builtin now, I am not sure about -ffreestanding.

The second is that there is no way to write fully generic C++ code w/o inserting calls to memcpy. =/ If you are writing your memcpy implementation, you'll have to go to great lengths to use C constructs that are guaranteed to not cause this behavior, or to manually call an un-instrumented memcpy implementation. I don't know of any easy ways around this. 

What are these magic constructs. I had problems with both struct copies and for loops.

Don't copy things by value ever. =/ It is really, *really* hard to do this.

Do you mean 'don't do struct copies'? Are there other problems aside from implicit memcpy calls?
 
If at all possible, I would build your runtime against an un-instrumented memcpy (perhaps defined within the runtime), and then use aliases or other techniques to wrap the instrumented functions in the exported names necessary for use when intercepting memcpy calls from the instrumented program.

I am not sure I understand it.
We can't afford function calls scattered at random places. It will cost 30% of performance of so.

Chandler Carruth

unread,
May 29, 2012, 2:02:16 PM5/29/12
to Dmitry Vyukov, llvmdev
On Tue, May 29, 2012 at 10:56 AM, Dmitry Vyukov <dvy...@google.com> wrote:
On Tue, May 29, 2012 at 9:50 PM, Chandler Carruth <chan...@google.com> wrote:
> How do I disable that feature? I've tried -fno-builtin and/or -ffreestanding
> with no success.
clang (as well as gcc) requires that freestanding environment provides
memcpy, memmove, memset and memcmp.

PS: Consider emailing cfedev, not llvmdev.

Hi,

Thanks. I've emailed cfe-dev.
We absolutely need clang/llvm to not insert the calls into our code.

This really isn't possible.

The C++ standard essentially requires the compiler to insert calls to memcpy for certain code patterns.

What do you really need here? Clearly you have some way of handling when the user writes memcpy; what is different about Clang or LLVM inserting memcpy? 

I need it for ThreadSanitizer runtime. In particular
line 1238. But I had similar problems in other places.
Both memory access processing and signal handling are quite tricky, we can't allow recursion.

The first thing to think about is that you *do* need to use -fno-builtin / -ffreestanding when compiling the runtime because it provides its own implementations of memcpy.

We used both at some points in time, but the problem is that they do not help to solve the problem. I think we use -fno-builtin now, I am not sure about -ffreestanding.

The second is that there is no way to write fully generic C++ code w/o inserting calls to memcpy. =/ If you are writing your memcpy implementation, you'll have to go to great lengths to use C constructs that are guaranteed to not cause this behavior, or to manually call an un-instrumented memcpy implementation. I don't know of any easy ways around this. 

What are these magic constructs. I had problems with both struct copies and for loops.

Don't copy things by value ever. =/ It is really, *really* hard to do this.

Do you mean 'don't do struct copies'? Are there other problems aside from implicit memcpy calls?

Don't do copies outside of a restricted set of primitive types (sizeof(T) <= sizeof(T*) would be my rule of thumb, but there is no hard-and-fast rule here to avoid these problems).
 
 
If at all possible, I would build your runtime against an un-instrumented memcpy (perhaps defined within the runtime), and then use aliases or other techniques to wrap the instrumented functions in the exported names necessary for use when intercepting memcpy calls from the instrumented program.

I am not sure I understand it.
We can't afford function calls scattered at random places. It will cost 30% of performance of so.

These won't end up actually being function calls... Clang lowers them to 'memcpy', and LLVM will try to lower them to actual loads and stores where possible.

We should discuss these issues separately though:

1) Get the runtime working w/o worrying about memcpy being inserted or not by having a clear barrier between instrumented functions and non-instrumented functions, and making the non-instrumented ones available when compiling and linking the runtime, but not when compiling / linking the instrumented program.

2) Deal with any performance fallout of the thusly built runtime. We can fix the LLVM optimizers until they generate the optimal code. =]

Jeffrey Yasskin

unread,
May 29, 2012, 2:11:37 PM5/29/12
to Chandler Carruth, Dmitry Vyukov, llvmdev
There are some other platforms that absolutely can't tolerate function
calls. Do they have an attribute or pass to tell LLVM to inline any
functions it or clang inserts? Could Dmitry do the same thing?

Chandler Carruth

unread,
May 29, 2012, 2:14:17 PM5/29/12
to Jeffrey Yasskin, Dmitry Vyukov, llvmdev
Yes, there are attributes which can be attached to the non-instrumented memcpy function, provided by the runtime and selected due to -ffreestanding, which will force inlining. __attribute__((always_inline)), __attribute__((flatten)). I suspect we don't correctly support the latter in Clang/LLVM, but that's clearly a missing feature we should fix. 

Chandler Carruth

unread,
May 29, 2012, 2:15:55 PM5/29/12
to Jeffrey Yasskin, Dmitry Vyukov, llvmdev
But to harp on it a bit because this is on 'llvmdev' and is of general interest: please don't use these to fix *performance* problems without first filing a bug against LLVM's optimizers for why it was necessary. In an ideal world these should only be used where there is a platform/ABI/debugging/etc contract that no function calls occur.

Dmitry Vyukov

unread,
May 30, 2012, 2:57:09 AM5/30/12
to Chandler Carruth, llvmdev
Thanks, I will try this.
Reply all
Reply to author
Forward
0 new messages