Segmentation fault with address sanitiser and lli

Richmond

unread,

Apr 12, 2021, 5:52:28 AM4/12/21

to

I have written this simple program which is intended to fail with
segmentation fault (although it doesn't necessarily).

int main()
{
int j ;
long arr[2]; // declare an array of integers
for (j = 0 ; j <= 100 ; j = j + 1 )
{
arr[j] = j ;
printf("%10d",j);
}
}

If I compile this to produce llvm assembler:

clang -S -emit-llvm count.c -fsanitize=address

Then interpret it:

lli count.ll
Stack dump:
0. Program arguments: lli count.ll
#0 0x00007fd939fdaf2f llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cef2f)
#1 0x00007fd939fd9460 llvm::sys::RunSignalHandlers() (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cd460)
#2 0x00007fd939fdb242 (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cf242)
#3 0x00007fd9395fd730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730)
Segmentation fault

Whereas:

clang -S -emit-llvm count.c

Produces a program which goes around in a loop.

I find this all puzzling, I am not sure why interpreted llvm assembler
should ever produce a segmentation fault, particularly with sanitiser,
unless there is a bug in lli or the sanitation code?

Barry Schwarz

unread,

Apr 12, 2021, 1:43:32 PM4/12/21

to

On Mon, 12 Apr 2021 10:52:16 +0100, Richmond <rich...@criptext.com>
wrote:

Once you cause undefined behavior (by accessing beyond the bounds of
your array), there is no expectation of any particular behavior nor is
there any expectation of consistency in that behavior. Anything that
can happen can happen.

--
Remove del for email

Richmond

unread,

Apr 12, 2021, 2:58:11 PM4/12/21

to

I thought the address sanitiser was supposed to stop it.

Scott Lurndal

unread,

Apr 12, 2021, 3:02:55 PM4/12/21

to

Richmond <rich...@criptext.com> writes:
>I have written this simple program which is intended to fail with
>segmentation fault (although it doesn't necessarily).
>
>int main()
>{
> int j ;
> long arr[2]; // declare an array of integers
> for (j = 0 ; j <= 100 ; j = j + 1 )
> {
> arr[j] = j ;
> printf("%10d",j);
> }
>}
>

Note that you've placed your 'arr' variable on the stack.

The operating system allocates a fairly large stack for the process, much
larger than the 128 bits required by your insufficiently sized 'arr'
variable.

Thus, it's likely that you'll be able to address up to
(4096 / 8) = 512 'arr' entries before recieving a SIGSEGV
(even more if the OS automatically adds to the stack when
you attempt to access the next page). Unix/linux will allocate
up to RLIMIT_STACK (8Mbytes) before you'll see a stack overflow lead to
a segmentation violation.

$ ulimit -a
address space limit (Kibytes) (-M) unlimited
data size (Kibytes) (-d) unlimited
max memory size (Kibytes) (-m) unlimite
stack size (Kibytes) (-s) 8192
process size (Kibytes) (-v) unlimited

(It's quite likely that the printf() call in the loop is corrupting
the values in arr[3] through arr[99] every time it is called due to
the way that the stack activation records are laid out).

Richmond

unread,

Apr 13, 2021, 8:18:11 AM4/13/21

to

If I compile it like this:

clang count.c -o count -fsanitize=address

I then get an immediate trapped error as expected:

=================================================================
==1489==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffd66efd930 at pc 0x0000004f43fc bp 0x7ffd66efd8f0 sp 0x7ffd66efd8e8
WRITE of size 8 at 0x7ffd66efd930 thread T0

So I would expect this:

clang -S -emit-llvm count.c -fsanitize=address

also to trap the error, especially if I run it like this:

lli count.ll

Because I am interpretting assembler, so there is no reason to seg
fault.

Stack dump:
0. Program arguments: lli count.ll

#0 0x00007ff499f7cf2f llvm::sys::PrintStackTrace(llvm::raw_ostream&) (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cef2f)
#1 0x00007ff499f7b460 llvm::sys::RunSignalHandlers() (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cd460)
#2 0x00007ff499f7d242 (/lib/x86_64-linux-gnu/libLLVM-7.so.1+0x9cf242)
#3 0x00007ff49959f730 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12730)
Segmentation fault

Malcolm McLean

unread,

Apr 13, 2021, 9:44:48 AM4/13/21

to

On Monday, 12 April 2021 at 19:58:11 UTC+1, Richmond wrote:

> Barry Schwarz <schw...@delq.com> writes:
>
> > Once you cause undefined behavior (by accessing beyond the bounds of
> > your array), there is no expectation of any particular behavior nor is
> > there any expectation of consistency in that behavior. Anything that
> > can happen can happen.
> I thought the address sanitiser was supposed to stop it.
>

"Undefined behaviour" means two things. It means that the C standard imposes
on constraint on what the program does after it executes the instruction. And
it means that such a program is considered to be erroneous. However it doesn't
mean that the program exists in some sort of philoosphical state of indefinition.
This has historically caused some misunderstanding on this newsgroup. Other
factors than the C standard are allowed to impose a behaviour. In fact that is partly
why we say "the behaviour is undefined" rather than "It writes a byte to address one
past the array". A good operating system will say "this means that the program will
exit with an error message". That's the best thing it can do, in the common situation
that no results are better than the wrong results.

Your address sanitiser should give a diagnostic for every illegal array access. If it
doesn't, it's not a very good address sanitiser. It could be bugged, or it could be that
it's hard to make it work in all circumstances with the compiler and OS, and it's
been shipped with known deficiencies.