Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

GNU compiler g++ most extreme debugging

54 views
Skip to first unread message

Frederick Virchanza Gotham

unread,
Jun 23, 2022, 7:46:34 AM6/23/22
to

A week ago here in comp.lang.c++ I started a thread entitled "Can any tool catch this invalid memory access?".

One or two people said I should use valgrind, but valgrind is useless at detecting invalid access to static data or stack data. Valgrind is only effective at detecting invalid access to the heap.

Then I tried using g++ with the command line option "-fsanitize", and it works very well, it successfully flagged invalid access to an array on the stack.

Just today I tried "-D_GLIBCXX_DEBUG", and I'm actually a little mesmerized at how effective it is. It catches *everything*, especially invalid access to STL containers and also classes such as string_view.

So right now here's how I'm building my program:

g++ -o precompiler precompiler.cpp -std=c++20 -ggdb3 -D_GLIBCXX_DEBUG -fsanitize=address,leak,undefined -fsanitize=pointer-compare -fsanitize=pointer-subtract -fstack-protector-all

This is working very well for me, but if anyone has any more suggestions for even more extreme debugging, I'm all ears.

David Brown

unread,
Jun 23, 2022, 8:18:22 AM6/23/22
to
Some problems can only be caught at run time, and you've got a good
selection of flags and tools for that.

But it is even better to catch problems at compile time. Make good use
of the compiler's static warning facilities. At a minimum, enable
"-Wall" and optimisation of at least "-O1". (With no optimisation, less
code analysis is done and fewer errors are spotted.) You should also
try "-Wextra", though some of the flags enabled there are controversial.

<https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html>


The new gcc "static analyzer" may also be helpful, depending on your code :

<https://gcc.gnu.org/onlinedocs/gcc/Static-Analyzer-Options.html>

Frederick Virchanza Gotham

unread,
Jun 23, 2022, 9:44:44 AM6/23/22
to
On Thursday, June 23, 2022 at 12:46:34 PM UTC+1, I wrote:

> This is working very well for me, but if anyone has any more suggestions for even more extreme debugging, I'm all ears.


I'm still trying to find a debugger that will flag the error in the following code. I've written a function called "Increment_First_And_Print_Without_First" which takes an input such as the word "brush", and it prints out:

rush
crush

Inside the body of "Increment_First_And_Print_Without_First", I dereference an iterator to one-past-the-last, which is undefined behaviour. Here's how I build the following program:

g++ -o iterator_deref iterator_deref.cpp -std=c++20 -ggdb3 -D_GLIBCXX_DEBUG -fsanitize=address,leak,undefined -fsanitize=pointer-compare -fsanitize=pointer-subtract -fstack-protector-all

When I run the program, no error is flagged. No error is flagged in 'gdb' either. I haven't found any debugging tool that can find this error.

Here's the code:

#include <iostream>
#include <string>
#include <string_view>
#include <type_traits>

using namespace std;

void Increment_First_And_Print_Without_First(string &s)
{
if ( s.size() < 2u ) throw -1;

++( s[0u] );

// The next line has an invalid dereferencing of an iterator
cout << "The next line dereferences an end() pointer" << endl;
cout << string_view( &*(s.cbegin() + 1u), &*(s.cend()) ) << endl;
}

int main(void)
{
cout << "string::const_iterator is "
<< (is_same_v< string::const_iterator, char const * > ? "just a raw pointer" : "NOT a simple pointer") << endl;

cout << "string_view::const_iterator is "
<< (is_same_v< string_view::const_iterator, char const * > ? "just a raw pointer" : "NOT a simple pointer") << endl;

string str("brush");

Increment_First_And_Print_Without_First(str);

cout << str << endl;
}

On my x86_64 Ubuntu PC here, the output I get is:

string::const_iterator is NOT a simple pointer
string_view::const_iterator is just a raw pointer
The next line dereferences an end() pointer
rush
crush

Frederick Virchanza Gotham

unread,
Jun 23, 2022, 10:02:12 AM6/23/22
to
On Thursday, June 23, 2022 at 2:44:44 PM UTC+1, Frederick Virchanza Gotham wrote:

> When I run the program, no error is flagged. No error is flagged in 'gdb' either. I haven't found any debugging tool that can find this error.


Actually here's a smaller simpler program:

#include <iostream>
#include <string>
#include <string_view>
#include <type_traits>

using namespace std;

int main(void)
{
cout << "string::const_iterator is "
<< (is_same_v< string::const_iterator, char const * > ? "just a raw pointer" : "NOT a simple pointer") << endl;

cout << "string_view::const_iterator is "
<< (is_same_v< string_view::const_iterator, char const * > ? "just a raw pointer" : "NOT a simple pointer") << endl;

string s("brush");

cout << string_view( &*(s.cbegin() + 1u), &*(s.cend() + 876u) ) << endl;
}


On my x86_64 Ubuntu Linux PC, this prints garbage but doesn't crash. I was hoping "-fsanitize" would catch the "+876" on the last line.

David Brown

unread,
Jun 23, 2022, 10:03:31 AM6/23/22
to
On 23/06/2022 15:44, Frederick Virchanza Gotham wrote:
> On Thursday, June 23, 2022 at 12:46:34 PM UTC+1, I wrote:
>
>> This is working very well for me, but if anyone has any more suggestions for even more extreme debugging, I'm all ears.
>
>
> I'm still trying to find a debugger that will flag the error in the following code. I've written a function called "Increment_First_And_Print_Without_First" which takes an input such as the word "brush", and it prints out:
>
> rush
> crush
>
> Inside the body of "Increment_First_And_Print_Without_First", I dereference an iterator to one-past-the-last, which is undefined behaviour. Here's how I build the following program:
>
> g++ -o iterator_deref iterator_deref.cpp -std=c++20 -ggdb3 -D_GLIBCXX_DEBUG -fsanitize=address,leak,undefined -fsanitize=pointer-compare -fsanitize=pointer-subtract -fstack-protector-all
>
> When I run the program, no error is flagged. No error is flagged in 'gdb' either. I haven't found any debugging tool that can find this error.
>

You are dereferencing the iterator, then taking its address. In C,
"&*x" is treated exactly as "x" except for constraint checking and
making the result an lvalue. So it is defined behaviour, even if "x"
were a null pointer. I don't see anything matching that (section
6.5.3.2p3) in the C++ standards. But it is possible that the "&*" pair
gets removed early in the compilation process, and is not seen by any of
the sanitizers or checkers. After all, you are not really dereferencing
the iterator - you are taking the address of the dereference.

Frederick Virchanza Gotham

unread,
Jun 23, 2022, 10:47:46 AM6/23/22
to
On Thursday, June 23, 2022 at 3:03:31 PM UTC+1, David Brown wrote:

> You are dereferencing the iterator, then taking its address.


The expression, "s.cend()", is of type string::const_iterator, which is not a simple pointer (or at least it's not a simple pointer on my g++ compiler).


> In C, "&*x" is treated exactly as "x" except for constraint checking and
> making the result an lvalue.


I think you mean R-value.

I'm only concerned with C++ right now. Is there anything in the C++ standard that says you can dereference an invalid pointer so long as you immediately take the address of the pointed-to thing? Is the following program valid in C++20?

int main(void)
{
char *p = nullptr;

&*p;
}

By the way "&*" has an observable effect when used on an array, for example:

Define an array: char buf[64];

The expression "buf" is an L-value of type "char[64]"

The expression "&*buf" is an R-value of type "char*"


> So it is defined behaviour, even if "x"
> were a null pointer. I don't see anything matching that (section
> 6.5.3.2p3) in the C++ standards. But it is possible that the "&*" pair
> gets removed early in the compilation process, and is not seen by any of
> the sanitizers or checkers. After all, you are not really dereferencing
> the iterator


Yes I am dereferencing the iterator. No doubt about it -- the iterator is getting dereferenced.


> you are taking the address of the dereference.


This also happens.

Juha Nieminen

unread,
Jun 23, 2022, 11:04:53 AM6/23/22
to
Frederick Virchanza Gotham <cauldwel...@gmail.com> wrote:
> One or two people said I should use valgrind, but valgrind is useless at detecting invalid access to static data or stack data. Valgrind is only effective at detecting invalid access to the heap.

While calling valgrind "useless" is technically correct in this context,
to catch this kind of mistake, I think it's quite a strong word with quite
a negative connotation, which I feel is a bit unfair towards the program.
The program itself is great for debugging the things that it does support.
It can't catch (most) out-of-bounds accesses on the stack because that's
quite literally impossible (without the help of the compiler), but you
merely need to be aware of that.

I think it would be nicer to just say that "valgrind can't detect these
types of error", and not dismiss its usefulness in the things it has
actually been designed to test (ie. heap allocations and accesses).

> Just today I tried "-D_GLIBCXX_DEBUG", and I'm actually a little mesmerized at how effective it is. It catches *everything*, especially invalid access to STL containers and also classes such as string_view.

To be more precise, it catches (most) wrong uses of the standard library.
(That's what the "GLIBCXX" is referring to. It's a macro used in the standard
library implementation used by gcc.)

It won't catch errors that are not related to the standard library
utilities.

Juha Nieminen

unread,
Jun 23, 2022, 11:08:12 AM6/23/22
to
Frederick Virchanza Gotham <cauldwel...@gmail.com> wrote:
> string s("brush");
>
> cout << string_view( &*(s.cbegin() + 1u), &*(s.cend() + 876u) ) << endl;
> }
>
> On my x86_64 Ubuntu Linux PC, this prints garbage but doesn't crash. I was hoping "-fsanitize" would catch the "+876" on the last line.

Have you tried to run the program through valgrind? Because it has been
designed precisely for that (you are accessing dynamically allocated
memory out of bounds).

David Brown

unread,
Jun 23, 2022, 12:26:24 PM6/23/22
to
On 23/06/2022 16:47, Frederick Virchanza Gotham wrote:
> On Thursday, June 23, 2022 at 3:03:31 PM UTC+1, David Brown wrote:
>
>> You are dereferencing the iterator, then taking its address.
>
>
> The expression, "s.cend()", is of type string::const_iterator, which is not a simple pointer (or at least it's not a simple pointer on my g++ compiler).
>
>

Yes. Despite that, I suspect that the "&*" can be turned into
calculating an address, without actually dereferencing it. And if the
generated code does not dereference it, memory access checkers will have
a hard time spotting the problem.

>> In C, "&*x" is treated exactly as "x" except for constraint checking and
>> making the result an lvalue.
>
>
> I think you mean R-value.

I actually meant to write /not/ an lvalue, which is basically the same
thing in C (but "not an lvalue" is the phrase used by the standard).
Sorry for missing that!

>
> I'm only concerned with C++ right now. Is there anything in the C++ standard that says you can dereference an invalid pointer so long as you immediately take the address of the pointed-to thing? Is the following program valid in C++20?
>
> int main(void)
> {
> char *p = nullptr;
>
> &*p;
> }

I could not find anything to that effect, but the C++ standards are big
- and they change faster than I read them! Maybe someone else here can
give you a better answer.

>
> By the way "&*" has an observable effect when used on an array, for example:
>
> Define an array: char buf[64];
>
> The expression "buf" is an L-value of type "char[64]"
>

It is, but it does not survive as an lvalue in most expressions - in
most uses, it gets converted to a pointer to its first element (an
rvalue). That includes "&*buf", but also most

> The expression "&*buf" is an R-value of type "char*"
>

Yes.

The exact passage from the C standard is :

"""
The unary & operator yields the address of its operand. If the operand
has type ‘‘type’’, the result has type ‘‘pointer to type’’. If the
operand is the result of a unary * operator, neither that operator nor
the & operator is evaluated and the result is as if both were omitted,
except that the constraints on the operators still apply and the result
is not an lvalue. Similarly, if the operand is the result of a []
operator, neither the & operator nor the unary * that is implied by the
[] is evaluated and the result is as if the & operator were removed and
the [] operator were changed to a + operator.
"""

As I say, I didn't find anything similar in the C++ standard. But it's
quite possible that the compiler does a similar "remove the &* then
correct the type" operation in an early code simplification pass,
assuming of course that this is okay with respect to any overloaded
operators involved. Later passes, sanitisers, and run-time checks would
then not see the dereference.

Frederick Gotham

unread,
Jun 23, 2022, 1:09:13 PM6/23/22
to
On Thursday, 23 June 2022 at 17:26:24 UTC+1, David Brown wrote:
>
> The exact passage from the C standard is :


That would not make sense in C++ when unary operator* is overloaded

Frederick Gotham

unread,
Jun 23, 2022, 3:03:05 PM6/23/22
to
When the command line option for debugging of the standard library is used, the code for string::const_iterator should abort if there's a bad addition or a bad dereference. I should contact the maintainers.

Frederick Virchanza Gotham

unread,
Jun 23, 2022, 5:08:28 PM6/23/22
to
On Thursday, June 23, 2022 at 8:03:05 PM UTC+1, Frederick Gotham wrote:

> When the command line option for debugging of the standard library is used, the code for string::const_iterator should abort if there's a bad addition or a bad dereference. I should contact the maintainers.


Just now I posted to the mailing list for libstdc++, you can see my post here:

https://gcc.gnu.org/pipermail/libstdc++/2022-June/054316.html

Paavo Helde

unread,
Jun 26, 2022, 3:32:26 PM6/26/22
to
MSVC19 detects the error at run time and aborts the program with a
dialog box:

Debug Assertion Failed!

Program: ...eTestVS2019\ConsoleTestVS2019\x64\Debug\ConsoleTestVS2019.exe
File: C:\Program Files (x86)\Microsoft Visual
Studio\2019\Professional\VC\Tools\MSVC\14.29.30133\include\xstring
Line: 1998

Expression: cannot seek string iterator after end

Paavo Helde

unread,
Jun 26, 2022, 3:46:22 PM6/26/22
to
23.06.2022 17:03 David Brown kirjutas:
> On 23/06/2022 15:44, Frederick Virchanza Gotham wrote:

>
> You are dereferencing the iterator, then taking its address.  In C,
> "&*x" is treated exactly as "x" except for constraint checking and
> making the result an lvalue.  So it is defined behaviour, even if "x"
> were a null pointer.  I don't see anything matching that (section
> 6.5.3.2p3) in the C++ standards.

In C++, dereferencing an end pointer is UB, end of story. Presumably
this is to allow for debugging implementations like MSVC debug mode,
which reports for this code:

Expression: cannot dereference string iterator because it is out of
range (e.g. an end iterator)

0 new messages