how to find error

Robert Magdziarz

unread,

Aug 23, 2021, 6:28:01 AM8/23/21

to

I have an error in my program which occurs rarely (I see wrong values in XML output file). I inserted a few assertions into functions which (as I suppose) contain error. Unformtuantely I failed to reproduce error.
Q: How to solve my problem, how to find error? What strategy should I use?
I am not very experienced programmer, intermediate I would say.

Mark Bluemel

unread,

Aug 23, 2021, 6:50:54 AM8/23/21

to

I usually sacrifice a black cockerel in a pentagram I've drawn in black chalk. The demon thus summoned can usually debug my code for me.

If that fails, I move on to sacrificing a goat. Fortunately I've never reached stage 3, as sacrificing a virgin can cause offence.

Seriously, you've given us virtually nothing to comment on, so I may as well be facetious. This link <https://stackoverflow.com/help/how-to-ask> shuould help you to ask a more well-formed question, which may get you a more helpful response. Some of the points are specific to Stack Exchange, but much of what is covered applies generally.

David Brown

unread,

Aug 23, 2021, 7:03:48 AM8/23/21

to

Use as much static error checking as you can - if the compiler can spot
your errors, it's much easier to fix them. For gcc, use "-O2 -Wall
-Wextra" to start with. For other compilers, there are other warning
flags. There are also dedicated static checkers for a range of
use-cases and budgets.

Look into using sanitizers, which are run-time checks. They can be good
for finding subtle problems.

Put asserts or checks at places where you /know/ something has gone
wrong - such as checks when the output is generated. Don't rely on
guesses as to where the bug might be (but put checks there too). Get
some checks or assertions to hit at the output stage, then work
backwards to see when the error arose. Use a debugger too, with
breakpoints at these checks so that you can look more closely at the
state of the program.

As Mark said, with no information about your program, tools, etc., no
one can give you concrete help.

Ben Bacarisse

unread,

Aug 23, 2021, 8:36:15 AM8/23/21

to

Robert Magdziarz <robert.mag...@gmail.com> writes:

> I have an error in my program which occurs rarely (I see wrong values
> in XML output file). I inserted a few assertions into functions which
> (as I suppose) contain error. Unformtuantely I failed to reproduce
> error.
> Q: How to solve my problem, how to find error? What strategy should I
> use?

No much to go on. Can you post or link to the code?

But you give one hint -- the bug might come and go depending on things
like assertions that should have no effect. That could point to the bug
being the result of something undefined like a buffer overrun.

I second David's suggestions. First, crank up the warnings from the
compiler to the max. For gcc, I use

gcc -std=c11 -pedantic -D_ANSI_SOURCE -fno-common \
-fno-diagnostics-show-option -fsanitize=undefined -Wbad-function-cast \
-Wstrict-prototypes -Wall -Wextra -Wformat-nonliteral -Wcast-align \
-Winline -Wundef -Wcast-qual -Wshadow -Wconversion -ffloat-store

but I dare say there are more I could use these days. Second, although
the -fsanitize=undefined flag will do a lot of run-time checking, you
could also try running the program under a memory checker like valgrind.

Once you've done that, you need to use "traditional" debugging methods,
but since I am an outlier here about what the best of those methods are,
I'll say no more right now.

--
Ben.

Malcolm McLean

unread,

Aug 23, 2021, 8:57:50 AM8/23/21

to

On Monday, 23 August 2021 at 11:28:01 UTC+1, Robert Magdziarz wrote:

It'svery difficult to debug when you cannot reproduce the bug. I presume that the input which caused
the bug has been discarded.

Obviously you go over the code looking for errors, but you have tired that and failed.

So the next step is to reproduce the error. Try more inputs until the error reappears. Once you can switch
the error on an off by toggling values in the input, you are well on your way to finding the bug.

Lew Pitcher

unread,

Aug 23, 2021, 9:18:59 AM8/23/21

to

This is less of a question for comp.lang.c, and more for (perhaps) comp.programming.
But, I'll give it a go.

My first thoughts would be to examine the faulty XML file, and determine what sort
of "wrong values" it contains. If you identify the "wrong values", and their
characteristics (both /expected/ and /actual/), you might be able to determine what
sort of operation (within the program) /makes/ them wrong.

For instance, if you /expect/ an integer numeric value, but the /actual/ value
contains decimal places, then you might suspect that the component of the program
that formats numbers has an issue.

From there, I would examine the code that externalizes those "wrong values", and
try to determine what would cause it to externalize the /right/ values in the
/wrong/ way.

If this leads nowhere, then I would trace the origin of those "wrong values" back
through the code, examining each component that manipulates them.

All this would take both bench-checking of code, trial runs with selected debug
data, and possibly trial runs under a debugging tool.

HTH
--
Lew Pitcher
"In Skills, We Trust"

Bart

unread,

Aug 23, 2021, 9:19:04 AM8/23/21

to

Are those wrong values wrong on the same input? Eg. you repeatedly run
on the same input, but sometimes the output is wrong. Or is it
consistently wrong on the same input?

When the output is wrong, do you know what the input is, and what the
output should be for those values? Does the same input generate
different wrong results each time it goes wrong? (That would suggest
unnitialised data, or reading some out of bounds values.)

Do those inputs have anything in common; can you extract the troublesome
inputs into their own small input, and try and reproduce the problem
within a smaller task. The would make it more practical to insert
debugging prints.

Is there anything special about the offsets within the XML output that
it goes wrong? (eg. 64K boundaries.)

Does it go wrong often enough that you can experiment with adding extra
info to the XML output to monitor what is being done? Even if it's just
what part of the program is generating that output.

What is the nature of the wrong output: incorrect values, garbage? If
garbage, what does it look like when dumped in binary?

But the main thing is that you have to be able to reproduce the problem
(preferably the same problem with the same input, with as small an input
as possible), and some effort should be expended in achieving that.

For example, if it very rarely goes wrong with the same input, set up a
a script that runs the program large numbers of times, and compares the
output to a reference file.

Kaz Kylheku

unread,

Aug 23, 2021, 11:49:54 AM8/23/21

to

On 2021-08-23, Robert Magdziarz <robert.mag...@gmail.com> wrote:
> I have an error in my program which occurs rarely (I see wrong values
> in XML output file).

What output file?

> I inserted a few assertions into functions which
> (as I suppose) contain error. Unformtuantely I failed to reproduce
> error.
> Q: How to solve my problem, how to find error? What strategy should I use?
> I am not very experienced programmer, intermediate I would say.

If the program is entirely deterministic (takes some non-real-time
inputs and calculates an output purely from those inputs) and the output
varies, then you have some undefined behavior in there:

- using an uninitialized object

- accessing memory through an invalid pointer, such as one incremented
out of bounds, or pointing to storage that has been freed and re-used.

- functions not declared in a header file, or not including
the header file, and calling them with wrong arguments.
(Use compiler options to detect calls or definitions without declarations).

- misuses of variadic function calling interfaces, like printf or scanf.

- in theory, any undefined behavior whatsoever.

(A program can appear deterministic in spite of these problems, until
ported to a different system, so you are lucky, if you repro the error!)

There are tools that can help, like Valgrind which exists on GNU/Linux
and a few other platforms. It will find uses of uninitialized memory
and some cases of invalid pointers in a hurry; you just run
"valgrind <your-program>" and then read the error diagnostics.
If the program is compiled with debug info, you get line number
information in there.

Before doing that, check your program for obvious problems. Are there
local variables you didn't initialize? How about structure members?
Any tricky pointer arithmetic going on?

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Lew Pitcher

unread,

Aug 23, 2021, 2:57:23 PM8/23/21

to

On Mon, 23 Aug 2021 03:50:48 -0700, Mark Bluemel wrote:

> On Monday, 23 August 2021 at 11:28:01 UTC+1, Robert Magdziarz wrote:
>> I have an error in my program which occurs rarely (I see wrong values
>> in XML output file). I inserted a few assertions into functions which
>> (as I suppose) contain error. Unformtuantely I failed to reproduce error.
>> Q: How to solve my problem, how to find error? What strategy should I use?

[snip]

> This link <https://stackoverflow.com/help/how-to-ask> shuould help you to
> ask a more well-formed question,

I prefer
http://www.catb.org/~esr/faqs/smart-questions.html
as it gives a lot more detail, and is not specific to stackoverflow.

[snip]

James Kuyper

unread,

Aug 23, 2021, 3:28:28 PM8/23/21

to

A call to assert() should have an effect only if it is triggered, in
which case you should see the assert message and the program should
terminate. If inserting an assert() does change the behavior without
getting triggered, that implies that some other part of your program has
behavior that is unspecified, and probably undefined. That,
unfortunately, doesn't help very much because there's such a huge
variety of different possible reasons why your code might have
unspecified behavior. That's pretty nearly 100% of what can be said
based upon what you've written above. As others have pointed out, you
really need to provide more information before anyone can help you with
this.

Now, you may be thinking "my program is way too big, I can't just post
the entire thing", and you're probably right about that. However, I can
give you some advice that will help you pare it down to a more
manageable size. It has the added advantage that following this process
often helps you figure out the problem by yourself.

You will be maintaining three copies of your program. The first is the
original version that demonstrated the problem, which won't be touched
by this process. The second version will be your working copy, which is
where you will make all of your changes. The third version is the most
recent version of the working copy that demonstrated the problem you're
investigating. If you're very familiar with a version control system of
some kind, it can be useful to use such a system to keep track of these
versions. Otherwise, just do it manually.

At the start, the second and third versions should be initialized by
copying from the original version.

The main part of this process is iterative.
First, remove some part of the program. It must be a part that should,
according to your understanding, have no effect on the ability of your
program to demonstrate the problem. Secondarily, it should be as large a
part of your program as possible. A good choice for the first iteration
would be to remove everything that's supposed to get executed only after
the part that's malfunctioning. You might need to re-write some of the
remaining code to make it work in some useful fashion after removing
that part. Keep notes on what part you removed at each step. If you're
using a version control system, good check-in comments can serve as your
notes.
The key point of removing that section is to determine whether or not
you were right about it having no effect. If you understood precisely
how your program actually works, you would already know how to fix it,
so you should expect to be incorrect about something. The next step,
therefore, is to test whether the modified version of your program
actually still demonstrates that problem.

If it does demonstrate the problem, your working version become the new
latest version that demonstrates the problem, and also becomes the
starting point for the next step through the iteration.
If it no longer demonstrates the problem, think very carefully about how
the change failed to have the effect you thought it should have. Often,
that's very important information that might lead you to figure out what
it was you were misunderstanding. If not, discard your changes, and go
back to the latest version that did demonstrate the problem, as the
starting point for your next step in the iteration. It's very importance
to make sure that whatever you remove during the next iteration be
different from what you removed during this iteration. Often you can
simpler remove less than you did during this iteration.

If you never figure out the problem for yourself, you'll eventually
reach the point where you can't think of anything more to remove. By
this time, it should be a lot shorter than the original version.
Therefore, it's time to give your complete but simplified program to
someone else to ask them for help, for example this newsgroup.

When you ask for that help, describe precisely the platform that you're
using, as well as the name of the compiler you're using, the version of
that compiler, and the options you selected when building it. Very
little code is perfectly portable, and that's particularly true of
whatever code you've written that's not doing what you think it should do.

If the program needs any inputs, document them. Explain precisely what
you expected the program to do - this might seem obvious, but problems
are often due to you having erroneous expectations. Then present
whatever evidence you have to show that it did something other than what
you expected.

Keith Thompson

unread,

Aug 23, 2021, 4:39:23 PM8/23/21

to

James Kuyper <james...@alumni.caltech.edu> writes:
> On 8/23/21 6:27 AM, Robert Magdziarz wrote:
>> I have an error in my program which occurs rarely (I see wrong values
>> in XML output file). I inserted a few assertions into functions which
>> (as I suppose) contain error. Unformtuantely I failed to reproduce
>> error.
>> Q: How to solve my problem, how to find error? What strategy should I use?
>> I am not very experienced programmer, intermediate I would say.
>
> A call to assert() should have an effect only if it is triggered, in
> which case you should see the assert message and the program should
> terminate. If inserting an assert() does change the behavior without
> getting triggered, that implies that some other part of your program has
> behavior that is unspecified, and probably undefined. That,
> unfortunately, doesn't help very much because there's such a huge
> variety of different possible reasons why your code might have
> unspecified behavior. That's pretty nearly 100% of what can be said
> based upon what you've written above. As others have pointed out, you
> really need to provide more information before anyone can help you with
> this.

Just a couple of comments on this excellent response:

Let me expand a little on that "should".

Defining the macro NDEBUG disables assertions. You can add `#define
NDEBUG` or, with many compilers, use an argument like `-DNDEBUG`. The
assert() macro expands either to code that prints an error message and
calls abort() if the macro NDEBUG is *not* defined, or to code that does
nothing if NDEBUG *is* defined. The argument to assert() is a scalar
expression.

If that scalar expression has side effects, then the behavior of your
program will change depending on whether NDEBUG is defined. There's
nothing illegal about it, but it's generally considered to be very poor
practice. *Unintentional* side effects in assert invocations are bad
news.

In rare cases, you might write something like:

assert(fputs("WARNING: Assertions are enabled, performance will be degraded\n", stderr));

> Now, you may be thinking "my program is way too big, I can't just post
> the entire thing", and you're probably right about that. However, I can
> give you some advice that will help you pare it down to a more
> manageable size. It has the added advantage that following this process
> often helps you figure out the problem by yourself.
>
> You will be maintaining three copies of your program. The first is the
> original version that demonstrated the problem, which won't be touched
> by this process. The second version will be your working copy, which is
> where you will make all of your changes. The third version is the most
> recent version of the working copy that demonstrated the problem you're
> investigating. If you're very familiar with a version control system of
> some kind, it can be useful to use such a system to keep track of these
> versions. Otherwise, just do it manually.

And keep in mind that the second and third versions do *not* need to
perform the task required for your original program. Their purpose is
only to demonstrate the bad behavior you're trying to diagnose.

[snip]

> If the program needs any inputs, document them. Explain precisely what
> you expected the program to do - this might seem obvious, but problems
> are often due to you having erroneous expectations. Then present
> whatever evidence you have to show that it did something other than what
> you expected.

And if your program needs inputs, consider hardwiring those inputs into
your demonstration program. If your original program chokes when given
the input "foo bar", see if you can reproduce the problem by defining a
string literal "foo bar" in your program rather than reading it as
input. Anything you can do to make it easier for others to reproduce
the problem is likely to be helpful. But if that turns out to be too
difficult, don't worry about it -- as long as you clearly document what
input causes the problem.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Lynn McGuire

unread,

Aug 23, 2021, 9:50:50 PM8/23/21

to

I am assuming that debugging does not work or show the error.
Sometimes when debugging a random problem, I will restart the debugger a
hundred times to get the error one time. Persistence is a virtue.

Also, print statements to the command line or in your XML file.

Lynn

Rosario19

unread,

Aug 24, 2021, 11:22:50 AM8/24/21

to

i don't remember well but i have done something as this:
identify the place in mem, the address, of the program where is
manifest the error in the debug compiled file

than write in the loop where appear the error

if(*mem==errorNumber)
{
//make a break point in the debug window here
// and see all variables in the debug windows
// see where is the error just after it happen

}

Kaz Kylheku

unread,

Aug 25, 2021, 1:15:55 PM8/25/21

to

This can be a valuable technique. In the implementation of the TXR
language, there is a similar thing. IN the gc.c module:

#if CONFIG_EXTRA_DEBUGGING
val break_obj;
#endif

If extra debugging is turned on, then when you're debugging,
you can put the address of an interesting object into break_obj.

The implementation will call a function called breakpt() when
that object comes up in various situations in the garbage collector.
You can put a break point on that.