------- Original Message -------
John Graham-Cumming Aug 28 12:49PM +0200
> Of course every effort should be made to eliminate errors during
software development but a good simulator can detect fault conditions
that you had not thought about (the unknown -unknowns as an infamous US
politician once said).
Can you give an example of that? In general testing means verifying that
the results are expected and hence the output would be 'known'. It is
possible to do some crash detection by fuzzing inputs with random data,
but that's very different from these devices that generate NMEA strings.
In my view they are useful for testing the serial connection and little
else because the rest (string recognition, parsing, unit conversion) is
ripe for unit testing. That can be done entirely in software and in an
automated fashion.
John.
--------------------------------
Well, we once supported a Dutch software company, that made a great
ballyho about how their software was fully tested before release, by an
automated system. I even visited their development offices and saw such
a test in progress. Impressive it has to be said..
But...
They were only checking for expected correct responses, for known correct
inputs. I proved to them (in about 5 seconds flat!) That give their
latest and greatest some invalid input (an extra space in a string input
I seem to remember in one instance) and it would at best give you garbage
output, at worst BSoD the PC! Not good, for something that was intended
to control a multi million dollar high power RF test facility!
(I was escorted out of the office shortly after that, while they sat down
and "had a meeting to discus the issue".)
We don't support that software any more. (Thankfully! C++ code with
Dutch language procedure/variable etc names, and comments. It's amazing
what Google Translate can handle, I was adding English translations to
the comments in their code, another black mark to my name...)
I'm probably preaching to the converted, but....
Take care with any software that takes input from something not of your
doing. Heck, even if it is your code creating the incoming data, check
it...
All input (especialy wetware created) should be sanitised for gross
errors (missing fields, out of bounds values, string data instead of
numeric etc, and so on) before the main processing routines.
Then, whatever routine does the processing, should have robust error
catching, much like the Try/Except constructs in Delphi.
Don't forget the 'Else' clause, at the end of an 'If' etc. Even if you
don't "Need" it, code it, to make any data values passed downstream
"sane" or indicate a problem in a safe way for later code, if that
happens.
Preset/Clear any common scratchpad variable's that are shared between
routines, that are "NOT" used to transfer data from one routine to
another. Be consistant with that, do you preset on entry, or garbage
collect on exit, do it one way, and one way only througout your code.
Take care of your string buffers too, no blind copying, always limit the
possible number of bytes coppied, to be equal or less than the buffer
size! Obvious, but that's the single biggest single cause of a lot of
'C/C++' code woes, (Buffer Overun's) when dealing with data from the
great unwashed outside world. GPS and other sensor data etc.
Lastly, if you must use an automated software test scheme, after checking
it does what you expect with "Good" data, throw some bad data at it, and
see what fall's over. Start with subtly bad data, extra embedded spaces,
empty data fields etc, such as would happen if a GPS looses lock
unexpectedly for example, then progress to random garbage.
Of course, it's for you to decicde what your code does with bad data,
silently ignore it, or signal an error somewhow.
Making any software robust, usually takes much more time and effort, than
making it do what you want in the first place. Sometimes more so for
simple code. But it pays off in the long term.
If you can, implement a watchdog type reset/recovery. If something bad
happens that locks your microcontroler, at least it could reset itself
after some delay, and if the problem was transient, it could come back to
life. Not always as easy to do as it is to say however...
73.
DJB.