Ftn I/Os documentation best practices

Don Y

unread,

Jun 26, 2022, 3:36:00 PM6/26/22

to

I add a boilerplate to each function definition that
declares constraints on inputs, expectations of outputs,
performance issues, etc. I use this to add invariants
to the code to detect/enforce these conditions.

But, there is nothing that ensures that I've done
this -- other than discipline.

I'm looking at ways to create an IDL that will allow
for more specific criteria to be included in the
declaration that could also drive the IDL compiler
to add suitable invariants as applicable.

[This makes RPC much more effective but can also
benefit traditional ftn invocations]

Any pointers to similar schemes? I've been looking
through CORBA et al. for hints but they seem to
focus on bigger machines (where there is more tolerance
over data types and more overhead expected).

David Brown

unread,

Jun 27, 2022, 3:36:27 AM6/27/22

to

What programming language are you using? If your answer is "C", it's wrong.

If you are just putting these things in comments, then they will get out
of sync with the code. The best you can do is writing something like a
Python script that will read the C code and check for the pattern of
comments.

If you want something really useful, you need a programming language
that will let you write the contracts in the language itself - then they
can be checked and enforced. Ada, D, and Scala are examples. C++ has a
Boost.Contracts library, and language support for contracts is due in
C++23 (last I heard - but it might be delayed again).

Grant Edwards

unread,

Jun 27, 2022, 11:18:12 AM6/27/22

to

On 2022-06-27, David Brown <david...@hesbynett.no> wrote:
> On 26/06/2022 21:35, Don Y wrote:
>> I add a boilerplate to each function definition that
>> declares constraints on inputs, expectations of outputs,
>> performance issues, etc.

> What programming language are you using? If your answer is "C",
> it's wrong.
>
> If you are just putting these things in comments, then they will get out
> of sync with the code.

I'd have to agree. I've worked with many projects and third-party
libraries over the decades which had a big template of comments for
every function which described the input/ouput parameters, return
value, global variables used, and so on.

Often these templates generated documents by using something like
Doxygen.

And on _every_single_one_ of those projects and libraries, the
comments were wrong often enough that nobody who knew which way was up
paid any attention to them. If you wanted to know what the parameters
were for, what the function returned, and so on, you read the C code.

A lot of the time, even the numbers and names of the parmeters
described in the template didn't match the code.

The auto-generated PDF documents and HTML web site looked nice, though.

--
Grant

David Brown

unread,

Jun 27, 2022, 1:52:51 PM6/27/22

to

Accuracy of such in-code documentation varies, but there is generally no
way to check it automatically. That's one of the reasons it is better
to use constructs in the programming language, where possible, rather
than documentation and comments. For preconditions, postconditions and
invariants, you need a language that has support for contracts. For
other languages, usually the best you can do is careful choice of names
and types, along with assert statements.

Still, Doxygen-like comments in code are usually better synchronised
with the code than external documentation!

Don Y

unread,

Jun 27, 2022, 5:34:40 PM6/27/22

to

On 6/27/2022 8:18 AM, Grant Edwards wrote:
> On 2022-06-27, David Brown <david...@hesbynett.no> wrote:
>> On 26/06/2022 21:35, Don Y wrote:
>>> I add a boilerplate to each function definition that
>>> declares constraints on inputs, expectations of outputs,
>>> performance issues, etc.
>
>> What programming language are you using? If your answer is "C",
>> it's wrong.
>>
>> If you are just putting these things in comments, then they will get out
>> of sync with the code.
>
> I'd have to agree. I've worked with many projects and third-party
> libraries over the decades which had a big template of comments for
> every function which described the input/ouput parameters, return
> value, global variables used, and so on.

You perhaps missed the balance of my post:

"I use this to add invariants to the code to detect/enforce
these conditions."

...

"I'm looking at ways to create an IDL that will allow for
more specific criteria to be included in the declaration
that could also drive the IDL compiler to add suitable
invariants as applicable."

I.e., a "specification language" FROM WHICH the IDL compiler can
(I am currently using an enhanced form of OCL) create the code -- in
whatever language binding is selected AT COMPILE TIME.

So, if I say:
month > 0
AND
month < 13
as constraints *in* the function's "prototype", then
the IDL compiler generates an invariant that throws a
"range error" OR panics (depending on IDL compiler switch)
AT RUN TIME if the function is invoked with the "month"
parameter not compliant with those constraints.

The OCL *documents* the calling constraints of the
function (and its return values) in a language neutral
manner. I.e., you could create an ASM binding for the
IDL compiler's output and the programmer would be
none the wiser.

The advantage of driving the code generator this way is
the "documentation" creates the code -- if you don't
*document* (declare) a constraint, then it isn't enforced.

It ensures the code and documentation agree and that
every bit of documentation has a corresponding bit of
code (but not necessarily the other way around)

> Often these templates generated documents by using something like
> Doxygen.
>
> And on _every_single_one_ of those projects and libraries, the
> comments were wrong often enough that nobody who knew which way was up
> paid any attention to them. If you wanted to know what the parameters
> were for, what the function returned, and so on, you read the C code.

You *always* read the code. The OCL declarations *are* effectively
code; the stub generated *will* reference "month" and not "moth"
or "monday" (or whatever). But, they are formally expressed in a
syntax defined by the "specification language" (~OCL in my case).

Invoking the exemplar with a month of "13" could possibly work
within the body of the function, as implemented -- perhaps treating
this as year++ with month=1 -- but the invariant won't let the
value *into* the function. Because the intent was *not* to invoke
the function with a bogus month value.

19A0 is not 2000!

The whole point is to encourage the developer to codify (in OCL)
the constraints on the code so that the IDL compiler can create
the actual instruction sequence (in the language bound to that set
of command line switches) to enforce those constraints.

*But*, you are still reliant on discipline; if the developer
doesn't declare those constraints, then the compiler can't create
any code to do this and simply is resigned to creating the code
to marshal arguments and pack the message for transport.

One can casually inspect the IDL files to see if there is an
abundance -- or a dearth -- of constraints without having to
parse countless source files. The IDL files *generate* the
"header" files so you can't skip that step.

Additionally, it can generate the sever side stubs (in whichever
language binding is appropriate *there*) to unpack and parse
the message, convert the arguments to whatever format is "native"
for the server (knowing that their values are "legitimized" by
the client-side stub) and hand them off to the server-side
function.

[similarly handling the return message]

> A lot of the time, even the numbers and names of the parmeters
> described in the template didn't match the code.
>
> The auto-generated PDF documents and HTML web site looked nice, though.

There's no point in generating "prose" from such a specification.
What are you going to do, pretty-print the generated stubs? Or,
the OCL-expressed constraints?

Stephen Pelc

unread,

Jun 28, 2022, 4:30:40 AM6/28/22

to

On 27 Jun 2022 at 17:18:07 CEST, "Grant Edwards" <inv...@invalid.invalid>
wrote:

>> If you are just putting these things in comments, then they will get out
>> of sync with the code.
>
> I'd have to agree. I've worked with many projects and third-party
> libraries over the decades which had a big template of comments for
> every function which described the input/ouput parameters, return
> value, global variables used, and so on.
>
> Often these templates generated documents by using something like
> Doxygen.

For the last 20 years or so, virtually all our manuals have been created
by our own "literate programming" system called DocGen. DocGen is
optimised for Forth, but it would not be a big job to write a version for C.

DocGen diverges from Doxygen and friends in a several ways. In
particular it does not need template blocks. If your C code is so bad
that another programmer cannot read the declaration, you need far
more help than DocGen or Doxgen can give you. The main entry
for a function follows the declaration

float someFunc( int how, double x, double y )
// *G The purpose of *\c{someFunc} is ...
// ** ...
{
...
}

The lines starting // *x are formal comments to be processed by
DocGen. The *X parts are formatting commands, and the *\<name>{}
parts are text macros.

The ideas behind DocGen are that the code and the documentation
are never separated, and that the DocGen portion is not much larger
than the descriptive comments you should have in your code anyway.
Keeping the code in sync with the documentation is a matter of
company culture and management.

Whenever we receive third party code to include in our products,
we *always* DocGen it before release and we *always* find some
bugs. Overall, I estimate that writing the documentation alongside
the code costs about 10% extra, paid for by the reduction in bug level.

Stephen
--
Stephen Pelc, ste...@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

David Brown

unread,

Jun 28, 2022, 7:48:09 AM6/28/22

to

That is /exactly/ what you do with tools like Doxygen - it extracts
/interface/ information (function prototypes, type declarations, etc.),
strips it of implementation-specific details, merges the comments (which
should hopefully be in sync with the code), and generates clear,
readable, searchable, cross-referenced documentation.

You use tools like that precisely so that people using your library or
code do /not/ read the C code. You don't even have to read the header
files.

And if you are formalising your prototypes with some kind of interface
description language to include preconditions, postconditions and
invariants, then you want them included in the generated documentation.
Ideally, that's what people will read, rather than the IDL source code
or the generated C headers.

The key point of separation of interfaces and implementations is that
people using the code should /only/ use the documented interfaces, and
not rely on anything involved in the implementation. So make the
information about those interfaces clear and precise - such as good
quality generated documentation - and make it accurate - such as by
using an IDL.

Don Y

unread,

Jun 28, 2022, 8:49:55 AM6/28/22

to

I do this by using a specific "paragraph tag" in FrameMaker documents
(e.g., "Code") and then have a simple utility that extracts all thusly
tagged paragraphs to create the "source file" -- which is then
compiled <however>.

[FM files are relatively easy to parse and the format has been
consistent for many releases; I wouldn't think of this sort of
approach with MSWord acting as "container"!]

It adds an extra step to the process (because the source doesn't exist
until extracted from the document).

But, it is ill-suited to producing "manuals" as the presentation
must be linear with the code; you can't tangle/weave to arrange
the code in a different order than the documentation.

OTOH, it is excellent for mixing multimedia with "code"; I can put
an illustration between "if" and "then". Or, a sound snipet to
indicate what a particular (audio) waveform -- expressed as an
array of floats -- *sounds* like adjacent to those constants.
This is particularly helpful with domain-specific constructs,
mechanisms and phenomena with which a generic programmer might
not have prior experience.

I document the "rationale" and "strategy" behind the code, elsewhere.
That can take the "30,000 ft view" of the code and usually needs
infrequent maintenance. E.g., why was Q12.4 format chosen? Show
me the error analysis behind that choice relative to other formats.

Keeping modules short and supporting other non-text annotations
makes it relatively easy for folks to understand the specifics of
an implementation.

But, all of these techniques (yours included) rely on discipline.
There's nothing that mechanically verifies the code and comments
agree. Even semi-automatic mechanisms rely on the developer
having *created* them (e.g., #including an audio file that
was generated by extracting those floats and converting them
to audio). Too often, the "solution" is simply to remove
comments rather than ensuring they are maintained.

Sadly, my experience has been that folks aren't keen on keeping
docs and code in sync and the more documentation, the less it
tends to track the code. Especially for projects that "evolved"
instead of being "designed". (each refactor requiring a substantial
reframing of the commentary)

Stephen Pelc

unread,

Jun 28, 2022, 10:35:56 AM6/28/22

to

On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blocked...@foo.invalid> wrote:
>> The ideas behind DocGen are that the code and the documentation
>> are never separated, and that the DocGen portion is not much larger
>> than the descriptive comments you should have in your code anyway.
>> Keeping the code in sync with the documentation is a matter of
>> company culture and management.

> Sadly, my experience has been that folks aren't keen on keeping
> docs and code in sync and the more documentation, the less it
> tends to track the code. Especially for projects that "evolved"
> instead of being "designed". (each refactor requiring a substantial
> reframing of the commentary)

As others have said it needs discipline. Discipline comes from
management. As the boss, I have made it quite clear that use
of DocGen is a requirement to work at the company. In turn
it is my job to ensure that people know how to use the tool.

Don Y

unread,

Jun 28, 2022, 2:33:58 PM6/28/22

to

On 6/28/2022 7:35 AM, Stephen Pelc wrote:
> On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blocked...@foo.invalid> wrote:
>>> The ideas behind DocGen are that the code and the documentation
>>> are never separated, and that the DocGen portion is not much larger
>>> than the descriptive comments you should have in your code anyway.
>>> Keeping the code in sync with the documentation is a matter of
>>> company culture and management.
>
>> Sadly, my experience has been that folks aren't keen on keeping
>> docs and code in sync and the more documentation, the less it
>> tends to track the code. Especially for projects that "evolved"
>> instead of being "designed". (each refactor requiring a substantial
>> reframing of the commentary)
>
> As others have said it needs discipline. Discipline comes from
> management. As the boss, I have made it quite clear that use
> of DocGen is a requirement to work at the company. In turn
> it is my job to ensure that people know how to use the tool.

You can "legislate" the use of a tool or adherence to a standard.
But, these are subjective issues -- not like "derate all caps by
40%" (which can be independently, mathematically verified). You
rely on individual "employees" for their judgement as to the
effectiveness of their documentation. Likewise, the efficacy
of their test/validation efforts.

EVERY employer and client I've ever worked with has had formal
standards regarding code "style", documentation, testing, etc.
"The Boss" in these cases have ranged from accountants, to
mechanical engineers, to electrical engineers ("no longer
practicing"), to economists. I.e., they can mandate but aren't
qualified to evaluate the quality of the work performed.

You can have peers review each others' work. But, I've not seen
that improve the work of folks who just don't have the drive
to "do better". (And I can't remember anyone EVER being fired
for incompetence!)

The true test of this is handing the design to another party
(i.e., SELLING the design) and seeing how well the new owner
can come up to speed on the product. If you have staff available
"later" that can be consulted wrt their previous work on a
design, then folks need not completely rely on print documentation.

Stephen Pelc

unread,

Jun 29, 2022, 8:37:03 AM6/29/22

to

On 28 Jun 2022 at 20:33:42 CEST, "Don Y" <blocked...@foo.invalid> wrote:

> On 6/28/2022 7:35 AM, Stephen Pelc wrote:
>> As others have said it needs discipline. Discipline comes from
>> management. As the boss, I have made it quite clear that use
>> of DocGen is a requirement to work at the company. In turn
>> it is my job to ensure that people know how to use the tool.
>
> You can "legislate" the use of a tool or adherence to a standard.
> But, these are subjective issues -- not like "derate all caps by
> 40%" (which can be independently, mathematically verified). You
> rely on individual "employees" for their judgement as to the
> effectiveness of their documentation. Likewise, the efficacy
> of their test/validation efforts.

Followed by lots more pointless whining.

Changing company culture is really hard, even for my own
company. I'm an electronics engineer by training, and I have
been writing software since 1967, and I still write production
code.

I may not have fired people directly for not being good enough,
but I have certainly strongly encouraged them to get another job.

Don Y

unread,

Jun 29, 2022, 9:40:06 AM6/29/22

to

On 6/29/2022 5:36 AM, Stephen Pelc wrote:
> On 28 Jun 2022 at 20:33:42 CEST, "Don Y" <blocked...@foo.invalid> wrote:
>
>> On 6/28/2022 7:35 AM, Stephen Pelc wrote:
>>> As others have said it needs discipline. Discipline comes from
>>> management. As the boss, I have made it quite clear that use
>>> of DocGen is a requirement to work at the company. In turn
>>> it is my job to ensure that people know how to use the tool.
>>
>> You can "legislate" the use of a tool or adherence to a standard.
>> But, these are subjective issues -- not like "derate all caps by
>> 40%" (which can be independently, mathematically verified). You
>> rely on individual "employees" for their judgement as to the
>> effectiveness of their documentation. Likewise, the efficacy
>> of their test/validation efforts.
>
> Followed by lots more pointless whining.

First-hand examples of how "discipline" doesn't work, in practice.

If you've been "lucky", then "good for you". You're likely the
Exception and not the Rule. You've led a blessed existence. So,
likely aren't competent to comment on life with "less angelic"
employees.

Please let us know your progress on addressing world hunger...

> Changing company culture is really hard, even for my own
> company. I'm an electronics engineer by training, and I have
> been writing software since 1967, and I still write production
> code.

Now, imagine your products hosted code written by other people.
Outside of your organization. What sort of reach do you have
into THEIR corporate culture? Do you act as PHYSICAL gatekeeper
and prohibit "unblessed" code from being installed on your
products? Do *you* take on the job of creating every application
and hardware module that any of your users might conceivably want
(because you trust your own efforts, exclusively)?

How eager will you be for your customers to have their experiences
with YOUR product tainted by those other "components"? Will they
be sophisticated enough to know that the quality issues arise not
from YOUR portion of the work but from the efforts of others?
Will they be able to determine *which* others (so they can excise them)?

[Imagine the resolver in your PC being unreliably written by X.
Will the user recognize that the resolver's faults are the
reason behind the poor performance of the browser? Or, flaws
in the filesystem implementation the reason for application
failures/data loss? Or...]

I want mechanisms that make it easy for people to "do the right
thing", despite their inclination to do otherwise. I'm not
keen on waiting for them to "see the light". Nor do I have
the ability to coerce them to do so.

But, if I make The Right Path easier to follow than the "wrong"
ones, they are more likely to follow it out of laziness/self-interest.

> I may not have fired people directly for not being good enough,

Why not? Especially if it's YOUR company? Imagine the hesitance
to doing so when The Boss is just an employee of some corporate
entity -- not *his* name above the door.

> but I have certainly strongly encouraged them to get another job.

So, you "reworked" any work they did for you up until the time
of their departure? Or, did you just let it slide -- *into* your
products?