Hi Al,
On 8/12/2014 4:01 AM, alb wrote:
> Don Y<
th...@is.not.me.com> wrote:
>
>> MSWord isn't good for anything -- even a one page memorandum! Wait
>> until you discover that you can't access documents created with
>> version N-2 or whatever!<frown>
>
> This is something I do not understand. As a company we are trying to improve
> our processes and workflow in order to get a better product in less time and
> we pin point how important it is to check, verify, validate each technical
> step through the project's lifetime. Aside to that we have to deliver
> documentation, indeed a document is a component no less important than any
> other component in our system, is a part of it without which you cannot:
>
> A. prove you are doing it right
> B. prove you are doing it in time
>
> Then why there's such a large gap in process development applied to
> documentation? How can the stakeholders be at the mercy of tools like MS Word?
Because folks treat documentation as a "checkoff" item:
Have documentation? YES NO
Many people/organizations don't produce formal specifications, test
plans, test results, etc. They just have some "random" collection
of paper that they can point at when asked about "documentation"...
and, no one has much interest in checking to see if any of it is correct
or USEFUL!
> I personally found that in our customer set of applicable documents there was
> some degree of inconsistency (24 requirements out of 2019) which may have been
> potentially costly if found at a later stage in the development.
I have long adopted the approach of writing a User's Manual before
beginning a design. It forms an informal specification -- in terms
that a user can understand (instead of "stuffy tech-speak"). And,
it helps me think about a *unified* approach to a design -- rather than
letting the design "evolve" (which seems to be the design methodology
du jour!) in unconstrained directions.
When that is done, a customer/client *knows* what the end product
will be like and can critique each design decision that the document
presents "as fact". It lets hypothetical users ask, "How do I..."
and "What if..." questions -- all of which SHOULD be covered in the
document.
At that point, the actual *implementation* is a piece of cake! All
of the decisions have already been made -- no fear of coming up with
two DIFFERENT approaches to two *similar* problems in the user
interface because "you got a better idea when working on the second".
>> You will probably discover that you need a "Documentation Czar" to
>> "enforce" policy/consistency in your documents. In reality, this
>> individual will become the chief grunt -- responsible for FIXING
>> everyone else's screwups!
>
> We have 'Czars' [1] around, we call them PAEs (product assurance engineers)
> and they tell us what we have screwed up with respect to norms, standards,
> versions, ... But it's not enough to guarantee links are not broken.
Yup. First, they need to be able to stop "delivery" of the product
(in order for their efforts to have any "weight"). Second, they
need to be "nit-pickers" to ensure they have the requisite skills
to *catch* ALL discrepancies.
I'm "too small" to take on many of the projects that I do undertake.
And, too "lazy" (unwilling to put in extra effort that shouldn't be
necessary). So, I try to design mechanisms that amplify my efforts;
do more by doing less.
Tying the documents to the actual implementation is one such example.
The documents and the implementation are always in sync (if you
let the makefiles do their thing!). It's also a carrot for future
developers (maintainers) as the documents provide a friendlier way
of viewing and entering changes to the codebase.
For example, a recent document explains and tabulates the rules by
which I convert letters to sounds (part of a TTS). The document
organizes the rules in a nice, easy to read format. A piece of
code that I wrote extracts the rules from the document, rearranges
them to satisfy the optimizations that the run-time implements,
then encodes them for inclusion in the actual run-time.
A developer *could* insert himself into the middle of that process
if he chose to. I.e., take the encoded output from version X of
the ruleset and manually introduce changes to advance it to version
Y -- without updating the documentation. But, it's almost certain
that he will introduce a bug/typo in the process. And, will have
to manually revise the regression suite to cover his changes (another
opportunity for errors).
Expecting developers to be "lazy", the *intended* way of modifying
the rules -- by altering the documentation -- is so much easier (and
robust) that it is unlikely anyone will *try* to circumvent it!
> Relying on multiple eyes is not bad per se, but the quirk of it is that we
> tend to silently accept that if a mistake passed a review than is not a
> mistake anymore, until two different tables report defferent numbers and the
> developer does not know which to pick (neither his/her manager).
Don't discount the fact that many people are not invested in the
process. And, others may not be familiar enough with the technology
to be *competent* to recognize a subtle mistake! Or, leary of
expressing their uncertainty ("Surely Bob would have commented on
this *if* it was a genuine mistake...").
One legacy letter-to-sound algorithm often implemented "wildcards"
to represent letter patterns of interest. E.g., "any number of
voiced consonants". But, the original implementation language was
SNOBOL. Folks recoding the algorithm (into C, most often) would
carelessly interpret the implementation as, literally, "any number
of voiced consonants". And, naively implement a greedy matching
algorithm (common in C):
while (char in {L, J, V, D, ...})
pointer++
This *looks* correct. Until it is applied in particular contexts:
%D
(where % is the aforementioned wildcard). Obviously, "%D" should
match "coLD", "aDDed", etc. I.e., the % matches the first (and ONLY
the first) of these voiced consonants and the explicit 'D' matches
the immediately following 'D' -- even though it, too, is a voiced
consonant.
But, the above implementation will fail -- due to its greed!
This is a common latent bug in implementations of this particular
algorithm. Because the folks re-implementing it (in C) failed to
understand how the original SNOBOL implementation operated.
>> GUI's are dangerously seductive. They allow users to focus on the
>> *appearance* of a document instead of its semantic content. "OK,
>> that's in italics like it is supposed to be!" (No, italics are used
>> for several different types of tags in our organization... concentrate
>> on getting the TAG right, not the appearance of the text rendered for
>> that tag!)
>
> On top of that, tools like Reqtify, used to trace requirements, may analyze
> text based on some formatting styles which *may* look the same even if they
> *are* different, so the resulting document may have a set of requirements
> which are not 'captured' by the tool and go silently untraced until God knows
> when (typically at CDR, where you are ready to launch your flight
> production!).
Tools like MSWord appeal to folks who think "pretty printing" is a
goal. Or, who are tickled with the prospect of embedding a picture
or a scope trace in a document.
Lately, my documents have been interactive. E.g., the document I
am working on presently allows the user (i.e., reader) to explore
how various glottal waveform parameters affect the *sound* of the
spoken voice -- by adjusting them and *listening* to the resulting
pronunciation of (canned) words.
[You could spend paragraphs trying to explain these sound qualities
and never be certain the reader understands; but, give him an actual
sound sample to evaluate -- and contrast -- and your confidence in
his understanding goes up markedly!]
>> Beyond the tool(s), by far, the biggest problem will be training
>> folks for the proper mindset. To treat words, terms, phrases, etc.
>> as more than just collections of letters/glyphs. IME, this is a
>> lot tougher nut to crack! :<
>
> I totally agree with you. There's a tendency to deny the current problems we
> are facing (daily) and an innate inertia to refuse change, too often
> considered destabilizing. Some three years ago a revision control system was
> introduced (svn) and currently we are still facing 'acceptance' issues.
In my case, I opted for Perforce -- much to the chagrin of all who
advised me on the subject! A big part of that, IMO, was a desire
to operate in their own little isolated fiefdoms, detached from
The Organization. And, failing to perceive the needs of others
in that organization!
> Another issue I currently see in front of my long revolutionary journey is the
> transition phase. Even imagining a ready to deploy documentation system (which
> do not have yet), we can only deploy it on newly coming projects, since old
> ones are already infected by the MS virus. Engineers who are then working on
> several projects will need to continue in two different ways...kinda confusing
> if not unsustainable.
Sorry, change is always painful. :( That's often why folks cling
desperately to old ways of doing things and "wetware systems" (in
which the "system" has been designed to fit in someone's braincase
early on -- and never revised when the constraints of that braincase
were exceeded!).
An exec at a Fortune 500 company ($10B/sales) once quizzed me on
the design of part numbering systems. I gave the typical reply:
"Numbers beginning with 1 for vegetables; 2 for fruits; 3 for meats;
4 for cereals; etc. The next digit could refine this further:
11 for leafy vegetables; 12 for legumes; etc."
He took a tomato out of his desk drawer: "Vegetable! 1XXXX"
"No, it's a *fruit*!" (how many folks trying to rely on this
"wetware" system would make a similar mistake?)
"Hmmm... What about berries? Strawberries, blueberries...?"
"... tomatoes, avocados, grapes..."
"Huh? Aren't those fruit?"
<knowing grin>
"And, where do we put *candy*? And vitamins? And..."
"All those oddball things can go in the 9's!"
I.e., systems that appear simplistic usually are... too simplistic!
The conversation ended with him arranging the items on his desk in
a haphazard order and "identifying" them in exaggerated fashion:
"1, 2, 3, 4, 5, 6... get the picture?"
"Then, how do you know what a 62347 is?"
"I type the part number into this computer and it tells me everything
I want to know about it! What it is, what it costs in materials,
labor, how many we have on hand, how many we have active orders for,
how many we sold last year, what time of year has the greatest demand,
where (geographically) that demand is located, etc. To *someone*
in this organization, each of those items are THE MOST significant
aspect of this product. If *that* person was designing a part numbering
system, he would choose to encode *that* data in the part number and
care little about the criteria *you* chose!!"
[I can't resist quoting Earl Sinclair:
"As you can see, I have separated all known dinosaur wisdom into
three categories: animal, vegetable, rocks."
"Water is the opposite of fire, which we have previously established
as a vegetable. What's the opposite of a vegetable? Fruit. So, water
is a fruit! Fruit is not a vegetable, so it has to be either an
animal or a rock. We know it's not an animal. Therefore, fruit is
a rock."]
> The system I have in mind is a hierarchy of units, where information is
> *clearly* defined in one single place. Tables (spreadsheets) and diagrams can
> live in a common directory, while ad-hoc ones can be in the specific document
> folder:
>
> main # handles the data set with Makefiles
> ├── common # dedicated to all common parts
> │ ├── acronyms # needless to explain
> │ ├── applicables # list of applicable documents included in all docs
> │ ├── diagrams # technical diagrams (dataflow, state machine diagrams, ...)
> │ ├── drawings # technical drawings (mechanics, pcbs, schematics)
> │ ├── intro # introduction, often present in several docs
> │ ├── references # reference documents, with issues and revisions
> │ └── tables # budget tables (power, resources, memory structure, ...)
> ├── lists # list of documents, components, units
> ├── notes # technical notes (potentially linked to change requests)
> ├── plans # development plans,
> └── specs # subsystem specs
> └── ABC-DEF-0120-R # with textual sources as well as tables and diagrams
> # unique to this subsystem
> ....
>
> I forgot to add an 'output' folder which will contain the whole set of
> documents produced by the latest build. In an ideal world LaTeX2HTML [2] could
> be used to export the whole set of docs in html and allow to browse document
> content with a browser instead, where hyperlinks are a much faster way to
> reach the information.
Why the need for specific folders/directories for each item type?
Sooner or later, you will end up with huge directories and shrinking
namespaces. Why not let things live where they "should" live -- just
ensure they are accessible from everywhere that they should be
accessed?
E.g., if I embed a particular object in a particular document...
then, at a later date, decide that the object can also be used
in some other document, I don't refactor the original document
to extract the object and move it to some "shared" location.
I just reference it where it was.
[I am becoming a huge fan of relational databases! Letting
objects reside in the DBMS instead of as files in a filesystem.
It makes it easier to see dependencies]
> Each directory shall have a makefile in order to handle word processing and
> bring up to date the current directory. Documents shall have a template (or
> class), or potentially a set of templates, in order to provide a uniform and
> coherent typeset throughout the project and also between projects.
>
> The structure (just a draft) shall not evolve too much otherwise it becomes
> unclear where does a piece of information belong to, but shall take into
> account all needs (I just sketched some of the top of my head).
>
> Scripts and macros may facilitate the generation of chapters, especially
> tables starting from spreadsheets which are then included into a document.A
> Makefile would then be a perfect fit for handling dependencies.
Your goals are far more ambitious than mine -- I just want to keep
my docs synchronized with the objects they describe. I count on
the VCS to handle much of that "make" overhead, currently.
> The main idea here is to remove two issues from the current situation:
>
> 1. information duplication
> 2. formatting
>
> While 1. can be addressed with a hierarchical structure, it is certain that
> people have to know where information is and what is considered to be sharable
> and what is unique to a document. As in programming, is not always clear at
> the beginning which subset of your main program will end up in a library
> function (especially if you start coding without knowing where you need to
> end).
>
> Secondly 2. shall not be a problem of the engineer who's inputing technical
> information. He should provide correct data, while data rapresentation shall
> be done by the typesetter. Luckily there are a couple of guys in the room who
> have the skills to take care about 2.
You can also opt to not be concerned with "presentation" (depends on
where your docs will be consumed). E.g., web pages tend to be
content driven; PDFs are layout driven.
Also, consider what you will want *in* those documents. I think
(as evidenced from my current efforts) that documents will become
much more "active" than "dead tree products". So, you may find
that "text's" role decreases over time to other media forms.
Good luck!