Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

(Fwd: *C&CD*) TEXT-ONLY DOCUMENTATION (*COMP.COMP*) (15)

10 views
Skip to first unread message

Composition Digest (Robert Royar, Moderator)

unread,
Nov 20, 1993, 12:32:00 PM11/20/93
to
Entry: 15
Date: Thu, 18 Nov 1993 00:39:41 +0000 (GMT)
From: agate!howland.reston.ans.net!spool.mu.edu!news.clark.edu!netnews.nwnet.net
!news.u.washington.edu!cac.washington.edu!prs...@ucbvax.Berkeley.EDU (Paul Reed
Smith)
Subject: TEXT-ONLY DOCUMENTATION (*COMP.COMP*) (15)
Message-id: <2ceg8d$l...@news.u.washington.edu>
Organization: UW Computing
Sender: compos01%ulkyvx...@mailhost.berkeley.edu
Reply-to: agate!howland.reston.ans.net!spool.mu.edu!news.clark.edu!netnews.nwnet
.net!news.u.washington.edu!cac.washington.edu!prs...@ucbvax.Berkeley.EDU (Paul
Reed Smith)
Lines: 17

Colleagues,

Do any of you have any you have any guidelines for documentation prepared
for dissemination as ASCII files?

I'm thinking of how one gets along without bolding, direct underlining,
or other fancy formating. Things like lines less than 80 characters
wide, creative use of capitalization, spacing, et cetera.

Do you have anything you can send me or any groups to which I should
direct this question? If so, drop me a line.

Thanks,

Paul
prs...@cac.washington.edu
__

------------------------------

David Tillyer

unread,
Nov 22, 1993, 4:37:11 PM11/22/93
to
Paul Smith asks an interesting question about conventions for sending
stuff in ASCII. I've got students saving in ASCII to send to a
tutor and I'd like to have some suggestions for them...so, please
post replys to the list. Thanks David Tillyer

Partl

unread,
Nov 24, 1993, 9:09:50 AM11/24/93
to
Composition Digest (Robert Royar, () wrote:
> Date: Thu, 18 Nov 1993 00:39:41 +0000 (GMT)
> Subject: TEXT-ONLY DOCUMENTATION (*COMP.COMP*) (15)
> Sender: compos01%ulkyvx...@mailhost.berkeley.edu
> Reed Smith)

> Do any of you have any you have any guidelines for documentation prepared
> for dissemination as ASCII files?
> I'm thinking of how one gets along without bolding, direct underlining,
> or other fancy formating. Things like lines less than 80 characters
> wide, creative use of capitalization, spacing, et cetera.
> Do you have anything you can send me or any groups to which I should
> direct this question? If so, drop me a line.

I found something like that in the E-TEXT collection of the Project
Guttenberg (or in a similar collecion, I dont recall very well)
when gopher'ing around the world. I am including it below, although
it is pretty long...

Dr. Hubert Partl Mail: pa...@mail.boku.ac.at
EDV-Zentrum, Universitaet fuer Bodenkultur Phone: (+43 1) 36 92 924 - 233
Nussdorfer Laende 11 Fax: (+43 1) 36 92 924 - 200
A-1190 Wien, Austria (-: Make laugh, not war! :-)

---------------- included file follows ----------------------------------

Elements of E-Text Style
Version 1.0
9 August 1993

This file should be named ESTYLE10.TXT or estyle10.txt.


Copyright (c) 1993 by John E. Goodwin. All Rights Reserved.

You may make and distribute verbatim copies of this work for non-
commercial purposes using any means, provided this copyright notice is
included in all such copies.

Contact: John E. Goodwin
P.O. Box 6022
St. Charles, IL 60174
jego...@delphi.com

[John Goodwin is available to consult, write, and teach courses on E-
text issues and Internetworking]


Abstract: This manual discusses how to use electronic text (E-text) as
a communications medium distinct from the print media. The manual is
written in a non-technical style, such as a humanist-of-little-brain
might enjoy reading.

o You can learn how to write effective E-text for personal, business,
and scholarly communication.

o It includes sections on preparing forms and texts for electronic
response and on writing effective and business-like E-mail letters.

o There is a brief section on Standard Generalized Markup Language, a
coding standard of interest to humanists.


Just to prove how non-technical it all is, here is an exceptional lapse
into technical jargon, in case you know what the Internet and FTP
archives are:

This work is a companion volume to _E-Mail 101_, available free as
ftp://mrcnext.cso.uiuc.edu:/etext/etext93/email025.txt.


<title> Elements of E-Text Style

=Preface= An Apology for E-Text

=Part I= Writing for an E-Text Audience

=Part II= Specific Differences of Style and Mechanics

=Part III= A Very Brief Style Manual

=Appendix A= Technical Details: Relationship to SGML and TEI

=Table I= Full Table of Contents (go to very end of this file)


<Preface>

This work grew out of my earlier course notes published under the
title _EMAIL 101_. It was originally projected to be a three chapter
section concerning the special needs of writers who wished their works
to be transportable by the electronic networks. The chapters were not
included in the original release as they existed only in outline form.

Over the course of Summer 1993 I gradually came to realize that E-text
was a communication medium in its own right, with its own needs and
conventions, its own strengths and weaknesses, and not merely the
bastard child of the print medium. Consequently, many questions of
style, long ago settled for print media and fixed into rules in style
manuals, needed to be re-examined in light of the new medium.

Since, it seemed to me, that no one had set out to treat the stylistic
considerations of writing E-text, at least at any length, I decided to
expand my three chapters into the present work. I set out to write down
systematically some observations I had made concerning the differences
between E-text and "ordinary" writing. I treat E-text as a legitimate
medium of expression, one that must be addressed on its own terms and
without unnecessary reference to how the words might look on paper or
how the work might be useful if printed out.

For reasons that I will discuss at length in the first part, only a
small fraction of E-text will ever see the light of print. While paper
may offer a better resolution image and a more perspicuous whole, E-text
excels at ease of production and portability. It can be copied simply,
transported great distances in seconds by electronic networks, and
stored on magnetic media--floppy disks, hard drives, and CD-ROMs-- that
are less bulky and cheaper than paper.

The extraordinary growth of E-mail in the past few years, from a medium
used by a few scientists and government officials to one accessible to
millions, often in a humanistic or business setting, demands that we
give the writing of E-text the attention it deserves. If you wish to
communicate effectively, you will have to master this new medium. It is
a necessary part of education--if only we knew what to teach!

Good writing is, in many respects, the same for any medium. And the
first thing any writer learns is that their** writing must fit both the
audience and the medium being used. We cannot pretend any longer that
we are writing for print or that our audience will be looking at
anything other than a computer screen.

** I deliberately use "their" as an ambiguous pronoun throughout.

Just as the print media differ among themselves depending on the
intended audience, expected lifetime of the text, and peculiarities of
the medium, so E-text differs from print.

This work is organized as follows:

In the first part we delineate the major differences between the print
media and E-text.

In the second part, we discuss specific issues such as techniques
for designing a visually appealing layout, or representing characters.

The third and final part is a brief style manual for writing E-text.
It is not offered as a set of prescriptions, but as an example of how
the principles in the second part can be realized in practice.

+ + +

In this introductory section, I would like to make a brief apology for
E-text. It is not usual, in discussing the print media, to begin a
manual on style with a defense of the worth of the medium; however, E-
text is so new that many persons will say "Why bother with it?". They
deserve an answer.

The most insidious objection to E-text is the claim that it is just
printed text before it has been printed out. In effect, this denies
either that

(1) There is any difference between the needs of E-text and the needs
of print; or

(2) That all text is printed out before being read.

The second premise is demonstrably false--most E-mail correspondence and
anything longer than about 25 pages obtained over a computer network
suffice as examples.

The first premise requires a more extended answer, since it is the
source of a great deal of confusion. In fact, the entire first part of
this work is devoted to refuting it. In this brief apology I will
answer two simpler objections: that E-text is so esoteric that it is of
no interest to ordinary persons; or that it is so commonplace as to be
beneath our consideration. I call these two objections the "Ham Radio"
and "Telephone" objections, respectively.


Not every communications medium is of interest to a large number of
persons. Take, for example, Amateur Radio. Using short-wave radio to
communicate requires a fair technical knowledge and special equipment.
Because of these two investments, neither the medium nor the skills
required to master it are common. This situation is very similar to
that of computers in the late '70's. Computers were not commonplace,
being owned mostly by hobbyists. Communication and distribution of
information was primitive, often by floppy disk passed hand to hand.
And the special programs required to create and read E-text--word
processors--were uncommon and required special skills.

On the other hand, some will object that E-text is now so commonplace
that it needs no consideration. You don't read style manuals about how
to talk on the Telephone do you? Although some scholars may discuss how
telephone conversation differs from the ordinary face-to-face variety,
most of us use telephones un-self-consciously. E-text is like typing a
letter. Who cares?

Although the *mechanics* of talking on a telephone are trivial, the
social implications are not. One can point out, for example, that to
most people, their parents have become persons that they talk to on the
telephone and not persons that they work with every day and see face-to-
face. The social implications of this are enormous; the technology
trivial.

Similarly with E-text: while the mechanics are easily mastered and
perhaps of little interest, E-text together with global computer
networks make possible a form of community that didn't exist prior to
the medium. The sort of community that will form around E-text is
different from the kind of communities that are centered on the
telephone. Rather than family or casual friends, it is likely to be a
community that cares about a single issue or agenda.

These communities can range from complex communities like companies or
groups of scholars, to persons sharing a single, simple interest.
Already, in our society, we find that technology has allowed us to adopt
a pattern of individualism never seen in the world before. Most face-
to-face communication is with your immediate family, your co-workers,
and perhaps a few friends. These friends are not as likely to live next
to you as in a small town, and you see them less often.

E-text both carries this atomization to its extreme and simultaneous
offers a way out from its worst effects. It is possible, using the
medium, to form important relationships with persons you have never seen
or talked to--this is individual atomism in the extreme. At the same
time, E-text provides a communications medium that can go beyond. It
solves the problem, inherent in much of our society, of shallow
relationships with other humans.

These new, deep relationships can be business or scholarly, , or just
old-fashioned friendship. Thus communicating well carries social
implications that go far deeper than talking well on the telephone. How
you write E-text may affect how you *appear* to potential friends,
clients, and one day perhaps even family.**

** It is only a matter of time before parents of college children
realize they can have a much closer relationship with their children for
the 10 dollars a month it costs to open an E-mail account.

Despite the unnaturalness compared to talking, in many ways E-text is
superior to the telephone as a way to "keep in touch". The telephone
requires that both persons be available simultaneously. Most
conversations are short and business-like, with marathon sessions being
reserved for close family and a few friends.

But it is not for writing the occasional personal note that one needs a
style manual. Unlike the telephone, E-mail has more serious uses--the
same uses that print media have. It is used for business, persuasion,
publication, and scholarship. E-mail may become as commonplace as
telephone, but it will not be approached with the same casualness.


Over the course of the past year or so I have seen collaborations of
individuals in many fields spring up. These collaborations at first
were of course among computer scientists. Then, in the last couple of
years the Scientists have caught on. There are signs that all academic
disciplines will soon have such collaborations. The cost in equipment
is low and the advantages great. Software for business "working groups"
is already in the marketplace.

Collaboration by E-mail--and a consequent reliance on E-text--may become
the dominant social model for certain kinds of collaboration: E.g.,
within a company or scholarly community--wherever the persons cannot
meet face to face.

There are many who say that E-text as we now know it--the typewriter-
like production of character-oriented terminals--will soon give way to a
new medium, mulitmedia. In this view, newer computers will spawn newer
media and the old ones will be forgotten. In five years, ten at the
most, E-text will be a thing of the past. Surely, the argument goes, we
should not invest time in perfecting a medium that is little better than
a fad.

Multimedia indeed shows great promise. I have no doubt that soon it
will be possible to mail graphic images, audio, and video clips along
with text. Printers will print not only color but black and white. And
visual formatting information like font, point size, and so on will be
sent alongside the basic text. Not only that, but these capabilities
will become part of every household, every phone system, cable system,
and cellular communications network. Personal computers will replace
telphones as the "communications center" of the household.

The vision of multimedia is one of old media--color magazines,
television, telephone, radio--being reborn in the new guise of
electronics. But what do you think will be a large component of each
and every mulitmedia message? Could it be that most of it will be E-
text? I think multimedia will turn out to be a lot like a letter to
home. We may send an occasional picture, or even an audio cassette, but
most of the communication will be in our writing.

Ultimately, writing is easier than taking photographs or editing video
clips--though not as easy as talking. It takes less time, less capital,
and less effort. Multimedia may be good for advertising, for writing
textbooks, and for fun; but for just plain communicating? If it
requires more thought or needs to reach more persons than a short
telephone call, it will be E-text. Multimedia will fill the niche of
four color magazines, coffee table art books, the biology textbooks, and
advertising.

Look around you, at your bookshelves, and notice how many have no
pictures. Think how many typed letters your office sends out compared
to the number of four-color brochures it creates. Most information is
disseminated by the cheapest possible means. Right now, electronic text
is that cheapest means. As more and more persons learn how to get it,
it will become the dominant medium. E-text is the black and white print
of the electronic age.

The uses of E-text are as diverse as the uses of print. The chief
innovation of the new medium is the fact that it places the capability
to publish in the hands of *anyone*. The capital required to spread
information or ideas has been reduced to a level any person, or at least
any community of persons, can afford.

The E-text revolution is that individuals are no longer dependent on
institutions or even businesses to create, share, and gather
information.** Every interest and splinter group, every church or
synagogue, every would-be author, student, or scholar can collaborate
with others, write, and share texts.

** They are still dependent on hardware, software, and
telecommunications.

As E-text becomes more and more acceptable, it will become the medium of
expression used by the masses. If you wish to reach them, you will have
to learn to write it effectively. Education--real education--has always
been a rather solitary effort. The right conditions seem to involve
access to a good library, a chance to talk with collaborators, write new
material, and have it discussed by the community of interested persons.
E-text can bring these necessary conditions for education out of the
university to the simplest home.


E-text is at the stage the European vernaculars were at the time of
the Renaissance. There were many doubters who pointed to the
established Latin tongue as the medium of communication. But, in time,
reality forced even the scholars to yield. A revolution was
accomplished in which masses of ordinary people could own books and even
on occasion produce them. The implications for society and learning
were staggering.

Like that earlier time, when print was new, there is now much innovation
and experimentation, and the wise practitioner will sift carefully the
techniques and suggestions offered both here and by others. In time we
shall have our Dantes, our Bacons, and our Shakespeares; the persons who
will show us how to make this new medium not only a utilitarian one but
a sublime one. For now, let us take those first hesitant steps down
that path.

<Part I> Writing for an E-Text Audience: Basic Problems

Writing for an E-text audience is very much like writing for a print
audience, but there are subtle differences. Nowadays, both works
destined for print and works aimed at the global networks are likely to
be created on a personal computer. The advantage of being able to make
incremental changes to a manuscript, and to create near print-quality
works with a laser printer--not to mention the advantages of spell-
checkers, automatic footnotes and the like--means that both kinds of
author will be using a computer. But one will be aiming for an
effective and attractive *printed* manuscript and the other will be
aiming to accomplish the same end on a computer screen.

The difference between E-text and print comes down to two factors:

(1) it is not currently possible to create a file that simultaneously
looks good in print and on the screen, yet is universally accepted by
all computer programs; and

(2) the least common denominator computer screen has a lower
resolution, a smaller viewing window, and a more limited repertoire of
visual effects than even a typewriter.

This Chapter will address three questions:

o Why write for an E-text audience at all?

o Is it possible to write for both audiences at the same time?

o How does writing for an E-text audience differ from writing for a
Print Audience?

In Part II we will explore the extent to which you can have it both
ways--strategies for getting as close as possible to the Holy Grail of
Electronic Communications, a file everyone can read that looks just as
good on the screen and in print.

Part III presents the mechanics of creating E-text in the form of a very
brief style manual. The issues of the previous two chapters are
summarized in a series of suggestions for creating your own effective
style.

The last Chapter of this part of the course discusses copyright issues
that effect the distribution of E-text. These issues, important as they
are for print media, become paramount concerns when copying your text is
as easy as pressing a button.

=Section 1.1= Why Write for an E-text Audience?

=Section 1.2= Is it Possible to Write E-Text and Print at the Same
Time?

=Section 1.3= Differences between E-Text and Print Media

=Section 1.4= Version Control


<Section 1.1> Why Write for an E-text Audience?

The basic position here is that computers are basically machines for
creating printed text. This position, contrary to the one taken here,
has a number of advantages:

(1) The resolution (appearance) of the final product is superior to
anything that can be created on the screen;

(2) Print media are easier to handle and browse;

(3) Paper is universally accepted and readable--no special hardware or
programs are required;

(4) The product is compatible with all the information-handling
systems that have been developed for paper (files, libraries,
catalogues, ... ); and

(5) The author's copyright is easier to maintain because the final
text is harder to copy.

These are overwhelming advantages. I call the resulting situation--
paper is the medium of storage and standard of communication while
computers and printers are just tools for creating paper--the "cellulose
interchange standard". It is well established; it works; and it is hard
to beat. It is the norm even in the computing world. The paper bias,
e.g. of word processors, is obvious.

Against this view is a reality of modern life: it is becoming very much
cheaper to store information in electronic form and comparatively more
expensive to store it as paper. Let's consider some facts:

o A 300 page book takes up a Megabyte of memory--around one 50 cent
floppy.

o CD-ROM storage lowers the cost to a few pennies for a book. A
single CD-ROM can store several hundred books.

o An 8 mm video tape can store several *thousand* books. This means
that the information in the Harvard University library system, one of
the world's largest (6 million volumes) would take up few thousand
cassette tapes presently costing less than $100,000.

Given that, in addition, you can

revise electronic text easily,

make copies faster,

send it further in less time and at less expense,

store it more cheaply,

print it out,

send it as a fax,

and convert it to other formats,

it will soon be a commonplace that *even for documents that are designed
to be printed out and looked at on paper* the principal means of storing
and exchanging information will be in electronic format.

So let's get this straight: we are not discussing whether it is better
to store and exchange information on paper or electronically. The bulk
of information will soon be stored on magnetic media and exchanged
electronically. What we are discussing is whether it makes more sense
to prepare *electronic* documents that look good printed out or ones
that look good on a screen.

Paper will become (in fact already is) a luxury reserved for the cream
of the information crop--just as four color printing is reserved for Art
books, glossy magazines, and advertising, while most other printed
information is black and white. You will want the 1% of your
information you use most often in print form. You won't be *able* to
get or afford print versions of most information, any more than you can
afford to buy everything in hard cover or print every brochure in four
colors.

And is this so bad? Why not have a good print library *and* good
electronic text. My public library, a very good one, has around 100,000
volumes and cost several millions to build. If I want something they
don't have, I have to wait a week for interlibrary loan, or a xerox
copy, or a fax. In most cases I would be happier with an electronic
version I can look at *today*.

The point is that paper doesn't compete with E-text, E-text is probably
information you *would not have* in any other form. It's books you
wouldn't have because you can't afford 1000 books right now (but you can
afford a couple of CD ROMs); it's text you can view (and download) at
your public library that the library couldn't afford in paper; and it's
free stuff that can't be distributed for free any other way, because
paper just costs too much.

Do you have to write E-text? You do if you want your audience to
include the 25 million people with E-mail access (projected to be 75
million in three years). You do if you want your message to travel as
far as possible--even if it is intended to be printed at the other end.

So you will write electronic text that will never be printed because you
*have to*. That means you need to learn to write effective E-text,
because there really is no alternative. Fortunately, if you can write
well in the print medium, you can write well in E-text. We'll give you
a few tips in a moment, but first we need to dispel the notion that it
is possible to write for both media at the same time.


<Section 1.2> Is it Possible to Write for E-text and Print at the Same
Time?

Here we come to the claim I made above, that it is impossible to satisfy
all three of these criteria with a single file:

(1) the file can be read by any computer

(2) the file creates good looking print

(3) the file looks good on the screen

You can pick two out of three, but you can't get all three. This is
unfortunate, but it is also true.

The only kind of file that is *universally* accepted is the plain text
file, also called the common =ASCII= file. Actually, even this is an
overstatement. ASCII, the American Standard Code for Information
Interchange, is a very specific code for representing text. A
*fraction* of that code can be translated without difficulty to
virtually all computers, including the fifty-two letters of the English
language (both upper and lower case), the ten digits, and a handful of
punctuation marks.** But the rest of ASCII--some punctuation marks and
special "control" characters used by computers (including the common
tab!)--are off-limits if you want your message to have a truly global
reach.

** It is important to remember that some files that need the full
ASCII repertoire, e.g. the source code of computer programs, may not
travel well.

Anyway, if you want a file that looks good in print (criterion 2) and is
also a plain text file (criterion 1), you *have* to give up criterion 3
--the file will not look good on the screen. This is because creating
"laser quality" output with a "book quality" appearance requires printed
commands, called =markup=, to be interspersed with text. This can be
minimized, but the cost is a text without even the visual effects that
are possible with a typewriter--underlining, superscripts and
subscripts, and diacritic marks to name a few. If you want these
effects and others typical of book-quality printing--multiple fonts,
automatic footnotes, and so on--then the markup burden makes the file
unpleasant to read, i.e. not effective as E-text.

Finally, the question of how to get as close as possible to satisfying
all three criteria and a discussion of formats and markup is left to
next chapter. There has been some success at creating files that look
good on the screen and in print--using so called WYSIWYG ("what you see
is what you get") word processors or using SGML ("Standard Generalized
Markup Language")--but these are not in universal use. I.e., you have
to give up criterion 1 to get numbers 2 and 3.

So right now the plain truth is you can have any two out of three but
not all three. Sorry.


<Section 1.3> Differences Between E-text and Print Media

Creating a manuscript on a computer is quite a different process from
the old fashioned method of revising a manuscript by (literally) cutting
an pasting typescript, of maintaining bibliographies, and of checking
spelling against a dictionary. Most of the peculiarities of using a
computer are true whether or not the output is meant to be the printed
page. Nevertheless, it helps to enumerate a few, since these
considerations apply in spades to producing E-text:


o Computer aided Research and Organization Methods

Note taking, creating bibliographies and databases, and gathering
information now involves all the techniques discussed in my course
notes, _EMAIL 101_.**

** Available free from:

//mrcnext.cso.uiuc.edu:/etext/etext93/email025.txt.

o Rigidity of Format and Outlining

Word Processing programs enforce a formatting and outlining discipline
to a degree that would be unusual in the old style. Outlining
encourages a strict hierarchical style and the automated formatting
features make for a more rigorous observance of whatever conventions are
built into the program used.

Rigorous formatting is a virtual requirement for E-text, since it is
otherwise impossible for programs to tell where a chapter starts, say,
or what portions of the text are italicized.

Spellchecking programs are another example of (welcome?) rigidity of
style imposed by the new methods. Rigorous spelling is what enables the
SEARCH command to find all references to a given subject.

o Incomplete drafts are more likely to be circulated

The ease of making changes leads to a more collaborative style of
working in which draft after draft (not uncommonly 10 or 20) is
circulated to a large group for comments. Often documents are *never*
final, but are instead continuously revised. It is useful to compare
this process to the way computer programs are written:

First a trial version, or "alpha" version is circulated to a few
select individuals.

Next a beta version, mostly complete and supposedly correct, is given
wide circulation as a trial balloon.

Finally there is a succession of ever more refined upgrades ranging
from minor changes to major "releases".

It is a good guess that most working documents will be produced in a
similar way. In a way, this is similar to the print industries
"editions" and "printings", except, like the manuscripts themselves,
there is more consciousness of the structure of the process when
computers are involved. Also, the cost of producing minor revisions is
less, so there is less fanfare for a new edition--and more trouble with
version control!

o Collaborative efforts are easier.

When the drafts we have been discussing are circulated by E-mail, the
working style discussed above becomes even more natural.

o Backup copies are necessary

Although one might take the precaution of xeroxing an important
manuscript, failing to make backups of works stored in magnetic media is
sheer folly for anything that takes longer than 15 minutes to write.
There is a whole new discipline of saving work frequently to disk,
copying it to backup floppies or tape, and so on.


<Section 1.4> Version Control

The problem of multiple versions is a big one any time the revision
process is easy or frequent. Most computer systems keep track of the
date a file was last modified--so you can tell which of seven files.**
But even time stamps won't help if some files are exact copies of
others--as they should be if you are doing proper backups. It helps to
use version numbers like "3.1.5a" to distinguish the multiple copies.

** (three on various floppies, one in a directory "/project/old" and
two in directory "/project/new", is the most current)

As with any tree structure,** it is often good to use =dotted decimal=
notation: Version 5.18.2 means release 2 of minor revision 18 of major
revision 5. Version 0.1 is probably a rough draft.

** This concept is discussed in part III of my course, _EMAIL 101_.

You have to be careful: this notation can either represent successive
versions or divergent versions. For example, 1.4.3 can mean the third
minor change to version 1.4, which was the fourth major change to
version 4. This is the most common scheme. It provides an odometer-
like method of numbering the versions. It differs from an odometer in
that you are not forced to increment the next place when you get to the
tenth revision. As long as the revision path is a straight line, with
each version being derived from the version before it, this scheme will
work.

It gets into trouble if there are any branches in the revision path.
Suppose two versions, [a] and [b], are both derived from 1.2. Does
1.2.1 refer to [a] and 1.2.2 to [b]? This is a natural way to describe
*branching* versions, i.e. with a tree notation, but you can't use both
schemes simultaneously.

It's a good bet that Version Control software--programs that keep track
of multiple versions, store them as "deltas", or difference files, to
save space, and allow you to recover *any* past version or display
differences between versions--will become more common and integrated in
word processing software.

<Part II> Specific Differences of Style and Mechanics

This part enumerates some of the differences between E-text and print
media and discusses them in a general way. Actual recommended practice
is deferred until Part III, which takes the form of a conventional style
manual.

In the long run, the reader will find the material in this part more
valuable than the style manual. The manual, after all, is only one
possible concrete realization of the principles discussed here. It is
better to give thought to these principles in the context of your own
writing than to slavishly follow the manual.

=Section 2.1= Differences Traceable to Physical Media

=Section 2.2= Differences in Style

=Section 2.3= Differences in Process

=Section 2.4= Differences in Repertoire

=Section 2.5= Differences in Layout

=Section 2.6= Searching and Hypertext

=Section 2.7= Copyright Issues

=Section 2.8= The Parts of a Book

=Section 2.9= The General Theory of Markup (SGML)

=Section 2.10= Summary: Basic Tricks of the Trade


<Section 2.1> Differences Traceable to Physical Media

The basic differences between E-text and print can be traced to the
physical differences of the media, and to the fact that the needs of the
human reader and computer must coexist.**

(1) The human's need for visual relief within a 24 line frame.

(2) The computer's need for a rigid hierarchy and consistent
spelling.

(3) The limitations of the =character set= available

(4) The limited possibilities of different renderings of the
characters, e.g. by font and placement on the page; and

(5) a consequent dependence on =delimiters= and structure for
rendering.

Taken together, these factors account, in the first instance, for most
of the differences that their are.between E-text and print. In this
Part we will primarily be working out the implications for representing
text and developing techniques for dealing with the limitations.

** I thank Michael Hart for pointing out this second requirement to
me.

The small viewing window of E-text--commonly 24 lines and often less--
has a number of consequences. Combined with the fact that moving around
within a document requires one of:

scrolling (moving a scroll bar with a mouse);

paging (hitting a single key, such as "return", repeatedly); or

searching (using special commands to find sequences of characters);

we can see why E-text is hard to navigate, or, as I say, E-text is less
perspicuous than print. I think this limitation of the medium is a
greater bar to its widespread acceptance than visual resolution.

This limited window and lack of perspicuity has a number of immediate
consequences for writing style:

(1) paragraphs must be short enough to present at least one break in
any given 24 line window.

For practical purposes paragraphs much over 10 lines are anathema. This
means that the flow of thought must be broken up on a finer scale than
is common in print--though not, perhaps, as radically as in newspapers.

(2) E-text is much more linear than print.

Signposts, such as enumeration and other cues, organization, and
arrangement in sequence are much more critical. The trick is to
structure your argument so that the mainline reader can read it in
sequence. Side-trips have a much higher penalty for the reader than in
print.

This statement, that E-text is more linear than print, seems to go
against the promise of "hypertext", i.e. documents in which you can skip
around to your heart's content. In fact, it is precisely because E-text
is so linear that hypertext is important. It makes navigating E-text
manageable.

The high penalty for skipping around (or passing through long sections)
has a number of other implications:

(1) tables of contents should be distributed throughout the text, as a
sort of preview of the following section.

In effect, these tables become "hypertext menus", allowing the reader to
locate the appropriate section with a SEARCH command. This gives as
much aid as possible to the reader. However, if the text is long and
there are many logical levels, then the full table of contents should be
provided at the *end* of the document (Not the beginning! We don't want
the reader to have to scroll past a very long table to begin reading.).
The Table of Contents is discussed at length in =Section =.

(2) footnotes should be located immediately after the paragraph to
which they refer.

An E-text is logically a scroll. There is no such thing as a "page",
except as a arbitrary marker added to synchronize the E-text version
with a print version. Because of the small viewing window, the only
place you can put a footnote is after the paragraph. In effect, it
becomes a "small print" section with added detail.

(3) bulleted lists should be relatively short and should not turn into
full-fledged tables.

Instead, they should be broken up into sub-lists if possible, with no
more than ten items in one run. Tables and long lists should be placed
in appendices or separate files unless they are exceptionally compact or
unless viewing them is necessary to the flow of ideas.


<Section 2.2> Differences in Style

The most marked characteristic of E-text style is brevity. We have
already commented on brevity of paragraphs. The same can be said for
the overall work. 200k, or about 70 printed pages, is already quite
long. A larger work should probably be broken up into 100-150k
segments.

Two other stylistic characteristics are =hierarchy= and =rigidity=.
Given that computers, in popular culture, are often associated with
mindless authority or fascism, these are not promising characteristics
for the would-be writer of E-text. The words "hierarchy" and "rigidity"
are just convenient labels, however. We could use more complimentary
terms, such as "logical organization" and "consistency of style".

In any event, the hierarchy and rigidity apply to the formatting and not
the ideas expressed. Qualities such as brevity, an organizational
structure that helps the reader, and consistency in spelling, grammar,
punctuation, and layout, are generally accounted hallmarks of good
style. In fact, their is a trend in print media towards these
qualities, as well as towards shorter paragraphs, perhaps occasioned by
the widespread use of computers for preparing printed text.

It might be said that the style advocated is essentially that of
journalism and the classic pyramid scheme for writing newspaper
articles. This is true to a point. E-text, however, is much more
linear than a newspaper article. Above the article level the typical
newspaper is a jumble of many articles bundled together in a very large
package. The E-text equivalent of a newspaper will almost certainly be
a large number of separate files, indexed and arranged in a directory
hierarchy.** Long files purporting to be E-journals are very tedious to
read, precisely because they violate the brevity maxim.

**In fact this is the case with Usenet Newsgroups.


Another stylistic difference is repetition. Saying the same thing
in different contexts, even verbatim, is more acceptable in E-text than
in print. Since it is harder to navigate E-text, repitition** saves the
readers time looking up references. Material that is repeated in
several places is a good candidate for a footnote or "small print"
section.

** Repetition is a technique widely used in computer programming to
save the time needed to follow up a reference. In this context it is
called "in-line coding".

On top of the major stylistic differences, there are numerous minor
points of grammar and markup (punctuation) that are covered in Part III.
These are almost at the quirk level, and have little effect on style
=per se=, so we don't consider them here.


<Section 2.3> Differences in Process

Electronic text and printed text created on computers are prepared in a
different fashion from print. E-text typically passes through more
stages and is in a rougher form than print. This does not prove that
print is a superior medium because the product is more polished; rather,
the capital investment required to produce *any* edition is so high that
intermediate drafts are too expensive to circulate. E-text creation is
more collaborative and not punctuated by such monumental milestones as
"first draft to printer" or "second edition". The stages tend to be so
incremental as to blend into each other.

Thus, "publishing" an incomplete or rough draft is appropriate for E-
text. The medium seems to invite statements that "this section is under
construction". I call this the =cathedral model= of text production. A
premium is placed on the execution of one's art, collaboration among
successive "generations", and grand design, but the product itself is
never really finished. The stages of the E-text production process are
discussed at greater length in =Section 2.3=.

The pervasive sense of hierarchy in E-text affects the writing process.
You might think that the rigid hierarchy leads to a top-down process in
which each section is outlined in excruciating detail and the writing
fills in the gaps. In fact, the actual process is a combination of this
and a bottom-up one in which sections are created piecemeal and tacked
together as ideas emerge. The ideal working style, like that of
building a cathedral, works from both ends. There is both a grand
design (far more ambitious than what the author can produce at the
moment) and whole sections that are created of a piece. Unlike
cathedrals, the parts can be re-organized with ease after construction.


There is another respect in which the E-text production process
differs from its print analogue. In E-text, self-publishing is the
norm. The low capital investment, both in equipment and training
required to create the text, all points to self-publishing as the most
economical distribution method. The traditional segmentation into
author, publisher, printer, distributor, follows the logic of the print
production process. E-text needs only an author and a distributor--the
distributor being a friendly archive site or bulletin board.

Print media can use this same simplified distribution scheme *if* it is
in electronic format. It is important to differentiate between
distributing E-text and distributing files that are intended to be
printed. The later are likely to have special markup, commands, or
formatting codes. Often they are binary (i.e. not text) files.


Since E-text is easily copied, far more so than text locked up in
proprietary formats, it presents a problem for compensating the author.
There are four suggested compensation schemes:

(freeware model) no compensation--the text is either in the public
domain or copyrighted but with a license for free distribution

The advantage of this model is that the work gains the widest possible
distribution. Without fee, license, or undue copyright restriction, the
work travels wherever it is wanted.

(shareware model) distribution is unrestricted but there is a
licensing fee for use.

This is an elegant solution to the compensation problem. Its reliance
on the honor system has drawbacks, however.

(proprietary model) distribution is restricted by licensing and
copyright.

This is the common method for distributing commercial software. In
effect it assimilates E-text to print media by artificially taking away
the natural ease of copying E-text.

(patron model) the work is commissioned and paid for by a patron--a
university, government, or other buyer. Since the work is paid for by
the patron, distribution can be free or by any of the other methods.

In fact, the patron model is the common, since royalties. Thus cries
that free distribution of E-text will destroy intellectual property are
have little merit. In fact, except in the commercial world,
intellectual property has little market value and is almost always a
public, not a private, good.


<Section 2.4> Differences in Repertoire

In addition to physical, stylistic, and process differences, E-text has
a different repertoire of visual techniques--and consequently different
problems. The major problem is the limited number of characters.
Unlike even the typewriter, E-text is limited to letters, numbers, and a
few punctuation characters *in a single font*. Print-oriented word
processors eliminate these restrictions, of course, but they remain for
E-text.

In addition, the visual effects are more limited even than the
typewriter's. Super- and subscripting are not possible, and certain
layouts involving lots of vertical space are ill-advised.** Finally,
graphic images are presently hard to include with text--at best they are
separate files distributed with the text and viewed with difficulty--and
such visual effects as parallel columns, and tabular layout do not work
well. They are not very robust in the E-text environment.

** more on this below, in =Section 2.5=.

The solutions to the character repertoire problem is to extend the
character set by a number of techniques:

o escape characters,

o delimiters, and

o tags.

An =escape character= is a rarely used character, such as the ampersand
or percent sign, that indicates the next character or characters is not
to be interpreted literally but as a symbol for some other character.
In effect, it acts as a sort of shift key to shift the character set.
Thus, "&e" might represent a Greek epsilon instead of an English [e].

=Delimiters= are pairs of characters used to mark off text. The equals
signs I have been using in place of italics are delimiters. So are the
asterisks I use if I *really* want to emphasize something. Delimiters
are so-called because they serve to "delimit" the text they enclose.
This strategy, widely used in E-text, replaces *rendering* by
*delimiting*.

A final technique is =tagging=. Tagging is discussed at length in
=Section 2.9= on markup. It extends the repertoire of delimiters by
combining delimiters and escape characters in a construct called a
<tag>. The tag is a logical unit that indicates an entity or logical
unit ("element") in the text.

The character repertoire problem becomes most acute when different fonts
or formulae are needed. Fonts are effectively handled by the techniques
discussed above, but formulas are a very sticky problem. Probably the
only solution is to realize that the notation we use for formulas grew
up in the handwritten environment. It has been brilliantly adapted to
print, but it's adaptation to E-text is new and awkward. All we can do
is let notation for formulas occurring in E-text evolve *without
reference to their print analogues*. The solution is not to

(a) give up and wait for multimedia; or

(b) to use print-oriented markup as an interim solution.

Programming languages have of necessity experimented with representing
mathematical formula. As E-text communication becomes more common,
conventions *will* evolve that are elegant and empower, rather than
hinder, communication. Some suggestions (and they are only that) for
mathematical notation are contained in Part III.


<Section 2.5> Differences in Layout

Layout of text on the page is one of the major differences between E-
text and print media. Naturally, this consideration is dominated by the
small viewing window of E-text. In E-text, the paragraph, not the page,
is the fundamental frame of reference for the reader.

o footnotes, as mentioned above, should be placed at the foot of
their *paragraphs*;

o manifestations of hierarchy at the chapter level or above do not
need the differentiated rendering (special indentation, typefaces,
capitalization, and the like) that they have in print media.

Instead high-level headings are optimized for searching, using a
consistent numbering scheme such as dotted decimal (e.g. 3.5.2)--or else
replaced by breaking the document into separate files.


Vertical and horizontal space is less important visually, because
the reader is conceptually "closer" to the text and unable to appreciate
such effects as indentation and vertical spacing. In particular:

o paragraphs should not be indented except to mark structural
features such as:

list items;

sub-paragraphs;

"small print"; and

minor section breaks.

Minor section breaks should have a larger indent than list items, to
distinguish the two.

o vertical spaces beyond five blank lines or so are an annoyance.

o lines printed with deep indentation in print media, e.g. letter
signatures, date and place of writing, and run-on lines in poetry,
should use some other device to set them off.

o unlike print, the visual effect of a block of text carries less
weight in E-text. Consequently you should not go to great
efforts to block text by hand like this paragraph--your
efforts will be wasted in a proportional font anyway.

Just as pushed margins are to be avoided in E-text, so attempting to
line up blocks of text in list items should be avoided. While this
visual effect works well in print, it is actually harder to read in E-
text.

On the whole, vertical and horizontal spacing merges with formal markup
in E-text, so that it becomes just one more way of delimiting text. Its
role in creating visually pleasing forms is very muted in E-text. Since
the reader is so close to the "painting", the effect, which depends on a
certain distance, is lost. E-text is not a medium that lends itself to
impressionism.

Combining "white space" role with the delimiting role is very much an
art. Functionality and minimalism are the main virtues of this art.
Mostly, it is a matter of being sensitive to the different needs of E-
text and print, and avoiding elaborate markup that mimics print
techniques that have little meaning for E-text.


Tables, multiple columns, and the like do not adapt well to E-text.
Although you might think that E-text is the medium =par excellence= for
tabular material, tables--perspicuous as they are in print--are very
difficult to navigate in E-text. They tend to be long *and* wide. This
is especially true of double-spaced tables common in typewritten text.
Also, E-text tables are difficult to transport and maintain, since
whitespace is the most unstable part of E-text. Various programs may
trim, condense, and reinterpret spaces, tabs, and returns.

A far better solution to viewing tables is to treat them as
spreadsheets. Spreadsheet programs, unlike word processors, are
optimized for viewing tables. I would rather have an table in Comma
Separated Value format that I can cut and paste into a spreadsheet
program than one formatted with spaces.**

** Admittedly, some, but not all, spreadsheet programs can handle
space formatting.

If you are tempted to include a long table in E-text, try to observe the
following:

o Put tabular material in appendices or in a separate file so the
reader is not forced to traverse it.

Last ditch: tell the reader how to jump over it, if it absolutely must
interrupt the flow of text.

o Redesign the data structure of the table so that it is as narrow as
possible, e.g. by breaking it into several logical units--sub-tables--
that can be related by an =index= or =key= column.

o Tabular material should have field delimiters other than spaces.
Commas are something of a standard, as are tabs, if portability is not
an issue.

Very often, a table that looks good in print has to be redesigned
altogether for E-text. You should constantly ask yourself *why* the
table is effective.

Does it have to be a table at all or is it really a list in disguise;

Does the tabular arrangement make the right comparison;

What is the main relationship a user will look for in the table?

One example of an organizing principle useful in print but less so in E-
text is alphabetical ordering. An alphabetical list is very effective
in print because it aids searching. It is also effective in E-text that
has to be *modified by hand*. But it is not effective in E-text that is
meant to be searched, because it gives up the chance for an alternate
organization of the material.

Besides tables, layout effects such as =parallel columns= should be
avoided altogether in E-text. The likeliest result is that the text
will be corrupted and rendered unreadable by a program somewhere along
the line.

E-text has very different visual needs from print. These are strongly
reflected in layout design. Writing visually appealing E-text requires
a conscious effort to meet the needs of the E-text medium on its own
terms. Whitespace, markup, and structure are all handled differently in
this new medium.


<Section 2.6> Searching and Hypertext

We have already discussed the SEARCH capability of E-text on a number of
occasions. In the present section we tie some of these strands
together, the most important of which is that

The author must be constantly aware of the need of the reader to
navigate their text by searching.

This imperative leads to a pervasive tendency in E-text: all manner of
references, cross-references, and indexing are replaced by a single
concept, the =pointer=. The pointer is a sequence of characters that
allows the reader to find the reference. This reference may be in the
present file, somewhere else in the same computer system, or in print.
In print, pointers take the following forms:

o cross-references (e.g., See page 37. See also "Dinosaurs").

o glossary and index references

o bibliographic citations

o mailing and telephony addresses

o subject classifications and shelf locations

To these the electronic medium (including E-text) adds:

o network and other information retrieval references

o hypertext links and menus

In E-text, the mechanism of pointing is the same for all these
categories, and =consequently the syntax should be the same also=.
Merely mimicking print forms of expression, with its elaborate
formatting rules for footnotes and bibliographies, obscures the
underlying unity of the "pointer" notion. In print, the visual
differentiation cues the reader in to the process required to resolve
(look up) the reference.

In E-text, great efforts should be expended to make the lookup process
the same for all manner of references. The main practical distinction
is between internal references and external ones. Part III discusses
this topic in greater depth.


<Section 2.7> Copyright issues

The most prominent characteristic of E-text, the ease with which it is
copied, leads to endless copyright headaches. Even the simplest E-text
is likely to sport a copyright, even if the author wishes to distribute
it for free, since otherwise who will know that it's for free?

Here we just present a few basic copyright concepts:

o Everything you "fix in a medium" (e.g., type into E-text) is
copyrighted, whether or not you have a notice, which merely announces
the fact that you have a copyright; or whether you have a registration,
which is legal evidence of your rights.

o *WHO* has the copyright is complicated. Usually the author; but it
could be their employer.

o Copyrights include the right to (1) copy, (2) distribute, (3)
display publicly, and (4) create derivative works. For other rights,
such as the right to sell these rights to other people and so on,
consult a legal manual.

o Copyrights, claimed or otherwise, remain in effect for a *long*
time. The "public domain" ends around 1917, with rare exceptions. If
you *place* your work in the public domain, that's another matter.

o A compromise between retaining all your copyrights and is a
"freelore copyright"++ like this manual's. You retain a copyright but
let others copy and distribute your work for free.

This is the preferred approach unless you think your work has commercial
value--or if you want to restrict distribution. It lets the work
circulate widely and, most importantly, gives permission to do so
without losing the work to the public domain.

You cannot use this method if you want others to be able to produce
"derivative works", however. For that, the Public Domain is your only
choice.**

** You could try to write an elaborate general public license, but
with few exceptions it is not worth it. Software source code and
educational curricula are likely exceptions to this rule.

++ The term "freelore copyright" is not a legal term. In programming
circles you will sometimes hear it called a GNU-like copyright, after
the GNU project, the first programming project to make extensive use of
a non-restrictive copyright for copyrighting software.


<Section 2.8> The Parts of a Book

In this section we take a brief tour of the typical book and make a few
observations along the way.

The front matter of an E-text differs somewhat from its print cousin,
the main virtue being brevity. No reader wants to scroll through page
after page of apparatus to get to text. In a book, it makes a great
deal of sense to put tables and reference material at both ends of the
book, these being the places one can find most easily. They are also
easy to reach in E-text, but most likely the reader wants to begin
reading quickly, so the front of the work, at least, is forbidden
territory.

An E-text should have the following frontmatter:

o cataloguing information (the title, author's name, preferred name
for the text file, subject classification, and how to get an electronic
copy, since the reader may, after all, be looking at a printout);

o an advertisement, abstract, or teaser to entice the reader;

o copyright information or terms of use (if too lengthy these should
be placed at the end with a pointer too them after the copyright
statement itself);

and THAT'S ALL. Do not ask your reader to scroll through more than
this. Tables of Contents and the like belong in appendices at the end
or in another file.

Whether or not you have an official Table of Contents or other indexing
material in an appendix, you should, at the beginning of each major
division, have a list of the contents of that section. You can think of
these as "menus". A merged version of all these local menus is needed
so the reader does not have to search through the entire document to get
an overview; neither should the reader have to scroll back and forth to
the beginning or end for help navigating. E-texts thus always have
*two* tables of contents.

The body of E-text is much like that of a print work, except for
comments pertaining to length, the pervasive sense of hierarchy (no more
than three local levels!), and the placement of footnotes after their
paragraphs.

The endmatter is likely to contain tables and bibliographies and
multiple indices, the *very last* of which is the Table of Contents.
The end of an E-text file is a very special place, because it is an easy
place to find; yet, unlike the front, few readers start there. It
should thus be the location of the most important navigation aid for the
document. Normally this is a full hierarchical list of the document's
contents with pointers back to the text.

With E-text, the notion of a Table of Contents and an Index is blurred.
In a book, the index really serves two purposes. It takes the place of
the SEARCH command in E-text, except that not every word is indexed
(barring the existence of a concordance, of course--a luxury in print).

It also serves as a schematic and *alternate* representation of how the
text might be organized. Most works could have been organized
profitably in more than one way. One way is fixed by the linear
organization of the text. The Index provides an alternate organization.

E-texts do not really need the first form of index. Computer programs
make their own search indices with lightening speed. In effect, you
have a concordance for every document. Alphabetic indices are of little
use. Not even hypertext programs can navigate them well. Either they
present a menu with 26 entries (too long!) or else you have to go
through two levels to get to your entry ("select A-G"). Even glossaries
are best arranged by topic and not alphabetically, since the
alphabetical order is irrelevant to the SEARCH command.**

**Not quite true: very long indices profit from "clustering", or
physical arrangement in search order.

Unless there is some reason that browsing topics in alphabetical order
might be interesting in itself, you shouldn't bother. Notice that this
is very different from electronic print, where the computer should
always be used to create an elaborate print index in the final print-
out.

The best way to think of an E-text index is as an alternate topical
organization of your work. It is especially useful if there are two (or
more) *hierarchical* ways to approach your subject. Your layout can
only show one way--echoed in your Table of Contents. The others have to
be represented by an index.


<Section 2.9> The General Theory of Markup (SGML)

The International Standards Organization (ISO) has developed a very
flexible standard for marking text, Standard Generalized Markup
Language.(SGML, or ISO-8879). SGML has a very flexible syntax for
describing the logical structure of documents. Its drawback is that,
like markup languages that are intended for print media, the burden of
the markup makes the text unreadable. SGML goes a long way towards
creating a text that can look good on paper, on the screen, or to a
program. The problem is that SGML software is not widely available, so
although SGML files are portable and *potentially* useful, there is
little use for them as yet. A widely available

SGML Tags are extraneous material used to mark a section of text. Along
with delimiters, they comprise the markup added to a text. A program
that uses the marked up text has to recognize delimiters and find the
tags. Since we are more or less following SGML, the tags themselves are
delimited by angle brackets, like this:

<outline>

The word "outline" is the =generic identifier= (GI) of the tag. The
left angle bracket is the Start-Tag-Open delimiter (STAGO); the right
angle bracket is the Tag-Close delimiter (TAGC). Ending tags look like

</outline>

The sequence "</" is the End-Tag-Open delimiter (ETAGO). The end tag
ends with TAGC, just like the opening tag. Thus paired tags themselves
become themselves a sort of delimiter, albeit at a higher level than the
delimiters they are built out of. They serve as a sort of named
parentheses to represent the "nesting" structure of the document.

Here is the hierarchical structure of the document represented as an
outline:

outline

chapter 1

section 1.1

footnote 1

chapter 2

section 2.1

footnote 2

section 2.2

In parenthesis notation (a common mathematical device), the same
structure looks like this:

(outline (chapter 1 (section 1.1) (footnote 1) ) (chapter 2 (section
2.1) (footnote 2) (section 2.2) ) ).

Maybe this is a bit more clear:

(outline
(chapter 1
(section 1.1)
(footnote 1)
)
(chapter 2
(section 2.1)
(footnote 2)
(section 2.2)
)
)

The parenthesis notation allows the tree structure of the outline, which
used to be represented only by the indentations, to be faithfully
represented even when the indentation is lost. I.e., we have a flexible
method of representing tree-structures in *running text*.

The parenthesis are delimiters whose purpose is to make clear the
nesting structure of the textual =elements=. If we think of SGML as
having "named parentheses" with <tag> being a left (opening) parenthesis
and </tag> being the matching closing parenthesis, we have:

<outline>
<chapter 1>
<section 1.1> Section 1.1 text ... </section>
<footnote 1> Footnote 1 text ... </footnote>
</chapter>
<chapter 2>
<section 2.1> Section 2.1 text ... </section>
<footnote 2> Footnote 2 text ... </footnote>
<section 2.2> Section 2.2 text ... </section>
</chapter>
</outline>

Notice the exact match between parentheses above and tags. Each text
element is clearly delimited. The section and footnote numbers only
appear in the opening delimiter as =attributes=. They would be
redundant in the closing delimiters.

The reason for the funny names, STAGO, ETAGO, TAGC, and GI, is that SGML
actually has an =abstract syntax=. The delimiters "<", ">", and "</"
could be any symbols at all (within reason). The choice shown here is
called the =Reference Concrete Syntax=. It is a particular choice for
the abstract syntax of SGML. In practice, you will almost always see
the standard choice.

In addition to the tagging of elements, SGML has a very general facility
for including text and making the sort of references we discussed in
=Section 2.6=. An =entity reference= is meant to be replaced either
with a character or with the contents of a file. It starts with an
ampersand (and-sign) and ends with a semicolon. Thus &file1; means
include file1 here. And if you can't type an "e" with an acute accent
on your keyboard you can use &eacute; to get the same effect. Of course
your entities have to be defined as part of your document's =entity
set=. SGML provides a way to do this.

SUMMARY: We have introduced the basic ideas of SGML: representing the
=logical structure= of textual =elements= using =tags= as delimiters;
the various parts of an opening and closing tag; entities for external
references and character substitution; and the notion of abstract vs.
concrete syntax. These ideas are useful in developing notations and
markup conventions.


<Section 2.10> Summary: Basic Tricks of the Trade

This part has covered a lot of ground. Creating E-text that is visually
pleasing and communicates effectively is an art. Some themes, driven by
the nature of the medium, recur over and over. I have summarized these
as a series of Tricks of the Trade:

TRICK 1: Replace visual rendering with delimiters and other markup,
but be sparing. The minimalist wins this game.

TRICK 2: Use a tree structure no more than three levels deep for the
basic hierarchy.

This trick goes hand in hand with the next:

TRICK 3: For more levels of hierarchy use data hiding techniques.

The point here is to remember that the reader has an unnaturally narrow
window on a very wide world. To avoid giving the reader the sense of
being helplessly lost, you *must* make an effort to keep the relevant
portion of reality small and easy to navigate.

TRICK 4: Use pointers to fill the roles of notes, cross-references,
bibliographic citations, hypertext links, etc.

Pointers are a recurring theme in computer science; they serve to unify
a whole series concepts that are visually distinct in the print media.
They are used to implement hierarchy and to allow "nonlinearity" in the
text.

TRICK 5: Think less in terms of traditional categories like "Table of
Contents" or "Index" and more in terms of data structure.

This trick follows naturally from the observation that logical structure
and not its rendering in a particular system should be primary. This is
a prerequisite for communicating with readers using *you know not what*
software or device.

TRICK 6: Use escape characters and tags to extend character set and
delimiter repertoire respectively.

TRICK 7: Formatting, rigorous markup that looks like visual layout,
can meet the needs of humans and computers.

The trick is to rigorously use sequences of characters (especially
"white space" like carriage returns and spaces) to create what appears
to be visual formatting. This simultaneously satisfies the human and
the computer. This is a nice trick, but hard to carry too far.

In all things use moderation. Being too clever or too idiosyncratic
usually marrs the effect for little gain. As always, the main trick is
to hide the effort that goes into the art, making the difficult look
easy.

<Part III> A Very Brief E-Text Style Manual

=Section 3.1= Backups and Saving Work

=Section 3.2= Compressed Files

=Section 3.3= Version Control

=Section 3.4= Use of Word Processing Features

=Section 3.5= Character Set and Font

=Section 3.6= Outlining and Hierarchies

=Section 3.7= Text Inclusions

=Section 3.8= Esoterica

This chapter is meant as a concrete example of the suggestions in the
previous two chapters, in the form of a "style manual". You should take
these guidelines as suggestions you may want to adopt, not as rigid
rules.


<Section 3.1> Backups and Saving Work

RULE 1.1 You should always keep two copies of any electronic text you
would mind not having one day, one on your hard disk and one on a
floppy. The floppy is far more likely to fail, so you should consider
keeping two floppies.

A common scheme if you don't work at home is to keep two backups, one at
home and one at work. Alternate which one you revise so that you will
always have the most recent one at home and the next most recent at
work. This is so that if a fire or other disaster destroys your work
records you still have the most recent copy.

If you work at home, make sure your two sets of backup disks are in
different places. That way an accident with a strong magnetic field
(found near motors, in telephones, in TV monitors, etc.)--or a spilled
cup of coffee--will not wipe out both copies.

RULE 1.2 (Archive copies) You should have *both* an archive copy of
each important "milestone" version *and* a set of backups. Backups are
usually snapshots of your system. If you delete a file from your hard
disk and then revise your backup, you will no longer have the file on
your backup disk! Even if you put your backup set aside from time to
time as an archive of "My System, December 1992", One day, you will
decide to recycle those disks and lose your copy.

The Moral: you need both an archive copy of each important project and
a revolving set of backup disks. (I call mine "A" for archives and "B"
for Backups).

Checklist:

o original working copy on hard disk

o second copy on hard disk for really important files

o daily archive of important work, organized by project

o most recent revolving backup set at another location (weekly or
monthly; more often for critical files).

o second most recent backup set on site.

Remember the basics: *at least* one backup and don't put your eggs all
in one basket. If you think I sound paranoid about this backup stuff,
trust me. Do this or you will get burned one day. I know what I am
talking about.

RULE 1.3 (Exception to Backups) An exception can be made for E-text
that can be easily obtained over the network if you don't modify it and
*if* getting a replacement would not be burdensome. In effect, the
network is your backup copy. But beware that what is on the network
today may not be there forever.

You can also "forget" about backups (but not archive copies) if your
computer is on a local area network and you know for a fact that backups
are made over the network on a regular basis. Many businesses,
recognizing that most persons would rather risk losing a months work
than spend five minutes backing it up, make systematic backups, often
using automatic systems that work at night, when the network is quite.
That is nice, but remember that you can still lose nearly an entire
day's work if disaster strikes just before you go home for the day.

RULE 1.4 (Saving Your Work) Unless your word processor has an autosave
and recover feature, you should develop the habit of saving your work at
least every fifteen minutes and whenever you get up to leave your
workstation.


<Section 3.2> Compressed Files

It is possible to compress text files to around half their original
size. Of course, you have to uncompress them before reading, but in
effect you can double your hard disk size with *software*. File
compression is becoming a standard feature of many programs and systems.

Compression works because text has very regular patterns that can be
encoded more compactly than the standard encoding. Files that have more
random bit patterns--binary files like programs or graphic images--
seldom compress more than a few percent.

RULE 2.1 Never compress any file except a text file.

I would be a bit leery of compression. It trades memory, which is
fairly cheap, for your time, which is expensive. Also, it complicates
the strategies your software has to use--what happens if your system
goes down in the middle of uncompressing an important file?

Compression is here to stay, but I recommend you follow this rule:

RULE 2.2 Only compress things you keep around for archival purposes--
old reports and projects, things you want at your finger tips but don't
use day-to-day.

To give some further guidelines, compression makes a lot of sense in
these cases:

o compressed files make good archive copies, at least if you are
keeping the file to feel safe and not because you need it regularly.

o compressed files are good for network transfers because, for text
files at least, they cut the time in half.

o more subtlely, there is a limit to the amount of hard disk you can
safely use--you shouldn't use more than you can backup in 10 minutes a
week or half an hour a month.

If you make your own backups on floppies, that means that 80 Megabytes
is about tops. More than that and you have 100 backup disks (times two
sets!) to deal with. Probably you get sloppy. File compression means
you can get twice as much stuff on your disk without increasing the
backup burden, so you save both time and space.

SUMMARY The most important thing to remember about file compression is
that there is a trade off between time and disk space. The fact that
you can get twice as much on your hard disk is traded against the fact
that it takes time to compress and uncompress files. Given that memory
is very cheap this is not always a good trade. The most likely outcome
is that by keeping too much useless stuff on disk you're setting
yourself up to waste a lot of time.


<Section 3.3> Version Control

Version control is important. It is easy to keep on top of *if* you
bother. If you don't, one day you will modify the second most recent
version of a long manuscript and then have to figure out the differences
between two variant documents and "merge" them into your next draft.
Then again, you could give up all the work you did and go back to the
old version. Or maybe you would like to follow this rule:

RULE 3.1 Versions should be numbered consecutively in "dotted decimal"
notation.

E.g., 2.4.1 means version 1 of sub-version 2.4 of main version 2.0. You
can add the version number to the heading of the file or make it part of
the file name. This is hardest to do in DOS, where filenames look like
DRAFT241.TXT meaning version 2.4.1.

Version 0.1 up to, but not including 1.0, are reserved for "drafts".
Version 1.0 is the first public release and Version 1.0.1 its first
minor revision.

RULE 3.2 In general, the primary version of your work should be in the
format your word-processing program considers to be "native". Plain
text files should be derived from this master copy.

Creating a plain text version and then re-importing it to the word
processor will often result in problems. The word processor's native
format (the one that understands all the nifty features) is proprietary;
i.e., it is not directly portable to other systems like plain text.
Something is lost translating proprietary format to plain text and back
again. The most common problems are:

o A "hard" return is located at the end of every line, making editing
difficult because you constantly have to adjust the length of each line
by hand, or else use the "fill" command on each paragraph;

o Unusual "line wraps" result from incompatible line lengths;

o Lists of items that are supposed to be on separate lines are
compressed into paragraphs;

o Visual formatting like the spaces or tabs before an indented block
quote, vertical bars alongside paragraphs, and similar things are
scrambled.

o Structures requiring elaborate spacing or tabbing like outlines,
tables, or section headings are confused;

o Double and single spacing is mixed up.

o Special symbols and codes are no longer readable.

These problems cause severe version control headaches unless you follow
the "master copy" strategy.

SUMMARY Strategies for avoiding these problems in general are given in
the next section. But in general you can avoid them if you follow this
basic strategy:

(1) Always consider native format the "primary" version and plain
text the "derived" version.

(2) Never use any feature of your word processor that can't be easily
translated into the plain text version.

The next two sections concentrate on just which features you can use.


<Section 3.4> Use of Word Processing Features

From the standpoint of creating effective E-text it is extremely
important to understand the following concepts:

hard return : a control character that signals the end of a line of
text. The actual code, an ASCII character, varies from computer to
computer. This is a source of many formatting problems.

filling : many word processors are capable of adjusting the length of
lines automatically in a process called paragraph filling. This can
either be automatic or on command. In the older method, the line
"wraps" when you reach the end, but if you make editing changes you have
to select the "fill" command. Newer word processors constantly refill
the paragraph as you make changes, adjust margins, etc.

formatting codes and markup : in order to represent all the effects
you can create on paper using a text file, it is necessary to add
additional characters that control the formatting of the document--
italics and underlining, fonts, superscripts, and the like. These codes
can be typed letters like ".cl" or "</p>"; or they can be "invisible" on
the screen but nevertheless present in the underlying file.

bit-mapped vs. character-oriented screens : The screen is represented
in the computer memory as a series of black and white dots, called
pixels ("picture elements"). There are two kinds of screens, those that
can only represent characters and those that can draw any graphic shape
(including any screen font ever devised, any line or shapes, patterns,
and complex artistic images like photographs and computer drawings).
Character-oriented terminals only have

WYSIWYG : "What You See Is What You Get" is a strategy adopted by many
word processing systems that run on bit-mapped terminals. A single
underlying file can create

font size and rulers : fonts in a traditional character-oriented
screen are all the same size--usually 80 or 132 characters per line. In
a bit-mapped system the fonts can have any size. Some fonts are fixed
width, meaning that any character takes up the same width (and hence
there are the same number in every line); in proportional fonts the
characters have different widths. The line expands and contracts when
you change letters.


With these concepts in mind, we can discuss how to create E-text
that is meant to be read as E-text. The main problem is that not every
word processor can read files from every other word processor. The
least common denominator is the plain text file, or ASCII file. ASCII
means "American Standard Code for Information Interchange". The ASCII
code includes the characters commonly found on an American typewriter
keyboard plus some "control characters" representing actions like
"carriage return" or "horizontal tab". The issue of which characters
you can use is discussed in the next section.

When using a word processor you have to be careful because it is not
always obvious which features will come out well in plain ASCII. Word
Processors compete on the basis of their wonderful features. Often,
however, the fancy features you paid for cannot be used in the real E-
text world. They are oriented towards producing pretty paper, but will
*confuse* other computers unless they are running identical software.

You will not be able to represent

o bolding

o italics

o underscores

o superscripts

o subscripts

o indenting and margins (except by spacing--not tabbing)

o soft returns

o multiple proportional fonts

o double columns

o special symbols or formulas

o included graphics and spreadsheets

and so on and so on. This means that you must forgo essentially
anything that needs a formatting code. In newer WYSIWYG word
processors, it may be hard to tell what is formatting and what isn't.
In general, you have to think like a typewriter. *sigh*

RULE 4.1 Change to a non-proportional font, preferably 10 point (elite)
or 12 point (pica) and 6 inch distance between margins. This works out
to 72 characters for elite or 60 for pica.

To be safe, lines should not exceed 72 characters; but in no event
should there be more than 80 characters without a hard return.

Actually, 60 characters per line (12-point non-proportional font with a
six inch ruler) is more portable, because it can be read both on
standard 80 character screens and with the default settings of most word
processors. If you use the 72 character line, some users may have to
select the whole text and convert it to Courier-10 to read it in a
WYSIWYG word processor without funny line wrapping. Not all users are
that sophisticated, so you are better off using a 60 character line
unless you have a special reason to go with 72. Also, short lines are
easier to read, as you will learn in any speed reading course.

RULE 4.2 Don't justify the text, but keep all text "ragged right", like
typescript.

RULE 4.3 Don't hyphenate words. Let the right column look uneven.

If someone is using a different screen width, your hyphenated word could
end up looking like "this in the mid-dle of the text". Also, SEARCH
commands choke on hyphenation. There is probably nothing you can do to
prevent your word processor from breaking words that have "real" hyphens
in them and happen to fall at the end of a line (remember this when
*you* have to search).

RULE 4.4 Start text flush with the left margin and don't add spaces to
create an indented effect. Do not use indentation, tab stops, or
spreadsheet like tables to format your text.

If you want to include spreadsheet data, use Comma Separated Value (CSV)
text like this:

"January Actual","January Budget" <hard return>
23201.45,20000.00 <hard return>

You can cut and paste this into any Spreadsheet program.

RULE 4.5 If you use the autofill feature to avoid having to type
return, make sure your word processing program has a feature that will
insert "hard" returns at the end of each line when you create your plain
text output file.

In Microsoft Word, this is the "Save as Text with Linebreaks" command.
If you use "Save as Text" you get returns at the end of every
*paragraph*, not every *line*. Someone with an old-fashioned text
editor--one that likes hard returns after every line--will see *very*
long lines (and probably truncate them to boot). You may have to
experiment to find the equivalent command in your system.

RULE 4.6 Don't use special characters like non-breaking spaces or
optional hyphens to dictate where line breaks occur. These features are
not portable.

RULE 4.7 Try to prevent your word processor from hyphenating words on
its own.

It's OK to break a word that has a "hard" hyphen at the hyphen. That
is, if a hyphen is a normal part of the word's spelling and the word
processor decides to break the word at the hyphen, don't worry. But you
should try to avoid breaking sentences that have dashes--like this--at
the double dash. Sometimes the word processor will break a double dash
in half.

RULE 4.8 Use single spacing with two hard returns between paragraphs.

Many WYSIWYG word processors allow single, double, or triple spacing
between lines. In the text file, however, there is not necessarily two
returns between each paragraph. Double spaced text *is* much easier to
read on a screen, but it is hard to re-paragraph. The two returns
between each line tend to make word processors think that each line is a
paragraph.

In general, it is easier to

RULE 4.9 Keep paragraphs short, say around 10 lines.

Paragraph breaks form a visual guide for the eye. A book that has
paragraphs spanning whole pages is hard to read. Similarly, on a 24
line screen it may be difficult to read paragraphs longer than 20 lines.
Even if you naturally express yourself in paragraphs of 7 to 10
sentences, you should break your progression of thought into shorter
segments after writing it down, if you want to reach your audience. If
I see a paragraph that fills the whole screen, I tend to want to scroll
down and skip ahead.


<Section 3.5> Character Sets and Fonts

In order to be portable, a document must be coded as a text file. The
American Standard Code for Information Interchange (ASCII) represents
each character as a seven bit number. There are variant dialects of
ASCII, especially for languages other than English, but the variations
do not affect the subset we will be discussing. In particular, there
are many extensions of ASCII to eight bits, of which Latin-1 is the most
popular. These extensions are *not* portable, and hence not discussed
further.

In order to be as portable as possible, ASCII text , or plain text, must
observe a number of conventions:

Rule 5.1 Use only the 84 character subset of ASCII consisting of the
twenty-six letters of the English alphabet (both upper and lower case),
the ten digits, and twenty-two punctuation marks in Table 1 below. Do
*not* use the ten bad characters:

o dollar sign

o pound sign (number sign)

o at-sign

o carat (circumflex)

o tilde

o back quote

o backslash

o vertical bar, and

o curly brackets

These symbols do not translate well into character sets in other
countries.

TABLE 1. The 22 Legal Punctuation Marks
o comma

o period

o colon

o semicolon

o exclamation point

o percent sign

o ampersand

o asterisk

o parentheses

o hyphen

o underscore

o plus sign

o equals sign

o square brackets

o apostrophe

o double quote

o angle brackets (less-than and greater-than signs)

o question mark

o and (forward) slash.

More briefly:

!%&*()-_=+[];:'",.<>?/

The reason why these characters are fine and others aren't is obscure.**

** There is an international standard, ISO-646 that adapts ASCII to
non-English languages. Part of its character set, the "invariant
subset", is the same on all keyboards. There are also obscure problems
translating ASCII to its IBM mainframe equivalent, EBCDIC. Even the 22
legal characters are too many in some circles.

RULE 5.2 The only "white-space" allowed are spacebar and line endings
(carriage returns). Horizontal tabs and other "control characters" are
not portable.

Actually you can use tabs for text that is not going to pass through
unusual or difficult conversions. For example, if you are sharing a
Spreadsheet by E-mail you can probably exchange a tab-formatted file
rather than using the (safer) Comma Separated Value format.

Sticklers will point out that Rule 5.2 means we allow an 87 character
subset of ASCII.


<Section 3.6> Outlining and Hierarchies

RULE 6.1 Impose a relatively rigid outline (hierarchy) on your
manuscript and reflect that hierarchy in a rigid formatting scheme for
the section and chapter headers.

E.g., this manual uses angle brackets *plus* two spaces *plus* a section
title. Each section is preceded by two blank lines, each part by three.
Sections are numbered in dotted-decimal form.

Conventions like these allow casual searching, or "navigation" of the
document. Unless you have a "hypertext" document that lets you skip
around easily, such guideposts are necessary.


In designing markup conventions, you should keep in mind that it is
more valuable to represent *logical* structure than to try to mimic the
*physical* appearance of a printed page. Thus,

o it is wasteful to use vertical space to try to mimic vertical
layout of a printed page, because the resulting effect looks
disconcerting on the screen. Use the number of blank lines to represent
the logical structure of the document instead.

o use flush left headers for the top levels, indent a couple of
spaces for lower level. At the very lowest logical level, just skip an
extra line between paragraphs and don't bother with a separate title for
the header.

o try "tagging" important breaks with special characters like angle
brackets, a row of hyphens "----------", or a decorative break like
this:

+ + +

o don't overuse all caps in titles, especially for Section breaks.
They scream too loudly. You can't really mimic the print-based effect
of small caps on the screen.

o don't bother developing elaborately different formats for headers
that are seen infrequently (e.g. chapters, parts, or "books").
Concentrate instead on the sections, sub-sections and minor breaks. The
"wide-area" structure is more simply represented by dotted-decimal and
the low level structure by visual formatting.

Remember that the global appearance of the document is much less
important than it is for a book, since the user never sees the document
as a whole, only small local sections. In fact, the highest level
logical divisions are probably not visual at all--they are the breaks
between computer *files*, or even a directory hierarchy. And--ever more
commonly--hierarchies of *computers*! This leads us to the next rule.

In preparing your outline, remember the following rule:

RULE 6.2 Never nest a hierarchy or outline more than three levels deep
without hiding some of the structure.

There is a great deal of structure in the computer world.

Countries contain

domains contain

networks contain

companies contain

individual computers (nodes) contain

directories contain

subdirectories contain

documents contain

chapters contain

sections . . . (*whew* that was ten levels).

You *must* try to hide some of this structure from your reader. The
easy way to do this is to narrow in on the local focus and pretend that
what we're looking at right now is the *only* thing in the world. It is
impossible to read a document and simultaneous think of its place in the
wide world. Forget the tree structure of the whole network or computer
system; let the reader focus on the local tree-structure.

And, whatever you do, don't let the reader know they are more than three
levels deep.


<Section 3.7> Text Inclusions

We have already discussed basic formatting issues like paragraphing,
line length, and basic layout. This section concentrates on the myriad
details that bedevil the typist. We save most of the *really* technical
stuff like tables, foreign languages, and formula, for =Section 3.9=.
In this section we discuss very common inclusions in text--


<Section 3.7.1> Alternate Fonts

RULE 7.1 Use markup to represent *logical* emphasis rather than
particular font effects.

Here are some typical reasons and traditional *print* renderings:

o emphasis (italics or underlining)

o strong emphasis (bolding, all caps)

o interior dialog (italics)

o editor's emphasis (italics)

o foreign language phrase (italics)

o book title (italics, underline)

o article title (quotation marks)

o new term, index term, glossary item (italics, quotes, underline)

As you see, italics are overused and the choices are not always
consistent. In order to make your meaning PERFECTLY CLEAR, it is best
to observe this rule:

RULE 7.2 Prefer delimiters for marking inclusions. Use different
delimiters for different purposes.

A delimiter is a character or pair of characters that is used fore and
aft to set off text.

It is not a bad idea to develop a set of guidelines for how to render
each sort of inclusion. Here is what I use:

o now *this* is emphasis (and *strong* emphasis)

o as I said, markup is part of =la vie=.

o or we can introduce a new "term" like this (See "taxonomy").

o book titles, like _Elements of Style_, are a snap.

I also have a series of conventions I use for special situations that
arise in scholarly text, such as multiple languages or included math
text.

By the way, avoid the effect that results when you try to
_mimic_print_media_by_underlining_in_this_fashion_. The result is
tedious and leads to long words that don't wrap well. In E-text, a pair
of underlines is just another delimiter, nothing more.

RULE 7.3 In E-text, always place punctuation *outside* delimiters.

Otherwise, the E-text looks "silly." Better: "silly".

In print, you put the punctuation after a quotation on the "inside."
This looks good in print but terrible on the screen. If your E-text is
destined for computer screens (and automated search programs) it is
better put the punctuation on the "outside". If this disturbs you,
remember that in the last century the printers rule (as I have seen in
many books,) was to put *commas* inside parentheses as well as inside
quotation marks. We are allowed to change these conventions from time
to time.


<Section 3.7.2> Quotations and Included Blocks of Text

There are a number of ways to include quoted or included
materials. One, favored in print, is to push the margin
of included text inwards, like this.

You should use this technique *very* sparingly. It requires a hard
return and hand spacing for each line. Reformatting (to shorten the
passage, say) is very difficult. WYSIWYG word processors let you shift
margins on a per-paragraph basis. This feature is not transportable so
you can't use it for E-text.

In E-mail correspondence you often see the convention that a right angle
bracket in the left column sets off correspondence. Often, this
continues to the point of inanity:

>>> I said I don't like the President's new policy
>> O yeah?
> Yeah.
O yeah?
> Not only that, you're an idiott
Well so are you. And you don't spell good either.

Our ability to reconstruct the whole train of correspondence is a poor
trade for legibility.

Another device to avoid is the frequent use of vertical bars alongside
text to indicate changes. Although most computer keyboards have a
"vbar" (usually a shift-backslash) this character does not travel well
and the visual effect is lost in some fonts or if the line length
changes.

An alternative to the vertical bar is to mark changed sections with
double brackets:

[[Our new improved widget has
a longer lifetime and
higher customer satisfaction
rating.]]

More elaborate schemes for marking changes are discussed in the section
on Editing and Marked Sections.

In summary, we have this rule:

RULE 7.4 Avoid block quotes and text with vertical lines to represent
additions or changes. Just use conventional quotation marks or a
special "delimiter" like double square brackets.


<Section 3.7.3> Lists

There are two basic kinds of list, ordered** and unordered. Unordered
lists often have "bullets" in front of the items.

** Also called enumeration's.

RULE 7.5 Indent list items at least two spaces and make sure list items
are in separate "paragraphs", i.e. with a blank line between each item.

This prevents formatting problems that occur when the word processor
decides that a list is actually a paragraph and pours it, bullets and
all, into a rectangular shape.

RULE 7.6 Do not use a "hanging indent" for list items. Let subsequent
lines run to the left margin.

o This is an example of a list item
that looks good in print but
is hard to re-format in E-text.

o This second list item is more typical
of E-text. You can reformat it without deleting
lots of spaces at the beginning of each line.

Also, as mentioned in Part II, the visual effect of straight line
margins is less important in E-text. You don't gain all that much
visually by going for the pretty-but-hard-to-format look.


<Section 3.7.4> Cross References, Hypertext, and Embedding

References to other parts of the text should be set off so they can be
found. Cross-references are of several sort, all related:

o Cross-References to other parts of the document: See Section 3.4,
See "UNIX" in glossary, Page 43.

These cross references are essentially pointers that urge you to leap
over the intervening text. This is easy in print media, where you have
all the pages in your hand. With a computer program you have to use the
comparatively clumsy method of manipulating the keyboard or mouse to
move around. With plain text, the only rational approach is to use the
"search" or "find" command of the word processor to locate the passage.
The art comes in guessing good "strings" (sequences of letters) to
"search" for.

o Hypertext references (outline overview, hypertext menu and
references)

Many word processors allow you to "navigate" a document by traversing an
outline overview. In what amounts to the same thing, "hypertext"
programs often implement the natural tree-structure of a document by a
series of menus representing the possible "branches" available at each
"node". This is the computer equivalent of the dime-store "interactive
adventure book" in which you get to choose the plot developments by
making choices like "If you want to rescue the damsel go to page 43; if
you want to kill the ogre, go to page 136."

o External File References

Here the point is that we can name other files and directories--and even
other computers, e.g. "rtfm.mit.edu:/pub/usenet" means subdirectory
"usenet" of directory "pub" on computer node "rtfm" at M.I.T.

o "Bibliographic Citations" of print media

The Bibliographic citation, either as a hypertext link in the text
(footnote) or as a list of references (menu) is a subject of great
attention in print media, with all sorts of elaborate formatting rules.

o Embedded Figures and Included Files

Very often, word processors (and long computer programs) have a master
file that looks something like this:

include <Front.Matter>
include <Chapter.1>
include <Chapter.2>
include <Chapter.3>
include <Chapter.4>
include <Chapter.5>
include <Appendix.A>
include <Index>

This master document sews together a bunch of smaller files. In
advanced programs, you may be unaware that this process, called file
inclusion, or embedding, is taking place.

File inclusion is especially common as a solution to the following
problem: how do you include material that is "foreign" to textual
matter, say a graphic image or drawing. If you just cut and paste the
text, the program will mistake it for part of running text, often with
dire consequences. The solution is to keep the offending material in a
separate file and have only the file reference in the text itself. Then
all the word processing program has to do is

You can immediately see that all these are applications of a single
idea, the idea of a "pointer", or "reference". One part of the document
points to another. We are supposed to imagine--and the program is
supposed to make us think--that there is a bridge from one place to
another, or that the reference can be expanded to that we can enter into
the other file or location and get back again. Thus, the "See
Reference", hypertext link, or external file reference really amount to
the same thing.

The point is that in all cases we need is someway to represent the
starting point (reference or pointer) and ending point (anchor point) of
the arrow. The World Wide Web uses Hypertext Markup Language (HTML).
An HTML cross-reference looks like this:


The corresponding target, or anchor, is marked this way:

One soon tires of making up unique names to allow each cross-reference
to mate with its anchor. It is more natural to use the document's
natural tree structure (perhaps represented by dotted decimal) for
anchor identification. Admittedly, this lends itself to dangling
references like "See page 25" when page 37 is the correct page for
version 2.3. Correcting these references is probably less work than
typing the ungainly syntax of an HTML cross-reference.

If we are not creating a source document to connect to the World Wide
Web, a simpler method is to delimit the reference with equals signs,
=See Index=, and the anchor point with angle brackets. This has an
added advantage if you are using equals signs to delimit italic text,
since glossary entries are often rendered in italics. You can see how
natural this is given the section marking scheme adopted here (See
=Section 26.6.4= below).

<=Index> This is the anchor point for the index reference made in the
above paragraph. The equals sign is optional. It just serves to mark
the tag <index> as an anchor point.

RULE 7.7 Delimit glossary entries, index entries, See references, and
so on with equals signs. Use a consistent notation, such as angle
brackets, to mark the anchor points.

The World Wide Web attempts to link documents with cross-references
(hypertext links) on a global scale. The notation developed for this
project is called a universal reference locator (URL) and is very
similar to

protocol://node:/directory/file:port

E.g.:

ftp://ftp.ncsa.uiuc.edu:/pub/education/README:80.

news://comp.sys.mac

The "protocol" part has to do with the method of getting the document
(and thus implicitly with the classification scheme). The examples here
are File Transport Protocol and Usenet News, two common document
retrieval systems. "ftp.ncsa.uiuc.edu" is a computer, "comp.sys.mac" is
a "newsgroup". "/pub/education/README" is a file in a directory called
"/pub/education"; and "80" is a "port number". These details only
concern the retriever, who may be just a computer program.

The URL notation is easily adapted to other hierarchical schemes used
outside the computing world, especially if the syntax rules are relaxed
a bit. Here are some ideas:

For Books:

dewey://stcharles.pub.lib:270.23.07:gilson:4 (St. Charles Public
Library, Dewey Decimal, Author Ettienne Gilson, copy 4).

LoC://QA.22.4: (a library of Congress citation)

ISBN://123-24-55

For a Journal Article:

journal://Time:1990.23.56-69

For the Phone System:

voice://1.708.840.8069 (A voice number)

fax://1.708.840.8069 (A FAX number)

internet://jgoodwin:adcalc.fnal.gov (E-mail address)

postal://Box.6022:St.Charles:IL:60174 (Surface mail)

Or something like that.

RULE 7.8 Use Universal Reference Locators (URLs) for worldwide computer
file references. Campaign for its extension to other obvious (paper and
telephonic) information sources.


<Section 3.7.5> Editing and Marked Sections

RULE 7.9 Indicate short deletions [and additions] with square brackets.
If you need to tell them apart add a plus or minus sign in front.
Indicate the version of the change by a version number (single number or
dotted decimal) after the sign.

This regulation shall apply to each +1[and every] tax payer -2[,
except members of this legislature].

We can thus reconstruct the history of this text:

Version 0: This regulation shall apply to each taxpayer, except
members of this legislature.

Revision 1: This regulation shall apply to each and every
taxpayer, except members of this legislature.

Revision 2: This regulation shall apply to each and every
taxpayer.

This principle can be extended to whole sections of text except that it
is better to use double square brackets since the text itself may
contain "innocent" brackets.

-2.3[[ . . . ]] means that this section is omitted in Version 2.3.

This notation soon becomes wearisome after multiple and intricate
revisions. Jim Warren has devised a visual format that makes collating
multiple versions in tabular or outline form:

012
This regulation shall apply to each
and every
taxpayer
, except members of this legislature
.

RULE 7.10 For complicated additions and deletions, such as those found
in legal matter, use Warren format.

Here are three examples of the formats we have been discussing:


[[include example here]]


One final rule:

RULE 7.11 Don't space between ellipsis. Instead, leave one blank space
before and after: ( ... ).

Word processors do not necessarily recognize ellipses as a single
"thing". The gracious effect of spacing created by a typewriter seems
lost on a computer screen.


<Section 3.7.6> General Style and Conventions

This section is about rules that are conventional to almost all typing.
A brief list is included here for completeness:

RULE 6.12 Add two spaces after each major break (period or question
mark, colon, etc.) and two spaces after minor pauses (comma, semicolon).

An exception is made for periods that are part of an abbreviation or
initials of a name, where the rule is:

RULE 6.13 Allow one space after each initial in a name but not between
initials of an abbreviation: J. E. Goodwin, St. Charles, Ill., U.S.A.

RULE 6.14 Represent a double dash with two hyphen and do not allow
spaces on either side of the dash--instead like this.

RULE 6.15 Certain Latin abbreviations do not have internal spaces, nor
are they in italics: i.e., e.g., etc.

<Section 3.8> Esoterica

99% of all ordinary E-text written in English does not need this
Section. But the issues discussed here greatly effect certain kinds of
text:

1. Texts requiring traditional scholarly adjuncts such as citations,
cross-references, indexing, bibliographies, glossaries, critical
apparatus, and figures;

2. Scientific and mathetmatical texts that use formulas extensively;

3. Statistical text with frequent use of numbers, uncertainties (plus
or minus), scientific notation, and tabular material. Such text occurs
commonly in the physical and social sciences, e.g. reports of
experiments.

4. Texts in one language that discuss another (language textbooks,
grammars, dictionaries, commentaries, many works in the humanities);


<Section 3.8.1> Inclusions in Languages Other than English

In English, where diacritical marks are rare, foreign languages are. It
is important to distinguish between =transcription= and
=transliteration=. In transcription, an attempt is made to render the
word as nearly as possible using the English alphabet, with or without
diacritics. Precision . Transliteration is an attempt to represent the
*spelling* of the word in the non-English alphabet. Great effort is
made, in designing the tranliteration system, to make the
transliteration reversible, so that the exact original text can be
recovered by a knowledgable human or program.

These two possible approaches to including non-English text lead to two
different rules, depending on intent:

RULE 8.1 Set off foreign phrases with the same delimiters used in place
of italics (usually equals signs).

RULE 8.2 Use special delimiters (for example plus signs or asterisks)
to signal special notations used for *tranliteration*.

No attempt is made to distinguish three uses of "equals-italics"--
foreign language italics, cross-reference signal, and miscellaneous
italics. As in print, these can usually be distinguished by context.


Beyond representing foreign phrases exactly, one might want an
informal notation for representing the diacritic marks that do
occasionally occur in English. Using these is probably pedantic in
ordinary E-text, but from time to time they may be useful, e.g. in
grammatical discussions:

RULE 8.3 In ordinary English texts it is not usual to use diacritical
marks, even when the English word technically has them, such as:
fac?ade, ro=le, coo%rdinate, blesse!d.

If absolutely necessary, we recommend:

acute accent: ne/e

grave accent: blesse!d

circumflex accent, tilde, or macron: ro=le, nolo= contendere

diaeresis or umlaut: coo%dinate

cedilla: fac?ade

The choice of symbols is based in portability (which excludes, for
example a tilde or circumflex). Also, the notation is just a little
ugly to discourage its overuse.


E-texts that discuss foreign languages present special problems.
Here are some suggestions:

1. The basic convention is that the primary language is unmarked, and
the secondary language delimited by asterisks: *E pluribus unum*, or by
equals signs =E pluribus unum=.

The choice of delimiter used requires some thought. In Latin, asterisks
should be used so that equals signs can be used to represent macrons:

Ve=ni=, vi=di=, vi=ci=.

Unless there are considerations like these, the asterisk is chosen for
the most frequent use in the text (usually italics-for-emphasis) because
it is less obtrusive and most conventional.

Since such text do not usually contain quotations, double quotes may be
used to represent translations or definitions:

=E pluribus unum= means "from many, one."

In printing, both the foreign text and the translations are often
rendered in a different style. If italics are needed for other
purposes, they should be delimited by asterisks:

=E pluribus unum= is *so* Eighteenth Century.

2. If the text contains a selection of many different languages,
special delimiters are used to segregate languages that use the Latin
alphabet from others. In this case no effort is made to choose one
secondary language as "the" secondary language. Instead, the delimiters
are used to mark alphabets that differ visually from the Latin alphabet.

= = Languages using the Latin Alphabet, other than the primary
language (in effect "language italics").

* * Greek

+ + Hebrew

/ / International Phonetic Alphabet

Other delimiters can be constructed =ad hoc=, such as &&[ ... ]&& or
+/ ... /, (* ... *) and so on.

Just a reminder: the recommendations here are strictly for informal use
in the context of "flat" ASCII files, e.g. for casual communication, or
as character-oriented output from a program that uses a proprietary
format or SGML for internal use. Any substantial work with multiple
languages is probably worth the effort to use something other than E-
text for the *underlying* representation. In particular, scholars
should consider the Text Encoding Initiative's recommendation.

Even with an elaborate underlying markup system, however, the problem
remains of how to render the foreign language text, perhaps a text that
does not even use the Latin alphabet, on a character-oriented screen.


<Section 3.8.2> Footnotes, Cross-References, and Bibliographic
Citations

There are two issues here: how to write the citation and where to put
it. As to the first issue, citation schemes that work well in print are
often cumbersome in E-text. The answer to the second issue is

RULE 8.4 Place footnotes at the foot of the paragraph, or else gather
them in an appendix at the end of the work.

Another common place to put notes, at the end of a Chapter, should be
avoided since it is a relatively hard place to find, compared to the end
of the file.

The inclusion of footnotes in the body of the text with special
delimiters, as is done by any word processors, is a concession to print-
oriented production of text. It places the footnote where the *program*
wants it. From the standpoint of the reader, there may as well not be a
footnote at all!

RULE 8.5 The footnote mark should be as unobtrusive and short as
possible: usually ** or ++, [34], or [Wells85].

. . . as discussed in the paper by Wells.[Wells85] Another . . .

. . . again makes this point in Ref.[36], where the bias . . .

. . . See the Nichomachean Ethics+[NE,1150a]. . . .


Footnotes with a single asterisk could be confused with an "emphasis"
delimiter. Putting asterisks in brackets, [*], seems long-winded.

RULE 8.6 Footnote sequencing should not continue across physical files.
Use dotted decimal notation to refer to "long-range" footnotes: [2.15]
means footnote 15 in chapter 2.


Designing a good bibliographic citation scheme for E-text means
breaking away from print models. Long dashes and hanging indents are
useless in E-text. Also, most readers, if they read notes at all, will
synchronize two windows so that notes can be read in one and the text in
another. *Therefore* it is better to make your annotated bibliography
follow chapter organization than to make it alphabetical or
chronological.

In general, it is a good idea to gather bibliographic references in one
place and *not* put them in footnotes, as is common in print. This is
because many of the citations will be URL's (see =Section 3.7.4=), which
mar the appearance of the text.**

** This assumes the E-text is not being prepared for linkage to the
World Wide Web! In this context, our discussion applies more to the
output of a WWW server than to its input.


<Section 3.8.3> Formulas and Statistical Text

There is a great deal of scope for developing new mathematical notations
that work well with E-text. I can only make a few recommendations and
observations here.

RULE 8.7 Use square brackets to set off "math italics", especially
variable names embedded in ordinary text. Omit the brackets for
displayed equations.

This rule is necessary to make variables stand out. Human eyes that are
used to picking out subtle font differences find it hard to read text
that refers to variables like a where a is the unknown. To repeat, [a],
where [a] is the unknown.

RULE 8.8 Separate displayed material by one blank line before and
after, and indent consistently (five spaces recommended).

Here is a well know example:

E = m c[2]


E[2] = p[2]c[2] + m[2]c[4]

where [E] is the total energy, [m] is the rest mass, and [c] is the
speed of light in a vacuum.


Scientific notation is a travesty in type. One commonly sees such
attempts as 1e12, 2.005+/-.01, or 2 x 10 5. We recommend quoting
numbers in the following fashion:

1.0E+12, 2.005(10), and 2.E+5.

To my eye, at least, the following rules are useful:

RULE 8.9A. Always use a sign after the "E" in exponential notation;

RULE 8.9B. Always express the decimal in floating point numbers and
precede a decimal point by a zero, i.e. 0.05, not .05.

RULE 8.9C Represent symmetric tolerances in parenthesis after the base
number.

A little care here is considerate of the reader and helpful for
subsequent typesetting.

RULE 8.10 In running text, superscripts and subscripts could be
represented the same way as footnotes in the main guidelines, viz.

2+[20] = 4+[10],

although the FORTRAN notation 2**20 = 4**10 is more perspicuous.

RULE 8.11 Subscripts and superscripts that do not represent powers but
represent labels, are conveniently handled like array subscripts:

a(1,3) = b(2,4) instead of a+[1]-[3] = b+[2]-[4].

The array indices might use square brackets instead of parentheses.

RULE 8.12 For the mixed case of subscripts for labeling and
superscripts for powers, we recommend:

a1[2] = a2[2] or a1**2 = a2**2 or a(1)[2] = a(2)[2].

The first approach is better suited for long formulas with many powers:

(x+y)[3] := x[3]+x[2]y+xy[2]+y[3]

(x+y)**3 := x**3 + x**2*y + x*y**2 + y**3.

RULE 8.13. Complex expression like summations and integrals can be
handled informally as follows:

(1/n)*sum(i=0,n; x(i)[2]) or int(x=0,infty;x[-2]).

RULE 8.14 Matrices, tables, and outlines are handled in a consistent
fashion.

7, 18, 19
-43, 72, 930.1
-1.1, 18, 100

Whereas in print vectors and Matrices are represented by boldface
letters, in E-text it is probably best to adopt Paul Dirac's bra-ket**
notation, first developed for Quantum Mechanics. Here, the vector "v"
is represented as [v>. This notation is well-developed and *can* be
typed in E-text.

** The name comes from the following construction: <bra] c [ket>.
The vector is called a "ket", the dual vector a "bra", and [c] is the
operator matrix.


<Section 3.8.4> Verse, Drama, and Liturgy

RULE 8.15 Each line is a separate paragraph. There should be two hard
returns between lines and three between stanzas.

Alternatively, two returns may mark stanzas, with lines beyond the first
indented by white space (one space recommended). Three returns can mark
longer segments. Only one of these two methods should be used in any
one work.

RULE 8.16 Do not try to mimic vertical or horizontal spacing of a
printed source, unless the visual effect of the poem is the main
concern.

RULE 8.17 Run on lines (say past 80 characters) can be represented by a
slash (/) at the beginning of the line.

RULE 8.18 An asterisk, *, is used to mark caesura, pause, or breathing
mark.

This should be preceded and followed by a space (or return) to prevent
its confusion with a footnote or emphasis delimiter.

RULE 8.19 Use asterisks to delimit stage directions or rubrics.

RULE 8.20 Use special delimiters to mark speakers, roles, or questions
and answers. Follow these with two spaces.

This helps the reader skip from part to part. Ampersands and periods
make unobtrusive delimiters. Brackets are visually more striking:

&Ham. To be or not to be.

&Pol. That is the question.

*or this*

&V. The Lord be with you.

&R. And with thy spirt.

*or this*

&Q.1.5 What is LINUX?

&A. LINUX is a small, free UNIX-like operating system for 386
computers.


<Section 3.9> Electronic Forms and Tests

E-text is often used as a medium for distributing forms, tests, and
other items to be filled out and returned. Often, these forms mimic
paper counterparts at the expense of their purpose--to be easy to fill
out and return. Here are some rules:

RULE 9.1 Avoid the multiple column format common on paper forms.

As soon as you start to fill out the form, the columns don't line up.

RULE 9.2 Skip a line between questions.

This avoids the dread re-formatting problem.

RULE 9.3 Place a left open bracket wherever an answer is required, but
not a right closing one at the end.

In order to fill in a checkbox, you have to position the cursor exactly
in the middle of the box, delete a character and type and "x". It is
easier to position a cursor at the end of the line and start typing
right away.

RULE 9.4 Avoid checkboxes. Ask for a one-character typed answer
instead.

RULE 9.5 Leave four hard returns (three blank lines) between "short
answer" questions

The responder begins typing at the beginning of the second blank line.

RULE 9.6 Do not use spaces or underscores to show blanks; use periods
or hyphens instead. Put them on the line *below* the response area (so
the responder doesn't have to erase them and lose count!).

Your state or province: [
--

Your zip or postal code: [
-----

This cues the responder as to desired length of the response. Blanks
are invisible, except in certain word processors, and underscores are
often run together, so you can't count them easily.

This sort of form is easy to fill out:

Your city of residence (20 characters max): [Chicago, Illinois
--------------------.


<Section 3.10> The E-Mail Business Letter

The paramount rule in writing an effective E-mail business letter is
brevity.

RULE 10.1 In general, you should omit as much of the traditional
apparatus of the business letter as you can,

since the mailing system may well add lots of unwanted detail. An
effective letter can be as short as:

From: jegoodwin
To: anotheruser
Subject: E-mail
<blank line>
This is what I have to say. =John=

RULE 10.2 Always begin E-mail with a single blank line.

This is to allow some visual separation from the mail header.

RULE 10.3 For short (one paragraph) messages, use only the paragraph
and your name, in-line with the last sentence.

Since brevity is the rule, anything beyond a one-paragraph note should
be carefully trimmed. The model below is about the *maximum* you can do
and still have a brief effective letter. Feel free to omit anything
unnecessary.

At most, an E-mail letter will have the following parts:

1. Mail Header

Do not add a letterhead** or mailing address. The mail system will add
enough garbage as it is. Your info goes at the end of the letter.

** An exception is in resumes and advertisements, where catching the
readers attention is of paramount importance. There, lots of whitespace
and visually arresting designs are welcome. The effect wears off
quickly, however, so think twice before adding eye-catching effects to
all your E-Mail.

2. Greeting

This is optional. "<skip one line>Dear Sir or Madam<Skip another line>"
(if you don't know the sex of the person you are writing--very
frequently the case, with E-mail), or "Dear George" or simply "George--"

3. Body

This follows the principles in the rest of this manual. Remember:
flush left.

4. Closing and Signature

The closing optional. "<skip line>Your Name<skip line>" is fine.
If you want one, don't indent it a half page, as is customary in print.

Suggested formal closings are "Sincerely", "[Best] Regards", and
"Thanks". I generally avoid "Thanks in advance", since it implies that
either you aren't thankful if the person doesn't respond (which is
ungracious); or you don't plan to thank them if they do (which is
churlish).

You may use special delimiters to mark your signature, but keep these
light and tasteful. I sign =John Goodwin=. Other persons use two
slashes before there name or add a plus (for clergy), etc., etc. This
is more distinctive than a signature file.

5. Contact information

Since the reader is most likely to contact you just after reading the
letter, but info here.

RULE 10.4 Keep contact information short, probably only your E-mail
address and phone number (two of each, at most)

RULE 10.5 Use the international style for phone numbers: e.g.,
+1 708 840 8069 (work).

Note: "+1" is the Country Code for the U.S.A.

RULE 10.6 Never, NEVER, include a character-drawing or funny quote in a
signature file.

////
[oo] <-- This is me!!! "Remember O Man that Dust Thou Art"
----

Many persons use a "dot-signature" file that is automatically appended
to all their E-mail. The effect is almost invariably puerile and
tasteless. If you include it twice you can add "incompetent" to the
list.

Here is how it looks all together:

To: blah blah
From: blah blah
Subj: blah blah
<--blank line required
Dear Sir or Madam: <-- or Dear George, or Dear Ms.Smith

[Body of text]

[Body of text]

[Last paragraph]

Sincerely, <--optional close

=Your name= <--use signature delimiters for visual effect

[Your Contact information]


<Section 3.11> The Final Rule

And lest the reader forget,

RULE B. All Rules are Made to Be Broken.

Rules summarize experience and judgment. In this manual I have tried to
reflect my own judgment as to what is appropriate, functional, and
aesthetically pleasing. I have not always succeeded. If I have spurred
the reader to consider their own style and refine it for their own
purposes, I will have achieved all my end in writing this manual. Above
all, remember, dear reader,

Question Authority. It's wrong.


+ + +

<Appendix A> Technical Details: Relationship to SGML and TEI

Many of the concerns addressed in this manual are common to participants
in the Text Encoding Initiative (TEI) and other users of the ISO
standard, Standard Generalized Markup Language (SGML, see =Section
2.9=). I would like to emphasize, for their benefit, that this manual
describes a *presentation format* and not an encoding format. It is
perfectly possible to create an SGML- or TEI-compliant file that uses
the format discussed in this manual as a visual output format.

There are very distinct advantages to having a visually appealing,
informal, character-oriented format, like the one advocated here, in
which the logical structure (i.e. markup) is still present, but not
visually intrusive. SGML compliant systems may well produce such a flat
file at the request of a user, or the screen output may be cut from the
program's display window and pasted into such a file. This style manual
has tried to describe design principles that will make the resulting
flat file useful and appealing to read.

Naturally, there are many uses for such a format outside SGML systems as
well; and a certain uniformity, or at least attention to design
principles, can only help make the texts created more useful.

The advantages of SGML or TEI encoding will only come about if word
processors that hide the markup process from the casual user become
commonplace and interoperable. Probably, a low-end freeware editing
system will have to be created.** Until that time, welcome or not, flat
ASCII is not only a visual format, but an interim interchange standard
as well.

** Such a system is being created for the LINUX operating system.

Once again: this is not a new encoding or input format, nor is it
primarily intended as an interchange standard; it is a suggested format
for visual *output* that happens to be maximally transportable at the
present moment.

+ + +

<Table I> Table of Contents

=Part I= Writing for an E-text Audience

=Section 1.1= Why Write for an E-text Audience?

=Section 1.2= Is it Possible to Write E-Text and Print at the Same
Time?

=Section 1.3= Differences between E-Text and Print Media

=Section 1.4= Version Control


=Part II= Specific Differences of Style and Mechanics

=Section 2.1= Differences Traceable to Physical Media

=Section 2.2= Differences in Style

=Section 2.3= Differences in Process

=Section 2.4= Differences in Repertoire

=Section 2.5= Differences in Layout

=Section 2.6= Searching and Hypertext

=Section 2.7= Copyright Issues

=Section 2.8= The Parts of a Book

=Section 2.9= The General Theory of Markup (SGML)

=Section 2.10= Summary: Basic Tricks of the Trade


=Part III= A Very Brief E-Text Style Manual

=Section 3.1= Backups and Saving Work

=Section 3.2= Compressed Files

=Section 3.3= Version Control

=Section 3.4= Use of Word Processing Features

=Section 3.5= Character Set and Font

=Section 3.6= Outlining and Hierarchies

=Section 3.7= Text Inclusions

=Section 3.7.1= Alternate Fonts

=Section 3.7.2= Quotations and Included Blocks of Text

=Section 3.7.3= Lists

=Section 3.7.4= Cross-References, Hypertext, and Embedding

=Section 3.7.5= Editing and Marked Sections

=Section 3.7.6= General Style and Conventions

=Section 3.8= Esoterica

=Section 3.8.1= Inclusions in Languages Other than English

=Section 3.8.2= Footnotes, Cross-References, and Bibliographic
Citations

=Section 3.8.3= Formulas and Statistical Text

=Section 3.8.4= Verse, Drama, and Liturgy

=Section 3.9= Electronic Forms and Tests

=Section 3.10= The E-Mail Business Letter

=Section 3.11= The Final Rule


+ + +

(end of _Elements of E-Text Style_)

Partl

unread,
Nov 26, 1993, 8:33:52 AM11/26/93
to

May I just tell you that I found something called "EText Style"
(file etext10.sty) somewhere with Gopher. You might re-find it
with a Veronica search for "etext". It contains many interesting
rules for this type of text, including the difference of reading
on paper and reading on a computer screen.

--

Marcia Halio-Peoples

unread,
Nov 26, 1993, 12:31:22 PM11/26/93
to
Dr. Partl,
I love your motto: Make laugh, not war!
Yes, indeed!
Marcia Halio
0 new messages