Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FAQs: A Suggested Minimal Digest Format

0 views
Skip to first unread message

Chris Lewis

unread,
Oct 29, 2002, 1:00:02 AM10/29/02
to
Archive-name: faqs/minimal-digest-format
Posting-frequency: every 20 days
Last-modified: Wed Jan 25 23:54:34 EST 1995

FAQs: A Suggested Minimal Digest Format
Chris Lewis
cle...@ferret.ocunix.on.ca


The latest edition of this FAQ can always be retrieved from:

ftp://rtfm.mit.edu/pub/usenet/news.answers/faqs/minimal-digest-format

Changes: URLs are now documented in RFC1630.

------------------------------

Subject: 1. Introduction and Intent

The intent of this FAQ is to provide current and future FAQ maintainers
with a simple description of a minimal format for FAQs. This minimal
format is a simplification of RFC1153 digest format that is sufficient
to be compatible with common newsreader digest handling functionality,
current practise, and Thomas Fine's "FAQ digest format to HTML"
converter which allows more sophisticated viewing on HTML-aware systems
such as Mosaic or WWW. There are other more sophisticated formats that
you can use, but this is the simplest one that is compatible with a wide
range of software that understands digest format.

This format is entirely optional. But it is designed to give you the
biggest "bang per buck" in terms of existing software compatibility and
minimum effort. If you believe that your FAQ can benefit from more
sophisticated formats, by all means use them. As such, this FAQ can be
simply considered a guide on how to take advantage of some basic digest
capabilities in end-user viewing software.

Rather than confuse the issue by documenting all of the variation
allowed by existing practise and software, this documents a single
variant. However, it can be extended by reviewing the documentation
for Thomas Fine's FAQ to HTML converter:

<http://www.cis.ohio-state.edu/hypertext/faq/usenet/faq-format/top.html>

This FAQ is written entirely in the minimal digest format, and can be
used as an example. You can skip from one section to the next
by pressing ^G in many newsreaders, such as rn, trn and strn.

This FAQ describes only how FAQ sections should be delimited, and
a couple of suggestions for meta-references to such things as FTP
or WWW repositories in formats that other tools support.

Note to reader software implementors: you should not take this format
as gospel, instead, use it as a guide to one minimal format of many
more sophisticated ones. You should really be reading RFC1153,
Thomas Fine's material, and consulting news.answers for how FAQS
are formatted in real life. See "Newsreader/Converter Specifics"
for descriptions of how some newsreaders work with digest-like documents.

------------------------------

Subject: 2. Table of Contents

1. Introduction and Intent
2. Table of Contents
3. What Should the Overall FAQ Look Like?
4. What's a Section, and How is it Formatted?
5. What is the Table of Contents Format?
6. What are External Meta References,
and What is Their Format?
7. Where Do I need to Look for Other Information?
8. Newsreader/Converter Specifics

------------------------------

Subject: 3. What Should the Overall FAQ Look Like?

Most FAQs lend themselves to a format like:

<news headers>
<news.answers required headers if the FAQ is registered>
<title and author>
<section>
<section>
<section>
<section>

While FAQs aren't always lists of questions and answers, they usually
have "sections" of text -- whether they be sets of lists, individual
Q&A's, groups of Q&A, textual sections, whatever. The digest format
is all about how these sections should be delimited for automatic parsing.

Note that this FAQ doesn't attempt to explain the news headers and
news.answers subheaders. For this, you should really consult the
FAQs on how to create news.answers postings. It's worth noting a
few things here. You should use Expires/Supersedes to manage the
deletion of previous copies of your FAQ. It is also a very good idea
to use References: lines to link the parts of multi-part FAQs together
so that they remain together with Usenet news readers.

------------------------------

Subject: 4. What's a Section, and How is it Formatted?

A "section" is merely a block of text. In many FAQs they are simply
the introduction paragraph, the table of contents, and each question
and answer. Through the use of digest format, most newsreaders can
skip from section to section using the convention presented here, and
more sophisticated packages can hypertext them.

A "section" consists of:

<blank line>
<string of 30 hyphens>
<blank line>
Subject: <subject line>
<additional optional RFC822-like headers>
<blank line>
<text>

Note that the string of hyphens and "Subject:" must start in column one.
"Subject:" has one space or tab between it and the subject line. If you have
to put "Subject:" in and don't want it interpreted as a section header, just
make sure that it isn't in column one (just like above). If your subject
line is too wide to fit in 80 columns, you can continue it onto the next
lines, with whitespace at the beginning of the following lines. Example:

Subject: this is a long........
subject line

The subject can be any arbitrary string of text. You may wish to use
a numbering scheme, for it makes it easier for your readers to "grep"
down to the precise section they want.

You can place additional RFC-like headers after the Subject, such as
"From:", "Date:" etc. Again, these headers should start in column
one. There should be no blank lines in the entire set of headers
in a section.

The text is free format ASCII and may be formatted any way you wish.

Current FAQ maintainers take note: if you're already using a consistent
format for your FAQ, converting to this format will often require only
one or two global edit commands.

------------------------------

Subject: 5. What is the Table of Contents Format?

The Table of Contents simply consists of the subject lines from the rest
of the FAQ, excluding "Subject:", and preferably indented. The subject lines
should be exact copies of the section headers.

This is only a suggestion. There is no existing software that parses this
data. The intent of using exactly the same strings as the subjects is
so that users can use search mechanisms to find specific sections. If the
subject line is too long to fit in a table of contents line, it is suggested
that you truncate it at a convenient point - the search will still work.

------------------------------

Subject: 6. What are External Meta References,
and What is Their Format?

Many of the more sophisticated viewers can "jump" from one FAQ to the
next, retrieve data via FTP, or send email simply by "pointing at"
properly formatted "tags" in your FAQ. This FAQ recommends "URL"
("Universal Resource Locator") format tags. See Section 7 for a
reference.

If your FAQ refers to a FTP-able file, use this format:

ftp://<inet>/<str>/<str>

Where "<inet>" is the Internet domain name of the server, and the rest
of the "<str>/<str>" is the file name. If you want to refer to a directory,
leave a trailing "/".

This string can be anywhere in the document, inline with text or whatever.

Similarly, for html (hypertext markup language)-compatible documents,
use http://<inet>/<str>/<str>

For clarity, it's best to surround the URL with angle brackets to make
it easier to parse. This FAQ uses this convention, ie:

<ftp://ftp.uunet.ca/distrib/chris_lewis/hp2pbm/>

One difficulty with URLs is that they're often quite long. Do not
break them in the middle, or they won't work. It is suggested that
if the URL is too long to fit, start a new line with the URL. Even
if it does look rather ugly, it's better than not working, or wrapping
beyond the 80th column.

------------------------------

Subject: 7. Where Do I need to Look for Other Information?

[These seemed relevant, but I need descriptions!]
<http://www.cis.ohio-state.edu/hypertext/usenet/faq-format/www/faq.html>,
<http://www.cis.ohio-state.edu/hypertext/faq/usenet/faq-format/top.html>
<http://www.cis.ohio-state.edu/hypertext/faq/usenet/technical-notes/faq.html>

John E. Goodwin's <JEGO...@delphi.com> "Elements of E-Text Style",

Note the specification of URLs is now to be found in rfc1630:

"Universal Resource Identifiers in WWW" [Jun 94]
by Tim Berners-Lee <ti...@info.cern.ch>
URL <ftp://ds.internic.net/rfc/rfc1630.txt>
<ftp://ftp.isi.edu/in-notes/rfc1630.txt>
<http://info.cern.ch/hypertext/WWW/Addressing/URL/URL_Overview.html>

------------------------------

Subject: 8. Newsreader/Converter Specifics

Rn, trn, and strn "^G" functionality skips to the next occurance of
"Subject:" in column one.

GNUs has two "digest" parsers. One insists on full RFC1153 compliance
(main Subject: line "digest" tokens etc.), and the other skips to lines
with (at least 8?) hyphens starting in column 1.

Tin has no digest functionality at present, though, tin's author indicates
willingness to add it in a way compatible with this format. This author
suggests either the "^Subject:" or "^-*" approach.

Nn triggers on Subject: plus From: which is often not applicable
to FAQs. Nn "explodes" FAQs with both Subject: and From: subheaders
into individual articles. Most nn users this author has discussed this
with do not want FAQs to behave this way, which is why this format doesn't
require "From:" lines.

Thomas Fine's FAQ to HTML conversion system uses a scoring system to
measure compliance with the:

<blank line>
<line of hyphens>
<blank line>
Subject: <subject>

format. See the following for more detail:

<http://www.cis.ohio-state.edu/hypertext/faq/usenet/faq-format/top.html>

I would appreciate detail on digest/FAQ parsing in other newsreaders and
conversion systems.

0 new messages