The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Message from discussion Question from a newcomer

From:
To:
Cc:
Followup To:
Subject:
 Validation: For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon.

More options Jun 17 1998, 3:00 am
Newsgroups: comp.text.sgml
From: Lars Marius Garshol <lar...@ifi.uio.no>
Date: 1998/06/17
Subject: Re: Question from a newcomer

* Ron Barnhart
|
| Also, what is Extended Backus-Naur Notation.

Backus-Naur notation (more commonly known as BNF or Backus-Naur Form)
is a formal mathematical way to describe a language, which was
developed by Jim Backus (and possibly Peter Naur as well) to describe
the syntax of the Algol 60 programming language.

BNF works by having something called production rules. Ie: you start
with a symbol and are then given various alternatives for what you can
replace that symbol by. Let's say you start with the symbol S and have
these production rules:

( := means that the left-hand side is to be replaced with one of the
alternatives on the right-hand side. | divides the alternatives.
I've used @ to mean nothing here.)

S  := - FN |
FN
FN := DL |
DL . DL
DL := D @ |
D DL
D  := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

In this language, 3.14 and -5 are valid sentences, while 3..14 is not.

In DL I have to use recursion to express the fact that there can be
any number of Ds (DL = decimal list, D = decimal). This is a bit
awkward and makes the BNF harder to read. Extended BNF solves this

? = which means that the symbol (or group of symbols in parenthesis)
in front of it can be skipped
* = which means that something can be repeated any number of times
(and possibly skipped)
+ = which means that something can be repeated any number of times,
but must be present at least once

So in extended BNF the above grammar can be written as:

S := (-)? D+ (. D+)?

D  := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

which is rather nicer. :)

BNF and EBNF are used to formally define the grammar of a language, so
that there is no disagreement or ambiguity as to what is allowed and
what is not. In fact, they are so unambiguous that there is a lot of
mathemathical theory around these kinds of grammars, and one can in
fact mechanically construct a parser for a language given a BNF
grammar for it. (There are some kinds of grammars for which this isn't
possible, but they can usually be transformed manually into ones that
can be used.)

Some people have in fact done that with the BNF grammar in the XML
specification to create XML parsers. You still have to do some work to
tell the parser-compiler what to do when it sees different constructs,
but it really helps a lot in constructing a parser.

| As far as I can tell, it looks like a DTD for DTD's (in other words,
| it describes in some cryptic format how a DTD is supposed to be
| formatted).

If you mean the syntax of a DTD you are entirely right, although it
can conceivably be used to describe the syntax of any language (again
with some restrictions, BNF isn't powerful enough for all languages).

| Is that an accurate description, and if so, what is the point of it?
| Will I have to use this type of notation to create a DTD or is it
| just background information?

I'd guess that it's there to tell you what DTDs are allowed to look
like. Syntax diagrams (so-called railroad diagrams) would probably
have been much easier to read for people not trained in computer
science, but also a lot more work to create.

--
"These are, as I began, cumbersome ways / to kill a man. Simpler, direct,
and much more neat / is to see that he is living somewhere in the middle /
of the twentieth century, and leave him there."     -- Edwin Brock