总结:HTTP协议规范中关于BNF章节的对照翻译,便于大家阅读RFC

70 views
Skip to first unread message

靳雄飞

unread,
Nov 3, 2009, 6:26:48 AM11/3/09
to 读S计划 - Java Web 方向
巴克斯范式(BNF)是非常重要的一种抽象符号表述形式,常用于描述语言的语法、数据格式等。BNF的约定是比较简单的,不过HTTP协议规范中,对标
准的BNF做了一些扩充,要想阅读并理解各类规范,掌握BNF基本上是必不可少的。

本文把HTTP协议规范(rfc2616)的2.1章进行翻译对照,以利于大家学习。
注意,本文是意译,并会加入作者的注解,有可能无法与原文一一对照。

2.1 扩展的巴科斯范式 Augmented BNF

All of the mechanisms specified in this document are described in
both prose and an augmented Backus-Naur Form (BNF) similar to that
used by RFC 822 [9]. Implementors will need to be familiar with the
notation in order to understand this specification. The augmented
BNF
includes the following constructs:
本文档使用两种手段来描述HTTP协议的相关机制:一种是比较自由的散文体,一种是类似于RFC 822中使用的扩展的巴科斯范式。读者要理解本规范,
需要熟悉BNF的符号表示法。扩展BNF包括下列构造块(constructs):

name = definition
The name of a rule is simply the name itself (without any
enclosing "<" and ">") and is separated from its definition by
the
equal "=" character. White space is only significant in that
indentation of continuation lines is used to indicate a rule
definition that spans more than one line. Certain basic rules
are
in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc.
Angle
brackets are used within definitions whenever their presence
will
facilitate discerning the use of rule names.
一条规则由名字和定义构成,名字和定义之间用等号"="隔开。如果一条规则定义要跨行,第二行起要用空白字符做缩进对齐,其它情况下,空白字符是无关紧
要的。某些特定的基本规则,比如:SP,LWS,HT,CRLF,DIGIT,ALPHA,等等,使用大写字母作为名字。定义中用到其它规则的名字时,
可以加上尖括号"<"和">"以利于识别,但要注意,尖括号并不是名字的一部分。

"literal"
Quotation marks surround literal text. Unless stated otherwise,
the text is case-insensitive.
用括号括起来的内容是原始文本,其内容是不区分大小写的,除非有特别说明。

rule1 | rule2
Elements separated by a bar ("|") are alternatives, e.g., "yes |
no" will accept yes or no.
竖线"|"分割的多个元素表示“任选其一”,竖线可读为“或”,比如:"yes | no",表示 yes或no。

(rule1 rule2)
Elements enclosed in parentheses are treated as a single
element.
Thus, "(elem (foo | bar) elem)" allows the token sequences "elem
foo elem" and "elem bar elem".
用圆括号括起来的规则,被当作是一个元素。因此,"(elem (foo | bar) elem)"可以产生两种符号(token)序
列:"elem foo elem"和"elem bar elem"。

*rule
The character "*" preceding an element indicates repetition. The
full form is "<n>*<m>element" indicating at least <n> and at
most
<m> occurrences of element. Default values are 0 and infinity so
that "*(element)" allows any number, including zero; "1*element"
requires at least one; and "1*2element" allows one or two.
如果元素前面有星号"*",表示该元素可以重复出现。完整的写法是"<n>*<m>element",表示一个元素至少出现n次,最多出现m次,n的默
认值为0,m的默认值为无穷大。因此, "*(element)" 表示出现的次数不受限制,包括0次; "1*element" 表示至少出现1
次;"1*2element"表示只能出现1次或者2次。

[rule]
Square brackets enclose optional elements; "[foo bar]" is
equivalent to "*1(foo bar)".
中括号内的元素是可选的。"[foo bar]" 等价于 "*1(foo bar)",也就是可以出现0次或者1次。

N rule
Specific repetition: "<n>(element)" is equivalent to
"<n>*<n>(element)"; that is, exactly <n> occurrences of
(element).
Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three
alphabetic characters.
在元素前面放一个数字N,表示该元素必须重复出现N次,所以,"<n>(element)" 等价于 "<n>*<n>(element)" 。例
如“2DIGIT”表示两位的数字,而“3ALPHA”表示包含三个字母的字符串。

#rule
A construct "#" is defined, similar to "*", for defining lists
of
elements. The full form is "<n>#<m>element" indicating at least
<n> and at most <m> elements, each separated by one or more
commas
(",") and OPTIONAL linear white space (LWS). This makes the
usual
form of lists very easy; a rule such as
( *LWS element *( *LWS "," *LWS element ))
can be shown as
1#element
Wherever this construct is used, null elements are allowed, but
do
not contribute to the count of elements present. That is,
"(element), , (element) " is permitted, but counts as only two
elements. Therefore, where at least one element is required, at
least one non-null element MUST be present. Default values are 0
and infinity so that "#element" allows any number, including
zero;
"1#element" requires at least one; and "1#2element" allows one
or
two.
"#" 的作用和 "*" 类似,不过 "#" 是用来定义列表的。完整的表达形式 "<n>#<m>element" 表示元素至少出现n次,最多出
现m次,两个元素之间用一个或者多个逗号 "," 分割,元素之间可以包含线性空白字符(LWS),但不是必须的。该语法可以很容易的表达列表,如果没
有“#”支持,则 :
"1#element "
需要写成:
( *LWS element *( *LWS "," *LWS element ))
他们是等价的,都代表长度大于1的列表,但后者复杂的多。

列表中允许出现空元素(也就是连续两个逗号“, ,”的情况),但不计入列表的长度。比如:"(element), , (element) " 是合
法的,但其长度为2。因此,如果规则要求至少一个元素,则列表中至少要存在一个非空元素。n的默认值是0,m的默认值是无穷大,因
此,"#element" 表示任意长度的列表,包括长度为空的列表;"1#element" 表示长度至少为1的列表,"1#2element" 表
示包含1个或者2个元素的列表。

; comment
A semi-colon, set off some distance to the right of rule text,
starts a comment that continues to the end of line. This is a
simple way of including useful notes in parallel with the
specifications.
如果在规则正文右边(要有空白间隔)加上分号,则分号之后一直到行尾的内容被认为是注释。这是对规则增加额外说明的便利手段。

implied *LWS
The grammar described by this specification is word-based.
Except
where noted otherwise, linear white space (LWS) can be included
between any two adjacent words (token or quoted-string), and
between adjacent words and separators, without changing the
interpretation of a field. At least one delimiter (LWS and/or
separators) MUST exist between any two tokens (for the
definition
of "token" below), since they would otherwise be interpreted as
a
single token.
本规范中描述的语法都是基于“单词(word)”的。任意两个相邻的“word”,或者word和分隔符之间,都可以包含线性空白符(LWS),这些空
白字符的数量,不会改变语法的意义,除非另有说明。两个符号(token,token的定义在规范的其它章节)之间至少要有一个空白符或者分隔符,否
则,它们将被解释成一个符号。

Reply all
Reply to author
Forward
0 new messages