Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Syntactical defininition of English

1 view
Skip to first unread message

Val Kartchner

unread,
Oct 11, 1988, 9:49:58 PM10/11/88
to

Does someone out there have a syntactical definition of English. I
would like to build English language parsers for various purposes
including adventure game authoring systems.
Thanks in advance,
-=:[ VAL ]:=-
--
---- /\ ----------------------------------------------------------------
/\/\ . /\ | Val Kartchner {UT@WSC} | 'vi' must go, this
/ \/ \/\/ \ | #include <disclaimer.h> | is non-negotiable.
===/ U i n T e c h \===!ihnp4!utah-cs!utah-gr!uplherc!sp7040!obie!val=====

s...@hpclskh.hp.com

unread,
Oct 19, 1988, 11:51:22 AM10/19/88
to
If you get one...post it. Should be an AMAZING grammar to see, if not good for
a few laughs.

Tim Budd

unread,
Oct 19, 1988, 12:10:45 PM10/19/88
to
You may remember that Context Free Languages were discovered by a
Linguist, Noam Chomsky, not a computer scientist. At the time (mid
1950's), there was great hope that a CFL, or at worst a CSL (context
sensitive language) could be found that would describe English, and
other such grammars developed for other natural languages.
Such efforts more or less met with utter and complete defeat in the late
50's and 60's. Indeed so much so that some people working in understanding
English (such at the folks at Yale), almost totally abandoned any
notion of syntax, and proceeded with just a semantic analysis of
utterances. So I fear your quest will be a futile one; the best you
can hope for is a grammar for a rather stilted and minimal subset of
English.

<sentence> ::= <subject> <verb> <object>
<subject> ::= I | teachers | policemen | the mob
<verb> ::= eat | love | detest
<object> ::= mice | chocolate | teachers | little children

Ralph Hyre

unread,
Oct 19, 1988, 4:15:24 PM10/19/88
to
In article <69...@orstcs.CS.ORST.EDU> bu...@mist.UUCP (Tim Budd) writes:
>Linguist, Noam Chomsky, not a computer scientist.
<At the time (mid 1950's), there was great hope that a CFL, or at worst a CSL
>(context sensitive language) could be found that would describe English, and
>other such grammars developed for other natural languages.
even stilted English would be enough for me. I just want to talk to my
Unix system in a more converstational manner, I have having the keystrokes
'ls -al' burned into my brain, wasting those valuable neural pathways.

>Such efforts more or less met with utter and complete defeat in the late
>50's and 60's.

Interesting that some of the technology lived on in the educational system:
(ie my school system)
'phonics' (the name given to my 3rd grade language class), where we learned
S -> N V, and more elaborate sentence diagramming in 7th grade:
S -> NP VP, NP -> prep N, N -> cat,dog, prep -> about, above & 50 others.

Then, in college, I learned about REAL linguistics and affix hopping and
such.
--
- Ralph W. Hyre, Jr.
Internet: ral...@ius3.cs.cmu.edu Phone:(412) CMU-BUGS
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA
"You can do what you want with my computer, but leave me alone!8-)"

Greg Lee

unread,
Oct 19, 1988, 11:34:08 PM10/19/88
to
From article <69...@orstcs.CS.ORST.EDU>, by bu...@mist.cs.orst.edu (Tim Budd):

" You may remember that Context Free Languages were discovered by a
" Linguist, Noam Chomsky, not a computer scientist. At the time (mid
" 1950's), there was great hope that a CFL, or at worst a CSL (context
" sensitive language) could be found that would describe English, and
" other such grammars developed for other natural languages.
" Such efforts more or less met with utter and complete defeat in the late
" 50's and 60's. Indeed so much so that some people working in understanding

Context free phrase structure grammar lives! It's the basis of the
best current theory of syntax, GPSG -- Generalized Phrase Structure
Grammar.
Greg, l...@uhccux.uhcc.hawaii.edu

Rob Bernardo

unread,
Oct 20, 1988, 12:55:04 AM10/20/88
to
In article <33...@pt.cs.cmu.edu> ral...@ius3.ius.cs.cmu.edu (Ralph Hyre) writes:
+even stilted English would be enough for me. I just want to talk to my
+Unix system in a more converstational manner, I have having the keystrokes
+'ls -al' burned into my brain, wasting those valuable neural pathways.

Let's see, if your UNIX system understood conversational English only,
you'd have to say:

Give me a long listing of everything in the directory.
--
Rob Bernardo, Pacific Bell UNIX/C Reusable Code Library
Email: ...![backbone]!pacbell!rob OR r...@PacBell.COM
Office: (415) 823-2417 Room 4E750A, San Ramon Valley Administrative Center
Residence: (415) 827-4301 R Bar JB, Concord, California

Clay M Bond

unread,
Oct 20, 1988, 6:09:37 AM10/20/88
to
Tim Budd:

>You may remember that Context Free Languages were discovered by a
>Linguist, Noam Chomsky, not a computer scientist.

No, I don't, actually. I'm not quite sure what you mean here.
They certainly weren't "discovered" though if this is supposed
to mean that Nim first proposed that natural language could be
generated with a CFG then it makes more sense (though that, too
is wrong. Harris, not Nim.)


>1950's), there was great hope that a CFL, or at worst a CSL (context
>sensitive language) could be found that would describe English, and

You mean CF/SG, don't you? If language X can be generated by a CFG,
then language X is a CFL; a CFL is not going to describe English.


>Such efforts more or less met with utter and complete defeat in the late
>50's and 60's.

No argument.


>Indeed so much so that some people working in understanding
>English (such at the folks at Yale), almost totally abandoned any
>notion of syntax, and proceeded with just a semantic analysis of
>utterances.

I fail to see what the difference is, assuming the semantic analyses
used are mathematical possible-worlds models which have nothing to
do with reality, much less language. You're manipulating symbols.
How is manipulating semantic symbols different from manipulating
syntactic ones, save that the former is more challenging since it's
more obvious that symbol systems don't work.

What? This construction doesn't fit the rule? Write another
rule/feature, of course!

The plight of the semanticist is no less futile than the syntactician.

--
<<<<<<<<<<<<***<<<<<<<<<<<<***<<<<<<***>>>>>>***>>>>>>>>>>>>***>>>>>>>>>>>>
<< Clay Bond, IU Department of Leather er uh, Linguistics >>
<< ARPA: bo...@iuvax.cs.indiana.edu AKA: Le Nouveau Marquis de Sade >>
<<<<<<<<<<<<***<<<<<<<<<<<<***<<<<<<***>>>>>>***>>>>>>>>>>>>***>>>>>>>>>>>>

Clay M Bond

unread,
Oct 20, 1988, 7:35:10 AM10/20/88
to
Ralph Hyre:

>Unix system in a more converstational manner, I have having the keystrokes

>'ls -al' burned into my brain, wasting those valuable neural pathways.

You might want to write an alias in your .login file to give it a
rest. Suggestions ... alias trash ls -al, alias junk ls -al, alias GOP
ls -al, the name of your current least-favorite person ... and not only
do you give your synapses a rest, but you take out some frustration as
well.

For a while I had alias noam rm ... and after a week or so of deleting
files, I felt better.

Kevin S. Van Horn

unread,
Oct 20, 1988, 1:56:42 PM10/20/88
to
In article <69...@orstcs.CS.ORST.EDU> bu...@mist.UUCP (Tim Budd) writes:
>Such efforts more or less met with utter and complete defeat in the late
>50's and 60's. Indeed so much so that some people working in understanding
>English (such at the folks at Yale), almost totally abandoned any
>notion of syntax, and proceeded with just a semantic analysis of
>utterances. So I fear your quest will be a futile one; the best you
>can hope for is a grammar for a rather stilted and minimal subset of
>English.

I think that Fred Thompson, of the Caltech C.S. Dep't., would not
entirely agree with this statement. His work is in natural-language
interfaces and, though recognizing its limits, he has managed to do
quite a bit using a syntax-based approach. The person who originally
asked about this may want to write Dr. Thompson, at Caltech 256-80,
Pasadena, CA 91125.

Kevin S. Van Horn

dave_lawrence

unread,
Oct 20, 1988, 4:24:30 PM10/20/88
to
r...@pbhyf.PacBell.COM (Rob Bernardo) writes:
>ral...@ius3.ius.cs.cmu.edu (Ralph Hyre) writes:
>+even stilted English would be enough for me. I just want to talk to my
>+Unix system in a more converstational manner, I have having the keystrokes
>+'ls -al' burned into my brain, wasting those valuable neural pathways.
>
>Let's see, if your UNIX system understood conversational English only,
>you'd have to say:
>
> Give me a long listing of everything in the directory.

or, more accurately, you would have to tell it
Give me a long listing (permissions, groups and all that good stuff)
of every file in the -current- directory.
(unless you had a parser that understood implied words ...)

Wouldn't you just love to write the parser that could correctly handle,
in the English (not -American- (personal pet peeve) |:-) language the
equivalent of the following...

alias news-dates grep 'Date:' /usenet/spool/\$1/* | sed 's/.*:.*: \(.*\)/\1/' | sed 's/^. / &/' | sort | sort -f -M +1 | sed 's/\(.* \)..:.*$/\1/' | uniq -c

Well, it might not look quite as bad, but I wouldn't say it to mum at
Christmas dinner ....

Cheerio,
Dave
--
g l o r i o u sex i s t e n c e
EMAIL: ta...@rpitsmts.bitnet, tale%mts.rpi.edu@rpitsgw, ta...@pawl.rpi.edu

Dave Jones

unread,
Oct 20, 1988, 5:09:34 PM10/20/88
to
I recall having seen a hardback book of a few hundred pages, filled
with English language productions, nothing else. That was about ten
years ago. I don't remember the name of it. It wouldn't help much
in writing shells, I fear, but it might be interesting to look at again.


Dave J.

Steven Ryan

unread,
Oct 20, 1988, 7:27:33 PM10/20/88
to
>You may remember that Context Free Languages were discovered by a
>Linguist, Noam Chomsky, not a computer scientist. At the time (mid
>......

Eh?

I think somebody forgot Type 0 = Turing Machine.

Anyway, check out Appendix ?B of Terry Winograd's book, some or other,
Part I: Syntax.

No, nobody has a complete, formal syntax/semantics of any natural language,
but, you said you wanted it for a game? this kind of stuff covers most cases.
For what it doesn't, just respond

Eh? I'm sorry, I don't understand; could you repeat that using
simpler sentence?

Rick Wojcik

unread,
Oct 21, 1988, 3:20:12 PM10/21/88
to
In article <25...@uhccux.uhcc.hawaii.edu> l...@uhccux.uhcc.hawaii.edu (Greg Lee) writes:
>Context free phrase structure grammar lives! It's the basis of the
>best current theory of syntax, GPSG -- Generalized Phrase Structure
>Grammar.
> Greg, l...@uhccux.uhcc.hawaii.edu

Greg, I would be interested in knowing the criteria by which you judge one
'current theory' of syntax to be better than the others. Why is GPSG
better than HPSG, in your opinion? Than LFG? (Don't bother with GB. I
don't want to stir up trouble. :-)
--
Rick Wojcik csnet: rwo...@boeing.com
uucp: uw-beaver!ssc-vax!bcsaic!rwojcik

Jay Kim

unread,
Oct 22, 1988, 8:12:14 AM10/22/88
to
> <<<<<<<<<<<<***<<<<<<<<<<<<***<<<<<<***>>>>>>***>>>>>>>>>>>>***>>>>>>>>>>>>
Clay Bond wrote:

> a CFL is not going to describe English.

Could you tell us a convincing evidence for this?
If you are going to bring up 50's argument based on a long-distant
dependency, I would recommend you to read first Gerald Gazdar (1982) Phrase
structure grammar. In Pauline Jacobson and Geoffrey K. Pullum (eds),
The Nature of Syntactic Representation. Dordrecht: D. Reidel, 131-186.

Greg Lee

unread,
Oct 23, 1988, 6:26:13 AM10/23/88
to
From article <83...@bcsaic.UUCP>, by rwo...@bcsaic.UUCP (Rick Wojcik):

" Greg, I would be interested in knowing the criteria by which you judge one
" 'current theory' of syntax to be better than the others. Why is GPSG
" better than HPSG, in your opinion? Than LFG? (Don't bother with GB. I
" don't want to stir up trouble. :-)

Actually, it's only context free phrase structure grammar I'm prepared
to defend, not GPSG specifically. The nice thing about GPSG is that
a GPSG description abbreviates a finite number of CF phrase structure
rules, and so describes a context free language. If and to the extent
the other theories you mentioned allow a similar interpretation, I
love them, too. But I don't know whether they do.

I should admit that I find much of the current literature in syntax
difficult to understand, since though it purports to be about syntactic
theory, it seems really only to concern conciseness or convenience of
description. This includes GPSG, the book, by Gazdar, Klein, Pullum,
and Sag.

To what I said in reply to Walter Rolandi, I'd like to add something
about the local nature of lexical subcategorization, again, following
Gazdar. Subcategorization of items with respect to sister constituents
is straightforward in a context free phrase structure grammar, and
this is the only, or at least the predominate, kind of subcategorization
found in natural language. However, I'm not sure it's possible to
make this out as a prediction of CFPSG without an appeal to simplicity,
since one can also describe certain non-local subcategorizations.

Greg, l...@uhccux.uhcc.hawaii.edu

do...@hcx2.ssd.harris.com

unread,
Oct 24, 1988, 9:50:00 AM10/24/88
to

>/* Written 8:12 am Oct 22, 1988 by jk...@uhccux.uhcc.hawaii.edu */
>/* End of text */

Context-free languages have enough trouble adequately describing
programming languages. Sure, they can do a half-decent job on the written
syntax as it appears in the file. But to use syntactical productions to
recognize things such as various data types in expressions, or even worse,
checking that the number of parameters agrees between a caller and a callee
is either too exhaustive to be useful or just simply beyond a CFL. Hey, if
a context-free grammer can't recognize the regular expression

x y z y x (note: this requires a pushdown machine with
a b c b a multiple stacks, more power than an
automata equivalent to a CFL can be)

how the hell is it going to handle English, or Spanish, or whatever?
Remember, we must check proper pluralization, subject-verb agreement, all
that good stuff. For programming languages, the CFL describes the written
syntax and the semantic actions fill in the context-sensitive features
we need. My wild guess is that our minds use a context-sensitive grammar
with hundreds of thousands of semantic checks to fill in where the CSG
is inadequate for our needs.


Doug Scofield do...@ssd.harris.com
Harris Computer Systems {uunet,mit-eddie,novavax}!hcx1!dougs
Ft. Lauderdale, FL voice: (305) 973 5340

David Keppel

unread,
Oct 25, 1988, 12:39:30 PM10/25/88
to
> somebody writes;
>>[ english grammar? ]

In article <44600003@hcx2> do...@hcx2.SSD.HARRIS.COM writes:
> [...] But to use syntactical productions to recognize things such as


> various data types in expressions, or even worse, checking that the
> number of parameters agrees between a caller and a callee is either
> too exhaustive to be useful or just simply beyond a CFL.

Attribute grammars are a current research topic. It is possible
(although "too exhaustive") to write an attribute grammar that
recognizes (semantically) Ada. It runs to some thousand pages (whew!).

Here's another "goodie": somebody fed the statement "Time flies like
an arrow" into a computer and the computer said:

* This is an analogy; time is a thing that moves in a way (flying)
that is similar to the way that an arrow moves.
* Definition: "time files" are some species that have characteristics
much like those of an arrow.
* Command: [go get a stopwatch and] time flies the same way that you
would time an arrow.

If you think that's fun, the Lojban people enumerate something like
20 different ways to understand the phrase "pretty little girl's
school". Lobjan is a synthetic language related to Loglan that is
designed to be unambiguous and machine-parseable; there *are* parsers
for Lojban, so quick, everybody run out and learn Lojban so we can
have "synthetic-language query systems" :-)

;-D on ( Eh? I don't grok, Mike ) Pardo
--
pa...@cs.washington.edu
{rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo

Mark Buda

unread,
Oct 25, 1988, 4:09:15 PM10/25/88
to
In article <960...@hpclskh.HP.COM> s...@hpclskh.HP.COM writes:
>If you get one...post it. Should be an AMAZING grammar to see, if not good for
>a few laughs.

<utterance> ::= <word>*

Everything else is semantics, of course.
--
Mark Buda / Smart UUCP: her...@shockeye.uucp / Phone(work):(717)299-5189
Dumb UUCP: ...rutgers!bpa!vu-vlsi!devon!shockeye!hermit
Entropy will get you in the end.
"A little suction does wonders." - Gary Collins

wsm...@m.cs.uiuc.edu

unread,
Oct 25, 1988, 9:31:00 PM10/25/88
to

>
> x y z y x (note: this requires a pushdown machine with
> a b c b a multiple stacks, more power than an
> automata equivalent to a CFL can be)
>

Don't you mean:
x y z x y
a b c a b

instead? (Technically, what you give is not a regular expression, either.)

The language you describe is generated by this context free grammar:

R -> a R a | S ;
S -> b S b | T ;
T -> T c | ;

Bill Smith
wsm...@cs.uiuc.edu
uiucdcs!wsmith

Gordon V. Cormack

unread,
Oct 25, 1988, 10:53:17 PM10/25/88
to
In article <44600003@hcx2>, do...@hcx2.SSD.HARRIS.COM writes:
> is either too exhaustive to be useful or just simply beyond a CFL. Hey, if
> a context-free grammer can't recognize the regular expression
>
> x y z y x (note: this requires a pushdown machine with
> a b c b a multiple stacks, more power than an
> automata equivalent to a CFL can be)
>


1. the expression above is not regular
2. the expression above is easily expressed as a CFG:

A -> B
A -> a A a
B -> C
B -> b B b
C ->
C -> C c

3. two stacks suffice for most recognition problems
4. grammer [sic] is misspelled
5. automata is plural
6. why is everybody picking on this guy so much? All he asked
for was a CFG for English. If I asked for a CFG for Pascal,
would you hassle me about all the Pascal constructs that aren't
context-free?
7. The UNIX command "style" contains a yacc grammar for English.
A paper is included in the supplementary UNIX documentation
describing "style", but the source is not supplied with the
BSD distribution.
--
Gordon V. Cormack CS Dept, University of Waterloo, Canada N2L 3G1
gvcormack@waterloo { .CSNET or .CDN or .EDU }
gvco...@uwaterloo.CA
gvcormack@water { UUCP or BITNET }

Greg Lee

unread,
Oct 27, 1988, 2:44:38 PM10/27/88
to
From article <44600003@hcx2>, by do...@hcx2.SSD.HARRIS.COM:
" ...

" is either too exhaustive to be useful or just simply beyond a CFL. Hey, if
" a context-free grammer can't recognize the regular expression
"
" x y z y x (note: this requires a pushdown machine with
" a b c b a multiple stacks, more power than an
" automata equivalent to a CFL can be)
"
" how the hell is it going to handle English, or Spanish, or whatever?

x y z x y
Supposing a b c a b was meant, then the answer is it's going to the
hell handle them if they don't the hell have such constructions.
Whether one does find such constructions in natural language is
debatable -- there is discussion in the linguistic literature going
back about a decade. At least, it seems clear that they are not
common.

" Remember, we must check proper pluralization, subject-verb agreement, all
" that good stuff.

Since natural languages have grammatical agreement with repect to only
a finite (and rather small) number of categories, and since the
strings that separate agreeing items can be characterized by a finite
number of strings of category symbols, agreement does not pose a problem
in principle.

" For programming languages, the CFL describes the written
" syntax and the semantic actions fill in the context-sensitive features
" we need.

And so it may be for natural languages.

" My wild guess is that our minds use a context-sensitive grammar
" with hundreds of thousands of semantic checks to fill in where the CSG
" is inadequate for our needs.

The proposal that natural languages are context free is also a guess,
at this point, but I think it's fair to say it's an educated guess.
There is some evidence against the proposal, but in my opinion this
evidence is rather marginal. Other linguists have other opinions.

Greg, l...@uhccux.uhcc.hawaii.edu

do...@hcx2.ssd.harris.com

unread,
Oct 28, 1988, 11:46:00 AM10/28/88
to

>
> x y z y x (note: this requires a pushdown machine with
> a b c b a multiple stacks, more power than an
> automata equivalent to a CFL can be)
> ^^^^^^^^ oops. should be automaton.
a most relevant mistake.

Yeah, I know I made a few typos with this expression.
It should have been


x y z x y

a b c a b x,y,z >= 0

More than one stack in an automaton means that it is not equivalent
to a CFL. It doesn't matter if there is only two. Two is too many.


Doug Scofield do...@ssd.harris.com
Harris Computer Systems {uunet,mit-eddie,novavax}!hcx1!dougs
Ft. Lauderdale, FL voice: (305) 973 5340

[These are my mistakes _only_]

0 new messages