[lojban] Re: "zo da bu" should not be valid (was Re: Re: My parser, SI, SA, and ZOI)

4 views
Skip to first unread message

Robin Lee Powell

unread,
May 13, 2004, 2:25:37 PM5/13/04
to lojba...@lojban.org
On Wed, May 12, 2004 at 11:30:29PM -0700, Robin Lee Powell wrote:
> zo takes a single Lojban word. bu takes a single Lobjan word. si
> takes as single word, or an arbitrary string of non-Lojban text.
>
> This is how they are defined.

zei as well; "le lojbo zo irk zei seljmaji vreji" is invalid, but
"le lojbo irk zei seljmaji vreji" is fine.

-Robin

--
http://www.digitalkingdom.org/~rlpowell/ *** I'm a *male* Robin.
"Many philosophical problems are caused by such things as the simple
inability to shut up." -- David Stove, liberally paraphrased.
http://www.lojban.org/ *** loi pimlu na srana .i ti rokci morsi

rlpo...@digitalkingdom.org

unread,
May 10, 2004, 4:20:30 PM5/10/04
to jco...@reutershealth.com
On Mon, May 10, 2004 at 04:04:03PM -0400, jco...@reutershealth.com wrote:
> However, I would not want fu zei bar to become a single token,

Just for the record, you know that grammar.300 currently does exactly
that, right?

> It wouldn't break my heart if "zei zei ..." runs were always illegal,
> though: that is, if zei could not act upon zei.

Nor mine.

Robin Lee Powell

unread,
May 6, 2004, 8:51:12 PM5/6/04
to lojba...@lojban.org
On Thu, May 06, 2004 at 06:20:33PM -0400, Nora LeChevalier wrote:
[long explanation of what my parser does snipped]
> I think I've seen someone use "si" as the delimiter. This majorly
> complicates things, no?

Not at all. My parser has no problems with this. I just tested this on

"zoi si I love zoi! si"

and

"zoi si I love zoi! si si Really! si"

Both of which do exactly what I said they would do.

> Also, from a making-sense point of view, I prefer "si" after the
> closing delimiter to delete the entire zoi phrase (back to and
> including the zoi). To say that "The first SI after the close of a
> ZOI clause erases the closing delimiter..." would make one think the
> next thing said is part of the inside of the ZOI; so you would never
> be able to get back to the ZOI.

Yes, I understand your point completely. I'd love to hear other people
chime in on this point. The problem is that SI is only supposed to
erase one previous word, so we're moving in to the realm of "not
justifiable under current standards".

> I had this trouble with the current version, too, by the way. There
> is a precedent for this. When applying a UI, if it's after something
> like a le broda ku, it applies to the whole le ... ku construct.

<nod>

Robin Lee Powell

unread,
May 7, 2004, 5:16:19 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 04:58:57PM -0400, jco...@reutershealth.com
wrote:
> Robin Lee Powell scripsit:
>
> > b. If the Lojban word "zo" (selma'o ZO) is identified, treat the
> > following Lojban word as a token labelled 'any_word_698', instead of
> > lexing it by its normal grammatical function.
> >
> > So "zoi da de da" is turned into four tokens, "zoi da anything_699
> > da" and "zo da" is turned into the single token "any_word_698".
>
> I read this instead as saying that "zo da" becomes "zo any_word_698",
> and that's what the official parser does. I agree with your reading
> of zoi. While we're at it, "lo'u ... le'u" beomces "lo'u any_words_697
> le'u".

OK. So given all that, what are *your* expectations for how SI *should*
act on these constructs (regardless of how grammar.300 defines them).

Robin Lee Powell

unread,
May 13, 2004, 2:49:42 AM5/13/04
to lojba...@lojban.org
On Wed, May 12, 2004 at 11:30:29PM -0700, Robin Lee Powell wrote:
> I've spent a good portion of my free time since you posted this
> thinking about this issue. It turns out that the answer is "Yes".
>
> There's a simple reason for this: it's the only solution that fits the
> current cmavo definitions.

>
> zo takes a single Lojban word. bu takes a single Lobjan word. si
> takes as single word, or an arbitrary string of non-Lojban text.
>
> This is how they are defined.

There's something else that's worth mentioning: I've made the general
meta-decision that a string of si keeps eating text; it doesn't get
re-evaluated until the string of si is done. This is enshrined in canon
in the form of the 4-si-for-zoi rule.

That was probably incomprehensible. Here's an example:

zoi gy broda gy si si gy -- what happens to the second si?

There are two possible answers: "It gets re-evaluated", so we are left
with:

zoi gy broda si gy

because si has no meaning inside of a zoi clause, and by erasing the
zoi closer we opened up a zoi clause, which then eats the si.

The other answer is: "It keeps eating text", so we are left with:

zoi gy gy

I've taken the latter position.

The reason is simple: the former position is untenable. Completely. It
makes it totally impossible to use si to erase zo, zoi, and possibly
lo'u. Try it, you'll see. "zo da si si si si ...", for example: if
interpreted in the former fashion, every odd si erases the word before
it (which except for the first means it's erasing a si that got caught
by zo) and every even si gets caught by zo.

I'm mentioning all this partly on general principles and partly because
I just had to add a special case for "ZO SI SI" into the SI handling
rules in my PEG grammar to thwart the grammar's default behaviour given
how SI and ZO are defined which is (you guessed it) to re-evaluate in
place, leading to the infinite repitition shown above.

This amused me.[1]

There were already such rules in places for the five cases of ZOI and
SI; the lack of such a rule for ZO goes back to me mis-understanding ZO
and having it and its argument turn into a single word from the POV of
the rest of the grammar.

-Robin

[1]: The fact that my grammar trying to do silly things with infinite,
non-escapable repitition was amusing enough to me to cause me to write
this giant e-mail is almost without question a sign of deep mental
pathology, or something.

Jorge Llamb�as

unread,
May 10, 2004, 3:37:18 PM5/10/04
to loj...@yahoogroups.com

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> My parser handles it exactly the same way in "mi broda lo zei zei da"
> and "mi broda zei zei da".

How come those are not "mi broda lo-zei-zei da" and
"mi broda-zei-zei da"?

> "zei bu" you seem to be correct on, and that follows from their relative
> priorities. I've just told BU to not work on ZEI, ever, to avoid that
> special case.

When you have "zei zei" you have to decide which one acts as glue
and which one as lujvo component. Why would you take the second one
as glue?

> > When zo fights with these words directly, it always wins:
> > {zo si}, {zo bu}, {zo zei}, so I don't see any reason for it not to
> > win when it fights with them over a third word. If {zo da} can be a
> > single word for {bu} and for {zei} to grab,
>
> Is that exactly the question that we're discussing? As far as I can
> tell, zo da is *never* considered a single word.

In the current grammar, that's correct.

> The official parser doesn't accept "zo da bu", nor "zo da zei broda", so
> I'm not sure where you get the idea that "{zo da} can be a single word
> for {bu} and for {zei} to grab" ?

If "da zei de" can be a single word for bu, why can't
"zo da" be a single word for bu?

Do you prefer to leave {zo a bu} as broken instead of giving it
one of the two obvious possible meanings?

mu'o mi'e xorxes




__________________________________
Do you Yahoo!?
Win a $20,000 Career Makeover at Yahoo! HotJobs
http://hotjobs.sweepstakes.yahoo.com/careermakeover


Jorge Llamb�as

unread,
May 7, 2004, 7:13:31 PM5/7/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > Why not restrict it to {ybu} only, then, since it is also silly with
> > the other words. You could actually be hesitating, and then you have
> > to start erasing y's, which is very silly.
>
> What about 'zo y'?

{zo y y y nalselmorjyvalsi} should be the same as {zo nalselmorjyvalsi},
you shouldn't be forced into {zo y y y si si si si zo nalselmorjyvalsi},
and that's assuming you actually count your y's when you hesitate...

To actually quote the noise you'd need {zoi ly y y y ly}.

I would even advocate abolishing {ybu}, but that's probably out of
the question.

Robin Lee Powell

unread,
May 7, 2004, 4:48:51 PM5/7/04
to loj...@yahoogroups.com
On Fri, May 07, 2004 at 01:40:23PM -0700, Robin Lee Powell wrote:
> So "zoi da de da" is turned into four tokens, "zoi da anything_699 da"
> and "zo da" is turned into the single token "any_word_698".

As xorxes pointed out, that second one is simply not correct; "zo da"
is turned into "zo any_word_698", which means zo handling in my parser
is, in one respect or another, wrong.

-Robin

Bob LeChevalier

unread,
May 10, 2004, 4:22:54 PM5/10/04
to loj...@yahoogroups.com
At 10:21 AM 5/10/04 -0700, you wrote:
> > In answer to the question in this thread, I believe that the text
> > comment in the body of the grammar after the rule defining LohU 436
> > addresses the intent for interactions between si and zo and zoi.
>
>Thanks for finding that, Bob. The next comment is general bitchiness
>and not specifically directed at you.
>
><bitchy>
>
><sarcasm>
>
>Oh, my gosh, why didn't *I* think to look there?
>
></sarcasm>
>
>Apparently, pre-parser instructions are scattered about the landside in
>that file, instead of being confined to the section labelled as such.
>Oh well.
>
></bitchy>

Actually, it is perfectly fine to bitch at me, since I put it there. The
stuff embedded in the grammar dates from before I attempted at a MUCH later
date to add preparser directions for Cowan and other who might be writing
parsers, and is intended to be explanatory of the rules rather than
parser-writing directions. It was thus intended to be used by the
grammar/textbook writer (at the time I thought it would be me, so it was
mostly a note to myself, which it turns out I haven't forgotten after 15 years)

I'm sorry, if it helps. I didn't forsee in 1988-9 how the YACC grammar
might be used in 2004, and at the time it was being used by a dozen people
who regularly discussed such things on the phone. There was then no EBNF,
nor any instructions in the beginning, so anyone using the grammar at all
was pretty familiar with the whole thing.

>Here's the text in question:
>
> It may be seen that any of the ZO/ZOI/LOhU trio of quotation markers
> may contain the powerful metalinguistic erasers. Since these
> quotations are not parsed internally, these operators are ignored
> within the quote. To erase a ZO, then, two SI's are needed after
> giving a quoted word of any type. ZOI takes four SI's, with the
> ENTIRE BODY OF THE QUOTE treated as a single 'word' since it is one
> selma'o. Thus one for the quote body, two for the single word
> delimiters, and one for the ZOI. In LOhU, the entire body is treated
> as a single word, so three SI's can erase it.

lojbab


--
lojbab loj...@lojban.org
Bob LeChevalier, Founder, The Logical Language Group
(Opinions are my own; I do not speak for the organization.)
Artificial language Loglan/Lojban: http://www.lojban.org


------------------------ Yahoo! Groups Sponsor ---------------------~-->
Yahoo! Domains - Claim yours for only $14.70
http://us.click.yahoo.com/Z1wmxD/DREIAA/yQLSAA/GSaulB/TM
---------------------------------------------------------------------~->

To unsubscribe, send mail to lojban-un...@onelist.com
Yahoo! Groups Links

<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lojban/

<*> To unsubscribe from this group, send an email to:
lojban-un...@yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/


Jorge Llamb�as

unread,
May 10, 2004, 5:12:04 PM5/10/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > When you have "zei zei" you have to decide which one acts as glue and
> > which one as lujvo component. Why would you take the second one as
> > glue?
>
> See above. This may not be the 'right' thing, of course.

Actually, your approach makes more sense, because zei is more
likely to appear as a modifier than as the main component of
a lujvo. That means {zei zei lujvo} would work for "zei-lujvo".
The current grammar (and the official parser) has it the other
way around though.

Robin Lee Powell

unread,
May 7, 2004, 4:47:25 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 01:38:19PM -0700, Jorge Llamb?as wrote:
> So both words turn what follows into special tokens, but remain
> themselves as separate tokens.

Oh, I'm sorry, you are correct. I thought both 'zo' and the following
token were eaten, but it's just the following token.

> That's not good. It causes a lot of problems.

Yeah, no kidding. I don't do token replacement of any kind, of course,
as I have no preprocessor.

> All of the following should give error if grammar .300 is followed to
> the letter:
>
> {zo da si de}
>
> si will erase the previous token, 'any_word_698', and then
> ZO followed by KOhA should give an error.

Correct. We can't test this because the official parser doesn't have si
handling.

> {zo da zei de}
>
> zei will join 'any_word_698' and KOhA and turn everything
> into BRIVLA, but then ZO followed by BRIVLA should cause
> an error.

No. This fails, but not for that reason. ZO processing happens long
before ZEI, so ZEI tries to bind any_word_698 and KOhA, any_word_698 is
not allowed as an argument to ZEI in grammar.300

It does, in fact, fail in the official parser.

> {zo da bu}
>
> bu will turn 'any_word_698' into BY, but then ZO followed
> by BY should cause an error.

Again, ZO happens first, but BU can't handle any_word_698, so it fails.

It does, in fact, fail in the official parser.

> Similar things will happen with zoi:
>
> {zoi gy sth gy si}
>
> will give an error because si will swallow the 'any_word_698'
> token

No; anything_699 (which is what you meant) does *NOT* swallow the
delimiter, so si will swallow gy, and you end up with
"zoi gy anything_699", which won't parse.

Again, not testable.

> I think that the Right Thing is to treat {zo <word>} and {zoi <word>
> <anything> <word>} as single tokens of selmaho KOhA.

Why KOhA?

> In any case, I don't see any justification for treating {zo} in one
> way and {zoi} in another.

Neither do I, as zo doesn't actually work the way I thought it did.

<sigh>

Jorge Llamb�as

unread,
May 10, 2004, 1:47:30 PM5/10/04
to loj...@yahoogroups.com

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> Unfortunately, this doesn't help with things like "zo broda zei broda",
> but I'm inclined to say that that's illegal because ZO acts first,
> leaving "<zo-quote> zei broda", and a quote is not a single word.

It is clear that {zo broda zei broda} is illegal with the current
grammar, though not exactly for the reason you say. {zei} grabs the
previous _token_ and binds it to the next to deliver a brivla, so
you end up with ZO BRIVLA which is illegal. Only ZO any_word_698
is valid. {zo broda si brode} fails for the same reason.
{si} erases any_word_698 leaving ZO BRIVLA, which is illegal.
I think that if you fix one you should fix the other as well.

{da zei de bu} is legal: first {da zei de} delivers BRIVLA, and
then {BRIVLA BU} delivers BY.

> Opinions on that issue welcome, but it seems pretty clear that ZEI only
> acts on single words, and quotes are not single words.

ZEI acts on single tokens, doesn't it?

Jorge Llamb�as

unread,
May 7, 2004, 5:12:43 PM5/7/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > I think that the Right Thing is to treat {zo <word>} and {zoi <word>
> > <anything> <word>} as single tokens of selmaho KOhA.
>
> Why KOhA?

What else? They behave as KOhAs in all other respects. This would
only affect their behaviour vis-a-vis {si}, {bu} and {zei}.
(Also {ba'e}, but the effect here is almost irrelevant.)

{zo zo}, {zo zoi}, {zoi zoi ... zoi} and {zoi zo ... zo} would
remain valid because they act first.

Robin Lee Powell

unread,
May 7, 2004, 4:40:23 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 12:31:28PM -0700, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > > What does {da zo si si} do?
> >
> > It *should* result in just 'da', because zo is defined as turning
> > itself and the next argument into a single word. zoi is *not* so
> > defined.
>
> Is there a justification for that difference?

Huh.

I'm sorry, I thought that that difference was explicitely stated
somewhere, but I can't find it. The Red Book doesn't seem to say
whether or not "zo da" is treated as a single word at all.

So unless I'm missing something, all we have to go on is grammar.300,
which says:

a. If the Lojban word "zoi" (selma'o ZOI) is identified, take the
following Lojban word (which should be end delimited with a pause for
separation from the following non-Lojban text) as an opening delimiter.
Treat all text following that delimiter, until that delimiter recurs
*after a pause*, as grammatically a single token (labelled
'anything_699' in this grammar). There is no need for processing within
this text except as necessary to find the closing delimiter.

b. If the Lojban word "zo" (selma'o ZO) is identified, treat the
following Lojban word as a token labelled 'any_word_698', instead of
lexing it by its normal grammatical function.

So "zoi da de da" is turned into four tokens, "zoi da anything_699 da"


and "zo da" is turned into the single token "any_word_698".

It is this behaviour that I am trying to emulate, without being a slave
to YACC restrictions (which, for example, make it so that "zoi da weeble
da si si si si" works, but no lesser number of "si" after the zoi have
any effect.

Robin Lee Powell

unread,
May 10, 2004, 1:55:16 PM5/10/04
to lojba...@lojban.org
On Mon, May 10, 2004 at 10:47:30AM -0700, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > Unfortunately, this doesn't help with things like "zo broda zei
> > broda", but I'm inclined to say that that's illegal because ZO acts
> > first, leaving "<zo-quote> zei broda", and a quote is not a single
> > word.
>
> It is clear that {zo broda zei broda} is illegal with the current
> grammar, though not exactly for the reason you say.

Of course. Again, though, I'm not aiming for bug-for-bug compatibility
with the current grammar.

> {zei} grabs the previous _token_

In the YACC grammar, yes, but that's not the word's definition:

zei ZEI lujvo glue
joins preceding and following words into a lujvo

> and binds it to the next to deliver a
> brivla, so you end up with ZO BRIVLA which is illegal. Only ZO
> any_word_698 is valid. {zo broda si brode} fails for the same reason.
> {si} erases any_word_698 leaving ZO BRIVLA, which is illegal. I think
> that if you fix one you should fix the other as well.

OK, so what would you say that "zo broda zei broda" is, then? I
certainly don't know. If I had to pick one, I would say that it's
"(zo broda) zei broda"; i.e. "'broda' type of broda".

> {da zei de bu} is legal: first {da zei de} delivers BRIVLA, and then
> {BRIVLA BU} delivers BY.

Good point, but not terribly helpful for the ZO + ZEI case that I can
see.

> > Opinions on that issue welcome, but it seems pretty clear that ZEI
> > only acts on single words, and quotes are not single words.
>
> ZEI acts on single tokens, doesn't it?

If one is using the concept of tokens, I suppose it does, yes.

Robin Lee Powell

unread,
May 10, 2004, 3:54:36 PM5/10/04
to lojba...@lojban.org
On Mon, May 10, 2004 at 12:45:33PM -0700, Robin Lee Powell wrote:

> On Mon, May 10, 2004 at 12:37:18PM -0700, Jorge Llamb?as wrote:
> > Do you prefer to leave {zo a bu} as broken instead of giving it one
> > of the two obvious possible meanings?
>
> Not particularily.

Let me expand on that: What I *want* is a solution that:

1. Fits the current cmavo definitions

2. Won't be monstrously confusing to th user.

It seems to me that "zo da si de" == "de" is confusing.

It furthermore seems to me that *all* solutions to "zo da bu" are
confusing. Same for "da zei da bu".

So I'm rather stuck.

Robin Lee Powell

unread,
May 7, 2004, 5:35:50 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 02:28:11PM -0700, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > While I'm at it, if ZO+word and ZOI+clause are treated as one word,
> > then "zo .y. si co valsi" is invalid, right?
>
> I was going to ask what you do with {y}.

Oh, don't ask that. :-)

Seriously, it's *very* complicated. Y is utterly ignored, except
where it's not, where special cases are used to catch it.

Basically, Y is caught anywhere that *any* Lojban word would normally be
valid, so it can be used in ZEI, BU, ZOI, and ZO. Probably some others.
As an *extra* special case, it cannot have BAhE applied to it, because
that would just be silly.

> I would propose that {y} be totally ignored, so {zoi y gy ... gy}
> should be valid. Then you couldn't quote it with {zo}, you'd have to
> say {zoi ly y ly}... But then {.y.bu} would require special
> treatment... It's really weird that you can't hesitate after certain
> words, especially after zoi, which is a place where I expect
> hesitation, while you think of an appropriate delimiter.

Yeah, it's complicated.

> In any case, responding to your question, yes {zo ca si co valsi}
> would be invalid.

Because it's equivalent to "co valsi", correct?

> It is invalid now, according to grammar .300, but for a different
> reason.

Right.

Robin Lee Powell

unread,
May 5, 2004, 8:57:31 PM5/5/04
to lojba...@lojban.org
On Wed, May 05, 2004 at 05:34:09PM -0700, Robin Lee Powell wrote:
> * SA and SI now interact in a more obvious fashion. For example, "le
> broda brode brodi .y. sa le si la broda brode brodi" is equivalent
> to "la broda brode brodi". Just using "sa" would not work because
> "le" and "lo" are in different selma'o.

It's worth noting that jbofihe does the same thing.

jco...@reutershealth.com

unread,
May 10, 2004, 4:04:03 PM5/10/04
to rlpo...@digitalkingdom.org, loj...@yahoogroups.com
Robin Lee Powell scripsit:

> > If "da zei de" can be a single word for bu, why can't "zo da" be a
> > single word for bu?
>

> As I haven't the slightest idea why ZEI was handled in a way different
> from every other preprocessor token, I don't have a good answer for
> that.

Probably because zei was invented long after the other magic preprocessor
words, and nobody was thinking about these corner cases at the time.

However, I would not want fu zei bar to become a single token, because
then "si" would dispose of all of it, which would be awkward for
na'e zei a zei bu zei na'e zei by zei livga terbilma
("non-A, non-B hepatitis", now usually called "hepatitis C").

It wouldn't break my heart if "zei zei ..." runs were always illegal, though:
that is, if zei could not act upon zei.

--
The Imperials are decadent, 300 pound John Cowan <jco...@reutershealth.com>
free-range chickens (except they have http://www.reutershealth.com
teeth, arms instead of wings and http://www.ccil.org/~cowan
dinosaurlike tails). --Elyse Grasso


------------------------ Yahoo! Groups Sponsor ---------------------~-->

Make a clean sweep of pop-up ads. Yahoo! Companion Toolbar.
Now with Pop-Up Blocker. Get it for free!
http://us.click.yahoo.com/L5YrjA/eSIIAA/yQLSAA/GSaulB/TM

Nora LeChevalier

unread,
May 6, 2004, 6:20:33 PM5/6/04
to lojba...@lojban.org
At 05:34 PM 5/5/04 -0700, Robin wrote:
>Now that I can parse ZOI, it was necessary to figure out what the
>interactions with it and SI and SA would be. Along the way, I added
>some features to the interaction between SI and SA. This is what I've
>currently got on my parser page, and I'd very much like people's
>opinions.
>
>In particular, if there is a direct contradiction with grammar.300 that is
>*not* obviously based on YACC limitations, or anything that seems strongly
>counter-intuitive, please let me know.
>
>The main page for my parser is
>http://www.digitalkingdom.org/~rlpowell/hobbies/lojban/grammar/
>
>
> * Groups of "si" and everything up to a "sa" are both erased at the
> beginning
>of a string. This may or may not be justifiable according to grammer.300;
>no-one's really sure. This means that sentences like "si si si" and "sa" are
>legal, as well as sentences like "le broda sa .i mi cusku".

>
> * SA and SI now interact in a more obvious fashion. For example, "le broda
>brode brodi .y. sa le si la broda brode brodi" is equivalent to "la broda
>brode
>brodi". Just using "sa" would not work because "le" and "lo" are in different
>selma'o.
>
> * Interactions between ZOI, SI, and SA are much richer. The goal is to
>achieve something more like what a user would 'expect', given the basic
>definitions of those words. Details:
>
>
> + The first SI after the close of a ZOI clause erases the closing
>delimiter, allowing one to add to the protected text. "zoi gy weeble gy si bob
>gy" is equivalent to "zoi gy weeble bob gy".
>
> + Two consecutive SI after the close of a ZOI erases the non-Lojban
> text
>itself; while it would theoretcially be possible to have consecutive SI after
>the close of a ZOI erase individual words inside the ZOI protected text, this
>is a bad idea because (for example) breaking up a bird call into words makes
>very little sense.
>
> So, for example, "zoi gy da da da gy si si de gy" is equivalent
> to "zoi
>gy de gy".
>
> + The interaction of these two features leads to a somewhat
> strange, but
>very minor, side effect: It is impossible to add to the protected text
>inside a
>zoi clause (i.e. using a single SI after the closing delimiter) any text that
>starts with "si" (unless it then goes on to be something that looks like a
>Lojban brivla or cmene), because it will be interpreted as two SI, causing
>erasure of the entire protected text.
>
> + Three consecutive SI after the close of a ZOI erases everything
> but the
>ZOI itself, so that, for example, "zoi gy da da da gy si si si dy weeble
>dy" is
>equivalent to "zoi dy weeble dy".
>
> + Four consecutive SI after the close of a ZOI erases the entire ZOI
>clause, including the ZOI.
>
> + Because of the SA and SI interaction enhancements, the fast way to
>delete and accidental ZOI is to close the delimiter and say "sa zoi si", and
>then continue on. For example, "broda zoi gy da da da da gy sa zoi si da" is
>equivalent to "broda da".
>
>
>-Robin

I think I've seen someone use "si" as the delimiter. This majorly
complicates things, no?

Also, from a making-sense point of view, I prefer "si" after the closing

delimiter to delete the entire zoi phrase (back to and including the
zoi). To say that "The first SI after the close of a ZOI clause erases the
closing delimiter..." would make one think the next thing said is part of
the inside of the ZOI; so you would never be able to get back to the

ZOI. I had this trouble with the current version, too, by the way. There

is a precedent for this. When applying a UI, if it's after something like
a le broda ku, it applies to the whole le ... ku construct.


--
mi'e noras no...@cox.net
Nora LeChevalier

Robin Lee Powell

unread,
May 7, 2004, 4:52:13 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 01:50:00PM -0700, Jorge Llamb?as wrote:
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > It is this behaviour that I am trying to emulate, without being a
> > slave to YACC restrictions (which, for example, make it so that "zoi
> > da weeble da si si si si" works, but no lesser number of "si" after
> > the zoi have any effect.
>
> It's a good thing to get rid of that effect, and your solution for
> {si} is not bad, but I think Nora's suggestion is more parsimonious
> and extendable to similar problems that occur with {bu} and {zei}.

So you think that "zoi gy weebles are the best! gy zei klama" should, in
fact, be treated as a lujvo?

Robin Lee Powell

unread,
May 7, 2004, 5:10:51 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 09:36:09AM +0200, Rapha?l Poss wrote:
> Besides, we have a design flaw if "si" is supposed to step back into
> the ZOI construct "word per word" : what is a word outside Lojban ?
> How "si" is going to affect the construct if we cannot break it down
> into words in the lojban sense of the term ?

My current design, as I stated at the beginning of this thread, is to
tread the non-Lojban text as one word for "si" purposes, for that exact
reason.

Jorge Llamb�as

unread,
May 10, 2004, 5:20:18 PM5/10/04
to lojba...@lojban.org

--- jco...@reutershealth.com wrote:
> However, I would not want fu zei bar to become a single token, because
> then "si" would dispose of all of it, which would be awkward for
> na'e zei a zei bu zei na'e zei by zei livga terbilma
> ("non-A, non-B hepatitis", now usually called "hepatitis C").

fu zei bar does become a single token currently, but this happens
after si has already acted, so the above is not a problem. The only
effect is that bu in {fu zei bar bu} sees a single token and
creates a BY.

> It wouldn't break my heart if "zei zei ..." runs were always illegal, though:
> that is, if zei could not act upon zei.

I think Robin's approach for this is the Right Thing: the second
zei is the glue, so {zei zei broda} is as well behaved as any
other lujvo. In a string of more than one {zei}, the evens have
to be glue and the odd ones glued rather than the other way around
as is currently the case.

Jorge Llamb�as

unread,
May 7, 2004, 11:01:35 AM5/7/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> On Thu, May 06, 2004 at 06:20:33PM -0400, Nora LeChevalier wrote:
> > Also, from a making-sense point of view, I prefer "si" after the
> > closing delimiter to delete the entire zoi phrase (back to and
> > including the zoi). To say that "The first SI after the close of a
> > ZOI clause erases the closing delimiter..." would make one think the
> > next thing said is part of the inside of the ZOI; so you would never
> > be able to get back to the ZOI.
>
> Yes, I understand your point completely. I'd love to hear other people
> chime in on this point. The problem is that SI is only supposed to
> erase one previous word, so we're moving in to the realm of "not
> justifiable under current standards".

I tend to agree with Nora. "Word" is not a very clearly defined word
at this level anyway. Consider:

{zoi gy Is this one word? gy bu}

{bu} is suposed to turn the previous word into a lerfu.

{zoi gy one word? gy zei zoi gy One word? gy}

{zei} is supposed to make two words into a lujvo.

So, if {bu} and {zei} take {zoi gy ... gy} to be a single word,
{si} could just as well do the same thing.

What does {da zo si si} do?

mu'o mi'e xorxes

Robin Lee Powell

unread,
May 7, 2004, 7:00:29 PM5/7/04
to loj...@yahoogroups.com
On Fri, May 07, 2004 at 03:56:42PM -0700, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > Basically, Y is caught anywhere that *any* Lojban word would
> > normally be valid, so it can be used in ZEI, BU, ZOI, and ZO.
> > Probably some others. As an *extra* special case, it cannot have
> > BAhE applied to it, because that would just be silly.
>
> Why not restrict it to {ybu} only, then, since it is also silly with
> the other words. You could actually be hesitating, and then you have
> to start erasing y's, which is very silly.

What about 'zo y'?

-Robin

Robin Lee Powell

unread,
May 7, 2004, 2:55:09 PM5/7/04
to loj...@yahoogroups.com
On Fri, May 07, 2004 at 08:01:35AM -0700, Jorge Llamb?as wrote:
>
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > On Thu, May 06, 2004 at 06:20:33PM -0400, Nora LeChevalier wrote:
> > > Also, from a making-sense point of view, I prefer "si" after the
> > > closing delimiter to delete the entire zoi phrase (back to and
> > > including the zoi). To say that "The first SI after the close of
> > > a ZOI clause erases the closing delimiter..." would make one think
> > > the next thing said is part of the inside of the ZOI; so you would
> > > never be able to get back to the ZOI.
> >
> > Yes, I understand your point completely. I'd love to hear other
> > people chime in on this point. The problem is that SI is only
> > supposed to erase one previous word, so we're moving in to the realm
> > of "not justifiable under current standards".
>
> I tend to agree with Nora. "Word" is not a very clearly defined word
> at this level anyway. Consider:
>
> {zoi gy Is this one word? gy bu}

Fails in my parser.

> {bu} is suposed to turn the previous word into a lerfu.

Which it can't do, because all the previous words are otherwise engaged:
zoi happens before bu.

> {zoi gy one word? gy zei zoi gy One word? gy}
>
> {zei} is supposed to make two words into a lujvo.

This fails in my parser for the same reason.

> So, if {bu} and {zei} take {zoi gy ... gy} to be a single word,

They don't, in my parser or the official one.

They seem to in jbofihe; I have no idea why.

> {si} could just as well do the same thing.
>
> What does {da zo si si} do?

It *should* result in just 'da', because zo is defined as turning itself


and the next argument into a single word. zoi is *not* so defined.

Despite this, jbofihe chokes on that example. My parser has code
specifically for it, but it wasn't actually working properly (it was
parsing that example, but not correctly). It now works; thanks.

Partial parse tree: sumti6=( KOhA=( da zo si si ) free=() )

Raphaël Poss

unread,
May 7, 2004, 3:36:09 AM5/7/04
to lojba...@lojban.org

Robin Lee Powell <rlpo...@digitalkingdom.org> writes:

>> Also, from a making-sense point of view, I prefer "si" after the
>> closing delimiter to delete the entire zoi phrase (back to and
>> including the zoi). To say that "The first SI after the close of a
>> ZOI clause erases the closing delimiter..." would make one think the
>> next thing said is part of the inside of the ZOI; so you would never
>> be able to get back to the ZOI.
>
> Yes, I understand your point completely. I'd love to hear other people
> chime in on this point. The problem is that SI is only supposed to
> erase one previous word, so we're moving in to the realm of "not
> justifiable under current standards".

Besides, we have a design flaw if "si" is supposed to step back into


the ZOI construct "word per word" : what is a word outside Lojban ?
How "si" is going to affect the construct if we cannot break it down
into words in the lojban sense of the term ?

--
. . . _ - --------\
: Rapha�l Poss JID Elr...@jabber.dk � ICQ 1757157 |
| EPITA CSI 2003 � http://raphael.poss.name � GnuPG fp ...3b72e72b :
\------ - _ . . '


Robin Lee Powell

unread,
May 7, 2004, 5:23:31 PM5/7/04
to lojba...@lojban.org
On Fri, May 07, 2004 at 05:19:46PM -0400, jco...@reutershealth.com wrote:
> On a related note:
>
> Robin, are you now allowing "zoi fu I am a blind alley fu si si This
> is right fu" as grammatical?

Yes.

sumti6=( ZOI=( zoi ) zoiClause=( fu I am a blind alley fu si si This is right fu ) free=() )

> What about "zoi fu The beginning and fu si the end fu"?

sumti6=( ZOI=( zoi ) zoiClause=( fu The beginning and fu si the end fu ) free=() )

For technical reasons I hope to eventually remedy, exactly *what* my
parser is doing with the SI handling isn't being shown, but trust me
that it's doing what you'd expect.

> I'd guess that your preparser can't cope with either,

No, but I stopped using that several days ago in favour of *very* simple
Rats!-specific semantic tests in the PEG grammar.

> but what do you think is the Ideal Right Thing?

That depends whether one treats a complete ZOI clause as one word, four
words, or some other number. If one treats it as four words, I think
that both of those should be legal.

Robin Lee Powell

unread,
May 10, 2004, 3:45:33 PM5/10/04
to loj...@yahoogroups.com
On Mon, May 10, 2004 at 12:37:18PM -0700, Jorge Llamb?as wrote:
> --- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > My parser handles it exactly the same way in "mi broda lo zei zei
> > da" and "mi broda zei zei da".
>
> How come those are not "mi broda lo-zei-zei da" and "mi broda-zei-zei
> da"?

Prioritization. ZEI clauses are actually fairly far down the tanru-unit
list in the BNF, which is what I started with.

> > "zei bu" you seem to be correct on, and that follows from their
> > relative priorities. I've just told BU to not work on ZEI, ever, to
> > avoid that special case.
>
> When you have "zei zei" you have to decide which one acts as glue and
> which one as lujvo component. Why would you take the second one as
> glue?

See above. This may not be the 'right' thing, of course.

> > > When zo fights with these words directly, it always wins: {zo si},


> > > {zo bu}, {zo zei}, so I don't see any reason for it not to win
> > > when it fights with them over a third word. If {zo da} can be a
> > > single word for {bu} and for {zei} to grab,
> >
> > Is that exactly the question that we're discussing? As far as I can
> > tell, zo da is *never* considered a single word.
>
> In the current grammar, that's correct.
>
> > The official parser doesn't accept "zo da bu", nor "zo da zei
> > broda", so I'm not sure where you get the idea that "{zo da} can be
> > a single word for {bu} and for {zei} to grab" ?
>
> If "da zei de" can be a single word for bu, why can't "zo da" be a
> single word for bu?

As I haven't the slightest idea why ZEI was handled in a way different


from every other preprocessor token, I don't have a good answer for
that.

> Do you prefer to leave {zo a bu} as broken instead of giving it one of


> the two obvious possible meanings?

Not particularily.

Bob LeChevalier

unread,
May 10, 2004, 7:50:27 AM5/10/04
to loj...@yahoogroups.com

Lojban grammar was DESIGNED to be a slave to YACC restrictions, that being
a working definition of LALR1 for purposes of language design. That it
isn't a correct definition is irrelevant.

In answer to the question in this thread, I believe that the text comment
in the body of the grammar after the rule defining LohU 436 addresses the
intent for interactions between si and zo and zoi.

lojbab

--
lojbab loj...@lojban.org
Bob LeChevalier, Founder, The Logical Language Group
(Opinions are my own; I do not speak for the organization.)
Artificial language Loglan/Lojban: http://www.lojban.org

Jorge Llamb�as

unread,
May 7, 2004, 3:31:28 PM5/7/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > What does {da zo si si} do?
>
> It *should* result in just 'da', because zo is defined as turning itself
> and the next argument into a single word. zoi is *not* so defined.

Is there a justification for that difference?

mu'o mi'e xorxes

Jorge Llamb�as

unread,
May 10, 2004, 3:06:31 PM5/10/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> On Mon, May 10, 2004 at 11:42:08AM -0700, Jorge Llamb?as wrote:
> > {zei zei da} at the beginning of text?
> In other words, it creates a lujvo that means "zei type-of da".
>
> Which is what I think a human would expect.

Yes, except that it is a funny lujvo that can only appear at
the beginning of a text. I suppose {zei bu} is also a BY?
Again, it can only appear at the beginning of a text.

> > In {zo da si de}, {zo da bu}, {zo da zei de}, we have zo and something
> > else fighting over the same word, one pulling from the left and the
> > other from the right. We just have to define which one has priority,
> > and the other one should act on what remains.
>
> I don't see any reason to over-ride grammar.300 on that point: ZO has
> higher priority. But then we're back to whether or not SI eats more
> than one word, which grammar.300 says it does not.

When zo fights with these words directly, it always wins:
{zo si}, {zo bu}, {zo zei}, so I don't see any reason for it
not to win when it fights with them over a third word.
If {zo da} can be a single word for {bu} and for {zei} to grab,

it can also be a word for {si}, why not? Call it a single
quoted word.

It is also possible to decide that when there is an intervening
word, zo loses: so {zo (da si de)}. But then why not
{zo (da bu)} and {zo (da zei de)}? I prefer zo to consistently
have priority: {(zo da) si de}, {(zo da) bu} and {(zo da) zei de}.

Robin Lee Powell

unread,
May 5, 2004, 8:34:09 PM5/5/04
to loj...@yahoogroups.com

Nora LeChevalier

unread,
May 7, 2004, 6:22:31 PM5/7/04
to lojba...@lojban.org
At 05:51 PM 5/6/04 -0700, Robin wrote:
>On Thu, May 06, 2004 at 06:20:33PM -0400, Nora LeChevalier wrote:
>[long explanation of what my parser does snipped]

> > I think I've seen someone use "si" as the delimiter. This majorly
> > complicates things, no?
>
>Not at all. My parser has no problems with this. I just tested this on
>
>"zoi si I love zoi! si"
>
>and
>
>"zoi si I love zoi! si si Really! si"
>
>Both of which do exactly what I said they would do.

[snip]
What happens if you "si" some more (to try to get rid of the "zoi")?

"zoi si I love zoi! si si [erases the end-delimiter] si [erases the
internals] si [new end-delimiter? or deletes the start-delimiter?]

In other words, at what point does "si" again start having normal usage?

Jorge Llamb�as

unread,
May 7, 2004, 4:38:19 PM5/7/04
to lojba...@lojban.org

--- Robin Lee Powell <rlpo...@digitalkingdom.org> wrote:
> > What does {da zo si si} do?
>
> It *should* result in just 'da', because zo is defined as turning itself
> and the next argument into a single word. zoi is *not* so defined.

I checked grammar .300, and there seems to be no such distinction
between zo and zoi:

a. If the Lojban word ``zoi'' (selma'o ZOI) is identified, take the
following Lojban word (which should be end delimited with a pause for
separation from the following non-Lojban text) as an opening delimiter.
Treat all text following that delimiter, until that delimiter recurs
*after a pause*, as grammatically a single token (labelled 'anything_699'
in this grammar). There is no need for processing within this text
except as necessary to find the closing delimiter.

b. If the Lojban word ``zo'' (selma'o ZO) is identified, treat the
following Lojban word as a token labelled 'any_word_698', instead of
lexing it by its normal grammatical function.

So both words turn what follows into special tokens, but remain
themselves as separate tokens.

That's not good. It causes a lot of problems. All of the


following should give error if grammar .300 is followed
to the letter:

{zo da si de}

si will erase the previous token, 'any_word_698', and then
ZO followed by KOhA should give an error.

{zo da zei de}

zei will join 'any_word_698' and KOhA and turn everything
into BRIVLA, but then ZO followed by BRIVLA should cause
an error.

{zo da bu}

bu will turn 'any_word_698' into BY, but then ZO followed
by BY should cause an error.

Similar things will happen with zoi:

{zoi gy sth gy si}

will give an error because si will swallow the 'any_word_698'

token and what's left: ZOI any_word_698 anything_699 will be
followed by something else and give an error. The only way to
recover is to add three more {si}'s to remove everything.

I think that the Right Thing is to treat {zo <word>} and
{zoi <word> <anything> <word>} as single tokens of selmaho
KOhA.

In any case, I don't see any justification for treating

{zo} in one way and {zoi} in another.

mu'o mi'e xorxes