Incredible!

30 views

Skip to first unread message

Mark L. Vines

unread,

Oct 21, 1995, 9:59:12 AM10/21/95

to LOJ...@cuvmb.cc.columbia.edu

la and spuda mi di'e
> > I admire the rafsi principle, which is elegantly & beautifully
> > expressed in Lojban.
>
> Even allowing for intersubjective variability, I find such
> sentiments incredible.

At first, reading this comment of And's, I assumed that And had not read
my posting with adequate care.

I thought of making a dismissive reply. "I too would find such sentiments
incredible," I thought I'd say, "if I were unaware of the original
context, in which I was presenting a _complaint_ about rafsi in Lojban.

"Forms of politeness vary," I thought I'd continue. "Perhaps the custom
of 'starting every complaint with a compliment' is unknown to you."

You see, my primary intent had been to complain about something I was
calling the rafsi-cmavo anomaly:

> Unfortunately, out of the 392 rafsi which have phonologically
> identical twin cmavo, about 295 (by my current estimate) are
> apparently unrelated in both meaning and derivation. For example,
> fe'i is the rafsi for female but the cmavo for divided by; ti'u is
> the rafsi for daughter but the cmavo for a time stamp.

However, as I worked on the phrasing of the dismissive reply I was
planning, I realized that my opening compliment would have to be defended
-- even tho it was just the prelude to a complaint. So I began looking
yet again at rafsi in Lojban, hoping to bolster that defense. As I
looked, the truth gradually became clear.

And was right & I was wrong. Horribly wrong. As wrong as I've been
about anything in years.

The rafsi suffer from several very serious defects & problems. Here is a
partial list, with only a few examples of each:

1
* cmavo & rafsi identical in form
but unrelated in meaning:
da'a (all except), da'a => damba (fight);
mo'i (space motion), mo'i => morji (remember).

(That was the only problem I'd recognized before, & I'd underestimated
its seriousness. It's really a form of homophone ambiguity -- & Lojban
is supposed to be free of such ambiguity.)

2
* cmavo & rafsi identical in meaning
but different in form:
da'a (all except), daz => da'a (all except);
mo'i (space motion), mov => mo'i (space motion).

(As you can see, the same forms & meanings are sometimes involved in both
type 1 & type 2 defects. What a mess! You'd think that, if the cmavo
da'a really needed a rafsi, it could have one identical to itself, but
no! It has daz instead, while the rafsi da'a belongs to the gismu damba.
Logical? Not!)

3
* gismu whose meanings ought to be "affixable"
but which lack rafsi:
matra (motor); vidni (video); risna (heart).

(There are, by my current estimate, some 238 gismu which lack rafsi; but
not all of them have meanings which ought to be "affixable.")

4
* gismu whose meanings need not be "affixable"
but which do have rafsi:
smo => smoka (sock); bik => bikla (whip);
rig => rigni (disgusting).

5
* gismu whose meanings ought to be "suffixable"
but which have only non-suffixable CVC rafsi:
meb => mebri (brow); run => rutni (artifact);
tab => tabno (carbon).

6
* rafsi with parallel forms but which are derived
from dissimilar valsi:
cme => cmene (name); zme => guzme (melon);
gu'a => gunka (work); bu'a => bruna (brother).

7
* gismu with parallel forms but which are contracted
into dissimilar rafsi:
cabna (now) <= cab; zabna (favorable) <= zan, za'a;
senci (sneeze) <= sec; denci (tooth) <= den, de'i.

In fact, I found so many rafsi problems that I now favor scrapping the
whole rafsi-lujvo system & starting over. At the very least, we should
discard those 295 rafsi, each of which is identical in form to some cmavo
but unrelated to it in meaning. Would you folks agree with that?

Also, I owe And a word of thanks for disillusioning me with regard to the
"elegance" & "beauty" of rafsi in Lojban. How, you may ask, did I ever
acquire that illusion in the first place? As it happens, I _have_
reconstructed the answer to that question, & I'm willing to discuss it if
anyone is interested. But my personal mental life must remain a
secondary concern here. More importantly:

Can the rafsi defects & problems still be corrected?

If they are not corrected, can Lojban serve all its various functions?

No doubt these questions have been raised before. (Believe me, I'm aware
of my newbie status -- all the more so after making such a big mistake!)
But have these questions been answered before? If so, what are the
answers, & are you folks satisfied with them?

co'o mi'e mark,l

Logical Language Group

unread,

Oct 21, 1995, 2:33:12 PM10/21/95

to MarkL...@eworld.com, loj...@cuvmb.cc.columbia.edu

>Can the rafsi defects & problems still be corrected?

In a word, no. Even if it were accepted that your concerns were valid,
any drastic design changes are not permitted at this point. Indeed, even
minor changes are difficult, except in some very small areas. We are
very close to the 5 year baseline, wherein the language will be frozen
as to prescription for 5 years or more (informal and experimental usages
will perhaps be developed, but they will by intention not be officially
recognized during a baseline).

A far more minor effort at making the rafsi attune to usage went through
2 years ago, and it took the commitment that the rafsi list would be
baselined to get support for that. Several people were involved in that
effort, and concerns such as yours were discussed as part of the debate.

>If they are not corrected, can Lojban serve all its various functions?

In short, your issues have no effect on Lojban's ability to serve its
functions. The only issue is one of learnability, and the tradeoff that
was made 2 years ago was that the problems in learnability of "illogical"
rafsi were small, and in any case no worse than the cost of relearning
foir people who already had used and learned some of the original rafsi.

As to whether people are satisfied by the status quo - in general the
"old-timers" are satisfied. New people tend to be dissatisfied when they
run into something that seems illogical. Sometimes the concerns are valid;
in many cases it is a problem that we have not considered, or considered
only cursorily. But for the most part, such problems are in semantics and
not in the basic language design, which is sound.

AS to whether the language design is "elegant", that appears to be a subjective
decision. I think that the rafsi system IS elegant, because I value
compromises, and am also aware of the flaws with many alternatives.
But explaining the "whys" of each design decision to each new person isn't
really practical. Someday, we'll have to write a book saying why each
design element came to be the way it was, and people could decide for
themselves.

But even to do this, someone considering the elegance of the design has to
bear in mind that all decisions must be made in context. In any design effort
you make some basic decisions first, and these will constrain all other
decisions. The fine points of assigning rafsi to meanings is a relative
"small" decision, that does NOT affect the fundamental principles of the
language design, and hence must be constrained by those fundamental
princioples. Within those constraints, I contend that the current tradeoff
is quite elegant, and furthermore, close to optimal.

Not everyone agrees with me on this. But no one has proposed any alternatives
that even come close to meeting the many criteria that are applicable, and
even if they seemed to do so, we would be unlikely to consider the idea
seriously because of the lateness in the design phase. IN general, I have to
admit
that in anything dealing with fundamental design questions, the first reaction
to a newcomer's proposal for change, or even a complaint, is to dismiss it
on the assumption that the new person doesn't have enough knowledge of the
language to even understand all the unwritten considerations that go into the
design. I'm giving your idea a more serious response to most primarily
because today I am avoiding some other work that I should be doing instead,
and a lot of not-so-newcomers know that this is something that I should NOT be
doing. In general, the people trying to produce the books no longer have
time to give serious consideration to any design ideas, except those that
come up in the course of book-writing, and those that are raised insistently,
and survive debate on Lojban List to gain support from at least one of the
not-so-newcomers who has been more actively involved in the Lojban List
discussion.

All this weasel-wording aside, I'm going to respond to your specific ideas.

>The rafsi suffer from several very serious defects & problems. Here is a
>partial list, with only a few examples of each:

>* cmavo & rafsi identical in form
> but unrelated in meaning:
> da'a (all except), da'a => damba (fight);
> mo'i (space motion), mo'i => morji (remember).
>
>(That was the only problem I'd recognized before, & I'd underestimated
>its seriousness. It's really a form of homophone ambiguity -- & Lojban
>is supposed to be free of such ambiguity.)

This is not an ambiguity at all, much less one of homophones. That is
because cmavo are words, and rafsi are not. The cmavo cannot be used
as combining forms into lujvo, and the rafsi cannot be used EXCEPT as
combining forms. The morphology of the language allows unambiguous
resolution of combining forms from separate words, such that it has NO
CARE AT ALL about the semantic definition of the words, and only the
slightest consideration of the YACC grammar (and that is because a couple
of cmavo serve metalinguistically to override some of the basic rules:
specifically the non-Lojban text quotes la'o and zoi - within those quotes
words need not follow the Lojban morphology and therefore those constructs
must be idnetified and filtered out before other grammar processing)

Perhaps it would be better to understand that the rafsi for "fight" is
NOT "da'a", but rather "-da'a" or "-da'a-" or "-da'a".

>* cmavo & rafsi identical in meaning
> but different in form:
> da'a (all except), daz => da'a (all except);
> mo'i (space motion), mov => mo'i (space motion).
>
>(As you can see, the same forms & meanings are sometimes involved in both
>type 1 & type 2 defects. What a mess! You'd think that, if the cmavo
>da'a really needed a rafsi, it could have one identical to itself, but
>no! It has daz instead, while the rafsi da'a belongs to the gismu damba.
> Logical? Not!)

Actually quite logical based on considerations of language. That someone
MIGHT weant to make a lujvo involving "all-except" justifies giving it
a rafsi. But with few exceptions, the cmavo that COULD be used in lujvo
are in fact seldom used. manwhile, as you point out in other points,
there are plenty of words that ARE usable in lujvo, and of them, at least one
- damba - is permitted under the rafsi selection rules to have the rafsi
form "da'a". So the tradeoff on this particular rafsi somes down to: will
"damba" be desired for use in lujvo more or less often than the cmavo "da'a".
Since only one of the two can have that particular rafsi, and neither has
a logical (mch less rule-permitted) basis for having any other word-final
short -rafsi form, then only one of the two concepts will get such a rafsi.

In actualy debate and usage, damba WAS used more, so damba got the rafsi.

>* gismu whose meanings ought to be "affixable"
> but which lack rafsi:
> matra (motor); vidni (video); risna (heart).
>
>(There are, by my current estimate, some 238 gismu which lack rafsi; but
>not all of them have meanings which ought to be "affixable.")

At this point, we see one fundamentalgap in your understanding. EVERY
gismu has at least one word-final rafsi, and at least one non-word-final
rafsi. These are not listed in tables because they are implicit in the
gismu word-form themselves. The rafsi for "matra" are "matr-y-" and "-matra"
respectively.

> 4
>* gismu whose meanings need not be "affixable"
> but which do have rafsi:
> smo => smoka (sock); bik => bikla (whip);
> rig => rigni (disgusting).

And this point at least makes me supsicious of another understanding that is
missing. On what basis are you saying that these gismu have meanings that
need not be affixable, whereas the ones in point 3 "ought" to ba affixable.
That you can think of a lujvo for one, and cannot think of a lujvo for
the other is not a very strong criterion - you are a single individual, and
an English native speaker. Perhaps someone from a different language
background or culture would feel just the opposite of you and would not
see any reason for a lujvo for risna, and plenty of reason for bikla.

NO analysis of word meanings can provide a culture-free basis for deciding
which of two words needs a rafsi more than another. The only practical
basis is to look at actual usage and proposed usages and see which forms
competing for a single rafsi ARE most useful. Thus, under the current design,
the only short forms permissible to "vidni" are "vid-", "vin-", and "-vi'i".
You would thus have to argue that vidni would be substantially more usefully
assigned toone of these rafsi than the current holder of those assignments.
The existing rafsi assignments were made on the basis of actual usage and
proposed usages, and even 2 years ago it would have taken SEVERAL more
lujvo based on vidni to get it to displace one of the others.

> 5
>* gismu whose meanings ought to be "suffixable"
> but which have only non-suffixable CVC rafsi:
> meb => mebri (brow); run => rutni (artifact);
> tab => tabno (carbon).

This is dealt with by the same argument. But I am particularly curious
about one. In what way is "mebri" particularly appropriate to ahve a
final-position rafsi, while smoka does not justify any rafsi at all.
(notwithstanding the fact that mebri does have the non-shortened "suffix"
form "-mebri".

> 6
>* rafsi with parallel forms but which are derived
> from dissimilar valsi:
> cme => cmene (name); zme => guzme (melon);
> gu'a => gunka (work); bu'a => bruna (brother).
>
> 7
>* gismu with parallel forms but which are contracted
> into dissimilar rafsi:
> cabna (now) <= cab; zabna (favorable) <= zan, za'a;
> senci (sneeze) <= sec; denci (tooth) <= den, de'i.

You are taking a very narrow view of "similar" and "dissimilar".
Your cases in 7 differ only by one letter in the initial position. Fine.
But how about "carna" and
"cabra" which similarly differ only by one letter from "cabna" Do you
consider them to be just as "parallel"?

But it turns out that one of these - "cabra" has a CCV form possible -
"bra", while the other two do not. Thus among those three, they really
AREN'T exactly parallel in structure.

By the standards of Lojban rafsi-making, the words in question DO have similar
structures as we define similar.

Specifically a CVC form rafsi coming from a gismu of form
C1V1C2C3V2 can use C1V1C2 or C1V1C3
and one of form
C1C2V1C3V2 can use C1V1C2 or C2V1C3

Similar constrainst apply to CCV and CVV form rafsi. It happens that in
6 you tried mixing CCVCV gismu with CVCCV gismu, and of course at the
surface ALL such gismu will appear to be of "dissimilar" forms. Why should
either of the two forms take precedence for assignment of a CVC rafsi?

>In fact, I found so many rafsi problems that I now favor scrapping the
>whole rafsi-lujvo system & starting over. At the very least, we should
>discard those 295 rafsi, each of which is identical in form to some cmavo
>but unrelated to it in meaning. Would you folks agree with that?

Given your example of problems that you find with the system, it seems likely
that whatyou object to is NOT the rafsi system itself, but the methods of
rafsi assignment - a far less basic issue (albeit one not open to change
these days).

I strongly suspect that the lack of elegance and beauty that And sees in the
rafsi system is far more basic than the level that you are seeing "ugly".
I'll let him say what his problems are, if he wishes, but having debated with
him before on the issue, I suspect that our differences are based more on
different aesthetic assumptions, and are hence subjective.

lojbab

Logical Language Group

unread,

Oct 23, 1995, 3:44:25 AM10/23/95

to ucl...@ucl.ac.uk, loj...@cuvmb.cc.columbia.edu

The Loglan morphology was redesigned as late as 1979-82 as the Great
Morphological Revolution, which is when unique assignment of rafsi was
introducved and the current system of making lujvo was invented.
A LOT of alternatives were considered at that point, and I have yet to
see a single idea that meets all crtieria of the current language - you
always seem to need to drop at leats one requirement as "unimportant".
The one that came closest was Nora's idea of reserving a specific letter for
ends-of-words, but we considered that as a joke even when we porposed it
- it just sounds too weird.

Ususally proposals either assume that lujvo will be longer than their tanru
by sticking some kind of glue in, that cmavo do not have to have a separate
word-space from gismu and lujvo (and rafsi), etc. None of these have seemed
to be all that much nicer for what they give up. But then I LIKE the current
system.

So no it has not been frozen for 25 years, just 13.

If you want a constantly evolving langauge, look at JCB's TLI Loglan.
It changes more rapidly than Lojban and yet still hasn't caught up with us.
I haven't figured out how this is possible.

lojbab

ucleaar

unread,

Oct 23, 1995, 2:46:01 AM10/23/95

to loj...@cuvmb.cc.columbia.edu

Lojbab:

> I strongly suspect that the lack of elegance and beauty that And sees
> in the rafsi system is far more basic than the level that you are
> seeing "ugly". I'll let him say what his problems are, if he wishes,
> but having debated with him before on the issue, I suspect that our
> differences are based more on different aesthetic assumptions, and are
> hence subjective.

I think that if we could totally redesign the morphology and the
phonology and the phonological forms of vocabulary items then 95%
(arbitrary figure) of the current complexity could be got rid of,
and there could be other advantages, e.g. greater brevity. The
improvement would be massive. But so would the necessary relearning.

But if in the history of Lojban there has been an opportunity for
such a rationalization of the morphology, it surely has not been
within the last 25 years. According to my understanding, when
Lojban split from Institute Loglan, the goal was essentially
to clone Loglan, not to improve it. So Lojban has always been
constrained by decisions in its prior design history; there's
never been a point when one could resolve to scrap all the
morphology and start again from scratch, however massive the
improvements of doing so would be. I therefore agree with
Lojbab that "Within those constraints, I contend that the current tradeoff

is quite elegant, and furthermore, close to optimal".

Lojban is like London, which most people find much less beautiful
than Paris. But Paris is developed by means of centralized power
that rides roughshod over the populace in building its grands
projets and carving boulevardes through slums and so on and so forth.
Architects such as Richard Rogers have grand plans for the Parisification
of London, but no body has both the will and the power the force them
through.

With hindsight, I guess it would have made more sense to develop
two languages, one a speakable form of predicate logic, which could
surely have been fully baselined many many years ago, and the other a
language constantly evolving towards ideals of rationality, elegance
and suchlike.
---
And

jo...@phyast.pitt.edu

unread,

Oct 23, 1995, 5:32:22 PM10/23/95

to loj...@cuvmb.cc.columbia.edu

> A LOT of alternatives were considered at that point, and I have yet to
> see a single idea that meets all crtieria of the current language - you
> always seem to need to drop at leats one requirement as "unimportant".

Here is one possible idea:

Define as rafsi all syllables of the form CCV, CCVN, and KVN, where
CC is any of the 48 permissible initials, N is one of l,m,n,r and K
is any of the other 13 Lojban consonants.

There are 240+960+260 = 1460 such syllables, enough to cover all gismu.

(If those were not enough, the 3380 KVKN are available. They just don't
make very nice lujvo.)

Make gismu from those by adding any CV at the end of the CCV and KVN
forms, or a single vowel at the end of the CCVN forms. (Not necessarily
the same vowel or CV for all. The choice may be arbitrary or follow some
rule, classifying words in some way.)

All the gismu so obtained are morphologically like the ones we have.
The rafsi for each is unique and automatically obtainable from the
gismu, and what is even better, no additional 'r's or 'y's are ever
needed, and the rafsi are always one-syllable, except when in final
position. Lujvo are trivially decomposed because each syllable
always corresponds to a different rafsi.

What criterion would this idea have failed to meet?

Jorge

Logical Language Group

unread,

Oct 24, 1995, 10:04:36 AM10/24/95

to jo...@phyast.pitt.edu, loj...@cuvmb.cc.columbia.edu

>Make gismu from those by adding any CV at the end of the CCV and KVN
>forms, or a single vowel at the end of the CCVN forms. (Not necessarily
>the same vowel or CV for all. The choice may be arbitrary or follow some
>rule, classifying words in some way.)
>
>All the gismu so obtained are morphologically like the ones we have.
>The rafsi for each is unique and automatically obtainable from the
>gismu, and what is even better, no additional 'r's or 'y's are ever
>needed, and the rafsi are always one-syllable, except when in final
>position. Lujvo are trivially decomposed because each syllable
>always corresponds to a different rafsi.
>
>What criterion would this idea have failed to meet?

Well, the obvious one seems to be that the gismu space is so constrained
that assignment of gismu would have to be nearly random.

It is NOT clear that the word-recognition scores algorithm is that effective
for Lojban gismu making, but I think that there is considerable likelihood
that the assignments are better than random.

A consequence also is that Lojban words have an uneve
n phoneme frequency,
and the frequencies of the phonemes are not unlike the frequencies of
natural languages the words were built from. A few people have noticed th
at
althought Lojban words look strange, as a text/phoneme string the language
sounds natural. It is unclear whether a flat distribution would have this
trait.

I can't remember how large the current gismu space is, but it is well over 20K
if my memory is worth anything. Even including in your less-good ra
fsi
forms would lead to only 5K in the gismu space.

I also think that design-wise we would not have found only 1400 gismu as
an upper limit to be too constraining. When we started designing the
language we had only 1000 gismu, and this grew to the current 1300.
It may be baselined for the foreseeable future, but I don't think that there
was any evidence back in 1987 that the number of gismu would stop just at this
particular point. Indeed some of us figured we would end up close to 2000,
based on observations that that number seems to commonly occur as a count
of roots, basic words, etc. in various natural languages. It may even happen
eventually that Lojban will get that high, though not for a lot of years.

It wasn't until the first gismu list baselining in 1989 or 1990 (can't remeber
which year) that the consensus settled towards fewer rather than more gismu,
and by that time the morphology was pretty much set in concrete since we
baselined it first (for obvious reasons - you don't want the rules for what
constitutes a word to change after you have started making wordfs).

lojbab

ucleaar

unread,

Oct 25, 1995, 5:25:05 PM10/25/95

to loj...@cuvmb.cc.columbia.edu

> The Loglan morphology was redesigned as late as 1979-82 as the Great
> Morphological Revolution, which is when unique assignment of rafsi was
> introducved and the current system of making lujvo was invented.

It's surprising that none of them could better the eventual system.
Jorge's idea is clearly superior. Further, perhaps some of the
requirements really were unimportant, or at least not so important
that the system had to be as cumbersome and complicated and
antimnemonic as it has ended up being.

> The one that came closest was Nora's idea of reserving a specific
> letter for ends-of-words, but we considered that as a joke even when we
> porposed it - it just sounds too weird.

Rick Harrison outlined something like that on Conlang once - he
devised a system where a syllable could be word-final iff it belonged
to a specified list of permitted final syllables - or something along
those lines.

> Ususally proposals either assume that lujvo will be longer than their
> tanru by sticking some kind of glue in, that cmavo do not have to have
> a separate word-space from gismu and lujvo (and rafsi), etc. None of
> these have seemed to be all that much nicer for what they give up. But
> then I LIKE the current system.

Here's mine, in brief.

There are 2 kinds of syllable, C(@) and CV. @ is schwa and can be
omitted between certain consonant pairs. Cmavo are all of form CV
or CVCV or CVCVCV, etc. Gismu are all of form C(@)CV (with 17 C
and 5 V, that gives 1445 possible gismu; Lojban actually has 7 V
phonemes and 22 C phonemes, so that gives 2904 possible gismu).
Lujvo are of form C(@)CV-C(@)-C(@)CV-C(@)-C(@)CV (e.g. "kkakkka",
"stendra"), where -C(@)- is "glue", but allows many distinct lujvo
to be made from the same gismu.

The result is greater simplicity and brevity than the present system
offers.

> So no it has not been frozen for 25 years, just 13.

What a shame, then, that the opportunity was wasted, and that 13
years ago was too early for discussion lists like this one.

> If you want a constantly evolving langauge, look at JCB's TLI Loglan.
> It changes more rapidly than Lojban and yet still hasn't caught up
> with us. I haven't figured out how this is possible.

It depends on who works on it, I guess. But can I believe you when
you say Loglan is backward? All I know about it is its orthography,
and as you know I find it much prettier than Lojban's. Oh, but I
also know that Lojban list is a million times interestinger than
Loglanists.

---
And

Chris Bogart

unread,

Oct 27, 1995, 3:01:19 AM10/27/95

to loj...@cuvmb.bitnet

>la .and. cusku di'e

>> There are 2 kinds of syllable, C(@) and CV. @ is schwa and can be
>> omitted between certain consonant pairs. Cmavo are all of form CV
>> or CVCV or CVCVCV, etc. Gismu are all of form C(@)CV (with 17 C
>> and 5 V, that gives 1445 possible gismu; Lojban actually has 7 V
>> phonemes and 22 C phonemes, so that gives 2904 possible gismu).

Guaspi does something similar -- its gismu are monosyllabic but can have
several consonants at the beginning and end; also it counts some liquids and
things as vowels, I think. The use of tones for syntax probably allows some
overlap between cmavo and gismu -- not sure about that though.

I don't know how the numbers work out for guaspi, but in your scheme you've
got the gismu space packed pretty tightly. (1300 gismu out of 1445 or 2904
possible words) I wonder how far you can go with that before the lack of
redundancy makes it hard to speak the language in a noisy room or over the
phone. The cmavo are already like that, but I only actually use and
understand about half of them (less?) so it's hard to know if they'll cause
that problem yet.

What if we constructed the gismu CCVCV/CVCCV as they are now, generated from
the six source languages, but constrained in such a way as to make the CCV
portion unique for each gismu. That CCV part would then be the only rafsi
for that gismu. I don't think that is possible without using your nalmelbi
[pe'i .u'u] consonant cluster scheme, though. Timothy Miller's Ferengi
language takes the approach of jamming consonants together willy-nilly with
schwas as needed, and it's not a pretty sight (sorry, Timothy!) Could we
eke out 1300 by allowing a few more consonant combinations, 3-letter combos
("str") and adding syllabic liquids and sibilants? Throwing "th" and "dh"
into the alphabet? Umlauts? Tones? Clicks? Farts and armpit noises?
*Anything* but random consonant clusters! :-)

[pe'i morphology changes are worth discussing, but I don't actually advocate
any change. se'i the language is sufficient as is -- za'a we're only
suggesting changes because the jai za'o schedule is giving us too much time
on our hands to be perfectionists.]

co...@digitalkingdom.org

unread,

Oct 26, 1995, 12:39:09 PM10/26/95

to Lojban List

jo...@phyast.pitt.edu

unread,

Oct 24, 1995, 12:29:38 PM10/24/95

to loj...@cuvmb.cc.columbia.edu

> >What criterion would this idea have failed to meet?
>
> Well, the obvious one seems to be that the gismu space is so constrained
> that assignment of gismu would have to be nearly random.

Not really, I'm sure you could still get a very high correlation with
the Chinese and English words, which are mostly monosyllables. I doubt
it would be much more random than the current assignment.

You can also add the rafsi KVV, with any CC to form the gismu KVCCV. The
only drawback is that these rafsi need the -r- glue sometimes, but that is
not too bad. "y" is still never needed, and the rafsi are still unique
for a given gismu.

> It is NOT clear that the word-recognition scores algorithm is that effective
> for Lojban gismu making, but I think that there is considerable likelihood
> that the assignments are better than random.

There wouldn't be a significant change in this respect.

> A consequence also is that Lojban words have an uneven phoneme frequency,

> and the frequencies of the phonemes are not unlike the frequencies of
> natural languages the words were built from.

That would still be the case, since for most gismu you would be adding
an arbitrary CC or KV.

> A few people have noticed that
> althought Lojban words look strange, as a text/phoneme string the language
> sounds natural. It is unclear whether a flat distribution would have this
> trait.

My proposal wouldn't have to have a flat distribution.

> I can't remember how large the current gismu space is, but it is well over 20K
> if my memory is worth anything. Even including in your less-good rafsi
> forms would lead to only 5K in the gismu space.

Current gismu space: 48*5*17*5 + 17*5*164*5 = 90100
Proposed gismu space: 46*5*17*5 + 13*5*164*5 = 72850

It is the same space, minus the gismu starting with l,m,n,r.

> I also think that design-wise we would not have found only 1400 gismu as
> an upper limit to be too constraining. When we started designing the
> language we had only 1000 gismu, and this grew to the current 1300.

Ok, with all the additions, there are 5167 possible rafsi, considering all
the forms CCV, CCVN, KVN, KVKN, KVV. Even if you get up to 2000 there is
plenty of redundancy.

> It may be baselined for the foreseeable future, but I don't think that there
> was any evidence back in 1987 that the number of gismu would stop just at this
> particular point. Indeed some of us figured we would end up close to 2000,
> based on observations that that number seems to commonly occur as a count
> of roots, basic words, etc. in various natural languages. It may even happen
> eventually that Lojban will get that high, though not for a lot of years.

That's not a problem.

> It wasn't until the first gismu list baselining in 1989 or 1990 (can't remeber
> which year) that the consensus settled towards fewer rather than more gismu,
> and by that time the morphology was pretty much set in concrete since we
> baselined it first (for obvious reasons - you don't want the rules for what
> constitutes a word to change after you have started making wordfs).

I think that this was really the problem all along. Even when the gismu were
originally made, the idea was to reproduce what JCB had done. I can't believe
that a simpler morphology can't be found if working from scratch.

I just thought of another possibility: make all the gismu identical to
its combining form, all of them of form BAL, where:

B: b, c, d, f, g, j, k, p, s, t, v, x, z, bl, br, cf, ck, cl, cm, ..., zv.
(total: 13+46=59)
A: a, e, i, o, u, ai, au, ei, oi, a'a, a'e, ..., u'u.
(total: 34)
L: l, m, n, r
(total: 4)

That gives a space of 8024 from which to select the 1500 or so gismu,
so there is no reason for the distribution of morphemes to be flat. The
most common roots would be monosyllabic and the less common could have
two syllables.

This would even allow us to forget about the stress rule. To separate
lujvo from tanru all that would be needed is a separator like {co}
for tanru: For example, if {xun} was "red" and "zda" was "house",
then {xunzda} would be red-house, and the tanru could be {zda co xun}
or {xunvau zda} (or something else if {vau} can't be used like that).
Even with the additional separator, tanru would not be longer than
they are now.

That is just one possibility, I'm sure there are millions more that
allow to make lujvo in a simple manner (which is the biggest problem
with the current system).

Obviously it can't be changed at this point, but I don't think that
what we have is the best that we could have given a reasonable set of
criteria. It may be the best only if we take the list of gismu as a
given, but that is a historical constraint, so And was right about the
25 years.