What actually are the rules of word formation?

129 views
Skip to first unread message

TR NS

unread,
Apr 19, 2015, 10:18:06 AM4/19/15
to lojban-b...@googlegroups.com
When I first learned about loglan/lojban I thought all words were of the form `CVCCV` or `CCVCV`, and compound words (lujvo) were always `(CVC|CCV)...CV`. I knew names had more flexibility. Later I learned that borrowed words did too. Even later I learned that compound words too had more flexible rules, e.g. hypen letters. And finally in the end, starring me in the face, it dawned on me that even simple words like `brivla` did not fit the original formations.

After reading a fair bit, I find myself more confused than ever. Are there any single set of definitive rules defining what and what isn't a legal lojban word?

Jorge Llambías

unread,
Apr 19, 2015, 3:13:55 PM4/19/15
to lojban-b...@googlegroups.com
On Sun, Apr 19, 2015 at 11:18 AM, TR NS <tran...@gmail.com> wrote:
When I first learned about loglan/lojban I thought all words were of the form `CVCCV` or `CCVCV`, and compound words (lujvo) were always `(CVC|CCV)...CV`. I knew names had more flexibility. Later I learned that borrowed words did too. Even later I learned that compound words too had more flexible rules, e.g. hypen letters. And finally in the end, starring me in the face, it dawned on me that even simple words like `brivla` did not fit the original formations.

And you forgot cmavo. 

After reading a fair bit, I find myself more confused than ever. Are there any single set of definitive rules defining what and what isn't a legal lojban word?

Except for a couple of minor details that only really affect fu'ivla, yes. The rules for gismu and lujvo have been definitive for decades. The most complicated part of the word formation rules are the ones for lujvo. The additional rules needed for fu'ivla are relatively easy, but they are built on top of the lujvo rules, because what is or is not a fu'ivla depends on whether or not it is a lujvo, which have priority. Since gismu, cmavo and cmevla are trivial, and fu'ivla depend on lujvo, you should probably start familiarizing yourself with the lujvo making rules, which you can find in CLL:  http://lojban.github.io/cll/4/11/

mu'o mi'e xorxes

Pierre Abbat

unread,
Apr 19, 2015, 4:38:08 PM4/19/15
to lojban-b...@googlegroups.com
I started learning Lojban after the rafsi reallocation. Any lujvo that was
valid when I started is still valid, including "brivla". The changes (there
was one a few months ago, stating that a syllable cannot contain CGV, where C
is a consonant, G is a glide, and V is a vowel) affect only fu'ivla.

Pierre
--
sei do'anai mi'a djuno puze'e noroi nalselganse srera

mezohe

unread,
Apr 19, 2015, 6:01:48 PM4/19/15
to lojban-b...@googlegroups.com
de'i li 2015-04-19 ti'u li 16:18 li'ai TR NS di'e cusku:
> When I first learned about loglan/lojban I thought all words were of the
> form `CVCCV` or `CCVCV`, and compound words (lujvo) were always
> `(CVC|CCV)...CV`.

That seems correct for original, pre-rafsi Loglan. Words with other
shapes were invented later.

> I knew names had more flexibility. Later I learned
> that borrowed words did too. Even later I learned that compound words
> too had more flexible rules, e.g. hypen letters. And finally in the end,
> starring me in the face, it dawned on me that even simple words like
> `brivla` did not fit the original formations.

Par for the course. Most of the learning materials leave the morphology
unexplained, and even the CLL leaves some questions open.

> After reading a fair bit, I find myself more confused than ever. Are
> there any single set of definitive rules defining what and what isn't a
> legal lojban word?

Each parser has its own set of word formation rules. Most words in use
are recognized by all parsers but there are edge cases. I'll try to lay
out how camxes does it, since the existing documents explain it a little
disjointedly.

------ MEZOHE'S CONDENSED CLL 2.0 MORPHOLOGY CHAPTER COUNTERFEIT ------

=== Phonemes ===

At the most basic level, an utterance is made of phonemes. Here are the
main classes of phonemes (there are subclasses as seen later):

- consonants {zunsna}:
bdgjvz (voiced), cfkpstx (unvoiced), lmnr (syllabic)
- glides {karmlisna}: i u
- h {me'o .y'y}: '
- word break (glottal stop) {depybu'i}: .
- vowels {karsna}: a e i o u
- diphthongs: au ai ei oi
- y {me'o .ybu}: y

The comma {me'o slaka bu} isn't a phoneme, but is used to separate
syllables for clarity. Removing it has no effect.

i and u are vowels, unless a vowel or diphthong follows, in which case
they are glides. Glide-diphthong pairs win over glide-vowel pairs, which
win over diphthongs.

At this level, strings of consonants follow these rules:
- consonants can be next to consonants, word breaks, vowels,
diphthongs, and y
- no consonant can be followed by itself
- voiced consonants can't be next to voiceless ones, and vice versa
- sibilants (cjsz) can't be next to each other
- x can't be next to c or k
- the substrings mz, nts, ntc, ndz, ndj are not allowed

Glides must follow a word break, vowel, diphthong, or y, and be followed
by a vowel, diphthong, or y. i as a glide can't follow a diphthong
ending in i, and u as a glide can't follow the diphthong au.

h can't be next to a consonant, glide, or glottal stop.

Vowels, diphthongs, and y can be next to consonants, glides, h, and word
breaks.

=== Syllables ===

These are the shapes syllables {slaka} can have:

* Vowel syllable
- a word break, a glide, or up to three consonants
- then a vowel or a diphthong
- then optionally a consonant
- e.g. .a, spa, pan, blaif, stra

* h-syllable
- the letter '
- then a vowel or diphthong
- then optionally a consonant
- e.g. 'u, 'ei, 'am

* y-syllable
- a word break, a glide, or up to three consonants
- then the letter y
- e.g. by, .y, gry, zbly

* hy-syllable
- the string "'y"

* consonantal syllable {zunsnaslaka}
- a consonant
- then a syllabic consonant
- e.g. fl, sm, rn

When a syllable starts with more than one consonant, the rules for these
clusters {zunsnagri} are more restrictive than the general ones above.
These are the permissible initial doubles, stolen with love from CLL:

pl pr fl fr
bl br vl vr

cp cf ct ck cm cn cl cr
jb jv jd jg jm
sp sf st sk sm sn sl sr
zb zv zd zg zm

tc tr ts kl kr
dj dr dz gl gr

ml mr xl xr

And the permissible initial triples:

cfr cfl sfr sfl jvr jvl zvr zvl
cpr cpl spr spl jbr jbl zbr zbl
ckr ckl skr skl jgr jgl zgr zgl
ctr str jdr zdr
cmr cml smr sml jmr jml zmr zml

When segmenting text into syllables, when a consonant could possibly
either start a syllable or end one, it's always taken to start one. In
other words, onsets are greedy, codas are lazy.

=== Words ===

Words can be cmavo, cmevla, or brivla. cmavo and brivla are made of
syllables, while cmevla are free strings of phonemes.

cmavo are composed of:

- one vowel- or y-syllable, with at most one initial consonant and no
final consonant
- optionally followed by any number of h- or hy-syllables without any
final consonants

Examples: .a, ba, bai, ba'i, ba'ai, by, by'i, ia, iai, iy, ua'ai'y

There are two exceptions: "ybu", also spelled "y.bu", is a single cmavo
despite the medial consonant and word break, and "y" surrounded by word
breaks and not followed by "bu" is a word break itself, not a cmavo.

cmavo can be stressed on any syllable.

cmevla are arbitrary strings of phonemes, following phoneme but not
syllable restrictions, starting with a word break, containing no word
breaks, and ending with a consonant followed by a word break. They can
be stressed on any vowel, diphthong, or syllabic consonant.

A brivla is composed of any number of initial rafsi followed by a final
rafsi. It must begin with a vowel syllable, end with a vowel- or
h-syllable, and have at least two syllables. It may not be a slinkuhi,
and may not start with a sequence of cmavo that yields a valid word when
removed. Stress (marked here with a grave accent) is on the second-last
vowel- or h-syllable.

A final rafsi is:

- a zihevla:
- a vowel syllable
- followed by any number of vowel, h-, or consonantal syllables
- followed by a vowel- or h-syllable with no final consonant
- is not a gismu or sequence of more than one rafsi
- e.g. cpi,kù,ku àl,ga fì,pr,koi glàu,ka sprà,'e
- or a gismu:
- a CV vowel syllable followed by a CCV one
- or a CVC one then a CV one
- or a CCV one then a CV one
- e.g. pà,stu vèd,li tsà,ni
- or a short final rafsi:
- a CVV or CCV vowel syllable, e.g. xau, cpa
- or a CV vowel syllable followed by a 'V h-syllable,
e.g. fà'i

An initial rafsi is any one of these:

- a gismu followed by the syllable "'y"
e.g. fasnu'y
- a gismu with its final vowel replaced with y
e.g. fasny
- a zihevla followed by the syllable "'y"
e.g. sorpeka'y
- a CV vowel syllable followed by a Cy y-syllable
e.g. fa,ky
- a short y-less rafsi, unless the following rafsi is a zihevla rafsi:
- a vowel syllable of the form CVV, CVVr, CVC, or CCV
- or a CV syllable followed by a 'V or 'Vr syllable
e.g. gau gaur gas jbu li,'a li,'ar
- a short y-less rafsi followed by a short final rafsi followed by "'y"
e.g. cau,cni,'y ri,'ar,ju,'o,'y mul,fau,'y, jbo,jbe,'y
- a zihevla that ends in a vowel syllable with its final vowel replaced
with y, unless the result breaks up into a string of any other rafsi
e.g. ka,'or,ty a,sny

If a CVVr or CV'Vr rafsi is followed by a rafsi beginning with "r", and
only then, the final "r" of the first rafsi is replaced with an "n".
If a rafsi ending in "y" is followed by a rafsi beginning with a vowel,
and only then, an "'" is prepended to the second rafsi. In other
situations where sticking two rafsi together violates phoneme or
syllable rules, the left rafsi needs to be replaced with one ending with
"y".

A brivla consisting of just a zihevla is called a zihevla, one
consisting of just a gismu is a gismu, and all others are called lujvo.

A slinkuhi {valslinku'i} is a [consonant followed by a brivla that up to
its first y-syllable, or if no y-syllables, in its entirety, is composed
of non-zihevla rafsi] that itself can't be broken up into a string of rafsi.
e.g. _p_rà,'i _s_pòr,te _z_bla,zdà,vro _c_nar,jy,fra,gà,ri

Other non-words also behave like slinkuhi, in that prepending a cmavo
makes them a word, but these arise from rules other than the one named
slinkuhi.
e.g. cpa cpau cpra cprau (brivla must have 2+ syllables)
cl,pàr,nu (brivla must start with a vowel syllable)

A tosmabru {valrtosmabru} is a sequence of cmavo followed by a brivla.
tosmabru can be coerced into being brivla by adding a consonant at the
end of the last syllable of the first cmavo.

e.g. gau,tcì,ni -> gau tcini; cmavo + gismu
gaur,tcì,ni -> gaurtcini; a single lujvo
.a,'u,nain,mo -> .a'u nainmo; cmavo + zi'evla
.a,'ur,nain,mo -> .a'urnainmo; a single zihevla
boi,kèi,foi -> boi kèi foi; three cmavo
boir,kèi,foi -> boirkeifoi; a single lujvo

=== Word breaks, glottal stops ===

All word breaks may be pronounced as glottal stops, and some word breaks
have to. Glottal stops are required before and after all cmevla, as well
as before all words starting with a vowel or "y". They are also required
after certain cmavo:

- When pronouncing two words together would break a phonotactic rule,
they need to be separated with a glottal stop.
e.g. "au" "uàn,mo" -> {.au .uanmo}

- Each pair of cmavo of the form CV Cy followed by either a brivla or a
cmavo of the form CVV or CV'V needs a glottal stop between the last
and second-last word.
e.g. "ca" "vy" "càr,vi" -> {ca vy. carvi} /Sa.vy?.'Sar.vi/
(/Sa.vy.'Sar.vi/ would be {cavycarvi}, a lujvo)

- Every stressed cmavo followed by a brivla starting with a consonant
cluster needs a glottal stop after the cmavo.
e.g. "bà" "sna,jù,'i" -> {bà. snaju'i} /'ba?.sna.'Zu.hi/
(/'ba.sna.'Zu.hi/ would be {basna jù'i}, a gismu and a cmavo)

=== Parser peculiarities ===

jbofihe, popular before camxes came along, has different rules than camxes.

* Vowel syllables

- They may start with any number of consonants, and the rule for
initial triples doesn't exist. The only restriction is that all
pairs in the initial cluster need to be valid initial pairs.
e.g. {stsmla'u} is a word

- They may end with up to two consonants, not just one.
e.g. {bongnanba} is a word

- Syllables beginning with glides are their own type, and if not
preceded by a glottal stop, they continue the word like an
h-syllable.
e.g. {.aierne} is one word, not two,
{.ia} always starts with a glottal stop

- Syllables beginning with vowels don't require a word boundary
before them.
e.g. {sincrboa} is a word, {.joan.} is a word

(Or, more accurately, jbofihe has no notion of syllables in the sense
that camxes does, but even under jbofihe practically no one would use
words that violated these modified syllable rules)

* cmevla

Dotside doesn't apply: the beginning of cmevla can also be delimited by
some cmavo, namely {la}, {lai}, {la'i}, or {doi}. If one of these cmavo
precedes a cmevla, no initial glottal stop is required. cmevla can't
contain any of these cmavo. For example {la .larfin.} parses as three
words, "la" "la" "rfin"

* brivla

zihevla as final rafsi, rafsi beginning with vowels, and rafsi ending in
"'y" do not exist.
e.g. {bardykentauru}, {.algyro'i}, {sorpeka'ykla} aren't words

rafsi with CVCy shape are illegal if the corresponding CVC rafsi is
legal in the situation.
e.g. {jbobanyjvo} isn't a word, only {jbobanjvo} is

rafsi with CVVr or CV'Vr shape are only recognized as rafsi if using the
corresponding CVV or CV'V rafsi would result in tosmabru.
e.g. {lerpi'oci'arci'e} is a zihevla,
{lerpi'oci'aci'e} is a lujvo,
{ci'arci'e} is a lujvo

All brivla must have a consonant cluster within the first five letters
after ' and y are removed. {ko'oinde} is not a word.

----------------------------------------------------------------------

I hope that I didn't overlook too many rules and that the text is fairly
understandable. Do tell if something is wrong or unclear.

mu'o do

selpa'i

unread,
Apr 19, 2015, 6:24:38 PM4/19/15
to lojban-b...@googlegroups.com
la cirko cu cusku di'e
> ------ MEZOHE'S CONDENSED CLL 2.0 MORPHOLOGY CHAPTER COUNTERFEIT ------
>
> [snip]
>
> I hope that I didn't overlook too many rules and that the text is fairly
> understandable. Do tell if something is wrong or unclear.

*Applause*

A very nice summary, and surely very time-consuming to put together.
Reading through it, it's no wonder so many people are so confused or
overwhelmed by Lojban's morphology, it's extremely complex (too complex?
pau nai ru'e). It's easy to "forget" how complicated it is when one is
already so familiar with all of its hundreds of rules.

mi'e la selpa'i mu'o

Jorge Llambías

unread,
Apr 19, 2015, 7:52:29 PM4/19/15
to lojban-b...@googlegroups.com
On Sun, Apr 19, 2015 at 5:19 PM, mezohe <wow...@gmail.com> wrote:

------ MEZOHE'S CONDENSED CLL 2.0 MORPHOLOGY CHAPTER COUNTERFEIT ------

Very impressive. And more succinct than I expected it would have to be, given all the complexity.  
A few minor comments:

i and u are vowels, unless a vowel or diphthong follows, in which case they are glides. Glide-diphthong pairs win over glide-vowel pairs, which win over diphthongs.

Not sure about "Glide-diphthong pairs win over glide-vowel pairs". "iaua" is "ia,ua" not "iau,a" so the glide-vowel wins in this case. Except that "au" can't really be a diphthong here because a diphthong can't be directly followed by a vowel, but I'm not sure if that's supposed to be understood at this stage, since that rule hasn't been mentioned yet.

 
i as a glide can't follow a diphthong ending in i, and u as a glide can't follow the diphthong au.

This rule hasn't been implemented yet. Currently it's just that a glide can't follow a diphthong.
 
h can't be next to a consonant, glide, or glottal stop.

Nor can it start or end a word. Alternatively: "h must be between one vowel/diphthong and another". 

* y-syllable
  - a word break, a glide, or up to three consonants
  - then the letter y
  - e.g. by, .y, gry, zbly

* hy-syllable
  - the string "'y"
 
y-syllables and hy-syllables could in principle end in a consonant too, although this is never actually realized in brivla. But there is a proposal to make the r-hyphen of type-3 fu'ivla "-yr-", which would introduce y-syllables with codas in fu'ivla. (And eliminate the need for -l- as a variation of the type-3 r-hyphen.)

A final rafsi is:

- a zihevla:
  - a vowel syllable
  - followed by any number of vowel, h-, or consonantal syllables
  - followed by a vowel- or h-syllable with no final consonant
  - is not a gismu or sequence of more than one rafsi
  - e.g. cpi,kù,ku  àl,ga  fì,pr,koi  glàu,ka  sprà,'e
- or a gismu:
... 

I think the definition of zihevla also needs to say that it may not start with a sequence of cmavo that yields a valid word when removed, otherwise "fa'i" would count as a zi'evla, even if it would never make it as a brivla.

A slinkuhi {valslinku'i} is a [consonant followed by a brivla that up to its first y-syllable, or if no y-syllables, in its entirety, is composed of non-zihevla rafsi] that itself can't be broken up into a string of rafsi.
  e.g. _p_rà,'i  _s_pòr,te  _z_bla,zdà,vro  _c_nar,jy,fra,gà,ri

The part following the consonant doesn't have to be brivla, just a string of one or more rafsi, as your first example shows. "pra'i" or "pra'ira'i" are slinku'i even though the consonant is not followed by a brivla.

- Each pair of cmavo of the form CV Cy followed by either a brivla or a
  cmavo of the form CVV or CV'V needs a glottal stop between the last
  and second-last word.
    e.g. "ca" "vy" "càr,vi" -> {ca vy. carvi} /Sa.vy?.'Sar.vi/
         (/Sa.vy.'Sar.vi/ would be {cavycarvi}, a lujvo)

or alternatively between the first and second word: "ca.vycarvi" should also work.

Very good description of the morphology!

Stela Selckiku

unread,
Apr 20, 2015, 2:29:42 AM4/20/15
to lojban-b...@googlegroups.com
I feel like it should be noted that while it's useful to understand this exact algorithm, processing that entire algorithm afresh is of course not how words are produced or understood at conversational speed in normal sentences. The way the rules are enacted at speed is (as with many things in language) through broad fast pattern recognition. The rules produce numerous patterns, each of which gradually become increasingly familiar and then obvious and then reflexive.

For instance one of the particular patterns implied by the full set of rules is that you can put together any two CCV rafsi to make a CCVCCV lujvo like {brivla}. Without understanding all of the implications and edge cases of the whole set of rules you can simply take this example and start producing that shape of lujvo with whatever CCV rafsi you know. How about CVV and CV'V rafsi then? Those work there too, CCVCV'V, CCVCVV, CVVCCV, CV'VCCV, all good, the only complication is with CVV + CVV you'll need an -r- hyphen (or very occasionally an -n-), which makes sense if you compare with how CVV CVV is two cmavo. Knowing a few shapes like that makes a zillion combinations you can make, you can start making lujvo all day long, and then when you learn which CCC are ok you can feel confident making CVCCCV or CVCyCCV shaped lujvo too, etc.

Similarly you don't need to know every zi'evla shape in the universe to start making zi'evla. Every word you know is an example of which forms are allowed. If you learn the word {sorpeka} (bus) then you've also learned that you can make any word of that shape, like IDK, {bamnosa} or {morpuki}. I don't need to compare {morpuki} to some complicated algorithm-- I can compare it to {sorpeka} and see that it's the same shape so it's allowed and it's in that category.

The algorithms are just a clear collectively understood way of thinking about unusual or unused forms, unexplored edge cases. The normal way you speak Lojban is by knowing words and their categories and making other words shaped the same shapes as the words you already know. It's generally pretty easy and fun!

mu'o mi'e la stela selckiku

mezohe

unread,
Apr 20, 2015, 2:46:36 AM4/20/15
to lojban-b...@googlegroups.com
de'i li 2015-04-20 ti'u li 01:52 la .xorxes. di'e cusku:
> A few minor comments:

Thanks for the corrections. I've found a few more points which I'll
handle below.

> i as a glide can't follow a diphthong ending in i, and u as a glide
> can't follow the diphthong au.
>
>
> This rule hasn't been implemented yet. Currently it's just that a glide
> can't follow a diphthong.

It is in the experimental grammar though, and seems to be the only
difference in morphology between official and experimental.

- diphthong = (a i / a u / e i / o i) !nucleus !glide
+ diphthong = (a u !u / (a i / e i / o i) !i) !nucleus

Was there any opposition to having this in the official grammar, other
than opposition to glides in general not needing a glottal stop? The
change doesn't affect any existing valid parses (or does it?)

de'i li 2015-04-19 ti'u li 22:19 la mezohe di'e cusku:
> An initial rafsi is any one of these:
> [...]
> - a zihevla that ends in a vowel syllable with its final vowel replaced
> with y, unless the result breaks up into a string of any other rafsi
> e.g. ka,'or,ty a,sny

Firstly the final syllable can only have a vowel, not a diphthong, and
secondly, in the current grammar, it can't start with a glide, only a
consonant. (Is there a reason for that second restriction?)

> Other non-words also behave like slinkuhi, in that prepending a cmavo
> makes them a word, but these arise from rules other than the one named
> slinkuhi.

A cmavo containing no "y", that is.

> A tosmabru {valrtosmabru} is a sequence of cmavo followed by a brivla.
> tosmabru can be coerced into being brivla by adding a consonant at the
> end of the last syllable of the first cmavo.

Again, the cmavo can't contain "y"

(jbofihe comparison section)
> - Syllables beginning with glides are their own type, and if not
> preceded by a glottal stop, they continue the word like an
> h-syllable.
> e.g. {.aierne} is one word, not two,
> {.ia} always starts with a glottal stop

This ignores words like {biardo}, which camxes used to accept, and
{bliardo}, which I don't remember it ever doing. More accurately said,
what camxes sees as a glide-vowel pair, jbofihe sees as a diphthong, and
together with the next rule diphthong-glide strings continue to be allowed.

> - Syllables beginning with vowels don't require a word boundary
> before them.
(with vowels or diphthongs)
> e.g. {sincrboa} is a word, {.joan.} is a word

{bliardo}, {bauiardo}, {.aierne}, {malrfiasko} are all words.

Also, strings like {fiasko} and {mrafia} are not words in jbofihe,
though consonant-glide-happy camxes did accept them, since they are seen
as lujvo-shaped but with illegal rafsi.

MorphemeAddict

unread,
Apr 20, 2015, 4:23:12 PM4/20/15
to lojban-b...@googlegroups.com
Very well done!

As I was reading through it, I wanted more examples, but looking back for an example of no examples, I don't find any. 

stevo



--
You received this message because you are subscribed to the Google Groups "Lojban Beginners" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban-beginne...@googlegroups.com.
To post to this group, send email to lojban-b...@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban-beginners.
For more options, visit https://groups.google.com/d/optout.

Jorge Llambías

unread,
Apr 20, 2015, 7:18:08 PM4/20/15
to lojban-b...@googlegroups.com
On Mon, Apr 20, 2015 at 3:46 AM, mezohe <wow...@gmail.com> wrote:
de'i li 2015-04-20 ti'u li 01:52 la .xorxes. di'e cusku:
This rule hasn't been implemented yet. Currently it's just that a glide
can't follow a diphthong.
It is in the experimental grammar though, and seems to be the only difference in morphology between official and experimental.

- diphthong = (a i / a u / e i / o i) !nucleus !glide
+ diphthong = (a u !u / (a i / e i / o i) !i) !nucleus

Was there any opposition to having this in the official grammar, other than opposition to glides in general not needing a glottal stop? The
change doesn't affect any existing valid parses (or does it?)

It's just a slightly more complicated rule. OK, I'll change it to  

 diphthong = (a i !i / a u !u / e i !i / o i !i) !nucleus

just because I don't like rules with indented brackets.


de'i li 2015-04-19 ti'u li 22:19 la mezohe di'e cusku:
An initial rafsi is any one of these:
[...]
- a zihevla that ends in a vowel syllable with its final vowel replaced
   with y, unless the result breaks up into a string of any other rafsi
     e.g. ka,'or,ty  a,sny

Firstly the final syllable can only have a vowel, not a diphthong, and secondly, in the current grammar, it can't start with a glide, only a consonant. (Is there a reason for that second restriction?)

I don't remember, the grammar of glides has been unstable, so maybe there was a reason at some point. I think "iy" and "uy" used to be "reserved for future use". I'll change it to:

stressed-fuhivla-rafsi <- fuhivla-head stressed-syllable !h onset y 

fuhivla-rafsi <- &unstressed-syllable fuhivla-head !h onset y h?

selpa'i

unread,
Apr 21, 2015, 1:17:38 PM4/21/15
to lojban-b...@googlegroups.com
la .xorxes. cu cusku di'e
> I don't remember, the grammar of glides has been unstable, so maybe
> there was a reason at some point. I think "iy" and "uy" used to be
> "reserved for future use".

And we've managed to find a use for them, namely as lerfu for the
semi-vowels (which I like to write as ĭ and ŭ).

(see: http://jbovlaste.lojban.org/dict/uy and
http://jbovlaste.lojban.org/dict/iy )

mezohe

unread,
Apr 22, 2015, 5:01:00 AM4/22/15
to lojban-b...@googlegroups.com
Another bug. Following these rules to the letter, words like
{gastro}/{neslau}/{ladzanja} would be split as
{ga,stro}/{ne,slau}/{la,dzan,ja} and classed as zihevla, not lujvo,
which in turn makes {cazgastro} and so on tosmabru. It seems like the
easiest fix is to say that rafsi are syllable-shaped strings, not
necessarily syllables.

de'i li 2015-04-19 ti'u li 22:19 la mezohe di'e cusku:
> When segmenting text into syllables, when a consonant could possibly
> either start a syllable or end one, it's always taken to start one. In
> other words, onsets are greedy, codas are lazy.
[...]
>
> A final rafsi is:
>
> - a zihevla:
> - a vowel syllable
> - followed by any number of vowel, h-, or consonantal syllables
> - followed by a vowel- or h-syllable with no final consonant
> - is not a gismu or sequence of more than one rafsi
> - e.g. cpi,kù,ku àl,ga fì,pr,koi glàu,ka sprà,'e
[...]
> - or a short final rafsi:
> - a CVV or CCV vowel syllable, e.g. xau, cpa
> - or a CV vowel syllable followed by a 'V h-syllable,
> e.g. fà'i
>
> An initial rafsi is any one of these:
[...]

TR NS

unread,
Apr 22, 2015, 9:10:25 PM4/22/15
to lojban-b...@googlegroups.com
Thank you! That's very helpful, albeit *scary*!

If I find the time I might try to create a diagram out of it -- if at all possible.


TR NS

unread,
Apr 22, 2015, 9:28:21 PM4/22/15
to lojban-b...@googlegroups.com


On Monday, April 20, 2015 at 2:29:42 AM UTC-4, la stela selckiku wrote:
I feel like it should be noted that while it's useful to understand this exact algorithm, processing that entire algorithm afresh is of course not how words are produced or understood at conversational speed in normal sentences. The way the rules are enacted at speed is (as with many things in language) through broad fast pattern recognition. The rules produce numerous patterns, each of which gradually become increasingly familiar and then obvious and then reflexive.


That makes sense. But it sounds a lot like what I would expect learning a *natural language*, not so much from a *logical* one.

TR NS

unread,
Apr 22, 2015, 9:32:21 PM4/22/15
to lojban-b...@googlegroups.com


On Sunday, April 19, 2015 at 6:01:48 PM UTC-4, cirko wrote:

- a zihevla:
   - a vowel syllable
   - followed by any number of vowel, h-, or consonantal syllables
   - followed by a vowel- or h-syllable with no final consonant
   - is not a gismu or sequence of more than one rafsi
   - e.g. cpi,kù,ku  àl,ga  fì,pr,koi  glàu,ka  sprà,'e

So the only way to identify a zi'hevla is by stress? e.g. `cpiku ku` vs. `cpikuku`

Pierre Abbat

unread,
Apr 23, 2015, 2:02:34 AM4/23/15
to lojban-b...@googlegroups.com
You tell that a word is in greater brivla space (a term from my program
valfendi) because it ends in a vowel and its second consonant is either next
to another consonant or separated by "y" from a consonant. If you have stress
but not word divisions (i.e. you're hearing speech), the word division of a
brivla is such that the next-to-last syllable (not counting syllables whose
nucleus is "y" or a consonant) is stressed and the last syllable ends in a
vowel. You tell a zi'evla from a gismu because it is not a gismu or sequence
of more than one rafsi. (A single rafsi either doesn't have two consonants or
doesn't have two vowels.)

"cpíkuku" is a gismu and a cmavo: "cpiku ku". "cpikúku" is a zi'evla.
"dadysabodre" is in greater brivla space but is not a word. Since it has "y"
in it, it's a lujvo if it's a brivla; but it cannot be broken into rafsi.

Pierre
--
Jews use a lunisolar calendar; Muslims use a solely lunar calendar.

Jorge Llambías

unread,
Apr 23, 2015, 4:31:02 PM4/23/15
to lojban-b...@googlegroups.com
On Thu, Apr 23, 2015 at 3:02 AM, Pierre Abbat <ph...@bezitopo.org> wrote:
"dadysabodre" is in greater brivla space but is not a word. Since it has "y"
in it, it's a lujvo if it's a brivla; but it cannot be broken into rafsi.
 
camxes will accept "dadysabodre" for "da dy sa bodre".

But "dadrysabodre" or "dradysabodre" are a non-words.
Reply all
Reply to author
Forward
0 new messages