camxes and syllabification in zi'evla

32 views
Skip to first unread message

Alex Burka

unread,
Oct 23, 2014, 12:49:59 PM10/23/14
to bpfk...@googlegroups.com
coi lai .bypyfyk.

I've been looking at the syllabification done by the camxes parser when it parses cmevla and zi'evla, and one detail surprised me.

camxes doesn't compute syllables for lujvo (it computes the component rafsi instead). I was considering how to compute the syllables for a lujvo, either by turning it into a zi'evla (by prepending valsr-) or into a cmevla (by appending -s). zo {mu'erskepre} ("physicist") is my subject. I changed the output in the PEG (but not the actual rules) a little to show me the syllables[1].

valsrmu'erskepre: val-sr-mu-'er-SKE-pr-e
mu'erskepres: mu-'er-ske-pres

The advantage of lujvo=>zi'evla is that you see where camxes thinks the stress is -- but I didn't expect that consonantal_syllable "pr". Is that really how you would pronounce that zi'evla? How many syllables does it have?

mu'o mi'e la durkavore

Jorge Llambías

unread,
Oct 23, 2014, 5:01:20 PM10/23/14
to bpfk...@googlegroups.com
On Thu, Oct 23, 2014 at 1:49 PM, Alex Burka <dur...@gmail.com> wrote:

valsrmu'erskepre: val-sr-mu-'er-SKE-pr-e
mu'erskepres: mu-'er-ske-pres

The advantage of lujvo=>zi'evla is that you see where camxes thinks the stress is -- but I didn't expect that consonantal_syllable "pr". Is that really how you would pronounce that zi'evla? How many syllables does it have?

That'a bug in the morphology, well found! I've changed the consonantal-syllable rule to:

  consonantal-syllable <- consonant syllabic !nucleus (consonant &spaces)?

Hopefully that should fix it.

mu'o mi'e xorxes

mukti

unread,
Oct 24, 2014, 6:39:57 AM10/24/14
to bpfk...@googlegroups.com
On Thursday, October 23, 2014 6:01:20 PM UTC-3, xorxes wrote:
That'a bug in the morphology, well found! I've changed the consonantal-syllable rule to:
  consonantal-syllable <- consonant syllabic !nucleus (consonant &spaces)?

Using this rule change, I reverified the current contents of jbovlaste and the camxes test corpus. The following words in jbovlaste are no longer considered morphologically valid according to this rule:

cmevla:
klivlynd
smanyjinkytoldu'evir

fu'ivla:
artmozaiko
asnrlatna
asnrtarbi
badnrgrute
baknrzebu
bilmrtuberkulosi
catnrpepiskopo
ciblrmoru
cipnrdodo
cipnrdromai
cipnrfalko
cipnrfasani
cipnrkanario
cipnrkorvo
cipnrkuku
cipnrlaridei
cipnrlori
cipnrpaseru
cipnrpika
cipnrsagitariidai
cipnrsikonia
cipnrstrutio
cipnrxirundo
cipnrxuazine
cirlnrokforte
cirlnxiogluto
cirlrbri
cirlrceda
cirlrfeta
cirlrgorgonzola
cirlrkamumberti
cirlrmozarela
cirlrpanira
cirlrparmaregio
cirlrpreste
cirlrstilto
cirlrxalumi
cirlrxauda
dasrngeko
datnrzbaselpla
finprsinxnatfidai
finprsinxnatfinai
fipnrpetoikti
fipnrprotopteru
fiprntosfenu
gurnrbulguru
gurnrtefi
guzmrkukurbita
jatnrpapa
jinmrberilo
jinmrplati
jinmrtitani
jinmrtuli
jinmrxafni
jisrnxananase
jivnlragbi
jivnrfarzu'e
juknrfalangida
kamrngogolo
klaktno
koblrsinapi
kulnrfarsi
kulnrnorge
kulnrnorgo
kulnrsfe'enska
kulnrtai
kulnrturkie
kulnrturko
kulnrxirani
latmrbizanto
mabrnfuru
mastla
matnrmiristika
maxrnspelta
mivrlge
mudrnsia
mustlei
navnlrado
navnrkripto
navnrxeno
nimrnlatifolia
nimrnlimone
nimrnxaurantifolia
pipnrpiano
postmo
purmrderi
ranmrdrakono
rartni
ricrlbizi
ricrnsia
runtngasnrproni
sodnlrubidi
sodnrcesi
sodnrfransi
sodnrkali
sodnrlito
sparknipofia
srasrnrupia
tabrntromba
tabrnvuvuzela
venzla
vibnrbarpinji
xagrnklarineto
xagrnsaksofono
xaslrkianga
xipfne
xubrnre'u
xubrnrumeksa

zei-lujvo:
pipnrpiano zei konceto

Additionally, the following words from the camxes test corpus, which includes examples from CLL and text from "Alice", no longer parse:

bangrtlingana
banrtlingana
banrtlinganu
bilmrmautisma
bilmrdisleksia
bongnanba
cipnrparota
cipnrpisitako
cipnrxakuila
cirlrkotadja
danlrxelefanta
danlrkoralo
datnrselecti
gudjrati
ingmeme
kulnrmerka
kulnrperu
kulnrsu,omi
kulnrtcosena
mablrbastarda
mabrnmustela
natmrnorge
xukmrkokeina
xukmrkokeini

If these changes are approved, I will update jbovlaste, reclassifying the words as obsolete. I will also mark as unparseable the sentences in the test corpus which contain the problematic words.

mi'e la mukti mu'o

Jorge Llambías

unread,
Oct 24, 2014, 7:42:42 AM10/24/14
to bpfk...@googlegroups.com
On Fri, Oct 24, 2014 at 7:39 AM, mukti <shun...@gmail.com> wrote:
On Thursday, October 23, 2014 6:01:20 PM UTC-3, xorxes wrote:
That'a bug in the morphology, well found! I've changed the consonantal-syllable rule to:
  consonantal-syllable <- consonant syllabic !nucleus (consonant &spaces)?

If these changes are approved, I will update jbovlaste, reclassifying the words as obsolete. I will also mark as unparseable the sentences in the test corpus which contain the problematic words.

No, most of those words are fine. There's an additional rule change needed probably with the codas. I'll check it later.

Jorge Llambías

unread,
Oct 24, 2014, 5:44:58 PM10/24/14
to bpfk...@googlegroups.com
On Fri, Oct 24, 2014 at 7:39 AM, mukti <shun...@gmail.com> wrote:
On Thursday, October 23, 2014 6:01:20 PM UTC-3, xorxes wrote:
That'a bug in the morphology, well found! I've changed the consonantal-syllable rule to:
  consonantal-syllable <- consonant syllabic !nucleus (consonant &spaces)?

Using this rule change, I reverified the current contents of jbovlaste and the camxes test corpus. The following words in jbovlaste are no longer considered morphologically valid according to this rule:

OK, I think I may have figured it out. The correct change was to:

   consonantal-syllable <- consonant syllabic &(consonantal-syllable / !nucleus onset) (consonant &spaces)?

The only words that should remain in your list would be these, because they don't consist of valid syllables:

artmozaiko
klaktno
mastla
mustlei
postmo
rartni
venzla
xipfne
 
bangrtlingana
banrtlingana
banrtlinganu 
bongnanba 
gudjrati
ingmeme

Hopefully I got it right this time. Could you check again? Thanks!

Jorge Llambías

unread,
Oct 24, 2014, 6:01:31 PM10/24/14
to bpfk...@googlegroups.com
On Fri, Oct 24, 2014 at 6:44 PM, Jorge Llambías <jjlla...@gmail.com> wrote:

The only words that should remain in your list would be these, because they don't consist of valid syllables:

artmozaiko
klaktno
mastla
mustlei
postmo
rartni
venzla
xipfne
 
bangrtlingana
banrtlingana
banrtlinganu 
bongnanba 
gudjrati
ingmeme

BTW, the standard way to "fix" these (if one wanted to keep the consonantal syllable) would be to add x- before the onset-less syllable, so we'd have:

artmxozaiko
kaktnxo
mastlxa
mustlxei
postmxo
rartnxi
venzlxa
xipfnxe
bangrtlxingana
banrtlxingana
banrtlxinganu
bongnxanba
gudjrxati
ingmxeme

Alex Burka

unread,
Oct 24, 2014, 6:11:14 PM10/24/14
to bpfk...@googlegroups.com
Heh, I _thought_ *{bongnanba} seemed inordinately hard to pronounce. I applied this change and ran mukti's list through. The remaining non-parsing words are:

klivlynd
smanyjinkytoldu'evir

artmozaiko
finprsinxnatfidai
finprsinxnatfinai
klaktno
mastla
mustlei
postmo
rartni
runtngasnrproni
sparknipofia
venzla
xipfne

bangrtlingana
banrtlingana
banrtlinganu
bongnanba
gudjrati
ingmeme

So just two that you weren't expecting, nice! I didn't check any words not in mukti's list, though.

As for fixing, I'd expect that many of these words weren't expecting a consonantal syllable (*{klivlynd}, *{postmo}, *{artmozaiko}), so the fix might be to add a syllable: {artamozaiko} or {artnmozaiko}. I can't see how to make "Cleveland" without adding a pause to get two cmevla {.kliv.lynd.}.

mu'o mi'e la durkavore
--
You received this message because you are subscribed to the Google Groups "BPFK" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bpfk-list+...@googlegroups.com.
To post to this group, send email to bpfk...@googlegroups.com.
Visit this group at http://groups.google.com/group/bpfk-list.
For more options, visit https://groups.google.com/d/optout.

Jorge Llambías

unread,
Oct 24, 2014, 6:50:07 PM10/24/14
to bpfk...@googlegroups.com
On Fri, Oct 24, 2014 at 7:11 PM, Alex Burka <dur...@gmail.com> wrote:
Heh, I _thought_ *{bongnanba} seemed inordinately hard to pronounce. I applied this change and ran mukti's list through. The remaining non-parsing words are:

klivlynd
smanyjinkytoldu'evir

I don't see why these two wouldn't parse: kliv,lynd and sma,ny,jin,ky,tol,du,'e,vir.
 
runtngasnrproni

This one too should parse: run.tn.gas.nr.pro,ni
 
So just two that you weren't expecting, nice! I didn't check any words not in mukti's list, though.

Maybe we need a full check given the runtngasnrproni case. 

Alex Burka

unread,
Oct 24, 2014, 8:41:18 PM10/24/14
to bpfk...@googlegroups.com
Sorry, all three appear to be my bugs. mukti is re-running the full check just in case.

Jorge Llambías

unread,
Oct 24, 2014, 9:29:24 PM10/24/14
to bpfk...@googlegroups.com
On Fri, Oct 24, 2014 at 9:41 PM, Alex Burka <dur...@gmail.com> wrote:
Sorry, all three appear to be my bugs. mukti is re-running the full check just in case.

Great!   BTW, I should correct myself about "klivlynd". If I'm not mistaken camxes would say "kli,vlynd" rather than "kliv,lynd", because it prefers open syllables when there's a choice. Not that that makes a lot of difference for anything.

Gleki Arxokuna

unread,
Oct 25, 2014, 2:52:20 AM10/25/14
to bpfk...@googlegroups.com
2014-10-25 2:01 GMT+04:00 Jorge Llambías <jjlla...@gmail.com>:

On Fri, Oct 24, 2014 at 6:44 PM, Jorge Llambías <jjlla...@gmail.com> wrote:

The only words that should remain in your list would be these, because they don't consist of valid syllables:

artmozaiko
klaktno
mastla
mustlei
postmo
rartni
venzla
xipfne
 
bangrtlingana
banrtlingana
banrtlinganu 
bongnanba 
gudjrati
ingmeme


From usability point of view: {postmxo} is infinitely harder to pronounce than {postmo}.
 
BTW, the standard way to "fix" these (if one wanted to keep the consonantal syllable) would be to add x- before the onset-less syllable, so we'd have:

artmxozaiko
kaktnxo
mastlxa
mustlxei
postmxo
rartnxi
venzlxa
xipfnxe
bangrtlxingana
banrtlxingana
banrtlxinganu
bongnxanba
gudjrxati
ingmxeme

mu'o mi'e xorxes 

--

mukti

unread,
Oct 25, 2014, 4:52:03 AM10/25/14
to bpfk...@googlegroups.com
On Friday, October 24, 2014 6:44:58 PM UTC-3, xorxes wrote:
OK, I think I may have figured it out. The correct change was to:
   consonantal-syllable <- consonant syllabic &(consonantal-syllable / !nucleus onset) (consonant &spaces)? 
Hopefully I got it right this time. Could you check again?

With this change, the non-parsing words are ...

jbovlaste cmevla:
klivlynd
smanyjinkytoldu'evir

jbovlaste fu'ivla:
artmozaiko
finprsinxnatfidai
finprsinxnatfinai
klaktno
mastla
mustlei
postmo
rartni
sparknipofia
venzla
xipfne

non-jbovlaste (test corpus):
bangrtlingana
banrtlingana
banrtlinganu
bongnanba
gudjrati
ingmeme

... which is to say xorxes' prediction, as amended by durka -- and with the exception of {runtngasnrproni}, which is accepted as a fu'ivla -- is correct.

{klivlynd} and {smanyjinkytoldu'evir} seem to be failing due to faulty slinku'i detection, which durka has patched -- I'll apply his patch and run the batch one more time.

Jorge Llambías

unread,
Oct 25, 2014, 7:22:41 AM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 3:52 AM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

From usability point of view: {postmxo} is infinitely harder to pronounce than {postmo}.

If it was up to me, I would disallow consonantal syllables altogether. and change the rule for type-3 fu'ivla to CVC(C)yrC...V.

Consonantal syllables don't occur in the core lojban words cmavo/gismu/lujvo, they were introduced for type-3 fu'ivla. but they should have made do with what they already had instead of adding new complications.

Jorge Llambías

unread,
Oct 25, 2014, 7:27:24 AM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 5:52 AM, mukti <shun...@gmail.com> wrote:

{klivlynd} and {smanyjinkytoldu'evir} seem to be failing due to faulty slinku'i detection, which durka has patched -- I'll apply his patch and run the batch one more time.

Shouldn't the test for cmevla be done before the test for fu'ivla? Why would these words ever be tested for slinku'i?

John Cowan

unread,
Oct 25, 2014, 11:36:27 AM10/25/14
to bpfk...@googlegroups.com
Jorge Llambías scripsit:

> artmxozaiko
> kaktnxo
> mastlxa

etc. etc.

These words are far worse than their x-less equivalents. Better to reformulate
the grammar to allow art-mo-zai-ko.

--
John Cowan http://www.ccil.org/~cowan co...@ccil.org
Income tax, if I may be pardoned for saying so, is a tax on income.
--Lord Macnaghten (1901)

Gleki Arxokuna

unread,
Oct 25, 2014, 11:41:34 AM10/25/14
to bpfk...@googlegroups.com
2014-10-25 19:36 GMT+04:00 John Cowan <co...@mercury.ccil.org>:
Jorge Llambías scripsit:

> artmxozaiko
> kaktnxo
> mastlxa

etc. etc.

These words are far worse than their x-less equivalents.  Better to reformulate
the grammar to allow art-mo-zai-ko.
.ie
I am fine with removing this  word in particular.
However, in general I don't see any problems with pronouncing {mastla}.

Why {mastra} is a fine brivla but {mastla} should not be (morphological classes apart)?


--
John Cowan          http://www.ccil.org/~cowan        co...@ccil.org
Income tax, if I may be pardoned for saying so, is a tax on income.
                --Lord Macnaghten (1901)

Jorge Llambías

unread,
Oct 25, 2014, 11:46:07 AM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 12:36 PM, John Cowan <co...@mercury.ccil.org> wrote:
Jorge Llambías scripsit:

> artmxozaiko
> kaktnxo
> mastlxa

etc. etc.

These words are far worse than their x-less equivalents.  Better to reformulate
the grammar to allow art-mo-zai-ko.

I see that as a feature, not a problem. The syllable "-art-" may look fine to English speakers, but it's unlojbanic, lojban only allows single consonant codas. "ar,tm,xo,zai,ko" is horrible, of course, consonantal syllables should be avoided. Why not better make it "larmozaiko"?

Jorge Llambías

unread,
Oct 25, 2014, 11:48:23 AM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 12:41 PM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

However, in general I don't see any problems with pronouncing {mastla}.

Why {mastra} is a fine brivla but {mastla} should not be (morphological classes apart)?

Because "tr" is a valid initial, while "tl" isn't. I wouldn't have a problem with making "tl" a valid initial, whereas I'm not happy with making "st" a valid coda.

John Cowan

unread,
Oct 25, 2014, 11:52:49 AM10/25/14
to bpfk...@googlegroups.com
Gleki Arxokuna scripsit:

> Why {mastra} is a fine brivla but {mastla} should not be (morphological
> classes apart)?

Well, "tl" is not a valid syllable onset, and planet-wide, languages prefer
simpler codas than onsets. (I say "planet-wide" because Klingon is an
exception.)

Now there is nothing difficult about "mast-la", but our parsers are set up
to accept two consonants in the coda only if they have to. I think that's
an unnecessary limitation, as if "mast-la" were simpler than "mast-xla".
One of the oil men in heaven started a rumor of a gusher down in hell. All
the other oil men left in a hurry for hell. As he gets to thinking about
the rumor he had started he says to himself there might be something in
it after all. So he leaves for hell in a hurry. --Carl Sandburg

Gleki Arxokuna

unread,
Oct 25, 2014, 11:52:55 AM10/25/14
to bpfk...@googlegroups.com
any existing words are of no interest in this thread which is about morphology, not about lexicon of filling holes in it.

mu'o mi'e xorxes

Jorge Llambías

unread,
Oct 25, 2014, 12:02:43 PM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 12:52 PM, John Cowan <co...@mercury.ccil.org> wrote:

Now there is nothing difficult about "mast-la", but our parsers are set up
to accept two consonants in the coda only if they have to. 

camxes won't accept two consonants in the coda at all. 

cmevla aside, that is. In cmevla pretty much anything goes, even something like .ktktkptkp. is a valid cmevla now. But it's not a jbocmevla. jbocmevla has to consist of valid syllables, and the only exception for codas in cmevla is at the end of the word and it must have syllabic+consonant form.
 
I think that's
an unnecessary limitation, as if "mast-la" were simpler than "mast-xla".

camxes accepts neither. It does accept mas-tl-xa, because of the syllabic "tl".

And Rosta

unread,
Oct 25, 2014, 3:07:49 PM10/25/14
to bpfk...@googlegroups.com
This discussion seems if not nonsensical then at least not very sensical.

What is the evidence that Lojban allows syllables other than CV?

I'm not saying there isn't any, but it's not obvious what it is. The (regrettable) existence of minimal pairs /Cia/:/Ciia/ seems to imply CGV syllables, and the (regrettable) existence of minimal pairs /aiC/:/aiiC/ seems to imply CVG syllables. But is there any other sort of syllable?

These morphophonological constraints on Lojban words seem to involve something other than syllables -- specifically a morphophonological entity rather than (as the syllable is) a phonological one. Once you get out of phonology and into morphophonology, notions of naturalness and crosslinguistic tendencies become less pertinent. So the morphophonological rules can be as weird and wacky as you like; tho without good reason, why would you want them to be?

But if Lojban syllabification is essentially CV (or simplex onset + simplex nucleus), give or take any complications with glides, why bother with a bunch of morphophonological constraints? Is it just because they are codified in CLL (albeit in erroneously phonological terms) and therefore cannotbe abandoned? Would abandoning morphophonological constraints invalidate existing words or text?

I should add, btw, that even if you wanted to push the IMO untenable analysis of the buffer vowel being anaptyctic, it would be even more implausible to argue that anaptyxis occurs within a syllabic constituent, so either the way the buffer vowel diagnoses syllable structure.

--And.

Jorge Llambías, On 25/10/2014 16:46:

Alex Burka

unread,
Oct 25, 2014, 3:12:42 PM10/25/14
to bpfk...@googlegroups.com
I don't see how you could replace all consideration of consonant clusters with CV + explicit buffer vowels. Wouldn't you lose the ability to specify which clusters are valid, which ones are valid initially, etc?

mu'o mi'e la durka

And Rosta

unread,
Oct 25, 2014, 3:26:04 PM10/25/14
to bpfk...@googlegroups.com
Alex Burka, On 25/10/2014 20:12:
> I don't see how you could replace all consideration of consonant clusters with CV + explicit buffer vowels. Wouldn't you lose the ability to specify which clusters are valid, which ones are valid initially, etc?

You're quite right: these constraints on 'consonant clusters' cannot plausibly be phonological and must rather be morphophonological. The distribution of buffer vowels proves that phonological structure is essentially CV. While in principle, phonological constraints on Cs in /C%C/ sequences are not implausible, the particular constraints Lojban wants to impose are. Therefore, any constraints on 'clusters' are more plausibly morphophonological. Morphophonological rules can do whatever you like.

--And.

> mu'o mi'e la durka
>
> On Saturday, October 25, 2014 at 3:07 PM, And Rosta wrote:
>
>> This discussion seems if not nonsensical then at least not very sensical.
>>
>> What is the evidence that Lojban allows syllables other than CV?
>>
>> I'm not saying there isn't any, but it's not obvious what it is. The (regrettable) existence of minimal pairs /Cia/:/Ciia/ seems to imply CGV syllables, and the (regrettable) existence of minimal pairs /aiC/:/aiiC/ seems to imply CVG syllables. But is there any other sort of syllable?
>>
>> These morphophonological constraints on Lojban words seem to involve something other than syllables -- specifically a morphophonological entity rather than (as the syllable is) a phonological one. Once you get out of phonology and into morphophonology, notions of naturalness and crosslinguistic tendencies become less pertinent. So the morphophonological rules can be as weird and wacky as you like; tho without good reason, why would you want them to be?
>>
>> But if Lojban syllabification is essentially CV (or simplex onset + simplex nucleus), give or take any complications with glides, why bother with a bunch of morphophonological constraints? Is it just because they are codified in CLL (albeit in erroneously phonological terms) and therefore cannotbe abandoned? Would abandoning morphophonological constraints invalidate existing words or text?
>>
>> I should add, btw, that even if you wanted to push the IMO untenable analysis of the buffer vowel being anaptyctic, it would be even more implausible to argue that anaptyxis occurs within a syllabic constituent, so either the way the buffer vowel diagnoses syllable structure.
>>
>> --And.
>>
>> Jorge Llambías, On 25/10/2014 16:46:
>>>
>>>
>>> On Sat, Oct 25, 2014 at 12:36 PM, John Cowan <co...@mercury.ccil.org <mailto:co...@mercury.ccil.org>> wrote:
>>>
>>> Jorge Llambías scripsit:
>>>
>>> > artmxozaiko
>>> > kaktnxo
>>> > mastlxa
>>>
>>> etc. etc.
>>>
>>> These words are far worse than their x-less equivalents. Better to reformulate
>>> the grammar to allow art-mo-zai-ko.
>>>
>>>
>>> I see that as a feature, not a problem. The syllable "-art-" may look fine to English speakers, but it's unlojbanic, lojban only allows single consonant codas. "ar,tm,xo,zai,ko" is horrible, of course, consonantal syllables should be avoided. Why not better make it "larmozaiko"?
>>>
>>> mu'o mi'e xorxes
>>
>> --
>> You received this message because you are subscribed to the Google Groups "BPFK" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to bpfk-list+...@googlegroups.com <mailto:bpfk-list+...@googlegroups.com>.
>> To post to this group, send email to bpfk...@googlegroups.com <mailto:bpfk...@googlegroups.com>.
>> Visit this group at http://groups.google.com/group/bpfk-list.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to the Google Groups "BPFK" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bpfk-list+...@googlegroups.com <mailto:bpfk-list+...@googlegroups.com>.
> To post to this group, send email to bpfk...@googlegroups.com <mailto:bpfk...@googlegroups.com>.

Jorge Llambías

unread,
Oct 25, 2014, 3:35:23 PM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 4:26 PM, And Rosta <and....@gmail.com> wrote:
Alex Burka, On 25/10/2014 20:12:
I don't see how you could replace all consideration of consonant clusters with CV + explicit buffer vowels. Wouldn't you lose the ability to specify which clusters are valid, which ones are valid initially, etc?

You're quite right: these constraints on 'consonant clusters' cannot plausibly be phonological and must rather be morphophonological. The distribution of buffer vowels proves that phonological structure is essentially CV. While in principle, phonological constraints on Cs in /C%C/ sequences are not implausible, the particular constraints Lojban wants to impose are. Therefore, any constraints on 'clusters' are more plausibly morphophonological. Morphophonological rules can do whatever you like.

Can we talk about "morphophonological syllables"? If yes, then assume this discussion is basically about morphophonological syllables rather than phonological ones. We need to identify these (at least to some extent) in order to know how to break up a stream of phonemes into words. 

Of course the morphophonological rules could have been much simpler than what they are, that cannot be disputed.

Jorge Llambías

unread,
Oct 25, 2014, 3:51:52 PM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 4:07 PM, And Rosta <and....@gmail.com> wrote:

But if Lojban syllabification is essentially CV (or simplex onset + simplex nucleus), give or take any complications with glides, why bother with a bunch of morphophonological constraints? Is it just because they are codified in CLL (albeit in erroneously phonological terms) and therefore cannotbe abandoned?  Would abandoning morphophonological constraints invalidate existing words or text?

Abandoning the constraints against double consonants, voiced-unvoiced clusters, sibilant-sibilant clusters, x-c/k clusters and mz would not invalidate anything, as far as I can tell. The distinction between valid and invalid "onsets" is mostly what is needed to delimit words.

Jorge Llambías

unread,
Oct 25, 2014, 4:34:51 PM10/25/14
to bpfk...@googlegroups.com
On Sat, Oct 25, 2014 at 4:35 PM, Jorge Llambías <jjlla...@gmail.com> wrote:

Can we talk about "morphophonological syllables"? If yes, then assume this discussion is basically about morphophonological syllables rather than phonological ones. 

Actually, that's not quite true. We do need to identify valid onsets in order to determine words, but this discussion wasn't really about onsets. 

I would have to agree that if the buffer vowel was real, all this discussion would be mostly nonsensical. So I would say that the buffer vowel is basically a myth. None of the phonological constraints make much sense if there was a buffer vowel. I don't think I've ever heard anyone speak lojban with a buffer vowel, and it would probably sound very confusing. Without a buffer vowel, it does make sense to limit the amount of consonant clustering that can occur. If there was a buffer vowel, the morphophonological syllable could still be onset-nucleus-coda as now, but with the coda allowed to contain as many consonants as you wanted. That's not how my dialect of lojban works though.

And Rosta

unread,
Oct 26, 2014, 4:48:56 AM10/26/14
to bpfk...@googlegroups.com
Jorge Llambías, On 25/10/2014 21:34:
> On Sat, Oct 25, 2014 at 4:35 PM, Jorge Llambías <jjlla...@gmail.com <mailto:jjlla...@gmail.com>> wrote:
>
> Can we talk about "morphophonological syllables"? If yes, then assume this discussion is basically about morphophonological syllables rather than phonological ones.

"Morphogical syllables" (maybe renamed to something a little less susceptible to confusion) would be fine. My questions would then be what the rules are and why. The norms of language don't constrain morphophonological rules much, so they can be as weird and wacky as necessary. The rule you give below, CVC*, seems pretty straightforward.

> Actually, that's not quite true. We do need to identify valid onsets
> in order to determine words, but this discussion wasn't really about
> onsets.

The question about onsets being whether CGV is a valid onset?

But "morphological onsets" are needed too, aren't they. E.g. /patrAma/ is two words /pa trAma/ whereas /partAma/ is one word, because of the rules for morpho-onsets.

> I would have to agree that if the buffer vowel was real, all this
> discussion would be mostly nonsensical. So I would say that the
> buffer vowel is basically a myth. None of the phonological
> constraints make much sense if there was a buffer vowel.

Well, today's morphophonology is yesterday's phonology (e.g. the vowel alternation in _sane--sanity_), so it makes sense diachronically but not synchronically. But for Lojban you don't look for diachronic explanations. (In Lojban too the actual explanation is of course quasi-diachronic, in that the complex constraints on 'clusters' were likely invented before the buffer vowel.)

> I don't think I've ever heard anyone speak lojban with a buffer
> vowel, and it would probably sound very confusing.

FWIW I used to use one (with erroneous allophony) in certain environments (but treated it as metrically invisible), in particular in environments z_C, m_C (to avoid confusion with mbC) and, for obstruent C, /C_./.

> Without a buffer vowel, it does make sense to limit the amount of
> consonant clustering that can occur. If there was a buffer vowel,
> the morphophonological syllable could still be onset-nucleus-coda as
> now, but with the coda allowed to contain as many consonants as you
> wanted. That's not how my dialect of lojban works though.

In what way is it not how your dialect of Lojban works? It would categorize as valid some words that you categorize as invalid? Or would it insert word-boundaries differently? The latter seems more significant an objection than the former.

So anyway, do you advocate abolishing the buffer vowel? An alternative would be to insist that every licit phonological string has both a CV syllabification (with buffer vowels) and a resyllabification without buffer vowels. That alternative strikes me as needlessly complex, but as still preferable to abolishing the buffer vowel.

--And.


Gleki Arxokuna

unread,
Oct 26, 2014, 6:33:55 AM10/26/14
to bpfk...@googlegroups.com
2014-10-25 23:51 GMT+04:00 Jorge Llambías <jjlla...@gmail.com>:


On Sat, Oct 25, 2014 at 4:07 PM, And Rosta <and....@gmail.com> wrote:

But if Lojban syllabification is essentially CV (or simplex onset + simplex nucleus), give or take any complications with glides, why bother with a bunch of morphophonological constraints? Is it just because they are codified in CLL (albeit in erroneously phonological terms) and therefore cannotbe abandoned?  Would abandoning morphophonological constraints invalidate existing words or text?

Abandoning the constraints against double consonants, voiced-unvoiced clusters, sibilant-sibilant clusters, x-c/k clusters and mz would not invalidate anything, as far as I can tell.
For them an autocorrection might be implemented in parsers that would automatically insert {y} between them.
 
The distinction between valid and invalid "onsets" is mostly what is needed to delimit words.

mu'o mi'e xorxes 
 

--
You received this message because you are subscribed to the Google Groups "BPFK" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bpfk-list+...@googlegroups.com.
To post to this group, send email to bpfk...@googlegroups.com.

And Rosta

unread,
Oct 26, 2014, 8:43:48 AM10/26/14
to bpfk...@googlegroups.com
And Rosta, On 26/10/2014 08:48:
> Jorge Llambías, On 25/10/2014 21:34:
>> Actually, that's not quite true. We do need to identify valid onsets
>> in order to determine words, but this discussion wasn't really about
>> onsets.
>
> The question about onsets being whether CGV is a valid onset?
>
> But "morphological onsets" are needed too, aren't they. E.g.
> /patrAma/ is two words /pa trAma/ whereas /partAma/ is one word,
> because of the rules for morpho-onsets.

Sorry, I see now that in another message you had indeed said (in effect) that word segmentation depends primarily on morpho-onsets, i.e. permissible word-initial /#C%C/ sequences.

If morpho-codas were unconstrained, and morpho-onsets were as per CLL (modulo any semivowel complications), what effects would that have on current lexis and usage?

--And.

Jorge Llambías

unread,
Oct 26, 2014, 8:52:25 AM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 5:48 AM, And Rosta <and....@gmail.com> wrote:

"Morphogical syllables" (maybe renamed to something a little less susceptible to confusion) would be fine. My questions would then be what the rules are and why. The norms of language don't constrain morphophonological rules much, so they can be as weird and wacky as necessary. The rule you give below, CVC*, seems pretty straightforward.

Here's the theory of morphological syllables in a nutshell. There are two kinds of syllables: vocalic and consonantal. Most syllables are vocalic. There are a total of 24.840 possible vocalic syllables, and 64 possible consonantal syllables. 

The consonantal syllables are all of the form CR where C is any of the 17 consonants bcdfgjklmnprstvxz and R is any of lmnr different from C.

The vocalic syllables consist of all possible combinations of (morphological) onset-nucleus-coda.

There are ten valid (morphological) nuclei: a, e, i, o, u, ai, ei, oi, au, y 

There are 138/139 valid (morphological) onsets. The dot/apostrophe (which are in complementary distribution, so they could be considered either one or two), the two glides i/u, the 34 controversial CG, the 17 single consonants C, the 48 permissible initials CC listed in CLL, and 36 permissible initials CCC based on the permissible CC. 80 of the 84 CC(C) onsets fall within the pattern [csjz][ptkfxbdgvmn][lr]. the remaining 4 are tc, ts, dj, dz. 

There are 18 valid (morphological) codas: the 17 consonants and the empty coda.

That gives 138*10*18 = 24,840  or  139*10*18 = 25,020 possible vocalic syllables. 

All words (except for cmevla) consist of a sequence of valid syllables. 

There are also some constraints on which syllables can be adjacent: the final consonant of a syllable and the first consonant of the next syllable can't be the same, they can't have different voicedness, they can't both be sibilants, if one is x the other can't be c or k, if the first is m the other can't be z. Also, a syllable that ends with n can't be followed by one that starts with an affricate (tc, ts, dj, dz). Some of these constraints sound completely arbitrary, and they are. In addition, some combinations are disallowed only because they give the same result as some other combination, e.g. tav+la = ta+vla. It doesn't make any difference if we say tav+la is disallowed, or if we say it's equivalent to ta+vla.

Now, not every valid combination of valid syllables will result in a string of valid words. Here's where the rafsi madness comes into play. Syllables with a "y" nucleus in particular are very restricted in how they will combine, and most of them can never occur in any valid (non-cmevla) word. Consonantal syllables are also somewhat restricted in that they can't appear until a vocalic syllable has appeared (except again in cmevla). There are also sequences of valid syllables (called "slinku'i"), which cannot be a word or a sequence of words.

I think that's basically it, although I may be forgetting some detail or other.

Actually, that's not quite true. We do need to identify valid onsets
in order to determine words, but this discussion wasn't really about
onsets.

The question about onsets being whether CGV is a valid onset?

But "morphological onsets" are needed too, aren't they. E.g. /patrAma/ is two words /pa trAma/ whereas /partAma/ is one word, because of the rules for morpho-onsets.

Yes. By "this discussion" I meant the one that started this thread in particular, which was about a bug in the PEG morphology that allowed onset-less syllables after consonantal syllables, when my intention when writing the morphology PEG was that all syllables should have a non-empty onset. So we had words like mas-tl-a pos-tm-o. 
 

Well, today's morphophonology is yesterday's phonology (e.g. the vowel alternation in _sane--sanity_), so it makes sense diachronically but not synchronically. But for Lojban you don't look for diachronic explanations. (In Lojban too the actual explanation is of course quasi-diachronic, in that the complex constraints on 'clusters' were likely invented before the buffer vowel.)

Yes, and not only that. Since lujvo came after cmavo and gismu, fu'ivla came after lujvo, and cmevla were probably there all along but in a parallel universe of their own, the rules encompassing them all constitute a complex patchwork which is hard to put together into a seamless whole.  

Without a buffer vowel, it does make sense to limit the amount of
consonant clustering that can occur. If there was a buffer vowel,
the morphophonological syllable could still be onset-nucleus-coda as
now, but with the coda allowed to contain as many consonants as you
wanted. That's not how my dialect of lojban works though.

In what way is it not how your dialect of Lojban works? It would categorize as valid some words that you categorize as invalid? Or would it insert word-boundaries differently? The latter seems more significant an objection than the former.

Just the former. I would not want to categorize "poktpftcu" for example as a valid word.

So anyway, do you advocate abolishing the buffer vowel? An alternative would be to insist that every licit phonological string has both a CV syllabification (with buffer vowels) and a resyllabification without buffer vowels. That alternative strikes me as needlessly complex, but as still preferable to abolishing the buffer vowel.

I don't mind it appearing sporadically at the phonological level, I just don't want it at the phonemic level because I think it hinders more than helps.

Jorge Llambías

unread,
Oct 26, 2014, 9:11:52 AM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 9:43 AM, And Rosta <and....@gmail.com> wrote:

If morpho-codas were unconstrained, and morpho-onsets were as per CLL (modulo any semivowel complications), what effects would that have on current lexis and usage?

I don't think it would invalidate anything, it would just open the door to more ugly fu'ivla. The restriction on codas is based solely on aesthetics and analogy. Lujvo never have codas heavier than a single consonant, and it seems like a good thing to maintain that restriction for fu'ivla, which are supposed to be as close as possible to core lojban words without interfering with their decomposability.

And Rosta

unread,
Oct 26, 2014, 9:32:02 AM10/26/14
to bpfk...@googlegroups.com
Jorge Llambías, On 26/10/2014 12:52:
> On Sun, Oct 26, 2014 at 5:48 AM, And Rosta <and....@gmail.com <mailto:and....@gmail.com>> wrote:
>
> Here's the theory of morphological syllables in a nutshell.

A very precise specification of the what, but not so clear on the why. Are there rationales other than "Because CLL says so"? And if that is the only rationale, and the rules could be drastically simplified without invalidating any existing lexis, why not simplify the rules? (Given that anyway the CLL rules are plainly not complete and fully specified.)

> There are 18 valid (morphological) codas: the 17 consonants and the empty coda.

I guess it's at least 17, because each of the 17 Cs can occur word-medially before another C that it can't occur word-initially before?

Does CLL forbid CC codas? I guess this would be in fu'ivla. So /artsta/ is not a valid fu'ivla, say?

> All words (except for cmevla) consist of a sequence of valid syllables.

Is this from CLL?

If the constraints apply only to words of certain classes, then the constraints are almost certainly morphophonological and not phonological in nature.

> Without a buffer vowel, it does make sense to limit the amount of
> consonant clustering that can occur. If there was a buffer vowel,
> the morphophonological syllable could still be onset-nucleus-coda as
> now, but with the coda allowed to contain as many consonants as you
> wanted. That's not how my dialect of lojban works though.
>
>
> In what way is it not how your dialect of Lojban works? It would categorize as valid some words that you categorize as invalid? Or would it insert word-boundaries differently? The latter seems more significant an objection than the former.
>
> Just the former. I would not want to categorize "poktpftcu" for
> example as a valid word.

But is that for any reason other than habit?

> So anyway, do you advocate abolishing the buffer vowel? An alternative would be to insist that every licit phonological string has both a CV syllabification (with buffer vowels) and a resyllabification without buffer vowels. That alternative strikes me as needlessly complex, but as still preferable to abolishing the buffer vowel.
>
> I don't mind it appearing sporadically at the phonological level, I
> just don't want it at the phonemic level because I think it hinders
> more than helps.

A distinction between "the phonological level" and "the phonemic level" looks rather spurious to me.

--And.

Jorge Llambías

unread,
Oct 26, 2014, 9:37:10 AM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 9:52 AM, Jorge Llambías <jjlla...@gmail.com> wrote:

There are also some constraints on which syllables can be adjacent: the final consonant of a syllable and the first consonant of the next syllable can't be the same, they can't have different voicedness, they can't both be sibilants, if one is x the other can't be c or k, if the first is m the other can't be z. Also, a syllable that ends with n can't be followed by one that starts with an affricate (tc, ts, dj, dz). Some of these constraints sound completely arbitrary, and they are. In addition, some combinations are disallowed only because they give the same result as some other combination, e.g. tav+la = ta+vla. It doesn't make any difference if we say tav+la is disallowed, or if we say it's equivalent to ta+vla.

I forgot to list one more constraint: a syllable that ends with a diphthong cannot be followed by one that begins with a glide. This constraint has been contested. A possible revision would be to disallow only the glide with the corresponding semivowel, so "tai-ian" would still be disallowed but "tai-uan" would be ok.

Jorge Llambías

unread,
Oct 26, 2014, 9:58:45 AM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 10:32 AM, And Rosta <and....@gmail.com> wrote:

A very precise specification of the what, but not so clear on the why. Are there rationales other than "Because CLL says so"? And if that is the only rationale, and the rules could be drastically simplified without invalidating any existing lexis, why not simplify the rules? (Given that anyway the CLL rules are plainly not complete and fully specified.)

CLL doesn't get in too much detail about syllables, but yes, it's based on CLL rules, with some modifications here and there. The underlying idea is that cmavo/gismu/lujvo are the core words of the language, and so fu'ivla should sound as much as possible like them. 
 
There are 18 valid (morphological) codas: the 17 consonants and the empty coda.

I guess it's at least 17, because each of the 17 Cs can occur word-medially before another C that it can't occur word-initially before?

I don't understand the question. 

Does CLL forbid CC codas? I guess this would be in fu'ivla. So /artsta/ is not a valid fu'ivla, say?

CLL doesn't forbid them, and it may even have some examples with CC codas. CLL would accept "artsta", with syllables art-sta but camxes will reject it.

 
All words (except for cmevla) consist of a sequence of valid syllables.

Is this from CLL?

No, CLL doesn't quite put it that way.
 

If the constraints apply only to words of certain classes, then the constraints are almost certainly morphophonological and not phonological in nature.
 
Originally camxes applied the constraints to cmevla as well, but now cmevla are divided into "jbocme" which consist of regular syllables, and "zifcme" in which pretty much anything goes. Although not anything at all: .prtkfmgjdglgl. is a valid cmevla, but .emzis. is still forbidden. It's crazy, and it's not my preference, but that's how it is now.

 
Just the former. I would not want to categorize "poktpftcu" for
example as a valid word.

But is that for any reason other than habit?

Probably not. It just doesn't look like lojban to me. 

Jorge Llambías

unread,
Oct 26, 2014, 10:59:11 AM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 7:33 AM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

2014-10-25 23:51 GMT+04:00 Jorge Llambías <jjlla...@gmail.com>:

Abandoning the constraints against double consonants, voiced-unvoiced clusters, sibilant-sibilant clusters, x-c/k clusters and mz would not invalidate anything, as far as I can tell.
For them an autocorrection might be implemented in parsers that would automatically insert {y} between them.

That would not quite work, because for example the new potential fu'ivla .imza would be turned into .imyza which is three words .i my za

And Rosta

unread,
Oct 26, 2014, 11:55:44 AM10/26/14
to bpfk...@googlegroups.com
Jorge Llambías, On 26/10/2014 13:58:
> On Sun, Oct 26, 2014 at 10:32 AM, And Rosta <and....@gmail.com <mailto:and....@gmail.com>> wrote:
> There are 18 valid (morphological) codas: the 17 consonants and the empty coda.
>
>
> I guess it's at least 17, because each of the 17 Cs can occur word-medially before another C that it can't occur word-initially before?
>
>
> I don't understand the question.

Is it the case that each of the 17 Cs can occur at the start of a cluster that is licit in the middle of a word but not at the start of a word.

Unrelatedly, what's the basis for allowing CCC onsets? Does CLL expressly licence CCC in fu'ivla? The licitness of CCC onsets can't be derived from gismu, lujvo, cmavo. And does CLL forbid CCCC onsets in fu'ivla? (I'm being lazy, here, aren't I. If that's annoying, do remonstrate with me and I'll search online before asking.)

> Just the former. I would not want to categorize "poktpftcu" for
> example as a valid word.
>
> But is that for any reason other than habit?
>
> Probably not. It just doesn't look like lojban to me.

If you're describing rather than defining the language, then that's okay, I suppose.

--And.

Jorge Llambías

unread,
Oct 26, 2014, 12:34:10 PM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 12:55 PM, And Rosta <and....@gmail.com> wrote:
Jorge Llambías, On 26/10/2014 13:58: 
On Sun, Oct 26, 2014 at 10:32 AM, And Rosta <and....@gmail.com <mailto:and....@gmail.com>> wrote:
        There are 18 valid (morphological) codas: the 17 consonants and the empty coda.

    I guess it's at least 17, because each of the 17 Cs can occur word-medially before another C that it can't occur word-initially before?

I don't understand the question.

Is it the case that each of the 17 Cs can occur at the start of a cluster that is licit in the middle of a word but not at the start of a word.

Ah, ok. Yes.

In addition to the 17 single consonants, consonantal syllables also have that property, but they are not counted as part of the coda of the preceding syllable, they form a syllable on their own.
 
Unrelatedly, what's the basis for allowing CCC onsets? Does CLL expressly licence CCC in fu'ivla? The licitness of CCC onsets can't be derived from gismu, lujvo, cmavo. And does CLL forbid CCCC onsets in fu'ivla? (I'm being lazy, here, aren't I. If that's annoying, do remonstrate with me and I'll search online before asking.)

Yes, CLL allows the 36 CCC onsets that camxes allows, plus a whole lot more (an infinite number in fact). CLL allows as initials any combination of consonants in which each CC pair is a valid initial, and the existence of the four affricate initials tc, ts, dj, dz means that indefinitely long onsets are allowed: For example tstststststststststststststsati would be a valid fu'ivla for CLL.

camxes keeps the CLL rule but excluding the affricates from the combinations, and that automatically restricts it to the 36 CCC and nothing else.

fu'ivla do allow a couple of things not present in lujvo, but not a lot more: CCC initials and consonantal syllables. 

There is a progression where each class of words introduces something new: we start with vanilla CV (17 consonants and 5 vowels). Then we have:

(1a) gismu introduce CCV and CVC syllables, with the 48 valid CC onsets and the rules for which CVC can be followed by which CV. 
(1b) cmavo introduce four new onsets (the dot and the apostrophe, the i/u glides) and five new nuclei (the four diphthongs and y), plus the rule that dot can only be initial and apostrophe can only be medial (and possibly CG initials, although not in official cmavo).
(2) lujvo introduce CVVC and hVC syllables, required to prevent CVV and CVhV initial rafsi from falling off, and I think that's about it.
(3) fu'ivla introduce CCC initials and consonantal syllables.
(4) cmevla introduce dot-consonant and consonant-dot, and currently a whole lot of other things.

I would prefer that fu'ivla did not introduce consonantal syllables. They were added to make type-3 fu'ivla possible, with their CrC clusters, but that could be easily managed with CyrC instead, only requiring a coda for y syllables (which are not currently used outside of cmevla).

The initial CCC are almost there already in lujvo, because we have lujvo like mispre that can easily be syllabified as mi-spre. So even though they don't occur word-initially, they are already present in the middle of words.

Gleki Arxokuna

unread,
Oct 26, 2014, 12:51:41 PM10/26/14
to bpfk...@googlegroups.com
Why? What's wrong with it?
 

mu'o mi'e xorxes
 

--

Jorge Llambías

unread,
Oct 26, 2014, 12:54:58 PM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 1:51 PM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

2014-10-25 19:48 GMT+04:00 Jorge Llambías <jjlla...@gmail.com>:

On Sat, Oct 25, 2014 at 12:41 PM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

However, in general I don't see any problems with pronouncing {mastla}.

Why {mastra} is a fine brivla but {mastla} should not be (morphological classes apart)?

Because "tr" is a valid initial, while "tl" isn't. I wouldn't have a problem with making "tl" a valid initial, whereas I'm not happy with making "st" a valid coda.

Why? What's wrong with it?

It's too heavy. Core lojban words (cmavo/gismu/lujvo) don't have double-consonant codas.

John Cowan

unread,
Oct 26, 2014, 1:49:08 PM10/26/14
to bpfk...@googlegroups.com
And Rosta scripsit:

> A very precise specification of the what, but not so clear on the
> why. Are there rationales other than "Because CLL says so"? And
> if that is the only rationale, and the rules could be drastically
> simplified without invalidating any existing lexis, why not simplify
> the rules? (Given that anyway the CLL rules are plainly not complete
> and fully specified.)

My intention when writing CLL was to specify certain word forms as
valid by construction, but not to say what else might be valid, much
less what was not valid. I now think that was a mistake, and CLL should
have prescribed the word forms of fu'ivla and cmevla much more narrowly.
The intention was to allow forms that were as close as possible (but no
closer) to the highly varied natural-language sources.

I also believe that the attempt to prescribe allophony, even unconditioned
allophony, was also a mistake. We should have said "Six vowels, 17
consonants, these are the normative forms, how you talk is up to you as
long as your interlocutors understand you."
The Unicode Standard does not encode idiosyncratic, personal, novel,
or private use characters, nor does it encode logos or graphics.

Gleki Arxokuna

unread,
Oct 26, 2014, 2:30:53 PM10/26/14
to bpfk...@googlegroups.com
Okay then. But then consonantal syllables are equally bad. Either it's Japanese CVCV intensive style or Serbian CCCC style but not somewhat Serbian, somewhat Japanese in different places.


mu'o mi'e xorxes 

Jorge Llambías

unread,
Oct 26, 2014, 3:21:59 PM10/26/14
to bpfk...@googlegroups.com
On Sun, Oct 26, 2014 at 3:30 PM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:

Okay then. But then consonantal syllables are equally bad. Either it's Japanese CVCV intensive style or Serbian CCCC style but not somewhat Serbian, somewhat Japanese in different places.

It's somewhere in between: We could say there are 13 types of syllable: CV, CCV and CVC from gismu, plus CVV from cmavo, plus CVVC from lujvo, plus CCCV, CCVC, CCVV, CCCVC, CCCVV, CCVVC, CCCVVC from fu'ivla. Plus the unlucky thirteenth: CR consonantal syllables from type-3 fu'ivla.

And Rosta

unread,
Oct 26, 2014, 4:27:07 PM10/26/14
to bpfk...@googlegroups.com
John Cowan, On 26/10/2014 17:49:
> I also believe that the attempt to prescribe allophony, even unconditioned
> allophony, was also a mistake.

In practice, yes, tho not necessarily in principle.

> We should have said "Six vowels, 17 consonants, these are the
> normative forms, how you talk is up to you as long as your
> interlocutors understand you."

Really six vowels rather than seven? What about the buffer vowel. The failure to assign a normative form to it was a mistake, I think. (FWIW I'd have given /y/ the normative value [y] and /%/ the normative value [@], swapping the values when the phonemes are adjacent to /i, u/.)

--And.


John Cowan

unread,
Oct 26, 2014, 7:15:19 PM10/26/14
to bpfk...@googlegroups.com
And Rosta scripsit:

> Really six vowels rather than seven? What about the buffer vowel.

That's one of the things that should have been left out. In Loglan,
there were two dialects: in one, /y/ was realized as [@] and no buffer
was permitted; in the other, /y/ was realized as [j@] and all other
instances of [@] were epenthetic. JCB actually required that in the
second dialect [@] be inserted at *all* points; however, no one ever
did that as far as anyone knows.

> (FWIW I'd have given /y/ the normative value [y] and /%/ the normative
> value [@], swapping the values when the phonemes are adjacent to
> /i, u/.)

Vocalic consonants were first introduced, I think to serve the role of /y/.
In post-split Loglan they came to be written doubled and to be treated
morphologically as consonant pairs (thus "ammata" is a valid brivla
whereas "amata" falls apart into "a ma ta").

But I wouldn't be happy with a front rounded vowel, given how rare they are
(as phonemes) in the world's languages.
It is revolting to have no better reason for a rule of law than that so it was
laid down in the time of Henry IV. It is still more revolting if the grounds
upon which it was laid down have vanished long since, and the rule simply
persists from blind imitation of the past. --Oliver Wendell Holmes Jr.

John Cowan

unread,
Nov 8, 2014, 8:09:45 PM11/8/14
to bpfk...@googlegroups.com
Jorge Llambías scripsit:

> Here's the theory of morphological syllables in a nutshell.

Thanks for the excellent summary.

> The vocalic syllables consist of all possible combinations of
> (morphological) onset-nucleus-coda.

This is not quite true. In particular, "y" can only occur in lujvo (and
cmevla), and in lujvo at least such syllables can only have the form Cy,
no clusters, no coda. That cuts back the count of syllables from 25,020
(counting "." and "'" as different) to 139*9*18 + 17*1*1 = 23,786 vocalic
syllables. Indeed, with a different theory of syllabication in which
Cy is always divided between coda C and "y" in a syllable by itself,
the count drops to 23,770 vocalic syllables.

> I don't mind [the buffer vowel] appearing sporadically at the
> phonological level, I just don't want it at the phonemic level because
> I think it hinders more than helps.

Emphatic agreement, substituting "phonetic level" for "phonological level".
Babies are born as a result of the mating between men and women,
and most men and women enjoy mating.
--Isaac Asimov in Earth: Our Crowded Spaceship

John Cowan

unread,
Nov 8, 2014, 10:33:50 PM11/8/14
to bpfk...@googlegroups.com
Jorge Llambías scripsit:

> Yes, CLL allows the 36 CCC onsets that camxes allows, plus a whole
> lot more (an infinite number in fact). CLL allows as initials
> any combination of consonants in which each CC pair is a valid
> initial, and the existence of the four affricate initials tc, ts,
> dj, dz means that indefinitely long onsets are allowed: For example
> tstststststststststststststsati would be a valid fu'ivla for CLL.

That was obviously ill-thought-out on my part. The whole idea of CCC...
initials was added at the last minute, and should have been left out.

> camxes keeps the CLL rule but excluding the affricates from the
> combinations, and that automatically restricts it to the 36 CCC and
> nothing else.

I like this.

> (3) fu'ivla introduce CCC initials and consonantal syllables.

However, they also forbid "y" in any position: see CLL 4.7, rule 4. The
following note "Note that consonant triples or larger clusters that are
not at the beginning of a fu'ivla can be quite flexible, as long as all
consonant pairs are permissible. There is no need to restrict fu'ivla
clusters to permissible initial pairs except at the beginning." should
be ignored: we do not want "cmacrpspspspspspspspspspspspspsa".

> (4) cmevla introduce dot-consonant and consonant-dot, and currently a
> whole lot of other things.

I think restricting cmevla to standard syllables, vocalic and
consonantal, is basically rich enough. Indeed, it may be too flexible
in allowing V-V hiatus: "koreas." is made of valid syllables, but is
IMO unlojbanic.

Horrors like "mxysptlk." simply should not be allowed as cmevla (see
<https://en.wikipedia.org/wiki/Mister_Mxyzptlk>.) However, we may
want to allow CyC and some coda clusters, so that "miks. .iespitlyk."
(closer to the actual English pronunciation) is permitted.

> I would prefer that fu'ivla did not introduce consonantal
> syllables. They were added to make type-3 fu'ivla possible, with their
> CrC clusters, but that could be easily managed with CyrC instead, only
> requiring a coda for y syllables (which are not currently used outside
> of cmevla).

Violates rule 4. Consonantal syllables suck, but I think we are stuck
with them.
My confusion is rapidly waxing
For XML Schema's too taxing:
I'd use DTDs / If they had local trees --
I think I best switch to RELAX NG.

Alex Burka

unread,
Nov 8, 2014, 10:54:41 PM11/8/14
to bpfk...@googlegroups.com

(4) cmevla introduce dot-consonant and consonant-dot, and currently a
whole lot of other things.

I think restricting cmevla to standard syllables, vocalic and
consonantal, is basically rich enough. Indeed, it may be too flexible
in allowing V-V hiatus: "koreas." is made of valid syllables, but is
IMO unlojbanic.
By my reading, camxes won't allow a nucleus to follow a vowel/diphthong under any circumstances, so *{koreas} is not allowed, even as a zifcme (and I agree it's naljbo).
I would prefer that fu'ivla did not introduce consonantal
syllables. They were added to make type-3 fu'ivla possible, with their
CrC clusters, but that could be easily managed with CyrC instead, only
requiring a coda for y syllables (which are not currently used outside
of cmevla).

Violates rule 4. Consonantal syllables suck, but I think we are stuck
with them.
Sorry, what's rule 4? I'm guessing it has to do with backwards compatibility? 


mu'o mi'e la durkavore

John Cowan

unread,
Nov 9, 2014, 12:54:03 AM11/9/14
to bpfk...@googlegroups.com
Alex Burka scripsit:

> By my reading, camxes won't allow a nucleus to follow a vowel/diphthong
> under any circumstances, so *{koreas} is not allowed, even as a zifcme
> (and I agree it's naljbo).

Good. At one time, the fu'ivla for "Korean language" was "bangrkore,a".

> > Violates rule 4. Consonantal syllables suck, but I think we are
> > stuck with them.
>
> Sorry, what's rule 4? I'm guessing it has to do with backwards
> compatibility?

See my previous posting. Rule 4 for fu'ivla in CLL Section 4.7 says
that "y" is forbidden in fu'ivla, so "bangrkore'a" can't be changed to
"bangyrkore'a". In practice, it wouldn't be a problem to pronounce
the former like the latter.

Indeed, if that was the *only* kind of "y" in fu'ivla (that is, allowing
the syllable types Cyl, Cym, Cyn, Cyr), I could live with it.
Normally I can handle panic attacks on my own; but panic is, at the moment,
a way of life. --Joseph Zitt

And Rosta

unread,
Nov 9, 2014, 7:20:24 AM11/9/14
to bpfk...@googlegroups.com


On 9 Nov 2014 01:09, "John Cowan" <co...@mercury.ccil.org> wrote:
>
> Jorge Llambías scripsit:


> > I don't mind [the buffer vowel] appearing sporadically at the
> > phonological level, I just don't want it at the phonemic level because
> > I think it hinders more than helps.
>
> Emphatic agreement, substituting "phonetic level" for "phonological level".

I think you're both wrong, and that rather you should be saying that you don't mind it at the phonological level but don't want it at the morphophonological level -- a position that I would then agree with.

The phonemic status of the buffer vowel is evidenced by its realizational specification. It might better be thought of as the realization of an empty nucleus.

I suppose that there is indeed also a rather extreme and unnatural analysis in which the buffer vowel is part of one of the nonprevocalic allophones of each consonant phoneme, and that its lack of realizational overlap with vowel phonemes is coincidental.

--And.

Jorge Llambías

unread,
Nov 9, 2014, 8:11:30 AM11/9/14
to bpfk...@googlegroups.com
On Sat, Nov 8, 2014 at 10:09 PM, John Cowan <co...@mercury.ccil.org> wrote:
Jorge Llambías scripsit:

> The vocalic syllables consist of all possible combinations of
> (morphological) onset-nucleus-coda.

This is not quite true.  In particular, "y" can only occur in lujvo (and
cmevla), and in lujvo at least such syllables can only have the form Cy,
no clusters, no coda.  That cuts back the count of syllables from 25,020
(counting "." and "'" as different) to 139*9*18 + 17*1*1 = 23,786 vocalic
syllables. 

True, but camxes allows CCCyC syllables in lojbanic cmevla, so I counted them as valid syllables.

Also CCy and CCCy can appear in fu'ivla rafsi. 

Jorge Llambías

unread,
Nov 9, 2014, 8:31:24 AM11/9/14
to bpfk...@googlegroups.com
On Sun, Nov 9, 2014 at 2:54 AM, John Cowan <co...@mercury.ccil.org> wrote:
 Rule 4 for fu'ivla in CLL Section 4.7 says
that "y" is forbidden in fu'ivla, so "bangrkore'a" can't be changed to
"bangyrkore'a".  In practice, it wouldn't be a problem to pronounce
the former like the latter.

Indeed, if that was the *only* kind of "y" in fu'ivla (that is, allowing
the syllable types Cyl, Cym, Cyn, Cyr), I could live with it.

That was the idea, yes. In fact only Cyr and Cyn. The -l- hyphen would now never be needed, since the final consonant of the rafsi becomes irrelevant for the choice of hyphen and so -yr- works for all cases except when the fu'ivla proper begins with r. 

Type-3 fu'ivla become thus something like pseudo-lujvo, which is what they are anyway.

John Cowan

unread,
Nov 9, 2014, 10:37:51 AM11/9/14
to bpfk...@googlegroups.com
Jorge Llambías scripsit:

> That was the idea, yes. In fact only Cyr and Cyn. The -l- hyphen would now
> never be needed, since the final consonant of the rafsi becomes irrelevant
> for the choice of hyphen and so -yr- works for all cases except when the
> fu'ivla proper begins with r.

I was actually thinking of something more radical: replace all syllabic
consonants with yC throughout the language. Ivan Derzhanski argued for
this long ago, and I now think he was correct: they reduce readability
and make things more awkward. Syllabic consonants were introduced by
JCB, and I think we should have discarded them for Lojban; Loglan wound
up being more dependent on them, and writing them double.

> Type-3 fu'ivla become thus something like pseudo-lujvo, which is what they
> are anyway.

Indeed.
Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo

Jorge Llambías

unread,
Nov 9, 2014, 10:47:12 AM11/9/14
to bpfk...@googlegroups.com
On Sun, Nov 9, 2014 at 12:37 PM, John Cowan <co...@mercury.ccil.org> wrote:
Jorge Llambías scripsit:

> That was the idea, yes. In fact only Cyr and Cyn. The -l- hyphen would now
> never be needed, since the final consonant of the rafsi becomes irrelevant
> for the choice of hyphen and so -yr- works for all cases except when the
> fu'ivla proper begins with r.

I was actually thinking of something more radical:  replace all syllabic
consonants with yC throughout the language. 

OK, the effect would be the same, since outside of cmevla the only syllabic consonants that could survive would be the type-3 fu'ivla hyphens (the rest would have to go because fu'ivla can't have y). In cmevla camxes already allows y-syllables with codas. 

Gleki Arxokuna

unread,
Nov 9, 2014, 11:48:51 AM11/9/14
to bpfk...@googlegroups.com
2014-11-09 18:37 GMT+03:00 John Cowan <co...@mercury.ccil.org>:
Jorge Llambías scripsit:

> That was the idea, yes. In fact only Cyr and Cyn. The -l- hyphen would now
> never be needed, since the final consonant of the rafsi becomes irrelevant
> for the choice of hyphen and so -yr- works for all cases except when the
> fu'ivla proper begins with r.

I was actually thinking of something more radical:  replace all syllabic
consonants with yC throughout the language.  Ivan Derzhanski argued for
this long ago, and I now think he was correct: they reduce readability
and make things more awkward.  Syllabic consonants were introduced by
JCB, and I think we should have discarded them for Lojban; Loglan wound
up being more dependent on them, and writing them double.

If this change can be easily explained (e.g. "replace all syllabic consonants to {yr}") then it's okay. Existing texts will be assumed to be synonymous to these amended words just like {klamygau} is a full synonym of {klagau} (except the sounding and the text itself).


> Type-3 fu'ivla become thus something like pseudo-lujvo, which is what they
> are anyway.

Indeed.

--
John Cowan          http://www.ccil.org/~cowan        co...@ccil.org
Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo

Jorge Llambías

unread,
Nov 9, 2014, 12:08:06 PM11/9/14
to bpfk...@googlegroups.com
On Sun, Nov 9, 2014 at 1:48 PM, Gleki Arxokuna <gleki.is...@gmail.com> wrote:
2014-11-09 18:37 GMT+03:00 John Cowan <co...@mercury.ccil.org>:

I was actually thinking of something more radical:  replace all syllabic
consonants with yC throughout the language.  Ivan Derzhanski argued for
this long ago, and I now think he was correct: they reduce readability
and make things more awkward.  Syllabic consonants were introduced by
JCB, and I think we should have discarded them for Lojban; Loglan wound
up being more dependent on them, and writing them double.

If this change can be easily explained (e.g. "replace all syllabic consonants to {yr}") then it's okay. Existing texts will be assumed to be synonymous to these amended words just like {klamygau} is a full synonym of {klagau} (except the sounding and the text itself).

Doing that won't work in general for consonantal syllables in places other than in the type-3 hyphen. For example the currently valid fu'ivla "mutcmle" (mut,cm,le) would turn into a lujvo "mutcymle". Such fu'ivla are rare, but I'm sure there are already a few in jbovlaste. Examples like this that create actual possible words can only happen with Cm consonantal syllables, because "m" is the only syllabic consonant that can start a CC onset. 

Gleki Arxokuna

unread,
Nov 9, 2014, 12:44:26 PM11/9/14
to bpfk...@googlegroups.com
This wont matter if there are none in the corpus. In fact jbofi'e/camxes discrepancy already invalidated a lot of fu'ivla

Reply all
Reply to author
Forward
0 new messages