jbovlaste updated with camxes-morphology

40 views
Skip to first unread message

Riley Martinez-Lynch

unread,
Jun 24, 2014, 9:16:26 AM6/24/14
to loj...@googlegroups.com

coi jbopre

jbovlaste has been updated to apply camxes morphology when new words are entered. The new morphological classifier, "vlatai.py" is part of the camxes-py Python parser, and replaces "vlatai", which is bundled with the jbofihe parser.

vlatai.py adds two types: "bu-letterals" (previously classified as "cmavo" or "cmavo cluster") and "zei-lujvo" (previously classified as "lujvo"). These new types are subject to camxes parser rules: Invalid constructs such as {bu bu} and {zei zei lujvo} are rejected.

Other "magic words" such as {zo} and {zoi} are not currently supported in combination with {bu} and {zei}. This is an oversight rather than a design choice, so please feel free to file a bug report if you find this is needed.

The 21,940 valsi currently registered in jbovlaste were verified with the new classifier: 21,829 reported no change, 10 were reclassified as bu-letterals, 26 were reclassified as zei-lujvo, 1 was reclassified from fu'ivla to lujvo, and 74 valsi were marked as "obsolete": cmevla (22), fu'ivla (51) and zei-lujvo (1). 

Details of the reclassified words can be found here:

https://github.com/lojban/jbovlaste/issues/47

https://github.com/lojban/jbovlaste/issues/39

https://github.com/lojban/jbovlaste/issues/40

https://github.com/lojban/jbovlaste/issues/43

https://github.com/lojban/jbovlaste/issues/44

The new "obsolete" valsi types are currently treated like the "experimental" types  in XML and PDF exports: They are marked with a warning.

la gleki raised the issue that some words (e.g. {relmast}) which don't conform to this version of camxes, ought to in fact be valid. xorxes noted that only older versions of the camxes/BPFK morphology prohibit such words.

I checked {relmast} against the Java/Rats! version of camxes which is linked on the "Issues With The Lojban Formal Grammar" page: It was not accepted. It was also not accepted by camxes.js or either the standard or experimental ilmentufa grammars. I also checked python-camxes, but it uses the same version of the Java jar that was described above.

I built a new camxes Java/Rats! jar using the latest morphology on the tiki, and I can confirm that according to this version of the grammar, {relmast} is valid. However, it's not clear whether such a jar is currently distributed anywhere.

Based on all of this, my inclination is to update camxes-py as soon as possible to use the newest BPFK morphology (where "newest" may mean n years old). However, if I do this, it will no longer be in sync with most other implementations of camxes currently distributed. Thoughts, anyone?

Thanks to rlpowell and tene for their assistance in getting the new software installed.

mi'e la mukti mu'o

Jonathan Jones

unread,
Jun 24, 2014, 11:23:29 AM6/24/14
to loj...@googlegroups.com
On Tue, Jun 24, 2014 at 7:16 AM, Riley Martinez-Lynch <shun...@gmail.com> wrote:
Based on all of this, my inclination is to update camxes-py as soon as possible to use the newest BPFK morphology (where "newest" may mean n years old). However, if I do this, it will no longer be in sync with most other implementations of camxes currently distributed. Thoughts, anyone?
Update it. It is more important that jbovlaste be consistent with the official morphology than it be such with outdated distributions of a parser.

That said, at least http://camxes.lojban.org/camxes/ should also be updated to be consistent with the PEG. Unfortunately, I've no idea who maintains /any/ of the distributions, let alone how many or where they are.

Thanks to rlpowell and tene for their assistance in getting the new software installed.

mi'e la mukti mu'o

--
You received this message because you are subscribed to the Google Groups "lojban" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lojban+un...@googlegroups.com.
To post to this group, send email to loj...@googlegroups.com.
Visit this group at http://groups.google.com/group/lojban.
For more options, visit https://groups.google.com/d/optout.



--
mu'o mi'e .aionys.

.i.e'ucai ko cmima lo pilno be denpa bu .i doi.luk. mi patfu do zo'o
(Come to the Dot Side! Luke, I am your father. :D )

la durka

unread,
Jun 24, 2014, 1:35:16 PM6/24/14
to loj...@googlegroups.com
FYI, this broke vlasisku's import. I've fixed it in the latest revision at github.com/lojban/vlasisku (and my Vlasisku instance is running with an updated export from yesterday).

As for camxes.lojban.org, I believe it is updated, but I could be wrong. For instance, it rejects {bliardo} but accepts {bliiardo} and {bolrbliardo}. And that leads to my question -- how is {bolrbliardo} legal but {bliardo} illegal? What is the difference, besides the prefix?

mi'e la durka mu'o

Gleki Arxokuna

unread,
Jun 24, 2014, 1:55:01 PM6/24/14
to loj...@googlegroups.com
compare to {ibliardo} which is legal


--

Alex Burka

unread,
Jun 24, 2014, 2:15:19 PM6/24/14
to loj...@googlegroups.com
Hmm, okay. So is it that {ibliardo} and {bolrbliardo} are pronounced with a syllabic L, like {.ibl,iardo}/{bol,rbl,iardo}, as opposed to {bli,iardo} with more of a BL cluster?

mi'e la durka mu'o
You received this message because you are subscribed to a topic in the Google Groups "lojban" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lojban/gJaX8fPV_zc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lojban+un...@googlegroups.com.

Jorge Llambías

unread,
Jun 24, 2014, 4:29:09 PM6/24/14
to loj...@googlegroups.com
On Tue, Jun 24, 2014 at 3:15 PM, Alex Burka <dur...@gmail.com> wrote:
Hmm, okay. So is it that {ibliardo} and {bolrbliardo} are pronounced with a syllabic L, like {.ibl,iardo}/{bol,rbl,iardo}, as opposed to {bli,iardo} with more of a BL cluster?

Ths canonical syllabifications would be ".i,bl,iar,do",  "bo,lr,bl,iar.do" and "bli.iar,do". Consonantal syllables are always exactly two consonants long, with the second consonant being one of l/m/n/r, so there are only 64 possible consonantal syllables. A brivla cannot start with a consonantal syllable.

mu'o mi'e xorxes

Pierre Abbat

unread,
Jun 25, 2014, 2:25:23 PM6/25/14
to loj...@googlegroups.com
On Tuesday, June 24, 2014 06:16:26 Riley Martinez-Lynch wrote:
> Details of the reclassified words can be found here:
>
> https://github.com/lojban/jbovlaste/issues/47
>
> https://github.com/lojban/jbovlaste/issues/39
>
> https://github.com/lojban/jbovlaste/issues/40

All words consisting of "spar" + the scientific name of a plant, which are
supposed to be words for the plant, such as "spararkti", are wrong. The
correct form is "spatr" + the scientific name (unless phonotactics force a
different interfix), and the letter immediately after the interfix must be a
consonant.

ckoala -> fasxolarto (This was actually my original form, before Nick inserted
the 'k' which makes it invalid under camxes.)
kriofla -> kriiofla
tarksako -> tarsako or traksako. Which do you prefer?
trueno -> tro'ena
xarjrngiri, xarjrnjiri -> ? ("ngiri" and "njiri" are in two Bantu languages.)
aierne -> ai'erne

On Tuesday, June 24, 2014 10:35:16 la durka wrote:
> As for camxes.lojban.org, I believe it is updated, but I could be wrong.
> For instance, it rejects {bliardo} but accepts {bliiardo} and
> {bolrbliardo}. And that leads to my question -- how is {bolrbliardo} legal
> but {bliardo} illegal? What is the difference, besides the prefix?

I had a similar question about "mitxondrio". It is syllabized "mit,xon,dr,io".
A consonantal syllable cannot be first.

Pierre
--
Don't buy a French car in Holland. It may be a citroen.

Reply all
Reply to author
Forward
0 new messages