A lexical mask for letters?

61 views
Skip to first unread message

eric.laporte

unread,
Sep 25, 2015, 11:45:04 AM9/25/15
to unitex-...@googlegroups.com
Dear all,
Denis Maurel suggests there could be a lexical mask for letters, <LETTRE>, which would be equivalent to <MOT><<^.$>>
His motivation is that <LETTRE> is simpler than <MOT><<^.$>> for beginners. On the other hand, with the innovation, teachers would lose a pedagogical opportunity to teach morphological filters to beginners.
Various users', teachers' and developers' opinion would be helpful for deciding about this option.
Thanks,
Eric Laporte

Denis Maurel

unread,
Sep 25, 2015, 12:07:19 PM9/25/15
to eric.laporte, Unitex-GramLab


Dear all,

The suggested modification is motivated in morphological mode where <MOT> (mot in French is word in English) is the mask for letter! This is not usefull in course for students.

If this idea is accepted, there will be two mask equivalent in morphological mode, <MOT> and <LETTER>. So the previous graphs will be always correct.

Best regards,

Denis Maurel


____________________________________
Professor Denis Maurel
Université François Rabelais Tours
LI (Computer Science Research Laboratory)
EPU-DI
64 avenue Jean-Portalis
37200 Tours
France
Phone: 33-2.47.36.14.35
Fax: 33-2.47.36.14.22
mailto:denis....@univ-tours.fr

http://www.univ-tours.fr/maurel

http://www.li.univ-tours.fr
http://tln.li.univ-tours.fr/



--
You received this message because you are subscribed to the Google Groups "Unitex-GramLab" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unitex-gramla...@googlegroups.com.
To post to this group, send email to unitex-...@googlegroups.com.
Visit this group at http://groups.google.com/group/unitex-gramlab.
To view this discussion on the web visit https://groups.google.com/d/msgid/unitex-gramlab/f525e1a6-6f55-4791-b5b3-7b832a5550d1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Gilles Vollant

unread,
Sep 25, 2015, 6:32:33 PM9/25/15
to denis....@univ-tours.fr, eric.laporte, Unitex-GramLab

 

Are the other tag in French or English ?

It seem we have somes in french and other in English…

 

Regards

Gilles Vollant

 

De : unitex-...@googlegroups.com [mailto:unitex-...@googlegroups.com] De la part de Denis Maurel
Envoyé : vendredi 25 septembre 2015 18:07
À : eric.laporte
Cc : Unitex-GramLab
Objet : Re: [Unitex-GramLab] A lexical mask for letters?

Oto Vale

unread,
Sep 25, 2015, 6:52:17 PM9/25/15
to Gilles Vollant, denis....@univ-tours.fr, eric.laporte, Unitex-GramLab
Hello,

I suggest <LTR>.
It's simple, and transparent for English, French, Portuguese. Spanish, Italian...

Bon week end

Oto

Cédrick Fairon

unread,
Sep 26, 2015, 6:28:16 AM9/26/15
to eric.laporte, Unitex-GramLab
I am rather in favor of new tags as long as they do not have negative impact on computing cost or code complexity. I guess there would be no problem on that side.

The tag <LETTRE> is almost as long as <MOT><<^..$>>, so I would prefer something like <LTR> (same spirit than <PNC>)

That was my two cents ;-)

C

--

Denis Maurel

unread,
Sep 28, 2015, 3:58:48 AM9/28/15
to Gilles Vollant, eric.laporte, Unitex-GramLab


Dear Gilles,

I agree with you. It is amazing with an English interface to see <MOT> and not <WORD>...
May be it is possible to duplicate all these tags in English?

eric.laporte

unread,
Sep 28, 2015, 6:06:56 AM9/28/15
to Unitex-GramLab
Dear all,

I misunderstood Denis' proposal: the lexical mask he suggests, <LETTRE>, (or <LTR>, as Oto recommends), would be functional only in the morphological mode. In this mode, <MOT>  presently matches with letters only (manual, section 6.4.2), so, the new mask would work exactly like <MOT>.
Sorry for the confusion.
I like Denis' suggestion because graphs in morphological mode would be more readable with <LTR> than with <MOT>, for all users, not only for beginners. As Denis says, <MOT> should remain functional for backward compatibility.
My only doubt is about the coexistence of different lexical masks with the same function. In general, this tends to reduce readability because it unnecessarily increases the number of codes: to be able to read graphs written by others, users have to know more lexical masks than necessary. To avoid this, I suggest the manual should deprecate the use of <MOT> in the morphological mode and recommend users to adopt <LTR> in their new graphs.

Now, should <LTR> be functional outside the morphological mode too? this is another question: I am less enthusiastic about this option, because <MOT><<^.$>> already exists for matching isolated letters.

Yet another question is whether <MOT> should be progressively replaced by <WORD>, which would be more easily remembered by more users. <MOT> is a remnant of the 1990s. Why not?

Best,

Eric

Alexis Neme

unread,
Sep 28, 2015, 12:32:28 PM9/28/15
to Unitex-GramLab
Hello 

Just about the mnemonic  naming <LTR>

Choosing <LTR> mnemonics for <LETTER> is an unlucky choice since it is ambiguous with Left-To-Right <LTR> as opposed to <RTL> Right-To-Left. Both are used in HTML to indicate  script directions: dir="rtl" or "ltr".

Think about  Right-To-Left scripts such as Hebrew and all Arabic script based languages, a dozen at least for official languages and many dozens if we count dialects.

Cheers,

Alexis

Denis Maurel

unread,
Sep 29, 2015, 2:28:28 AM9/29/15
to Alexis Neme, Unitex-GramLab


Dear Alexis,

Thanks! if <LTR> is ambiguous for some language, we will choose <LETTER>.

I propose to add with the same sense of <MOT>:
<WORD> in normal mode
<LETTER> in morphological mode.

What about <PRE>?

Best regards,

Denis Maurel


____________________________________
Professor Denis Maurel
Université François Rabelais Tours
LI (Computer Science Research Laboratory)
EPU-DI
64 avenue Jean-Portalis
37200 Tours
France
Phone: 33-2.47.36.14.35
Fax: 33-2.47.36.14.22
mailto:denis....@univ-tours.fr

http://www.univ-tours.fr/maurel

http://www.li.univ-tours.fr
http://tln.li.univ-tours.fr/



Hello 

Just about the mnemonic  naming <LTR>

Choosing <LTR> mnemonics for <LETTER> is an unlucky choice since it is ambiguous with Left-To-Right <LTR> as opposed to <RTL> Right-To-Left. Both are used in HTML to indicate  script directions: dir="rtl" or "ltr".

Think about  Right-To-Left scripts such as Hebrew and all Arabic script based languages, a dozen at least for official languages and many dozens if we count dialects.

Cheers,

Alexis


On Monday, September 28, 2015 at 12:06:56 PM UTC+2, eric.laporte wrote:


--
You received this message because you are subscribed to the Google Groups "Unitex-GramLab" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unitex-gramla...@googlegroups.com.
To post to this group, send email to unitex-...@googlegroups.com.
Visit this group at http://groups.google.com/group/unitex-gramlab.

Denis Maurel

unread,
Sep 29, 2015, 10:50:16 AM9/29/15
to denis maurel, Unitex-GramLab

Dear All,

Is it possible to replace <PRE> by <FIRST>?

Oto Vale

unread,
Sep 29, 2015, 3:03:09 PM9/29/15
to denis....@univ-tours.fr, Unitex-GramLab
Hello,

If there is "no negative impact on computing cost or code complexity" (as Cedrick Fairon dixit), il would be interesting to perform a 'défrancisation' (maintaining the  french codes in parallel, if possible) 


We have  <TOKEN>

Il could be; 
          ===> <LETTER>  (as Eric Laporte suggest)
<MOT> ==> <WORD>

<MAJ>= => <UPPERCASE>
<MIN> ==>  <LOWERCASE>
<PRE> ==> <FIRST>

Other tags like <DIC> seem transparent


[]s

Oto

Gilles Vollant

unread,
Sep 29, 2015, 4:57:19 PM9/29/15
to Oto Vale, denis....@univ-tours.fr, Unitex-GramLab

 

Of course, anyone agree we will not break compatibility with existing graph.

 

I think Adding a second code in English will not add computing cost (and a very minor code modification).

 

I suggest we “defrancise” code also.

 

Do you prefer UPPERCASE or a smaller UCASE ?

 

De : unitex-...@googlegroups.com [mailto:unitex-...@googlegroups.com] De la part de Oto Vale
Envoyé : mardi 29 septembre 2015 21:03
À : denis....@univ-tours.fr
Cc : Unitex-GramLab
Objet : Re: [Unitex-GramLab] Re: A lexical mask for letters?

Gilles Vollant

unread,
Sep 30, 2015, 4:25:02 AM9/30/15
to Oto Vale, denis....@univ-tours.fr, Unitex-GramLab

How can we finnaly choose ?

 

I suggest this :

We remove LETTRE introduced last week (no real compatibility issue for one week) and replace by LETTER

 

We add WORD as alias for MOT

We add FIRST as alias for PRE

 

For MAJ, MIN, we must choose between UPPERCASE, LOWERCASE or UCASE, LCASE. Do you prefer short or explicit ?

 

Who press the “GO”  button ?

 

Regards

Gilles Vollant

 

De : Gilles Vollant [mailto:voll...@gmail.com]
Envoyé : mardi 29 septembre 2015 22:57
À : 'Oto Vale'; 'denis....@univ-tours.fr'
Cc : 'Unitex-GramLab'
Objet : RE: [Unitex-GramLab] Re: A lexical mask for letters?

 

 

Of course, anyone agree we will not break compatibility with existing graph.

 

I think Adding a second code in English will not add computing cost (and a very minor code modification).

 

I suggest we “defrancise” code also.

 

Do you prefer UPPERCASE or a smaller UCASE ?

 

De : unitex-...@googlegroups.com [mailto:unitex-...@googlegroups.com] De la part de Oto Vale
Envoyé : mardi 29 septembre 2015 21:03
À : denis....@univ-tours.fr
Cc : Unitex-GramLab
Objet : Re: [Unitex-GramLab] Re: A lexical mask for letters?

 

Hello,

Daniel Stein

unread,
Sep 30, 2015, 4:56:20 AM9/30/15
to Unitex-GramLab
I agree with the changes summed up in Gilles' mail - and I prefer ucase over uppercase as I think complex grammars will be more readable (although I think than MAJ and MIN is also transparent in English as majuscle and minuscle are English words aswell).

Kind regards

Daniel


For more options, visit https://groups.google.com/d/optout.



--
Daniel Stein
Friedenstraße 35
34121 Kassel

Denis Maurel

unread,
Sep 30, 2015, 8:37:22 AM9/30/15
to Unitex-GramLab


Dear Otto and Daniel,

I suggest to reduce the mask to use <UPPER> and <LOWER>?
Reply all
Reply to author
Forward
0 new messages