Suggestions for Unitex and Gramlab

212 views
Skip to first unread message

eric.laporte

unread,
Oct 25, 2013, 4:08:00 AM10/25/13
to
Here are suggestions made during the Unitex-Gramlab workshop (October 10-11, 2013).
Users are invited to vote for the most useful suggestions by clicking on "Mark as best answer".

eric.laporte

unread,
Oct 24, 2013, 10:25:13 AM10/24/13
to unitex-...@googlegroups.com
Precedence between paths in local grammar graphs

There are presently two ways of selecting among concurrent matches in local grammar graphs: (i) weights and (ii) the longest match/shortest match option. When both ways are used, the longest-match option has precedence over weights, i.e. weights are taken into account only in order to decide between maximal-length matches. This option should remain for backward compatibility, but it would be useful to opt for inverting this precedence, i.e. match length would be taken into account only in order to decide between maximal-weight matches.

Elsa Tolone and Éric Laporte on behalf of Gaëlle Recourcé

eric.laporte

unread,
Oct 24, 2013, 10:25:43 AM10/24/13
to unitex-...@googlegroups.com
Inclusion of Cassys into Gramlab

It woulde be useful to be able to use Cassys in Gramlab. This implies extending the Gramlab interface.

eric.laporte

unread,
Oct 24, 2013, 10:26:13 AM10/24/13
to unitex-...@googlegroups.com
More statistical measures in Unitex

Some statistical measures commonly used in corpus linguistics are not available in Unitex and would be useful.

Elsa Tolone and Éric Laporte on behalf of Antonio Balvet

eric.laporte

unread,
Oct 24, 2013, 10:27:28 AM10/24/13
to unitex-...@googlegroups.com
Processing XML documents

It would be very useful to be able to process the tag-free text of XML documents, with a preprocessing that strips off XML tags, and a postprocessing that inserts them back. Sébastien Paumier has already implemented the construction of offset files which allow for mapping positions to the original text.

eric.laporte

unread,
Oct 24, 2013, 10:27:48 AM10/24/13
to unitex-...@googlegroups.com
Extension of XAlign

Several extensions and modifications of the XAlign aligner would be useful:
- bilingual concordances with both texts side by side;
- manual edition of the segmentation;
- the place where the XAlign directory is created is unusual with respect to Unitex conventions and makes it inaccessible to student users in some teaching environments.
Dusko Vitas and Cvetana Krstev's group have implemented solutions outside Unitex for these problems and similar ones.

eric.laporte

unread,
Oct 24, 2013, 10:28:08 AM10/24/13
to unitex-...@googlegroups.com
DELA-to-LMF conversion

It would be useful to be able to convert the DELA format of dictionaries to an LMF format.

eric.laporte

unread,
Oct 24, 2013, 10:28:26 AM10/24/13
to unitex-...@googlegroups.com
Inclusion of statistical measures into Gramlab

Statistical measures would be useful in Gramlab.

Éric Laporte

eric.laporte

unread,
Oct 24, 2013, 10:28:43 AM10/24/13
to unitex-...@googlegroups.com
Inclusion into Gramlab of FST-Text search

The Locate pattern tool of Unitex offers an alternative to Paumier's (2003) quick search: a search in the FST-Text. With this option, Gramlab could process agglutinative languages. In most agglutinative languages, morphemes are concatenated without delimitation, and the result of lexical analysis is encoded in the FST-Text.

Éric Laporte

eric.laporte

unread,
Oct 24, 2013, 10:29:39 AM10/24/13
to unitex-...@googlegroups.com
Joint processing of several documents without UIMA

Most NLP tools can process several documents jointly. This allows for comparing features of documents: vocabulary..., and for generating multidocument concordances. It would be useful to do so without neither resorting to a UIMA chain nor merging documents into a single file. This problem has been solved in special cases by Claude Devis, Laurent Kevers and Bastien Kindt.

Elsa Tolone and Éric Laporte on behalf of Tita Kyriacopoulou

eric.laporte

unread,
Oct 24, 2013, 10:29:57 AM10/24/13
to unitex-...@googlegroups.com
Generation of a log file to report on the application of a graph to a corpus

It would be useful to be able to generate a log file reporting all the steps in the exploration of the graphs by the Locate pattern tool. Such a log file would help to debug a local grammar, and to detect parts which slow down the application. It would be especially useful with local grammars with many graphs.

Elsa Tolone and Éric Laporte on behalf of Gilles Volant

eric.laporte

unread,
Oct 24, 2013, 10:30:14 AM10/24/13
to unitex-...@googlegroups.com
Inclusion of the Gramlab debug mode into Unitex

The present debug mode of Gramlab, implemented by Sébastien Paumier, would be useful in Unitex too.

eric.laporte

unread,
Oct 24, 2013, 10:30:31 AM10/24/13
to unitex-...@googlegroups.com
Suppression of logging messages in command-line environment

It would be useful to control if the system outputs logging messages or not when in command-line environment, through a silent/verbose option.

Elsa Tolone and Éric Laporte on behalf of Cédrick Fairon

eric.laporte

unread,
Oct 24, 2013, 10:40:45 AM10/24/13
to
XML well-formedness check

It would be useful to check the well-formedness of an XML document being generated by transducers. Now we can only check it with an external validator once the document is ready. An integrated check would point to graphs that introduce a violation of XML syntax.

Elsa Tolone and Éric Laporte on behalf of Sylvain Surcin

eric.laporte

unread,
Oct 24, 2013, 10:31:05 AM10/24/13
to unitex-...@googlegroups.com
Packaging graph libraries in Gramlab without Maven

In Gramlab, a graph library can be exported and imported with Maven http://maven.apache.org/ An alternative procedure would be useful to users that have not installed Maven. This would involve packaging a directory of graphs with an indication of the main graph among them.

Elsa Tolone and Éric Laporte on behalf of Hubert Naets

eric.laporte

unread,
Oct 24, 2013, 10:31:20 AM10/24/13
to unitex-...@googlegroups.com
Packaging graph libraries in Unitex

It would be useful to be able to package a directory of graphs, with an indication the main graph among them, so that the package can be easily exported and imported.

eric.laporte

unread,
Oct 24, 2013, 10:31:37 AM10/24/13
to unitex-...@googlegroups.com
Unitex-Gramlab convergence

Unitex and Gramlab have much functionality and software in common. The interest of the community is that as much as possible of functionality and software is maintained in common.
The Unitex interface is adapted to linguistic research and individual projects, whereas the Gramlab interface is adapted to intensive collaboration and coexistence of several projects in parallel.
It would be useful to advance towards convergence of the Unitex and Gramlab software, maintaining distinct interfaces.

Elsa Tolone and Éric Laporte on behalf of Tita Kyriacopoulou

eric.laporte

unread,
Oct 24, 2013, 10:31:59 AM10/24/13
to unitex-...@googlegroups.com
Word (type) list

Unitex displays the list of (different) tokens. It would be useful to be able to opt for the list of (different) words, excluding figures and non-verbal symbols.

Elsa Tolone and Éric Laporte on behalf of Rosa Cetro

eric.laporte

unread,
Oct 24, 2013, 10:32:19 AM10/24/13
to unitex-...@googlegroups.com
Lock on graphs or dictionaries for collaborative work in Gramlab

It would be useful to be able to temporarily lock a resource (graph or dictionary) until you commit your changes, so that changes made in parallel by different users do not result in conflicts. We use this convention for changes in Lexicon-Grammars.

Éric Laporte on behalf of Elsa Tolone

Cédrick Fairon

unread,
Oct 24, 2013, 1:42:29 PM10/24/13
to unitex-...@googlegroups.com
Well this one has already been implemented by Gilles Vollant in the meantime.
Documentation will be shortly added in the Unitex Manual.

Usage: SelectOutput [OPTIONS]

OPTIONS:
 -o [on/off]/--output=[on/off]: enable (on) or disable (off) standard output
 -e [on/off]/--error=[on/off]: enable (on) or disable (off) error output


Exemple:
UnitexToolLogger.exe { SelectOutput -o off -e off } { Normalize absinthe-win-2.0.4.txt }

On peut juste avoir un niveau intermédiaire : garder les sorties d'erreur et supprimer les sorties normales.

La possibilité de régler finement plusieurs niveau de verbosité serait encore plus perfectionnée, mais nécessite un travail global sur le source d'Unitex.

Nebojsa Vasiljevic

unread,
Nov 10, 2013, 6:43:23 PM11/10/13
to unitex-...@googlegroups.com
My suggestion is to make Unitex-GramLab convergence through those  three phases:

1. Migrate both projects into a single project hosted on some of the well known open-source software hosting facilities (Gootle Code, GitHub, etc). Since Google code supports SVN and we already use Google Groups, it could be the right choice.

2.  Implement possibility to open GramLab IDE projects in Unitex Classic GUI  and implement possibility to open classic language folders as projects in GramLab IDE .

3. Gradually modify parts of Unitex Classic GUI  and GramLab IDE to make those parts of code reusable in both front-ends.

Finally, Classic GUI and GramLab IDE becomes a relatively tiny layer over shared GUI components and a GUI-independent class library.

Regards,
Nebojša Vasiljević

eric.laporte

unread,
Nov 17, 2013, 9:15:28 AM11/17/13
to unitex-...@googlegroups.com
When you save the current graph under a new name (menu FSGraph > Save as), the Unitex interface displays by default another name, recently typed as a new name for a graph. I would prefer the interface to display by default the current name of the current graph, because the new name is likely to be a variant of the current name.
Eric Laporte

Denis Maurel

unread,
Nov 18, 2013, 4:38:37 AM11/18/13
to eric.laporte, unitex-...@googlegroups.com


I agree with Eric: it is a good idea. Another one in the same idea, close the text with a click on the button in top on the right...


Best regards,

Denis Maurel


____________________________________
Professor Denis Maurel
Université François Rabelais Tours
LI (Computer Science Research Laboratory)
EPU-DI
64 avenue Jean-Portalis
37200 Tours
France
Phone: 33-2.47.36.14.35
Fax: 33-2.47.36.14.22
mailto:denis....@univ-tours.fr

http://www.univ-tours.fr/maurel

http://www.li.univ-tours.fr
http://tln.li.univ-tours.fr/



When you save the current graph under a new name (menu FSGraph > Save as), the Unitex interface displays by default another name, recently typed as a new name for a graph. I would prefer the interface to display by default the current name of the current graph, because the new name is likely to be a variant of the current name.
Eric Laporte

On Thursday, 24 October 2013 16:24:16 UTC+2, eric.laporte wrote:


--
You received this message because you are subscribed to the Google Groups "Unitex-GramLab" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unitex-gramla...@googlegroups.com.
To post to this group, send email to unitex-...@googlegroups.com.
Visit this group at http://groups.google.com/group/unitex-gramlab.
To view this discussion on the web visit https://groups.google.com/d/msgid/unitex-gramlab/a93343bf-ce14-4fe7-9e9c-d16e6bc59bae%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Nebojsa Vasiljevic

unread,
Nov 23, 2013, 1:24:08 PM11/23/13
to unitex-...@googlegroups.com, eric.laporte, denis....@univ-tours.fr
Dear Eric, Denis (and others),

I've fixed those two issues plus one similar, so:
- "Save as" action for graphs now offers proper default value (current file name)
- Text frame now has a close button in the title bar
-  Closing the "Do you really want to quit ?" popup window now means "NO" 

Like before, Unitex.jar and the source code patch are in the attachment. Please, try it and if it works fine, I could commit it in the Unitex source code repository.

Also, I would like to emphasize that this kind of issues should be recorded and managed through an issue tracking system. 
As a matter of fact Unitex has such a tracking system  (http://gforgeigm.univ-mlv.fr/tracker/?group_id=33) , but it is not used. 
IMO, we should either start to use it or migrate the project on Google code  (or some other software projects hosting platform) and then start to use issue/feature tracking. 

Rgards,
Nebojša Vasiljević
Unitex.jar
Unitex-Java-src-2013-11-23-vasiljevic.patch

Denis Maurel

unread,
Nov 24, 2013, 3:42:48 AM11/24/13
to Nebojsa Vasiljevic, unitex-...@googlegroups.com, eric.laporte


Hi Nebojsa,
I try it. It is very fine
Thank you very much


Best regards,

Denis Maurel


____________________________________
Professor Denis Maurel
Université François Rabelais Tours
LI (Computer Science Research Laboratory)
EPU-DI
64 avenue Jean-Portalis
37200 Tours
France
Phone: 33-2.47.36.14.35
Fax: 33-2.47.36.14.22
mailto:denis....@univ-tours.fr

http://www.univ-tours.fr/maurel

http://www.li.univ-tours.fr
http://tln.li.univ-tours.fr/



Dear Eric, Denis (and others),

Eric Laporte

unread,
Nov 24, 2013, 4:12:12 AM11/24/13
to unitex-...@googlegroups.com
Dear Nebojsa,

Thanks for your version. However, I am wondering how many users are in favour of these changes.
I suggest we await feedback from more users.

<<
- "Save as" action for graphs now offers proper default value (current file name)
>>
OK for me, but only Denis, you and me expressed any opinion about this change.


<<
- Text frame now has a close button in the title bar
>>
Shouldn't this close button come with a confirmation window? if you close the text unintentionally, you may have to remember parameters to open it back into the same state.


<<
-  Closing the "Do you really want to quit ?" popup window now means "NO"
>>
I am neutral about this. I am OK if this is standard behaviour on closing a confirmation window by clicking on the close button.

Best,

Eric Laporte
-- 
Éric Laporte
Institut Gaspard-Monge - Université Paris-Est Marne-la-Vallée
Bâtiment Copernic - Bureau 4B090
5, bd Descartes
77454 - Marne-la-Vallée CEDEX 2
Tel. +33 1 6095 7552

Nebojsa Vasiljevic

unread,
Nov 24, 2013, 9:57:30 AM11/24/13
to unitex-...@googlegroups.com
Dear Eric,

While we are waiting to see opinions of other users, I would like to answer the questions you asked:

Q: Shouldn't this close button come with a confirmation window? if you close the text unintentionally, you may have to remember parameters to open it back into the same state.

A: As I understand, If you close the text frame and open the same text afterwards, nothing will be lost. If this is right, then we don't need a confirmation window. Taking into account other products (MS Office, web browsers, etc), the standard behavior is to ask for confirmation only if you can lost some changes or if you are attempting to close more than one file/page with a single click.

Q: I am OK if this is standard behaviour on closing a confirmation window by clicking on the close button.

A: The standard behavior is: closing a confirmation window by clicking on the close button means "Cancel". In the particular case the operation is "Quit Unitex", and click on the close button on the confirmation window should mead "cancel the quit operation". Currently, if I close the confirmation window, Unitex will quit, and this is definitely not what I expect. Also, I have a habit to just close a confirmation window when I don't want to quit, since I'm lazy to read the question carefully to figure out should I click "yes" or "no".

Regards,
Nebojša Vasiljević

eric.laporte

unread,
Nov 24, 2013, 2:28:58 PM11/24/13
to unitex-...@googlegroups.com
Dear Nebojša,


On Sunday, 24 November 2013 15:57:30 UTC+1, Nebojsa Vasiljevic wrote:
<<
A: As I understand, If you close the text frame and open the same text afterwards, nothing will be lost. If this is right, then we don't need a confirmation window.
>>
If you close the text frame unintentionally, no files are lost. However, in order to open back the text into the same state, you need to realise, for example, that you have to open the .snt version of the file and not the .txt, and then accept the values of all parameters as Unitex suggests them. This is obvious to most professional and experienced users, but I think it unfriendly to new users to ask them to realise all this. Unitex users are not only computer scientists: many are linguists, and I know by experience that new users are not necessarily very swift with computers. In addition, closing the text and opening another is not a frequent operation for linguists using Unitex: a large part of their activity consists of working on graphs or a dictionary on the basis of a single corpus. Therefore, the chance of an unintentional click on the close button is not so slim. This is why I still think a confirmation window would be a good thing with this close button.
Unitex is not very useful without its linguist users.

Best,

Nebojsa Vasiljevic

unread,
Nov 24, 2013, 5:58:30 PM11/24/13
to unitex-...@googlegroups.com
Dear Eric,

I understand. In the attachment you can find new Unitex.jar with the confirmation window.

BTW, If we find out that text open operation is  unfriendly for inexperienced users, this is an issue anyway. Even for me, it is hard to answer the question: "What is the difference between Text->Open and Text->Open Tagged Text?". Both options can open both .stn and .txt . Also, you end up with .snt file opend in any case, etc. 
IMO, Text->Open should just open a .snt text (with proper filter on file extension) and other options should do other things.

Regards,
Nebojša
.
Unitex.jar

eric.laporte

unread,
Nov 28, 2013, 10:32:07 AM11/28/13
to unitex-...@googlegroups.com
One of my master students suggested that during graph edition (FSGraph), when the cursor is in a special edit mode because you have clicked on the third set of the icons in the graph toolbar, a right-click should automatically return the cursor to normal state. I thought it a good idea.

Eric Laporte

On Thursday, 24 October 2013 16:24:16 UTC+2, eric.laporte wrote:

Nebojsa Vasiljevic

unread,
Dec 1, 2013, 4:06:43 AM12/1/13
to unitex-...@googlegroups.com
Dear colleagues,

After some testing, I've find out that the new close confirmation window pops up in some cases where it should not pop up. To fix it properly, significant changes should be made both in Unitex-Java and GramLab IDE sources: we should systematically make difference between cancelable and non-cancelable frame closing. 

Issues with cancelable/non-cancelable frame closing already exists in Unitex/Gramlab. For instance, in Gramlab IDE there is always a "cancel" button present in the "Do you want to save ?" popup window for the graph editor, and when you are closing a project and have opened graph, selecting "cancel" in the  "Do you want to save ?" will not cancel the project closing. So, there is the general closing cancellation issue in  Unitex/Gramlab.

Since the other two fixes seems OK (default file name in "Save as" action for graphs and cancellation of "Do you really want to quit ?" popup window), I am going to commit those two fixes, and postpone text frame closing improvements to the time after the general closing cancellation issue will be resolved.

Regards,
Nebojsa Vasiljevic

Nebojsa Vasiljevic

unread,
Dec 1, 2013, 12:58:26 PM12/1/13
to unitex-...@googlegroups.com
Dear Eric (and others),

I have implemented this feature (reset cursor to normal on right click) and also set default file filter to "*.grf" in graph open dialog. You can download Unitex.jar and source code patch from:


Regards,
Nebojša Vasiljević

eric.laporte

unread,
Dec 3, 2013, 4:44:12 AM12/3/13
to unitex-...@googlegroups.com
Thanks for this Nebojša.
Observing my students, I found another reason why the close button of the text window, if it is reintroduced in the Unitex interface, should have a confirmation window. When you study a concordance, clicking on an occurrence brings the text window to the front, and this may hide part of the concordance window. An inexperienced user may believe that the click created text window, and wish to delete it to see the concordance window again.However, deleting the text window will close the text, and therefore close the concordance, word-list and token-list windows. This is likely to interrupt the user's study and cause frustration. I agree a "Do you really want to quit ?" popup window fixes the problem, with perhaps a message to warn that "this will also close any concordance and word-list window relative to the text".
Best,

Eric Laporte

eric.laporte

unread,
Dec 3, 2013, 4:50:58 AM12/3/13
to unitex-...@googlegroups.com
Nebojša,


On Saturday, 23 November 2013 19:24:08 UTC+1, Nebojsa Vasiljevic wrote:
<<
Also, I would like to emphasize that this kind of issues should be recorded and managed through an issue tracking system. 
As a matter of fact Unitex has such a tracking system  (http://gforgeigm.univ-mlv.fr/tracker/?group_id=33) , but it is not used. 
IMO, we should either start to use it or migrate the project on Google code  (or some other software projects hosting platform) and then start to use issue/feature tracking. 
>>
I like the Unitex-Gramlab forum because talks are public and users can give their opinion. Would it be the case with the tracking system?
Best,
Eric Laporte

eric.laporte

unread,
Dec 3, 2013, 5:09:47 AM12/3/13
to unitex-...@googlegroups.com
Thanks for this Nebojša.
I tested the Unitex.jar and I think it improves graph editing.
Best,

Eric Laporte

Nebojsa Vasiljevic

unread,
Dec 4, 2013, 5:02:17 PM12/4/13
to unitex-...@googlegroups.com
Eric,

Your question: I like the Unitex-Gramlab forum because talks are public and users can give their opinion. Would it be the case with the tracking system?

Short answer: Yes.

Of course, there are different issue tracking systems, but I suppose we will select an appropriate. Actually, we should select appropriate softwer project hosting service, and the issue tracking is just a feature of this kind of service.   Generally, all issues and comments on issues are public, and anyone can post new issue (but need to be logged in).  Issue tracking system may look like a forum, but it is database of issues and an open issue will not become a forgotten post.

Here is an issue list of a project:

And here is an issue with comments:

Those examples are from GitHub, but we can also use Google Code (GramLab IDE already use it). 

Regards,
Nebojša Vasiljević
Reply all
Reply to author
Forward
0 new messages