How to apply dictionary-graphs in Gramlab IDE?

114 views
Skip to first unread message

Fanny Grandry

unread,
Jun 23, 2017, 12:53:54 PM6/23/17
to unitex-...@googlegroups.com
Is there a way to specify a dictionary-graph in Gramlab IDE? I looked for it and didn't find.

I tried directly editing the content of the project.versionable_config file in my project, but each time I restart, my IDE is very slow and my project.versionable_config is overwritten with the default one. 
Strangely, dictionaries to be applied during the preprocessing are still visible in the interface at the first restart, and the .fst2 that I have just added seems to be properly applied to my corpus. 
Please advise.

Best regards,

Fanny Grandry

eric.laporte

unread,
Jun 26, 2017, 7:26:31 AM6/26/17
to Unitex-GramLab
Dear Fanny,
In the zone in the left part of the GramLab screen, between the corpus zone and the console zone, check "Do preprocessing" and click "preprocessing". In the dialog box named "Configuring preprocessing of project <name of project>", scroll down to the "Dictionaries" zone and click "Set..." below the list of dictionaries. A "Configuring project <name of project>" dialog box opens. If there is a dictionary-graph in the Dela directory of the project, its name appears in the list: check it. Click Ok on both dialog boxes. Check "Do preprocessing" and click Go. The lexical entries described in the dictionary-graph will now appear in the word lists if they occur in the text. I checked this with version 3.2 alpha dated March 23, 2017.
Best,
Eric

Fanny Grandry

unread,
Jun 28, 2017, 5:44:49 AM6/28/17
to Unitex-GramLab
Thanks for the quick response, it worked. 
I must have forgotten to compress my dictionary-graph which made it unavailable in the preprocessing configuration window.

I have another question. I can't manage using this dictionary-graph in replace mode. Even though, I added the -rbl suffix, the compilation keeps failing with this message:

And since I can't find the unitex console anymore, I don't know what's the problem . I'm using Unitex-Gramlab 3.1, rev 4314.

I would like to add an item to my word list that is a subsequence of what is matched by my dictionary-graph. My graph is complex so using left and right contexts in merge mode is not very convenient.

If you could give me an example that works it would be really helpful.

Best 

Fanny

eric.laporte

unread,
Jun 28, 2017, 11:51:06 AM6/28/17
to unitex-...@googlegroups.com
Dear Fanny,
The example I used was the Dnum.grf dictionary-graph (and subgraphs, all with filenames beginning with Dnum) distributed with Unitex. You can find them in French/Dela in the Unitex system directory. I adapted the main graph so it can be used in replace mode and I named it with the suffixes -r, -rl and -rbl (attached graph). In all 3 cases, GamLab successfully applied it to a text. I checked this with version 3.2 alpha dated June 9, 2017.
Your dictionary-graph matches a sequence, and produces a lexical entry for a subsequence only, right? I never did that and I am not sure dictionary-graphs can be used that way.
Best regards,
Eric Laporte
Dnum-rbl.grf

Fanny Grandry

unread,
Jul 6, 2017, 6:38:20 AM7/6/17
to Unitex-GramLab
Thanks and sorry for the delayed response.

I know what my problem was. The new entry was added but at least discarded, because it was absent from my corpus given as input.

I made a new attempt. I tried generating several entries, hoping the one that is in my corpus would be kept, but the process just stops after the first try. 
"Ignoring line because the inflected form does not appear in the text:+Al,Alphonse.N\+Diminutive"
Is there a way for this dictionary-graph to work? (please see attached grf with context in comments)

Best,

Fanny

On Friday, June 23, 2017 at 6:53:54 PM UTC+2, Fanny Grandry wrote:
addDiminutives-rba.grf

eric.laporte

unread,
Jul 6, 2017, 8:36:02 AM7/6/17
to Unitex-GramLab
Dear Fanny,
My hypothesis in my last post was wrong: there is no problem for a dictionary-graph to match a sequence, and produce a lexical entry for a subsequence only.
The problem in your dictionary-graph is that the 'all matches' option does not work. I reproduced the same problem with test-Fanny-rba (attached graph): only the 3-letter matches are processed.
In order to work around this bug, you can make distinct graphs for 2-letter, 3-letter... and 5-letter abbreviations, as in test-Fanny-3-rba and test-Fanny-4-rba (attached graphs): with these dictionary-graphs, all the matches are processed.
I made my tests with version 3.2 alpha dated 27 June 2017.
The fact that the 'all matches' option does not work in spite of the 'a' in the end of the name of the graph is a bug. Can you report it as an issue in the GitHub platform?
Best,
Eric
test-Fanny-3-rba.grf
test-Fanny-rba.grf
test-Fanny-4-rba.grf

Fanny Grandry

unread,
Jul 6, 2017, 12:10:57 PM7/6/17
to Unitex-GramLab
Thanks for the trick. 

Best,

Fanny

On Friday, June 23, 2017 at 6:53:54 PM UTC+2, Fanny Grandry wrote:

Cristian Martinez

unread,
Jul 10, 2017, 7:09:46 AM7/10/17
to Unitex-GramLab
Thanks again, Fanny, for your bug report and for the helpful example provided.

This problem was fixed in the version 3.2.49-alpha of UnitexToolLogger released on July 10, 2017 [*]. For more info and comments check out the discussion on GitHub: https://github.com/UnitexGramLab/unitex-core/issues/36.

[*] To check which version of UnitexToolLogger is in use, open the IDE and click on Info > About
Reply all
Reply to author
Forward
0 new messages