Loading Dictionnaries in the interface.

38 views
Skip to first unread message

Alexis Neme

unread,
Jul 18, 2018, 2:26:58 AM7/18/18
to Unitex-GramLab
I am loading my dictionnary with the Unitex- IDE.
Using Java interface : File Edition > Open > Dictionnaries  

The inflected dictionnary has 6 millions lines (340 MegaB flat file UTF16). give the same error message with  a DELAS with 75 000 lines (5 mega flatfile utf-16)



Error message: "This file is too large to be displayed. Use a wordprocessor to view it."



16 giga RAM are available on any laptop actually!
is there any others reasons for such limitation?



Tks

Alexis 

Ait cheikh Anas

unread,
Jul 18, 2018, 11:45:31 AM7/18/18
to unitex-...@googlegroups.com
The maximum size of a dictionary that Unitex can support is 3 MegaBytes.

Alexis Neme

unread,
Jul 18, 2018, 11:56:11 AM7/18/18
to denis....@univ-tours.fr, Cristian Martínez, unitex-...@googlegroups.com
Hi Denis,

Tks for your reply.
>>> This is just an interface limitation.
if ths limitation is not intrinsic to Java or JRE Virtual Machine, or hindered by technical reasons by Oracle in the JRE/VM. 
it is an wrong  justification. 

 I deal with millions of entries in my case 76 000 lemma and 6 millions lines.
  
And with Notepad++, we will loose many functonalities specific to our DELas/DElaf formats  
Certainly we can do it by regular exp., but it is less intuitive and difficult for a linguist to handle such regular expressions.


Merci,
Regards


 


Bests

----------

Alexis Neme

Computational Linguistics Scientist - Arabic NLP Expert
FR-PT-EN-AR (DE, Tagalog)

http://tasrif.univ-mlv.fr/About.html

UPEM - LIGM - Laboratoire d'Informatique Gaspard-Monge 



On Wed, Jul 18, 2018 at 10:40 AM, Denis Maurel <mau...@univ-tours.fr> wrote:


Dear Alexis

This is just an interface limitation. You can use the usual command on your dictionary (sort, compile, etc.).
To see the dictionary, you have to use an editor, for instance notepad++...


Best regards,

Denis Maurel


____________________________________
Professor Denis Maurel
Université de Tours
Lifat (Computer Science Research Laboratory)
EPU-DI
64 avenue Jean-Portalis
37200 Tours
France
Phone: 33-2.47.36.14.35
Fax: 33-2.47.36.14.22
mailto:denis.maurel@univ-tours.fr

http://www.univ-tours.fr/maurel

http://www.li.univ-tours.fr
http://tln.li.univ-tours.fr/

----- Le 18 Juil 18, à 8:26, Alexis Neme <alexi...@gmail.com> a écrit :
--
You received this message because you are subscribed to the Google Groups "Unitex-GramLab" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unitex-gramlab+unsubscribe@googlegroups.com.
To post to this group, send email to unitex-gramlab@googlegroups.com.
Visit this group at https://groups.google.com/group/unitex-gramlab.
To view this discussion on the web visit https://groups.google.com/d/msgid/unitex-gramlab/5ae2c52b-e6cb-4c07-9952-540e0a388d7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Reply all
Reply to author
Forward
0 new messages