Portuguese (Brazil) resource update

29 views
Skip to first unread message

Oto Vale

unread,
Aug 22, 2015, 5:14:07 PM8/22/15
to Unitex-GramLab
Hello,

The Unitex Portuguese (Brazil) resources have been extended and updated for the Spelling Agreement of 1990. 
My students at Universidade Federal de São Carlos (UFSCar) and myself have extended and updated the dictionary of simple words to 75,015 entries (67,942 unique lemmas and 7.900 new entries), which inflect into 10,957,820 forms. 
We also provided the corresponding inflectional transducers and a text in the new spelling ("Senhora", from José de Alencar). 
The new version is online with the Unitex 3.1 beta version in www.unitexgramlab.org  or in the Unitex-PB project webpage
This work was supported by CNPq undergraduate fellowship program, the NILC team and the Dicionário Informal.

You can read more about this work in:

Calcia, N. P. , Kucinskas, A. B., Muniz, M., Nunes, M. G. V. and Vale, O. A.  (2014). Révision et adaptation des dictionnaires et graphes de flexion d Unitex-PB à la nouvelle orthographe du portugais. 3rd UNITEX/GramLab Workshop, October 9-10, 2014,  Université de Tours  


Jorge Baptista and myself have performed an evaluation of this new version. The results will be presented in the 10th Brazilian Symposium in Information and Human Language Technology, and will be soon available.

Oto Araujo Vale
Universidade Federal de São Carlos
Reply all
Reply to author
Forward
0 new messages