Fwd: Release of AUKBC Tamil Part-of-Speech Corpus and Part-of-Speech tagger Engine

81 views
Skip to first unread message

Shrinivasan T

unread,
May 31, 2016, 9:42:54 PM5/31/16
to gbin...@yahoogroups.com, freetamil...@googlegroups.com
---------- Forwarded message ----------
From: <so...@au-kbc.org>
Date: 1 Jun 2016 02:30
Subject: Release of AUKBC Tamil Part-of-Speech Corpus and Part-of-Speech tagger Engine
To: <cl...@googlegroups.com>, <il...@googlegroups.com>, <il...@googlegroups.com>, <indow...@googlegroups.com>, <tva_kanitam...@googlegroups.com>, <anuva...@googlegroups.com>
Cc: <so...@au-kbc.org>

Dear All,
We are pleased to announce the release of AUKBC Tamil Part-of-Speech
Corpus and Part-of-Speech tagger Engine.

It was released on 24th May 2016 at the recently concluded 3rd Workshop on
Indian Language Data: Resources and Evaluation (WILDRE 3), co-located with
the 10th edition of the Language Resources and Evaluation Conference (LREC 2016) at Solvenia by Nicoletta Calzolari, CNR, Istituto di Linguistica Computazionale “Antonio Zampolli”, Pisa - Italy.

This Tamil corpus (515K tokens) is the largest manually annotated POS
tagged corpus available in Indian languages. The corpus is the famous 20th century Tamil novel "Ponniyin Selvan" written by "Kalki Krishnamoorthy".

The corpus is annotated with the BIS Tagset, a hierarchical tagset which
is approved by the Bureau of Indian Standards and Tamil Virtual Academy .

The Corpus Statistics:
Total Number of sentences - 50,876 ; Number of words - 5,15,283

POS Tagger:

The POS Tagger engine is released under the GNU GPL version 3.0 license .

The corpus and the engine can be downloaded @
http://au-kbc.org/nlp/corpusrelease.html

for CLRG team@AU-KBC
sobha

Dr. Sobha L (Lalitha Devi)
CLRG, AU-KBC Research Centre,
MIT, Anna University, Chennai
www.au-kbc.org/nlp/



--
You received this message because you are subscribed to the Google Groups "தஇக - கணித்தமிழ் வளர்ச்சி" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tva_kanitamil_val...@googlegroups.com.
To post to this group, send email to tva_kanitam...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tva_kanitamil_valarchi/5e8fb5a7012891751ac621848ebe3275%40au-kbc.org.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages