HASOC 2021- Hate Speech shared task 1- Data available - Register now

349 views
Skip to first unread message

Gautam Kishore Shahi

unread,
Jul 28, 2021, 5:36:07 PM7/28/21
to
Hello,

We are happy to announce the release of the dataset for HASOC 2021. Please find more details below.

------------------------------------------------------------------------
Hate Speech and Offensive Content Identification  (HASOC)
Task 1 -  English and Indo-Aryan Languages
Shared task at FIRE 2021,  13 - 17 December, Virtual Event: http://fire.irsi.res.in
Website: https://hasocfire.github.io/
After Registration, you receive the password for the Datasets: https://forms.gle/RDwsJdKTQNLVZp668
Datasets in English, Hindi and Marathi
Task on conversational Hate Speech including contextual information (Mixed Script)
---------------------------------------------------------------------------------------------------------------------

Also consider the other HASOC tasks for Dravidian Languages, Arabic and Urdu : http://fire.irsi.res.in/fire/2021/hasoc

Task 1 Description:
--------------------------

HASOC provides a forum and a data challenge for multilingual research on the identification of problematic content.
This year, we offer again English, Marathi and Hindi with, altogether with Thousands of annotated tweets from Twitter. Participants in this year’s shared task can choose to participate in one or two of the subtasks.
Participants can also look at the data of HASOC 2019 and 2020: https://hasocfire.github.io/

Sub-task A: Identifying Hate, offensive and profane content

Sub-task A focus on Hate speech and Offensive language identification offered for English, Marathi, Hindi. Sub-task A is a coarse-grained binary classification in which participating systems are required to classify tweets into two classes, namely: Hate and Offensive (HOF) and Non- Hate and offensive (NOT).
  • (NOT) Non-Hate-Offensive - This post does not contain any Hate speech, profane, offensive content.
  • (HOF) Hate and Offensive - This post contains Hate, offensive, and profane content.
Sub-task B:- Discrimination between Hate, profane and offensive posts
This sub-task  is a fine-grained classification offered for English, Marathi, Hindi.. Hate-speech and offensive posts from the sub-task A are further classified into three categories.
  • (HATE) Hate speech:- Posts under this class contain Hate speech content.
  • (OFFN) Offensive:- Posts under this class contain offensive content.
  • (PRFN) Profane:- These posts contain profane words.

-----------
Timeline main track
------------  

22th July                   Task announcement, training data fully available 
1 August                    Release of Training data
16 August                  Registration deadline (see a link to form at the website)
20 August                  Release of Test data
27 August                  Run submission
22 September           Paper submission (Easychair)
22 October                Notification and Reviews
27 October                Revised system description paper submission
13-17 December       FIRE takes place virtually, India.
December                  Accepted participant papers appear at CEUR WS

Please check web site for registration and further details: https://hasocfire.github.io/

----------------
Organisers
----------------
Thomas Mandl :- University of Hildesheim, Germany
Sandip Modha :- DA-IICT & LDRP-ITR, Gandhinagar, India
Gautam Kishore Shahi: - University of Duisburg-Essen, Germany
Durgesh Nandini :-  University of Bamberg, Germany
Johannes Schäfer: - University of Hildesheim, Germany
Amit Kumar Jaiswal: - University of Bedfordshire, UK
Prasenjit Majumder :- DA-IICT, Gandhinagar, India
Tharindu Ranasinghe :- University of Wolverhampton, UK
Marcos Zampieri :- Rochester Institute of Technology, USA
Reply all
Reply to author
Forward
0 new messages