HASOC 2021- Hate Speech shared task 1- Data available - Register now

349 views

Skip to first unread message

Gautam Kishore Shahi

unread,

Jul 28, 2021, 5:36:07 PM7/28/21

Hello,

We are happy to announce the release of the dataset for HASOC 2021. Please find more details below.

------------------------------------------------------------------------

Hate Speech and Offensive Content Identification (HASOC)
Task 1 - English and Indo-Aryan Languages

Shared task at FIRE 2021, 13 - 17 December, Virtual Event: http://fire.irsi.res.in

Website: https://hasocfire.github.io/
After Registration, you receive the password for the Datasets: https://forms.gle/RDwsJdKTQNLVZp668

Datasets in English, Hindi and Marathi
Task on conversational Hate Speech including contextual information (Mixed Script)

---------------------------------------------------------------------------------------------------------------------

Also consider the other HASOC tasks for Dravidian Languages, Arabic and Urdu : http://fire.irsi.res.in/fire/2021/hasoc

Task 1 Description:

--------------------------

HASOC provides a forum and a data challenge for multilingual research on the identification of problematic content.
This year, we offer again English, Marathi and Hindi with, altogether with Thousands of annotated tweets from Twitter. Participants in this year’s shared task can choose to participate in one or two of the subtasks.

Participants can also look at the data of HASOC 2019 and 2020: https://hasocfire.github.io/

Sub-task A: Identifying Hate, offensive and profane content

Sub-task A focus on Hate speech and Offensive language identification offered for English, Marathi, Hindi. Sub-task A is a coarse-grained binary classification in which participating systems are required to classify tweets into two classes, namely: Hate and Offensive (HOF) and Non- Hate and offensive (NOT).

(NOT) Non-Hate-Offensive - This post does not contain any Hate speech, profane, offensive content.
(HOF) Hate and Offensive - This post contains Hate, offensive, and profane content.

Sub-task B:- Discrimination between Hate, profane and offensive posts
This sub-task is a fine-grained classification offered for English, Marathi, Hindi.. Hate-speech and offensive posts from the sub-task A are further classified into three categories.

(HATE) Hate speech:- Posts under this class contain Hate speech content.
(OFFN) Offensive:- Posts under this class contain offensive content.
(PRFN) Profane:- These posts contain profane words.

-----------

Timeline main track
------------

22th July Task announcement, training data fully available

1 August Release of Training data

16 August Registration deadline (see a link to form at the website)

20 August Release of Test data

27 August Run submission

22 September         Paper submission (Easychair)
22 October            Notification and Reviews
27 October            Revised system description paper submission
13-17 December    FIRE takes place virtually, India.

December Accepted participant papers appear at CEUR WS

Please check web site for registration and further details: https://hasocfire.github.io/

----------------
Organisers
----------------
Thomas Mandl :- University of Hildesheim, Germany
Sandip Modha :- DA-IICT & LDRP-ITR, Gandhinagar, India
Gautam Kishore Shahi: - University of Duisburg-Essen, Germany
Durgesh Nandini :- University of Bamberg, Germany
Johannes Schäfer: - University of Hildesheim, Germany
Amit Kumar Jaiswal: - University of Bedfordshire, UK
Prasenjit Majumder :- DA-IICT, Gandhinagar, India
Tharindu Ranasinghe :- University of Wolverhampton, UK
Marcos Zampieri :- Rochester Institute of Technology, USA

Reply all

Reply to author

Forward

0 new messages