Dear HASOC Organisers,
My name is Mihir Makwana and I am an MSc Artificial Intelligence student at Kingston University, London. I am writing to request access to the HASOC code-mixed conversational datasets for my master's dissertation on hate speech and offensive content detection in code-mixed Hindi-English social media text.
My project examines transformer-based classification of offensive content in code-mixed posts, including how the surrounding conversation as context affects model performance. For this I would like to use the ICHCL (Identification of Conversational Hate-Speech in Code-Mixed Languages) dataset from HASOC 2022, along with the conversational Subtask 2 data from HASOC 2021. If the HASOC 2023 ICHCL and hate-span data are also available, I would be glad to include them.
I confirm that the data would be used only for academic research. I will follow the terms of use, not redistribute the datasets, and cite the relevant HASOC publications in my dissertation and any resulting work.
Could you let me know the steps to obtain access, along with the password or any data usage agreement I need to sign? I am happy to provide proof of my student status or have my supervisor confirm the request if that helps.
My details:
Name: Mihir Makwana
Programme: MSc Artificial Intelligence, Kingston University London
Supervisor: Dr. Gordon Hunter
University email:
k255...@kingston.ac.ukThank you for your time.
Kind regards,
Mihir Makwana