Re: Launch of the Global Contentious Politics (GLOCON) Automated Dataset

0 views

Skip to first unread message

Erdem Yoruk

unread,

Jun 27, 2024, 9:37:41 AM6/27/24

to Erdem Yoruk

***APOLOGIES FOR CROSS-POSTING***

Dear Colleagues

I would like to share with you the design decisions and user manual for the Global Contentious Politics (GLOCON) dataset: https://arxiv.org/abs/2405.18613

Please let us know if you are interested in downloading this dataset of social movements, protests, and conflict, as well. You can find the details regarding GLOCON in my previous email below.

Best regards and have a great summer

erdem

Erdem Yörük

Professor, Dept of Sociology, Koç Univ

Director, Center for Computational Social Sciences, Koç Univ

Associate Member, Dept of Social Policy and Intervention, Univ of Oxford

Affiliated Faculty, Ford Institute for Human Security, Univ of Pittsburgh

PI, Politus ERC Project

PI, Emerging Welfare ERC Project

PI, Social ComQuant H2020 Project

Coordinator, MA Program in Computational Social Sciences

Now Out (Open Access): The Politics of the Welfare State in Turkey: How Social

Movements and Elite Competition Created a Welfare State (Univ of Mich Press, 2022)

(İletişim Yayınları, 2022, Türkçe çevirisi: Türkiye’de Refah Devleti ve Siyaset)

Book cover for 'The Politics of the Welfare State in Turkey'

Erdem Yoruk <ery...@ku.edu.tr>, 9 Kas 2023 Per, 22:19 tarihinde şunu yazdı:

Dear Colleagues
I am pleased to announce the launch of the Global Contentious Politics Dataset, a comprehensive and fully automatically curated resource of social movements, protests, and conflict, made available through our project website. This dataset is the result of an extensive research effort funded by the European Research Council (ERC) and leverages the latest advancements in Artificial Intelligence, Natural Language Processing, and Machine Learning to analyze and catalog instances of contentious politics in Argentina, Brazil, India, South Africa, and Turkey, and in some additional countries in the future. Please find here the YouTube video of the project.

The Global Contentious Politics Dataset (GLOCON) stands as the inaugural multicountry protest event repository tailored for the Global South, harnessing local news sources through automated data processing. It catalogs an array of contentious political events ranging from protests and rallies to strikes, confrontations, and episodes of political turbulence. Developed during the Emerging Markets Welfare (EMW) Project, GLOCON was originally designed to examine the interplay between contentious politics and social welfare schemes in the Global South but we hope the dataset will be useful for a broader academic community, including social movement, conflict and computational social science scholars. The dataset, pioneering in its multilingual and fully automated collection, spans from the 1990s to the present, with event specifics on timing, location, participants, and organizers included. Particularly for India and South Africa, it distinguishes between rural vs. urban and violent vs. non-violent events, features accessible through the interactive Dashboard.

As of 2023, GLOCON houses data on 621,290 events derived from local news documents. The dataset is updated annually, with aspirations to expand its global reach. Data from India, South Africa, Argentina, and Brazil are harvested in English, Spanish, and Portuguese, respectively, while Turkish sources are manually processed for Turkey and they include data on the Kurdish ethnic conflict. The GLOCON Dashboard allows users to visualize event data through geolocation, temporal, and categorical filters, offering a dynamic tool for researchers. Further insights into the dataset’s methodology, definitions of protest event features, and the creation of training data can be found in the comprehensive Annotation Manual. For raw data access, users are directed to the Download section.

The supervised machine learning algorithm, modeled on human annotative precision, utilizes a 'Gold Standard Corpus' (GSC) – a meticulous double-annotated dataset of 17,000+ documents that shapes the accuracy of the automated system. This annotation process, carried out by skilled social science graduate students at Koç University and the University of Sao Paolo under expert guidance, ensures the consistency and quality of the GLOCON database. We believe that this GSC will be an important source for computational social scientists.

We invite you to explore the dataset and utilize it in your research and teaching. To facilitate its use, we have provided comprehensive documentation and user guides. Furthermore, we encourage feedback and collaboration, as we see this launch not as an end but as the beginning of an ongoing dialogue within the academic community. Should you have any inquiries or require further assistance, please do not hesitate to contact us through the contact form available on our website. We look forward to your contributions in advancing our collective understanding of contentious politics.

Thank you for your attention, and we eagerly await the insights that your engagement with the Global Contentious Politics Dataset will undoubtedly bring.
With my best wishes

On behalf of the GLOCON Team
Dr. Erdem Yörük
--
Erdem Yörük
Director, Center for Computational Social Sciences, Koç Univ
Associate Professor, Dept of Sociology, Koç Univ
Associate Member, Dept of Social Policy and Intervention, Univ of Oxford
PI, Politus ERC Project
PI, Emerging Welfare ERC Project
PI, Social ComQuant H2020 Project
Coordinator, MA Program in Computational Social Sciences

Now Out (Open Access): The Politics of the Welfare State in Turkey: How Social
Movements and Elite Competition Created a Welfare State (Univ of Mich Press, 2022) (İletişim Yayınları, 2022, Türkçe çevirisi: Türkiye’de Refah Devleti ve Siyaset)

Reply all

Reply to author

Forward

0 new messages