Arabic Commentary

25 views
Skip to first unread message

Nouar Aldahoul

unread,
Jun 23, 2026, 6:36:45 AM (9 days ago) Jun 23
to sig...@googlegroups.com

Salam, everyone.

We are looking for Arabic commentary data for previous FIFA World Cup matches (2022, 2018, 2014, 2010, 2006, 2002). Please let us know if anyone knows how we can get them. We are open to both free and paid options.

thanks in advance

Regards, 

Nouar

MD.RAFIUL BISWAS

unread,
Jun 23, 2026, 7:09:43 AM (9 days ago) Jun 23
to Nouar Aldahoul, sig...@googlegroups.com
Dear Nouar,
I have Twitter/X dataset on FIFA 2022 collected during the event. There is also a pre-event (FIFA 2022) dataset. 
However, the dataset is raw and searched  with only keywords FIFA. 
Let me know if this dataset can help you. 
 
Best Regards
Md. Rafiul Biswas


--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CAE7AZ0sHZMZW7XPL%3DcaiW%2BNrn%2BbJXeakEpGYdLWTJkvES%2BrRcQ%40mail.gmail.com.

El-Haj, Mo

unread,
Jun 23, 2026, 7:46:58 AM (9 days ago) Jun 23
to Nouar Aldahoul, sig...@googlegroups.com

Hi Nouar,

It might be difficult to acquire such data given the strict broadcast copyright restrictions.. but there is no harm in contacting Al Jazeera / beIN Sports directly, as they have held the MENA rights to FIFA World Cups since 2002, I guese..

Yallashoot also has a HuggingFace dataset of Arabic football commentary. It is not World Cup-specific, but it could still work for a study on language use,  etc.

There is also the FIFA World Cup 1930–2022 dataset, which contains information about teams, players, tournaments , results... It could provide useful context if you manage to link it to the Yallashoot commentary.

We have previously used similar football datasets when supporting Burnley Football Club in the UK, where the data helped inform player recruitment and transfer-related decisions.


Best of luck with your research

Mo


————


Dr Mo El-Haj
Director of NLP @ VinUniversity 
Reader (Associate Professor) in NLP
CECS, VinUniversity, Vietnam 
SCC, Lancaster University, UK

From: 'Nouar Aldahoul' via SIGARAB: Special Interest Group on Arabic Natural Language Processing <sig...@googlegroups.com>
Sent: Tuesday, 23 June 2026 12:35:23
To: sig...@googlegroups.com <sig...@googlegroups.com>
Subject: [External] [SIGARAB] Arabic Commentary
 

This email originated outside the University. Check before clicking links or attachments.

--
Reply all
Reply to author
Forward
0 new messages