Salam, everyone.
We are looking for Arabic commentary data for previous FIFA World Cup matches (2022, 2018, 2014, 2010, 2006, 2002). Please let us know if anyone knows how we can get them. We are open to both free and paid options.
thanks in advance
Regards,
Nouar
--
You received this message because you are subscribed to the Google Groups "SIGARAB: Special Interest Group on Arabic Natural Language Processing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sigarab+u...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/sigarab/CAE7AZ0sHZMZW7XPL%3DcaiW%2BNrn%2BbJXeakEpGYdLWTJkvES%2BrRcQ%40mail.gmail.com.
Hi Nouar,
It might be difficult to acquire such data given the strict broadcast copyright restrictions.. but there is no harm in contacting Al Jazeera / beIN Sports directly, as they have held the MENA rights to FIFA World Cups since 2002, I guese..
Yallashoot also has a HuggingFace dataset of Arabic football commentary. It is not World Cup-specific, but it could still work for a study on language use, etc.
There is also the FIFA World Cup 1930–2022 dataset, which contains information about teams, players, tournaments , results... It could provide useful context if you manage to link it to the Yallashoot commentary.
We have previously used similar football datasets when supporting Burnley Football Club in the UK, where the data helped inform player recruitment and transfer-related decisions.
————
This email originated outside the University. Check before clicking links or attachments.