Resource name:
PARSEME Corpus of Verbal Multiword Expressions (version 1.0)
Type of resource:
manually annotated corpus and the associated tools
Languages:
Bulgarian, Czech, Farsi, French, German, Modern Greek, Hebrew, Hungarian, Italian, Lithuanian, Maltese, Polish, Romanian, Brazilian Portuguese, Slovenian, Spanish, Swedish, and Turkish
Size:
275 thousand sentences, 5.5 million tokens, 54 thousand annotated VMWEs
see also per-language statistics
Format:
parsemetsv, inspired by the CoNLL-U format
Annotation schema:
follows the universal annotation guidelines, elaborated for 21 languages
Features:
for most languages, aligned companion files with morphological and/or syntactic data in the CoNLL-U format are available
License:
various flavours of the Creative Commons license, see license per language
Download link:
http://hdl.handle.net/11372/LRT-2282
Publisher:
PARSEME network
Authors:
Agata Savary (France), Carlos Ramisch (France), Silvio Ricardo Cordeiro (France, Brazil), Federico Sangati (Italy), Veronika Vincze (Hungary), Behrang QasemiZadeh (Germany), Marie Candito (France), Fabienne Cap (Sweden), Voula Giouli (Greece), Ivelina Stoyanova (Bulgaria), Antoine Doucet (France), Kübra Adalı (Turkey), Verginica Barbu Mititelu (Romania), Eduard Bejček (Czech Republic), Ismail El Maarouf (UK), Gülşen Eryiğit (Turkey), Luke Galea (Malta), Yaakov Ha-Cohen Kerner (Israel), Jolanta Kovalevskaitė (Lithuania), Simon Krek (Slovenia), Chaya Liebeskind (Israel), Johanna Monti (Italy), Carla Parra Escartín (Spain), Lonneke van der Plas (Malta), Cristina Aceta (Spain), Itziar Aduriz (Spain), Jean-Yves Antoine (France), Greta Attard (Malta), Kirsty Azzopardi (Malta), Loic Boizou (Lithuania), Janice Bonnici (Malta), Mert Boz (Turkey), Ieva Bumbulienė (Lithuania), Jael Busuttil (Malta), Valeria Caruso (Italy), Manuela Cherchi (Italy), Matthieu Constant (France), Monika Czerepowicka (Poland), Anna De Santis (Italy), Tsvetana Dimitrova (Bulgaria), Tutkum Dinç (Turkey), Hevi Elyovich (Israel), Ray Fabri (Malta), Alison Farrugia (Malta), Jamie Findlay (UK), Aggeliki Fotopoulou (Greece), Vassiliki Foufi (Greece), Sara Anne Galea (Malta), Polona Gantar (Slovenia), Albert Gatt (Malta), Anabelle Gatt (Malta), Carlos Herrero (Spain), Uxoa Iñurrieta (Spain), Glorianna Jagfeld (Germany), Milena Hnátková (Czech Republic), Mihaela Ionescu (Romania), Natalia Klyueva (Czech Republic), Svetla Koeva (Bulgaria), Viktória Kovács (Hungary), Taja Kuzman (Slovenia), Svetlozara Leseva (Bulgaria), Sevi Louisou (Greece), Teresa Lynn (UK), Ruth Malka (Israel), Héctor Martínez Alonso (Spain), John McCrae (UK), Helena de Medeiros Caseli (Brazil), Ayşenur Miral (Turkey), Amanda Muscat (Malta), Joakim Nivre (Sweden), Michael Oakes (UK), Mihaela Onofrei (Romania), Yannick Parmentier (France), Caroline Pasquer (France), Maria Pia di Buono (Italy), Belem Priego Sanchez (Spain), Annalisa Raffone (Italy), Renata Ramisch (Brazil), Erika Rimkutė (Lithuania), Monica-Mihaela Rizea (Romania), Katalin Simkó (Hungary), Michael Spagnol (Malta), Valentina Stefanova (Bulgaria), Sara Stymne (Sweden), Umut Sulubacak (Turkey), Nicole Tabone (Malta), Marc Tanti (Malta), Maria Todorova (Bulgaria), Zdenka Urešová (Czech Republic), Aline Villavicencio (Brazil), Leonardo Zilio (Brazil)
------------------------------------------------