COLING (Barcelona, Spain), December 13th, 2020
Second call for papers (UPDATED)
This joint workshop addresses two domains – multiword expressions and (electronic) lexicons – with partly overlapping communities and research interests, but divergent practices and terminologies.
Multiword expressions (MWEs) are word combinations, such as by and large, hot dog, pay a visit or pull one's leg, which exhibit lexical, syntactic, semantic, pragmatic or statistical idiosyncrasies. MWEs encompass closely related linguistic objects: idioms, compounds, light-verb constructions, rhetorical figures, institutionalised phrases and collocations. Because of their unpredictable behavior, notably their non-compositional semantics, MWEs pose problems in linguistic modelling (e.g. treebank annotation, grammar engineering), NLP pipelines (notably when orchestrated with parsing), and end-user applications (e.g. information extraction). Modelling and processing of MWEs has been the topic of the MWE workshop, organised over the past years by the MWE section of SIGLEX.
Because MWE-hood is a largely lexical phenomenon, appropriately built electronic MWE lexicons turn out to be quite important for NLP. Their conception opens up, among others, the issues of lemmatisation and of standardised representation of morphological, syntactic and semantic properties of MWEs. Large standardised multilingual, possibly interconnected, NLP-oriented MWE lexicons prove indispensable for NLP tasks such as MWE identification, due to its critical sensitivity to unseen data. But the development of such lexicons is challenging and calls for tools which would leverage, on the one hand, MWEs encoded in pre-existing NLP-unaware lexicons and, on the other hand, automatic MWE discovery in large non-annotated corpora.
In order to pave the way towards a better understanding of these issues, and to foster convergence and scientific innovation, the MWE and ELEXIS (European Union's Horizon 2020 research grant 731015) communities put forward a joint event and call for papers on research related (but not limited) to:
Joint topics on MWEs and e-lexicons:
Extracting and enriching MWE lists from traditional human-readable lexicons for NLP use
Formats for NLP-applicable MWE lexicons
Interlinking MWE lexicons with other language resources
Using MWE lexicons in NLP tasks (identification, parsing, translation, ...)
MWE discovery in the service of lexicography
Multiword terms in specialised lexicons
Representing semantic properties of MWEs in lexicons
Paving the way towards encoding lexical idiosyncrasies in constructions
Computationally-applicable theoretical work on MWEs and constructions in psycholinguistics, corpus linguistics and formal grammars
MWE and construction annotation in corpora and treebanks
Processing of MWEs and constructions in syntactic and semantic frameworks (e.g. CCG, CxG, HPSG, LFG, TAG, UD, etc.), and in end-user applications (e.g. information extraction, machine translation and summarisation)
Original discovery and identification methods for MWEs and constructions
MWEs and constructions in language acquisition and in non-standard language (e.g. tweets, forums, spontaneous speech)
Evaluation of annotation and processing techniques for MWEs and constructions
Retrospective comparative analyses from the PARSEME shared tasks on automatic identification of MWEs
Our intention is to also perpetuate previous converging effects with the Construction Grammar and WordNet community (see the LAW-MWE-CxG 2018 and MWE-WN 2019 workshops). Therefore, we extend the traditional MWE scope to grammatical constructions and we include WordNets in the scope of e-lexicons.
The workshop features two tracks:
A regular research track, where the submissions must be substantially original.
A shared task track, with submissions consisting of system description papers.
The regular research track submissions should follow one of the 2 formats:
Long papers (9 content pages + references): Long papers should report on solid and finished research including new experimental results, resources and/or techniques.
Short papers (4 content pages + references): Short papers should report on small experiments, focused contributions, ongoing research, negative results and/or philosophical discussion.
The decisions as to oral or poster presentations of the selected papers will be taken by the PC chairs. No distinction between papers presented orally and as posters is made in the workshop proceedings. There is no limit on the number of reference pages. The submission will be double-blind. Papers available as preprints can also be submitted provided that they fulfil the conditions defined by the ACL Policies for Submission, Review and Citation.
All papers should be submitted via the workshop's START space https://www.softconf.com/coling2020/MWE-LEX/
Please follow the guidelines and use the COLING 2020 style files available at https://coling2020.org/pages/submission. Please choose the appropriate track (research/shared task) and for research papers the submission modality (long/short).
MWE-LEX 2020 will host edition 1.2 of the PARSEME shared task on semi-supervised identification of MWEs. This is a follow-up of editions 1.0 (2017), and 1.1 (2018). Edition 1.2 features (a) improved and extended corpora annotated with MWEs, (b) complementary unannotated corpora for unsupervised MWE discovery, and (c) evaluation focusing on unseen MWEs. Following the synergy with Elexis, our aim is to foster the development of unsupervised methods for MWE lexicon induction, which in turn can be used for identification. Authors may submit system description papers to the shared task track. Details are available at http://multiword.sf.net/sharedtask2020
All deadlines are at 23:59 UTC-12 (anywhere in the world).
September 2, 2020: Workshop papers due date (short papers, long papers, system description papers)
October 16, 2020: Notification of acceptance
November 1, 2020: Camera-ready papers due
December 13, 2020: Workshop colocated with COLING 2020 in Barcelona
Research track, MWE-specific topics:
Stella Markantonatou, Institute for Language and Speech Processing, R.C. "Athena" (Greece)
Jelena Mitrović, University of Passau (Germany)
Research track, MWE-LEX topics:
John McCrae, National University of Ireland Galway (Ireland)
Carole Tiberius, Dutch Language Institute in Leiden (Netherlands)
Shared task track:
Petya Osenova, University of Sofia and Bulgarian Academy of Sciences (Bulgaria)
Agata Savary, Université of Tours (France)
For any inquiries regarding the workshop please send an email to mwele...@gmail.com.