language family working groups

36 views
Skip to first unread message

Ryan Cotterell

unread,
Jul 21, 2020, 1:28:42 PM7/21/20
to unim...@googlegroups.com
Hello,

Beyond the steering committee, the first major organizational change in UniMorph is the creation of working groups. The working groups will be based on language families. We created five to start with. (We will create more once our Google groups quota is removed :P)
  1. Romance (https://groups.google.com/g/unimorph-romance)
  2. Uralic (https://groups.google.com/g/unimorph-uralic)
  3. Semitic (https://groups.google.com/g/unimorph-semitic)
  4. Slavic (https://groups.google.com/g/unimorph-slavic)
  5. Turkic (https://groups.google.com/g/unimorph-turkic)
Please joint any of the above groups. We are trying to figure out the proper publication model to encourage people to help clean up and make the data cross-lingually compatible. A likely solution is each family-specific working group submits a paper to SIGMORPHON 2021 or LREC 2022 describing the annotation challenges they faced. We also may jointly write a larger TACL paper that describes the state of the art in morphological transfer learning. 

We created a UniMorph drive here: https://drive.google.com/drive/folders/1EaPFMq1kLMjDqRBKfDTE3USvRL9ZGdUZ?usp=sharing. Each family has its own folder.

Let's look at the Romance family to get an idea of what we are trying to accomplish. We currently have data for 25 Romance languages here (https://docs.google.com/spreadsheets/d/11uHXL1aS8SErTqs-Z5pbgPkm-67Y_aF7oA-_-sI4l3E/edit?usp=sharing). Many of them are listed on the UniMorph website https://unimorph.github.io/ where you can download the data. Much of the data, however, has yet to be converted. For Romance, the TODO items include the following: We need to catalog the data we have, where it came from, and unify it with other sources. We also need to address Romance-specific issues, e.g. how to annotate clitics. Moreover, the primary goal is to unify the annotation schemes, so we need to take stock. Mans would also like cognates annotated where possible to aid transfer learning.  We have noted all of these issues here: https://docs.google.com/document/d/1cRK1jbibsXcbEqsNIPpatJYyxqw2Ppnd9xbTSQ1tft4/edit?usp=sharing

@David Yarowsky wishes for this to be part of the new UniMorph website (http://unimorph.ethz.ch/), but that will take some time to get set up. For now, we just have the Google infrastructure. 

Cheers,
--Ryan



James Tauber

unread,
Jul 21, 2020, 1:36:08 PM7/21/20
to unimorph
Is there any interest in / need for a cross-cutting group focused on historical rather than modern languages?

James

--
You received this message because you are subscribed to the Google Groups "unimorph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to unimorph+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/unimorph/CAAptH7pMTUQ%2BNi53x1a5dkVWfenF4qPW_e%3DDetZy9GJPzAD-Kw%40mail.gmail.com.


--

Ryan Cotterell

unread,
Jul 21, 2020, 1:39:46 PM7/21/20
to James Tauber, unimorph
We could create such a group if you would like. It's really about where the community interests lie. 

Ryan Cotterell

unread,
Jul 21, 2020, 1:57:49 PM7/21/20
to unimorph
I've updated permissions on the working group Google groups. Let me know if you have any issue joining.

James Tauber

unread,
Jul 21, 2020, 1:59:46 PM7/21/20
to Ryan Cotterell, unimorph
Yeah, that's what I'm trying to gauge.

James

Ali Salehi

unread,
Jul 21, 2020, 3:26:29 PM7/21/20
to Ryan Cotterell, unim...@googlegroups.com

Thanks for organizing this. Would it be possible to have a group on Iranian languages?

--

Anil Singh

unread,
Jul 21, 2020, 4:16:41 PM7/21/20
to Ali Salehi, Ryan Cotterell, unim...@googlegroups.com
There can be a group on Indic languages, with perhaps subgroups for different language families within Indic.

Message has been deleted

Yustinus Ghanggo Ate

unread,
Jul 21, 2020, 9:25:07 PM7/21/20
to unimorph
Hi Ryan,

It would be great if we have a working group for Austronesian language family. 

-Y
...@David Yarowsky wishes for this to be part of the new UniMorph website (http://unimorph.ethz.ch/), but that will take some time to get set up. For now, we just have the Google infrastructure. 

Cheers,
--Ryan



Reply all
Reply to author
Forward
0 new messages