Fwd: LTI Colloquium - 4/19/19 - One of Our Own

1 view
Skip to first unread message

Emily Ahn

Apr 16, 2019, 12:03:12 PM4/16/19
to pitt...@googlegroups.com, Han, Na-Rae
Join us to hear from David Mortensen this Friday for Colloquium!

---------- Forwarded message ---------
From: Tessa Samuelson <tes...@andrew.cmu.edu>
Date: Tue, Apr 16, 2019 at 11:18 AM
Subject: LTI Colloquium - 4/19/19 - One of Our Own
To: <lti-s...@cs.cmu.edu>

Dear everyone,

Unfortunately, the booked colloquium speaker this week, Leiba Wehbe, has had to cancel. 

Fortunately, from amongst one of our own, a familiar face has stepped forward to claim the desired spot of colloquium speaker. 

Who: David R. Mortensen
When: Friday, 4/19/19
Where: Doherty Hall 2315  
When: 2:30-3:50 pm 

Hmong Elaborate Expressions: Constructional and Distributional-Semantic Perspectives


Fluent Hmong speech and writing are full of elaborate expressions, idioms like tuav riam tuav phom ‘wield knife wield gun; wield weapons’ and poob ntsej poob muag ‘lose ear lose eye; lose face.’ This talk argues that these expressions are interesting in two different ways. First, they are simultaneously based upon a general pattern of coordination and on specific coordinate compounds (words like ntsej-muag ‘ear-eye; face’). Second, the words that can occur in these coordinate compounds are predictable and follow a single, general pattern. They seem to sometimes be composed of synonyms (as in quaj-nyiav ‘cry-cry; cry’, sometimes antonyms (as in hnub-hmo ‘day-night; day and night; all the time’) and sometimes representative members of some class (as in xyoob-ntoo ‘bamboo-tree; woody plants’). However, this talk hypothesizes that they are always distributionally similar. That is, they are words that occur in similar sets of contexts. 

The hypothesis is tested against a 13 million word corpus of Hmong newsgroup text. I show that a classifier based on the cosine similarity between the second and forth word (in Word2vec embeddings) better predicts when a four-gram is an elaborate expression than a strong rule-based baseline. This finding has implications for other kinds of coordination (whether in Hmong or in other languages).  


David R. Mortensen's origins are unknown. He is currently a Systems Scientist at LTI.  Prior to coming to Carnegie Mellon University, he was an Assistant Professor in the Linguistics Department at the University of Pittsburgh. He earned an MA and PhD in Linguistics at the University of California, Berkeley. He has diverse research interests. His work is multilingual and features a special interest in low-resource languages, especially languages of South and Southeast Asia. He specializes in computational phonology, morphology, and data resource development. He is currently working on research projects involving morphological disambiguation, historical linguistics, and distributed representations of linguistic units.


tl;dr: this week is our own David Mortensen - who is doing a talk focused on the complexities of Hmong speech.

Here's a link to learn more about the Hmong people:

There will still be refreshments afterwards :)

Tessa G. Samuelson

Language Technologies Institute
Carnegie Mellon University
6719 Gates Hillman Center
David R. Mortensen, Poster.pdf
Updated LTI Colloquium Speakers Spring 2019.pdf
Reply all
Reply to author
0 new messages