When: 6:10 pm, Thursday March 11th
Where: 7th Floor Interschool Lab, Schapiro Engineering Research Lab (by
my office, 722)
Directions:
http://www.cs.columbia.edu/resources/directions
Speaker: Regina Barzilay, MIT
Welcome back one of our own graduates!
Probabilistic Approaches for Modeling Text Structure and
their application to Text-to-Text Generation
Regina Barzilay (MIT)
Text-to-text generation aims to produce a coherent text by
extracting, combining and rewriting information given in input
texts. Examples of its applications include summarization, answer
fusion in question-answering and text simplification. In this
talk, I will present models of document structure that can be
effectively used to guide content selection in text-to-text
generation. First, I will focus on unsupervised learning of
domain-specific content models. These models capture the topics
addressed in a text, and the order in which these topics appear.
I will present an effective method for learning these models from
unannotated domain-specific documents, utilizing hierarchical
Bayesian methods. Next, I will present a method for assessing the
coherence of a generated text. The key premise of our work is
that the distribution of entities in coherent texts exhibits
certain regularities. Our algorithms can learn these patterns from
raw texts, without recourse to manual annotation or a predefined
knowledge base. Finally, I will show how these models can be
effectively integrated in text-to-text applications such as
automatic generation of Wikipedia articles.