Hola!
Us informem que el proper divendres 7 de febrer tenim programat dos seminaris del TALP a càrrec d'Audi Primadhanty i Pranava Swaroop Madhyastha, a l'aula S208 de l'Edifici Omega del Campus Nord de la UPC. Aquests seminaris seran un assaig de la defensa de Pla de Recerca i Proposta de Tesi.
L'hora d'inici del seminari seran les 12:00.
Aquests són els detalls dels seminaris:
| Títol | Probabilistic Inference for Weakly-Supervised Entity-Relation
i Learning Word Embeddings for Language Modelling |
|---|---|
| Ponent | Audi Primadhanty i Pranava Swaroop Madhyastha |
| Lloc | Omega-S208 Campus Nord - UPC |
| Dia | 7 Febrer 2014 |
| Horari | de 12:00h a 14:00h |
| Abstract Probabilistic Inference for Weakly-Supervised Entity-Relation Extraction |
We investigate the task of extracting entities and relations from text
documents given only a few examples of desired entities and relations. The
task is relevant for information extraction in new, open domains where the
availability of annotated corpus is negligible or expensive to obtain.
We begin with the task of named entity classification by proposing a probabilistic
generative model that uses hidden states. the purpose of hidden states is to
capture commonalities of the contexts in which entities of different types appear.
Our hope is that this model will have improved robustness when it comes to
recognize unseen entities.
|
| Abstract Learning Word Embeddings for Language Modelling |
In Natural Language Processing, state-of-the-art
systems for tasks such as parsing, semantic role
labeling, word-sense disambiguation, etc. make use
of lexical features. Most of these systems are
trained using annotated corpus, which are used to
gather statistics about each lexical item and its
linguistic relations. However, even for large
annotated corpora, it is unlikely to observe each
lexical item in the context of all its possible
relations. In this setting, one would like to
exploit a notion of word similarity, and assume
that similar words have similar behaviour.
The focus of this thesis proposal is to formulate statistical models that improve performance on linguistic prediction tasks by making use of distributional word space representations. In particular, we are interested in designing computationally efficient and robust learning algorithms for lexical embeddings that use a combination of both supervised training methods and unsupervised training methods that use a large text corpus to induce a distributional representation. We present preliminary experiments to infer usefulness and proof of concept of the proposed approach. |
Us recordem que tota la informació dels propers seminaris i dels seminaris passats (incloses les transparències), la podeu trobar a la web dels seminaris del TALP.
Fins aviat!
Pranava i Xavi
Hola!
Us recordem que el proper divendres 7 de febrer tenim programats dos seminaris del TALP a càrrec d'Audi Primadhanty i Pranava Swaroop Madhyastha, a l'aula S208 de l'Edifici Omega del Campus Nord de la UPC. Aquests seminaris seran un assaig de la defensa de Pla de Recerca i Proposta de Tesi.
L'hora d'inici del seminari seran les 12:00.
Aquests són els detalls dels seminaris:
| Títol | Probabilistic Inference for Weakly-Supervised
Entity-Relation
i Learning Word Embeddings for Language Modelling |
|---|---|
| Ponent | Audi Primadhanty i Pranava Swaroop Madhyastha |
| Lloc | Omega-S208 Campus Nord - UPC |
| Dia | 7 Febrer 2014 |
| Horari | de 12:00h a 14:00h |
| Abstract Probabilistic Inference for Weakly-Supervised Entity-Relation Extraction |
|---|
|
We investigate the task of extracting entities and
relations from text
documents given only a few examples of desired entities
and relations. The
task is relevant for information extraction in new, open
domains where the
availability of annotated corpus is negligible or
expensive to obtain.
We begin with the task of named entity classification by
proposing a probabilistic
generative model that uses hidden states. The purpose of
hidden states is to
capture commonalities of the contexts in which entities of
different types appear.
Our hope is that this model will have improved robustness
when it comes to
recognize unseen entities.
|
Our aim is to further extend such techniques for extracting relations in any domain for specific target entities and relations in a large unlabeled corpus, requiring only few examples for each entity and relation type. |
| Abstract Learning Word Embeddings for Language Modelling |
In Natural Language Processing, state-of-the-art
systems for tasks such as parsing, semantic role
labeling, word-sense disambiguation, etc. make use
of lexical features. Most of these systems are
trained using annotated corpus, which are used to
gather statistics about each lexical item and its
linguistic relations. However, even for large
annotated corpora, it is unlikely to observe each
lexical item in the context of all its possible
relations. In this setting, one would like to
exploit a notion of word similarity, and assume
that similar words have similar behaviour.
The focus of this thesis proposal is to formulate statistical models that improve performance on linguistic prediction tasks by making use of distributional word space representations. In particular, we are interested in designing computationally efficient and robust learning algorithms for lexical embeddings that use a combination of both supervised training methods and unsupervised training methods that use a large text corpus to induce a distributional representation. We present preliminary experiments to infer usefulness and proof of concept of the proposed approach. |
|---|