Dimarts 8 de setembre: Parth Gupta - Query expansion for mixed-script information retrieval

1 view
Skip to first unread message

Xavier Lluís

unread,
Sep 2, 2015, 5:19:07 AM9/2/15
to semina...@googlegroups.com
Next Tuesday September 8th, Parth Gupta from UPV will give a talk about his recent research.


Títol Query expansion for mixed-script information retrieval
Ponent Parth Gupta
Lloc Omega-S208 Campus Nord - UPC
Dia Dimarts 8 de setembre de 2015
Horari 12:30h - Presentació
Abstract For many languages that use non-Roman based indigenous scripts (e.g., Arabic, Greek and Indic languages) one can often find a large amount of user generated transliterated content on the Web in the Roman script. Such content creates a monolingual or multi-lingual space with more than one script which is referred as the Mixed-Script space. IR in the mixed-script space is challenging because queries written in either the native or the Roman script need to be matched to the documents written in both the scripts. Moreover, transliterated content features extensive spelling variations. In this talk, the concept of Mixed-Script IR will be introduced and through analysis of the query logs of Bing search engine, we estimate the prevalence and thereby establish the importance of this problem. The talk will also cover a deep-learning based principled solution to the term modelling challenge where the Mixed-Script terms are modelled jointly through deep-autoencoder.
Bio Parth Gupta (http://www.dsic.upv.es/~pgupta/) is a PhD student at Technical University of Valencia (UPV), Spain and a researcher at PRHLT Research Center. His research area is at the intersection of Information Retrieval, Natural Language Processing and Machine Learning. Most recently he is working on Deep Learning architecture to learn abstract representation of text and terms across the languages and scripts. Deep Autoencoder based approach proposed in his PhD work won the Transliterated Search shared task at FIRE 2013 organised by Microsoft Research. Currently, he is an Applied Scientist Intern at Microsoft Bing working with query formulation team in London. He has published in reputed conferences and journals such as SIGIR, COLING, ECIR, Knowledge-Based Systems, Neurocomputing. In the past, he has worked at FBK Research Center (Trento, Italy), Institute of Infocomm Research (Singapore) and Microsoft Research (India) as Research Intern. He is also associated with the open-source search engine library - Xapian, where he contributed the learning-to-rank module as a Google Summer of Code student (GSoC 2011) and later as a mentor (GSoC 2012,2014).
untitled-[2]
Reply all
Reply to author
Forward
0 new messages