Raghavendra Udupa | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raghavendra Udupa is active.

Explore More

Publication

Featured researches published by Raghavendra Udupa.

international joint conference on artificial intelligence | 2011

Learning hash functions for cross-view similarity search

Shaishav Kumar; Raghavendra Udupa

Many applications in Multilingual and Multimodal Information Access involve searching large databases of high dimensional data objects with multiple (conditionally independent) views. In this work we consider the problem of learning hash functions for similarity search across the views for such applications. We propose a principled method for learning a hash function for each view given a set of multiview training data objects. The hash functions map similar objects to similar codes across the views thus enabling cross-view similarity search. We present results from an extensive empirical study of the proposed approach which demonstrate its effectiveness on Japanese language People Search and Multilingual People Search problems.

meeting of the association for computational linguistics | 2009

MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora

Raghavendra Udupa; K. Saravanan; A. Kumaran; Jagadeesh Jagarlamudi

In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news articles with similar news content are rich in Named Entity Transliteration Equivalents (NETEs). Our mining algorithm, MINT, uses a cross-language document similarity model to align multilingual news articles and then mines NETEs from the aligned articles using a transliteration similarity model. We show that our approach is highly effective on 6 different comparable corpora between English and 4 languages from 3 different language families. Furthermore, it performs substantially better than a state-of-the-art competitor.

conference on information and knowledge management | 2008

Mining named entity transliteration equivalents from comparable corpora

Raghavendra Udupa; K. Saravanan; A. Kumaran; Jagadeesh Jagarlamudi

Named Entities (NEs) form a significant fraction of query terms in Information Retrieval (IR) systems and their retrieval has been shown to correlate highly with the IR system performance. NEs are even more important in Cross Language Information Retrieval (CLIR), as in addition to being a significant component of query terms. In the recent times, the large quantity and the perpetual availability of news corpora in many of the world’s languages simultaneously, has spurred interest in a promising alternative to NE translation or transliteration, particularly, the mining of Named Entity Transliteration Equivalents (NETEs) from such news corpora (Klementiev and Roth, 2006; Tao et al., 2006). Formally, comparable news corpora are time-aligned news stories in a pair of languages, over a reasonably long duration. NETEs mined from comparable news corpora could be valuable in many tasks such as CLIR and MT, to effectively complement the bilingual dictionaries and the machine transliteration systems. This opportunity is precisely what we address in our work. We introduce a novel method, called MINT (MIning Namedentity Transliteration equivalents), with the following innovations for effective mining of NETEs from comparable corpora: MINT relies on little linguistic resources, requiring a Named Entity Recoginizer (NER) in only one language; hence NETEs from even a resource poor language may be mined, when paired with a language where an NER is available.

european conference on information retrieval | 2010

On improving pseudo-relevance feedback using pseudo-irrelevant documents

Karthik Raman; Raghavendra Udupa; Pushpak Bhattacharya; Abhijit Bhole

Pseudo-Relevance Feedback (PRF) assumes that the top-ranking n documents of the initial retrieval are relevant and extracts expansion terms from them. In this work, we introduce the notion of pseudo-irrelevant documents, i.e. high-scoring documents outside of top n that are highly unlikely to be relevant. We show how pseudo-irrelevant documents can be used to extract better expansion terms from the top-ranking n documents: good expansion terms are those which discriminate the top-ranking n documents from the pseudo-irrelevant documents. Our approach gives substantial improvements in retrieval performance over Model-based Feedback on several test collections.

european conference on information retrieval | 2010

Transliteration equivalence using canonical correlation analysis

Raghavendra Udupa; Mitesh M. Khapra

We address the problem of Transliteration Equivalence, i.e. determining whether a pair of words in two different languages (e.g.Auden, ऑडन) are name transliterations or not. This problem is at the heart of Mining Name Transliterations (MINT) from various sources of multilingual text data including parallel, comparable, and non-comparable corpora and multilingual news streams. MINT is useful in several cross-language tasks including Cross-Language Information Retrieval (CLIR), Machine Translation (MT), and Cross-Language Named Entity Retrieval. We propose a novel approach to Transliteration Equivalence using language-neutral representations of names. The key idea is to consider name transliterations in two languages as two views of the same semantic object and compute a low-dimensional common feature space using Canonical Correlation Analysis (CCA). Similarity of the names in the common feature space forms the basis for classifying a pair of names as transliterations. We show that our approach outperforms state-of-the-art baselines in the CLIR task for Hindi-English (3 collections) and Tamil-English (2 collections).

international conference on the theory of information retrieval | 2009

A term is known by the company it keeps: On Selecting a Good Expansion Set in Pseudo-Relevance Feedback

Raghavendra Udupa; Abhijit Bhole; Pushpak Bhattacharyya

It is well known that pseudo-relevance feedback (PRF) improves the retrieval performance of Information Retrieval (IR) systems in general. However, a recent study by Cao et al [3] has shown that a non-negligible fraction of expansion terms used by PRF algorithms are harmful to the retrieval. In other words, a PRF algorithm would be better off if it were to use only a subset of the feedback terms. The challenge then is to find a good expansion set from the set of all candidate expansion terms. A natural approach to solve the problem is to make term independence assumption and use one or more term selection criteria or a statistical classifier to identify good expansion terms independent of each other. In this work, we challenge this approach and show empirically that a feedback term is neither good nor bad in itself in general; the behavior of a term depends very much on other expansion terms. Our finding implies that a good expansion set can not be found by making term independence assumption in general. As a principled solution to the problem, we propose spectral partitioning of expansion terms using a specific term-term interaction matrix. We demonstrate on several test collections that expansion terms can be partitioned into two sets and the best of the two sets gives substantial improvements in retrieval performance over model-based feedback.

languages and compilers for parallel computing | 2006

Optimal bitwise register allocation using integer linear programming

Rajkishore Barik; Christian Grothoff; Rahul Gupta; Vinayaka Pandit; Raghavendra Udupa

This paper addresses the problem of optimal global register allocation. The register allocation problem is expressed as an integer linear programming problem and solved optimally. The model is more flexible than previous graph-coloring based methods and thus allows for register allocations with significantly fewer moves and spills. The formulation can also model complex architectural features, such as bit-wise access to registers. With bit-wise access to registers, multiple subword temporaries can be stored in a single register and accessed efficiently, resulting in a register allocation problem that cannot be addressed effectively with simple graph coloring. The paper describes techniques that can help reduce the problem size of the ILP formulation, making the algorithm feasible in practice. Preliminary empirical results from an implementation prototype are reported.

Lecture Notes in Computer Science | 2001

Fast and Accurate Fingerprint Verification

Raghavendra Udupa; Gaurav Garg; Pramod Kumar Sharma

Speed and accuracy are the prerequisites of a biometric authentication system. However, most fingerprint verification methods compromise speed for accuracy or vice versa. In this paper we propose a novel fingerprint verification algorithm as a solution to this problem. The algorithm is inspired by a basic Computer Vision approach to model-based recognition of objects-alignment. We pose the problem of fingerprint matching as one of matching the corresponding feature sets. We propose a novel transformation consistency checking scheme to make verification accurate. We employ an early elimination strategy to eliminate inconsistent transformations and thereby achieve significant speed-up. Further speed-up is obtained by sampling based on geometric nearness. Our algorithm is simple, intuitive, easy to implement even on the simplest hardware and does not make any assumption on the availability of singularities like core and delta in the fingerprint. We report our results on three representative fingerprint databases.

component-based software engineering | 2005

Reusable dialog component framework for rapid voice application development

Rahul P. Akolkar; Tanveer A. Faruquie; Juan M. Huerta; Pankaj Kankar; Nitendra Rajput; Thiruvilwamalai V. Raman; Raghavendra Udupa; Abhishek Verma

Voice application development requires specialized speech related skills besides the general programming ability. Encapsulating the speech specific behavior and complexities in prepackaged, configurable User Interface (UI) components will ease and expedite the voice application development. These components can be used across applications and are called as Reusable Dialog Components (RDCs). In this paper we propose a programming model and the framework for developing reusable dialog components. Our framework facilitates the development of voice applications via the encapsulation of interaction mechanisms, the encapsulation of best-of-breed practices (ie. grammars, prompts, and configuration parameters), a modular design and through pluggable dialog management strategies. The framework extends the standard J2EE/JSP based programming model to make it suitable for voice applications.

ieee international conference on high performance computing data and analytics | 2000

Register Efficient Mergesorting

Abhiram G. Ranade; Sonal Kothari; Raghavendra Udupa

We present a register efficient implementation of Mergesort which we call FAME (Finite Automaton MErgesort). FAME is a m-way Mergesort. The m streams are merged by organizing comparison tournaments among the elements at the heads of the streams. The winners of the tournament form the output stream. Many ideas are used to increase efficiency. First, the heads of the streams are maintained in the register file. Second, the tournaments are evaluated incrementally, i.e. after one winner is output the next tournament uses the results of the comparisons performed in the preceding tournaments and thus minimizes work. Third, to minimize register movement, the state of the tournament is encoded as a finite automaton. We experimented with 8-way and 4-way FAME on an Ultrasparc and a DEC Alpha and found that these algorithms were better than cache-cognizant Quicksort algorithms on the same machines.

Explore More