Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Nikesh Garera is active.

Publication


Featured researches published by Nikesh Garera.


international joint conference on natural language processing | 2009

Modeling Latent Biographic Attributes in Conversational Genres

Nikesh Garera; David Yarowsky

This paper presents and evaluates several original techniques for the latent classification of biographic attributes such as gender, age and native language, in diverse genres (conversation transcripts, email) and languages (Arabic, English). First, we present a novel partner-sensitive model for extracting biographic attributes in conversations, given the differences in lexical usage and discourse style such as observed between same-gender and mixed-gender conversations. Then, we explore a rich variety of novel sociolinguistic and discourse-based features, including mean utterance length, passive/active usage, percentage domination of the conversation, speaking rate and filler word usage. Cumulatively up to 20% error reduction is achieved relative to the standard Boulis and Ostendorf (2005) algorithm for classifying individual conversations on Switchboard, and accuracy for gender detection on the Switchboard corpus (aggregate) and Gulf Arabic corpus exceeds 95%.


conference on computational natural language learning | 2009

Improving Translation Lexicon Induction from Monolingual Corpora via Dependency Contexts and Part-of-Speech Equivalences

Nikesh Garera; Chris Callison-Burch; David Yarowsky

This paper presents novel improvements to the induction of translation lexicons from monolingual corpora using multilingual dependency parses. We introduce a dependency-based context model that incorporates long-range dependencies, variable context sizes, and reordering. It provides a 16% relative improvement over the baseline approach that uses a fixed context window of adjacent words. Its Top 10 accuracy for noun translation is higher than that of a statistical translation model trained on a Spanish-English parallel corpus containing 100,000 sentence pairs. We generalize the evaluation to other word-types, and show that the performance can be increased to 18% relative by preserving part-of-speech equivalencies during translation.


conference on information and knowledge management | 2007

The role of documents vs. queries in extracting class attributes from text

Marius Pasca; Benjamin Van Durme; Nikesh Garera

Challenging the implicit reliance on document collections, this paper discusses the pros and cons of using query logs rather than document collections, as self-contained sources of data in textual information extraction. The differences are quantified as part of a large-scale study on extracting prominent attributes or quantifiable properties of classes (e.g., top speed, price and fuel consumption for CarModel) from unstructured text. In a head-to-head qualitative comparison, a lightweight extraction method produces class attributes that are 45% more accurate on average, when acquired from query logs rather than Web documents.


meeting of the association for computational linguistics | 2009

Structural, Transitive and Latent Models for Biographic Fact Extraction

Nikesh Garera; David Yarowsky

This paper presents six novel approaches to biographic fact extraction that model structural, transitive and latent properties of biographical data. The ensemble of these proposed models substantially outperforms standard pattern-based biographic fact extraction methods and performance is further improved by modeling inter-attribute correlations and distributions over functions of attributes, achieving an average extraction accuracy of 80% over seven types of biographic attributes.


meeting of the association for computational linguistics | 2007

JHU1 : An Unsupervised Approach to Person Name Disambiguation using Web Snippets

Delip Rao; Nikesh Garera; David Yarowsky

This paper presents an approach to person name disambiguation using K-means clustering on rich-feature-enhanced document vectors, augmented with additional web-extracted snippets surrounding the polysemous names to facilitate term bridging. This yields a significant F-measure improvement on the shared task training data set. The paper also illustrates the significant divergence between the properties of the training and test data in this shared task, substantially skewing results. Our system optimized on F0.2 rather than F0.5 would have achieved top performance in the shared task.


conference on computational natural language learning | 2006

Resolving and Generating Definite Anaphora by Modeling Hypernymy using Unlabeled Corpora

Nikesh Garera; David Yarowsky

We demonstrate an original and successful approach for both resolving and generating definite anaphora. We propose and evaluate unsupervised models for extracting hypernym relations by mining cooccurrence data of definite NPs and potential antecedents in an unlabeled corpus. The algorithm outperforms a standard WordNet-based approach to resolving and generating definite anaphora. It also substantially outperforms recent related work using pattern-based extraction of such hypernym relations for coreference resolution.


meeting of the association for computational linguistics | 2009

Arabic cross-document coreference detection

Asad B. Sayeed; Tamer Elsayed; Nikesh Garera; David Alexander; Tan Xu; Douglas W. Oard; David Yarowsky; Christine D. Piatko

We describe a set of techniques for Arabic cross-document coreference resolution. We compare a baseline system of exact mention string-matching to ones that include local mention context information as well as information from an existing machine translation system. It turns out that the machine translation-based technique outperforms the baseline, but local entity context similarity does not. This helps to point the way for future cross-document coreference work in languages with few existing resources for the task.


international conference on tools with artificial intelligence | 2006

A Briefing Tool that Learns Individual Report-Writing Behavior

Mohit Kumar; Nikesh Garera; Alexander I. Rudnicky

We describe a briefing system that learns to predict the contents of reports generated by users who create periodic (weekly) reports as part of their normal activity. We address the question whether data derived from the implicit supervision provided by end-users is robust enough to support not only model parameter tuning but also a form of feature discovery. The system was evaluated under realistic conditions, by collecting data in a project-based university course where student group leaders were tasked with preparing weekly reports for the benefit of the instructors, using the material from individual student reports


national conference on artificial intelligence | 2009

Cross-Document Coreference Resolution: A Key Technology for Learning by Reading

James Mayfield; David Alexander; Bonnie J. Dorr; Jason Eisner; Tamer Elsayed; Tim Finin; Marjorie Freedman; Nikesh Garera; Paul McNamee; Saif M. Mohammad; Douglas W. Oard; Christine D. Piatko; Asad B. Sayeed; Zareen Syed; Ralph M. Weischedel; Tan Xu; David Yarowsky


Theory and Applications of Categories | 2009

HLTCOE Approaches to Knowledge Base Population at TAC 2009

Paul McNamee; Mark Dredze; Adam Gerber; Nikesh Garera; Tim Finin; James Mayfield; Christine D. Piatko; Delip Rao; David Yarowsky; Markus Dreyer

Collaboration


Dive into the Nikesh Garera's collaboration.

Top Co-Authors

Avatar

David Yarowsky

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Delip Rao

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Paul McNamee

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

James Mayfield

Johns Hopkins University Applied Physics Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tim Finin

University of Maryland

View shared research outputs
Researchain Logo
Decentralizing Knowledge