Owen Rambow
Columbia University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Owen Rambow.
meeting of the association for computational linguistics | 2005
Nizar Habash; Owen Rambow
We present an approach to using a morphological analyzer for tokenizing and morphologically tagging (including part-of-speech tagging) Arabic words in one process. We learn classifiers for individual morphological features, as well as ways of using these classifiers to choose among entries from the output of the analyzer. We obtain accuracy rates on all tasks in the high nineties.
international conference on computational linguistics | 2000
Srinivas Bangalore; Owen Rambow
Previous stochastic approaches to generation do not include a tree-based representation of syntax. While this may be adequate or even advantageous for some applications, other applications profit from using as much syntactic knowledge as is available, leaving to a stochastic model only those issues that are not determined by the grammar. We present initial results showing that a tree-based model derived from a tree-annotated corpus improves on a tree model derived from an unannotated corpus, and that a tree-based stochastic model with a hand-crafted grammar outperforms both.
international conference on natural language generation | 2000
Srinivas Bangalore; Owen Rambow; Steve Whittaker
Certain generation applications may profit from the use of stochastic methods. In developing stochastic methods, it is crucial to be able to quickly assess the relative merits of different approaches or models. In this paper, we present several types of intrinsic (system internal) metrics which we have used for baseline quantitative assessment. This quantitative assessment should then be augmented to a fuller evaluation that examines qualitative aspects. To this end, we describe an experiment that tests correlation between the quantitative metrics and human qualitative judgment. The experiment confirms that intrinsic metrics cannot replace human evaluation, but some correlate significantly with human judgments of quality and understandability and can be used for evaluation during development.
meeting of the association for computational linguistics | 2008
Ryan M. Roth; Owen Rambow; Nizar Habash; Mona T. Diab; Cynthia Rudin
We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, improve the performance.
international conference on software engineering | 2001
Scott P. Overmyer; Benoit Lavoie; Owen Rambow
Despite the advantages that object technology can provide to the software development community and its customers, the fundamental problems associated with identifying objects, their attributes, and methods remain: it is a largely manual process driven by heuristics that analysts acquire through experience. While a number of methods exist for requirements development and specification, very few tools exist to assist analysts in making the transition from textual descriptions to other notations for object-oriented analysis and other conceptual models. We describe a methodology and a prototype tool. Linguistic Assistant for Domain Analysis (LIDA), which provide linguistic assistance in the model development process. We first present our methodology to conceptual modeling through linguistic analysis. We give an overview of LIDAs functionality and present its technical design and the functionality of its components. We also provide a comparison of LIDAs functionality with that of other research prototypes. Finally, we present an example of how LIDA is used in a conceptual modeling task.
meeting of the association for computational linguistics | 1995
Owen Rambow; K. Vijay-Shanker; David J. Weir
DTG are designed to share some of the advantages of TAG while overcoming some of its limitations. DTG involve two composition operations called subsertion and sister-adjunction. The most distinctive feature of DTG is that, unlike TAG, there is complete uniformity in the way that the two DTG operations relate lexical items: subsertion always corresponds to complementation and sister-adjunction to modification. Furthermore, DTG, unlike TAG, can provide a uniform analysis for wh-movement in English and Kashmiri, despite the fact that the wh element in Kashmiri appears in sentence-second position, and not sentence-initial position as in English.
meeting of the association for computational linguistics | 2006
Nizar Habash; Owen Rambow
We present MAGEAD, a morphological analyzer and generator for the Arabic language family. Our work is novel in that it explicitly addresses the need for processing the morphology of the dialects. MAGEAD performs an on-line analysis to or generation from a root+pattern+features representation, it has separate phonological and orthographic representations, and it allows for combining morphemes from different dialects. We present a detailed evaluation of MAGEAD.
north american chapter of the association for computational linguistics | 2001
Marilyn A. Walker; Owen Rambow; Monica Rogati
Sentence planning is a set of inter-related but distinct tasks, one of which is sentence scoping, i.e. the choice of syntactic structure for elementary speech acts and the decision of how to combine them into one or more sentences. In this paper, we present SPoT, a sentence planner, and a new methodology for automatically training SPoT on the basis of feedback provided by human judges. We reconceptualize the task into two distinct phases. First, a very simple, randomized sentence-plan-generator (SPG) generates a potentially large list of possible sentence plans for a given text-plan input. Second, the sentence-plan-ranker (SPR) ranks the list of output sentence plans, and then selects the top-ranked plan. The SPR uses ranking rules automatically learned from training data. We show that the trained SPR learns to select a sentence plan whose rating on average is only 5% worse than the top human-ranked sentence plan.
linguistic annotation workshop | 2009
Rajesh Bhatt; Bhuvana Narasimhan; Martha Palmer; Owen Rambow; Dipti Misra Sharma; Fei Xia
This paper describes the simultaneous development of dependency structure and phrase structure treebanks for Hindi and Urdu, as well as a PropBank. The dependency structure and the PropBank are manually annotated, and then the phrase structure treebank is produced automatically. To ensure successful conversion the development of the guidelines for all three representations are carefully coordinated.
north american chapter of the association for computational linguistics | 2007
Nizar Habash; Owen Rambow
We present a diacritization system for written Arabic which is based on a lexical resource. It combines a tagger and a lexeme language model. It improves on the best results reported in the literature.