Is this you? Create Your Porfile

Yves Schabes

Mitsubishi Electric Research Laboratories

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yves Schabes is active.

Explore More

Publication

Featured researches published by Yves Schabes.

Journal of Logic Programming | 1995

Principles and implementation of deductive parsing

Stuart M. Shieber; Yves Schabes; Fernando Pereira

We present a system for generating parsers based directly on the metaphor of parsing as deduction. Parsing algorithms can be represented directly as deduction systems, and a single deduction engine can interpret such deduction systems so as to implement the corresponding parser. The method generalizes easily to parsers for augmented phrase structure formalisms, such as definite-clause grammars and other logic grammar formalisms, and has been used for rapid prototyping of parsing algorithms for a variety of formalisms including variants of tree-adjoining grammars, categorial grammars, and lexicalized context-free grammars.

human language technology | 1992

Inside-outside reestimation from partially bracketed corpora

Fernando Pereira; Yves Schabes

The inside-outside algorithm for inferring the parameters of a stochastic context-free grammar is extended to take advantage of constituent information in a partially parsed corpus. Experiments on formal and natural language parsed corpora show that the new algorithm can achieve faster convergence and better modelling of hierarchical structure than the original one. In particular, over 90% of the constituents in the most likely analyses of a test set are compatible with test set constituents for a grammar trained on a corpus of 700 hand-parsed part-of-speech strings for ATIS sentences.

international conference on computational linguistics | 1990

Synchronous tree-adjoining grammars

Stuart M. Shieber; Yves Schabes

The unique properties of tree-adjoining grammars (TAG) present a challenge for the application of TAGs beyond the limited confines of syntax, for instance, to the task of semantic interpretation or automatic translation of natural language. We present a variant of TAGs, called synchronous TAGs, which characterize correspondences between languages. The formalisms intended usage is to relate expressions of natural languages to their associated semantics represented in a logical form language, or to their translates in another natural language; in summary, we intend it to allow TAGs to be used beyond their role in syntax proper. We discuss the application of synchronous TAGs to concrete examples, mentioning primarily in passing some computational issues that arise in its interpretation.

international conference on computational linguistics | 1992

Stochastic lexicalized tree-adjoining grammars

Yves Schabes

The notion of stochastic lexicalized tree-adjoining grammar (SLTAG) is formally defined. The parameters of a SLTAG correspond to the probability of combining two structures each one associated with a word. The characteristics of SLTAG are unique and novel since it is lexieally sensitive (as N-gram models or Hidden Markov Models) and yet hierarchical (as stochastic context-free grammars).Then, two basic algorithms for SLTAG arc introduced: an algorithm for computing the probability of a sentence generated by a SLTAG and an inside-outside-like iterative algorithm for estimating the parameters of a SLTAG given a training corpus.Finally, we should how SLTAG enables to define a lexicalized version of stochastic context-free grammars and we report preliminary experiments showing some of the advantages of SLTAG over stochastic context-free grammars.

meeting of the association for computational linguistics | 1996

Combining Trigram-Based and Feature-Based Methods for Context-Sensitive Spelling Correction

Andrew R. Golding; Yves Schabes

This paper addresses the problem of correcting spelling errors that result in valid, though unintended words (such as peace and piece, or quiet and quite) and also the problem of correcting particular word usage errors (such as amount and number, or among and between). Such corrections require contextual information and are not handled by conventional spelling programs such as Unix spell. First, we introduce a method called Trigrams that uses part-of-speech trigrams to encode the context. This method uses a small number of parameters compared to previous methods based on word trigrams. However, it is effectively unable to distinguish among words that have the same part of speech. For this case, an alternative feature-based method called Bayes performs better; but Bayes is less effective than Trigrams when the distinction among words depends on syntactic constraints. A hybrid method called Tribayes is then introduced that combines the best of the previous two methods. The improvement in performance of Tribayes over its components is verified experimentally. Tribayes is also compared with the grammar checker in Microsoft Word, and is found to have substantially higher performance.

international conference on computational linguistics | 1992

A freely available wide coverage morphological analyzer for English

Daniel Karp; Yves Schabes; Martin Zaidel; Dania Egedi

This paper presents a morphological lexicon for English that handle more than 317000 inflected forms derived from over 90000 stems. The lexicon is available in two formats. The first can be used by an implementation of a two-level processor for morphological analysis (Karttunen and Wittenburg, 1983; Antworth, 1990). The second, derived from the first one for efficiency reasons, consists of a disk-based database using a UNIX hash table facility (Seltzer and Yigit, 1991). We also built an X Window tool to facilitate the maintenance and browsing of the lexicon. The package is ready to be integrated into an natural language application such as a parser through hooks written in Lisp and C.To our knowledge, this package is the only available free English morphological analyzer with very wide coverage.

meeting of the association for computational linguistics | 1988

AN EARLEY-TYPE PARSING ALGORITHM FOR TREE ADJOINING GRAMMARS

Yves Schabes; Aravind K. Joshi

We will describe an Earley-type parser for Tree Adjoining Grammars (TAGs). Although a CKY-type parser for TAGs has been developed earlier (Vijay-Shanker and Joshi, 1985), this is the first practical parser for TAGs because as is well known for CFGs, the average behavior of Earley-type parsers is superior to that of CKY-type parsers. The core of the algorithm is described. Then we discuss modifications of the parsing algorithm that can parse extensions of TAGs such as constraints on adjunction, substitution, and feature structures for TAGs. We show how with the use of substitution in TAGs the system is able to parse directly CFGs and TAGs. The system parses unification formalisms that have a CFG skeleton and also those with a TAG skeleton. Thus it also allows us to embed the essential aspects of PATR-II.

conference of the european chapter of the association for computational linguistics | 1993

Parsing the Wall Street Journal with the inside-outside algorithm

Yves Schabes; Michal Roth; Randy B. Osborne

We report grammar inference experiments on partially parsed sentences taken from the Wall Street Journal corpus using the inside-outside algorithm for stochastic context-free grammars. The initial grammar for the inference process makes no assumption of the kinds of structures and their distributions. The inferred grammar is evaluated by its predicting power and by comparing the bracketing of held out sentences imposed by the inferred grammar with the partial bracketings of these sentences given in the corpus. Using part-of-speech tags as the only source of lexical information, high bracketing accuracy is achieved even with a small subset of the available training material (1045 sentences): 94.4% for test sentences shorter than 10 words and 90.2% for sentences shorter than 15 words.

international conference on computational linguistics | 1992

Structure sharing in lexicalized tree-adjoining grammars

K. Vijay-Shanker; Yves Schabes

We present a scheme for efficiently representing a lexicalized tree-adjoining grammar (LTAG). The proposed representational scheme allows for structure-sharing between lexical entries and the trees associated with the lexical items. A compact organization is achieved by organizing the lexicon in a hierarchical fashion and using inheritance as well as by using lexical and syntactic rules.While different organizations (Flickinger, 1987; Pollard and Sag, 1987; Shieber, 1986) of the lexicon have been proposed, in the scheme we propose, the inheritance hierarchy not only provides structure-sharing of lexical information but also of the associated elementary trees of extended domain of locality. Furthermore, the lexical and syntactic rules can be used to derive new elementary trees from the default structures specified in the hierarchical lexicon.In the envisaged scheme, the use of a hierarchical lexicon and of lexical and syntactic rules for lexicalized tree-adjoining grammars will capture important linguistic generalizations and also allows for a space efficient representation of the grammar. This will allow for easy maintenance and facilitate updates to the grammar.

meeting of the association for computational linguistics | 1991

Polynomial Time and Space Shift-Reduce Parsing of Arbitrary Context-free Grammars.

Yves Schabes

We introduce an algorithm for designing a predictive left to right shift-reduce non-determinisic push-down machine corresponding to an arbitrary unrestricted context-free grammar and an algorithm for efficiently driving this machine in pseudo-parallel. The performance of the resulting parser is formally proven to be superior to Earleys parser (1970).The technique employed consists in constructing before run-time a parsing table that encodes a non-deterministic machine in the which the predictive behavior has been compiled out. At run time, the machine is driven in pseudo-parallel with the help of a chart.The recognizer behaves in the worst case in O(|G|2n3)-time and O(|G|n2)-space. However in practice it is always superior to Earleys parser since the prediction steps have been compiled before run-time.Finally, we explain how other more efficient variants of the basic parser can be obtained by determinizing portions of the basic non-deterministic push-down machine while still using the same pseudo-parallel driver.

Explore More