Esther Levin
Bell Labs
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Esther Levin.
IEEE Transactions on Speech and Audio Processing | 2000
Esther Levin; Roberto Pieraccini; Wieland Eckert
We propose a quantitative model for dialog systems that can be used for learning the dialog strategy. We claim that the problem of dialog design can be formalized as an optimization problem with an objective function reflecting different dialog dimensions relevant for a given application. We also show that any dialog system can be formally described as a sequential decision process in terms of its state space, action set, and strategy. With additional assumptions about the state transition probabilities and cost assignment, a dialog system can be mapped to a stochastic model known as Markov decision process (MDP). A variety of data driven algorithms for finding the optimal strategy (i.e., the one that optimizes the criterion) is available within the MDP framework, based on reinforcement learning. For an effective use of the available training data we propose a combination of supervised and reinforcement learning: the supervised learning is used to estimate a model of the user, i.e., the MDP parameters that quantify the users behavior. Then a reinforcement learning algorithm is used to estimate the optimal strategy while the system interacts with the simulated user. This approach is tested for learning the strategy in an air travel information system (ATIS) task. The experimental results we present in this paper show that it is indeed possible to find a simple criterion, a state space representation, and a simulated user parameterization in order to automatically learn a relatively complex dialog behavior, similar to one that was heuristically designed by several research groups.
Neural Computation | 1994
Vladimir Vapnik; Esther Levin; Yann Le Cun
A method for measuring the capacity of learning machines is described. The method is based on fitting a theoretically derived function to empirical measurements of the maximal difference between the error rates on two separate data sets of varying sizes. Experimental measurements of the capacity of various types of linear classifiers are presented.
Proceedings of the IEEE | 1990
Esther Levin; Naftali Tishby; Sara A. Solla
A general statistical description of the problem of learning from examples is presented. Learning in layered networks is posed as a search in the network parameter space for a network that minimizes an additive error function of a statistically independent examples. By imposing the equivalence of the minimum error and the maximum likelihood criteria for training the network, the Gibbs distribution on the ensemble of networks with a fixed architecture is derived. The probability of correct prediction of a novel example can be expressed using the ensemble, serving as a measure to the networks generalization ability. The entropy of the prediction distribution is shown to be a consistent measure of the networks performance. The proposed formalism is applied to the problems of selecting an optimal architecture and the prediction of learning curves. >
ieee automatic speech recognition and understanding workshop | 1997
Wieland Eckert; Esther Levin; Roberto Pieraccini
Automatic speech dialogue systems are becoming common. In order to assess their performance, a large sample of real dialogues has to be collected and evaluated. This process is expensive, labor intensive, and prone to errors. To alleviate this situation we propose a user simulation to conduct dialogues with the system under investigation. Using stochastic modeling of real users we can both debug and evaluate a speech dialogue system while it is still in the lab, thus substantially reducing the amount of field testing with real users.
international conference on acoustics, speech, and signal processing | 1992
Esther Levin; Roberto Pieraccini
The authors extend the dynamic time warping (DTW) algorithm, widely used in automatic speech recognition (ASR), to a dynamic plane warping (DPW) algorithm, for application in the field of optical character recognition (OCR) or similar applications. Although direct application of the optimality principle reduced the computational complexity somewhat, the DPW (or image alignment) problem is exponential in the dimensions of the image. It is shown that by applying constraints to the image alignment problem, e.g., limiting the class of possible distortions, one can reduce the computational complexity dramatically, and find the optimal solution to the constrained problem in linear time. A statistical model, the planar hidden Markov model (PHMM), describing statistical properties of images is proposed. The PHMM approach was evaluated using a set of isolated handwritten digits. An overall digit recognition accuracy of 95% was achieved. It is expected that the advantage of this approach will be even more significant for harder tasks, such cursive-writing recognition and spotting.<<ETX>>
international conference on acoustics speech and signal processing | 1998
Esther Levin; Roberto Pieraccini; Wieland Eckert
We introduce a stochastic model for dialogue systems based on Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a variety of methods, including the reinforcement learning approach. The advantages of this new paradigm include objective evaluation of dialogue systems and their automatic design and adaptation. We show some preliminary results on learning a dialogue strategy for an air travel information system.
international conference on acoustics, speech, and signal processing | 1990
Esther Levin
Neural networks are used to model nonlinear and time-varying systems. The proposed model attempts to cope with the time variability systems by adding an undetermined control input which modulates the mapping implemented by the network. The network architecture proposed, the hidden control neural network (HCNN), combines nonlinear prediction of conventional neural networks with hidden Markov modeling. This network is trained using an algorithm that is based on back-propagation and segmentation algorithms for estimating the unknown control together with the networks parameters. The HCNN approach is evaluated on multispeaker recognition of connected digits, yielding a word accuracy of 99.3%.<<ETX>>
ieee automatic speech recognition and understanding workshop | 1997
Esther Levin; Roberto Pieraccini; Wieland Eckert
We introduce a stochastic model for dialogue systems based on the Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a variety of methods, including the reinforcement learning approach. The advantages of this new paradigm include objective evaluation of dialogue systems and their automatic design and adaptation. We show some preliminary results on learning a dialogue strategy for an air travel information system.
international conference on acoustics, speech, and signal processing | 1992
Roberto Pieraccini; Evelyne Tzoukermann; Zakhar Gorelov; Jean-Luc Gauvain; Esther Levin; Chin-Hui Lee; Jay G. Wilpon
An understanding system, designed for both speech and text input, has been implemented based on statistical representation of task specific semantic knowledge. The core of the system is the conceptual decoder, which extracts the words and their association to the conceptual structure of the task directly from the acoustic signal. The conceptual information, which is also used to clarify the English sentences, is encoded following a statistical paradigm. A template generator and an SQL (structured query language) translator process the sentence and produce SQL code for querying a relational database. Results of the system on the official DARPA test are given.<<ETX>>
international conference on acoustics, speech, and signal processing | 1993
Oscar E. Agazzi; Shyh-shiaw Kuo; Esther Levin; Roberto Pieraccini
An algorithm for connected text recognition using enhanced planar hidden Markov models (PHMMs) is presented. The algorithm automatically segments text into characters (even if they are highly blurred and touching) as an integral part of the recognition process, thus jointly optimizing segmentation and recognition. Performance is enhanced by the use of state length models, transition probabilities among characters (bigrams), and grammars. Experiments are presented using: (1) a simulated database of over 24000 highly degraded images of city names and (2) a database of 6000 images rejected by a high-performance commercial OCR (optical character recognition) machine with 99.5% accuracy. Measured performance on the first database is 99.65% for the most degraded images when a grammar is used, and 98.76% in the second database. Traditional OCR algorithms would fail drastically on these images.<<ETX>>