Hiyan Alshawi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hiyan Alshawi is active.

Explore More

Publication

Featured researches published by Hiyan Alshawi.

finite state methods and natural language processing | 2000

Learning dependency translation models as collections of finite-state head transducers

Hiyan Alshawi; Shona Douglas; Srinivas Bangalore

The paper defines weighted head transducers, finite-state machines that perform middle-out string transduction. These transducers are strictly more expressive than the special case of standard left-to-right finite-state transducers. Dependency transduction models are then defined as collections of weighted head transducers that are applied hierarchically. A dynamic programming search algorithm is described for finding the optimal transduction of an input string with respect to a dependency transduction model. A method for automatically training a dependency transduction model from a set of input-output example strings is presented. The method first searches for hierarchical alignments of the training examples guided by correlation statistics, and then constructs the transitions of head transducers that are consistent with these alignments. Experimental results are given for applying the training method to translation from English to Spanish and Japanese.

meeting of the association for computational linguistics | 1998

Automatic Acquisition of Hierarchical Transduction Models for Machine Translation

Hiyan Alshawi; Srinivas Bangalore; Shona Douglas

We describe a method for the fully automatic learning of hierarchical finite state translation models. The input to the method is transcribed speech utterances and their corresponding human translations, and the output is a set of head transducers, i.e. statistical lexical head-outward transducers. A word-alignment function and a head-ranking function are first obtained, and then counts are generated for hypothesized state transitions of head transducers whose lexical translations and word order changes are consistent with the alignment. The method has been applied to create an English-Spanish translation model for a Speech translation application, with word accuracy of over 75% as measured by a string-distance comparison to three reference translations.

north american chapter of the association for computational linguistics | 2003

Effective utterance classification with unsupervised phonotactic models

Hiyan Alshawi

This paper describes a method for utterance classification that does not require manual transcription of training data. The method combines domain independent acoustic models with off-the-shelf classifiers to give utterance classification performance that is surprisingly close to what can be achieved using conventional word-trigram recognition requiring manual transcription. In our method, unsupervised training is first used to train a phone n-gram model for a particular domain; the output of recognition with this model is then passed to a phone-string classifier. The classification accuracy of the method is evaluated on three different spoken language system domains.

international conference on acoustics, speech, and signal processing | 2002

Combining prior knowledge and boosting for call classification in spoken language dialogue

Marie Rochery; Robert E. Schapire; Mazin G. Rahim; Narendra K. Gupta; Giuseppe Riccardi; Srinivas Bangalore; Hiyan Alshawi; Shona Douglas

Data collection and annotation are major bottlenecks in rapid development of accurate syntactic and semantic models for natural-language dialogue systems. In this paper we show how human knowledge can be used when designing a language understanding system in a manner that would alleviate the dependence on large sets of data. In particular, we extend BoosTexter, a member of the boosting family of algorithms, to combine and balance hand-crafted rules with the statistics of available data. Experiments on two voice-enabled applications for customer care and help desk are presented.

Philosophical Transactions of the Royal Society A | 2000

Learning dependency transduction models from unannotated examples

Hiyan Alshawi; Shona Douglas

We present a method for constructing a statistical machine translation system automatically from unannotated examples in a manner consistent with the principles of dependency grammar. The method involves learning a generative statistical model of paired dependency derivations of source and target sentences. Such a dependency transduction model consists of collections of weighted head transducers. Head transducers are finite–state machines with different formal properties from ‘standard’ finite–state transducers. When applied to machine translation, the acquired head transducers are applied ‘middle out’, efficiently converting source head words and dependents directly into their counterparts in the target language. We present experimental results on the accuracy of our models for English–Spanish and English–Japanese translation, the training examples being pairs of transcribed spontaneous utterances and their translations. A hierarchical decomposition of bi–language strings emerges from our training process; this decomposition may or may not correspond to familiar linguistic phrase structure. However, no explicit semantic representations are involved, suggesting an approach to language processing in which natural language itself is the semantic representation.

meeting of the association for computational linguistics | 1997

A Comparison of Head Transducers and Transfer for a Limited Domain Translation Application

Hiyan Alshawi; Adam L. Buchsbaum; Fei Xia

We compare the effectiveness of two related machine translation models applied to the same limited-domain task. One is a transfer model with monolingual head automata for analysis and generation; the other is a direct transduction model based on bilingual head transducers. We conclude that the head transducer model is more effective according to measures of accuracy, computational requirements, model size, and development effort.

international conference on acoustics, speech, and signal processing | 1997

State-transition cost functions and an application to language translation

Hiyan Alshawi; Adam L. Buchsbaum

We define a general method for ranking the solutions of a search process by associating costs with equivalence classes of state transitions of the process. We show how the method accommodates models based on probabilistic, discriminative, and distance cost functions, including assignment of costs to unseen events. By applying the method to our machine translation prototype, we are able to experiment with different cost functions and training procedures, including an unsupervised procedure for training the numerical parameters of our English-Chinese translation model. Results from these experiments show that the choice of cost function leads to significant differences in translation quality.

Archive | 2003

Using Direct Variant Transduction for Rapid Development of Natural Spoken Interfaces

Hiyan Alshawi; Shona Douglas

Current speech-enabled services tend to fall into two categories: highly tuned systems requiring a large effort by specialized developers, or constrained systems that are developed rapidly but do not allow users to speak naturally. In this paper we present a new approach to language understanding aimed at bridging the gap between these extremes. This approach (direct variant transduction) relies on specifying an application with examples and on classification and pattern-matching techniques. It addresses two bottlenecks in the development of an interface with natural spoken language: coping with language variation and linking natural language to appropriate actions in the application back-end. Dialog control can be specified declaratively or be delegated to arbitrary functions computed by the underlying application. We describe the method and provide experimental results on varying the number of examples used to build a particular application.

meeting of the association for computational linguistics | 2002

Speech Translation Performance of Statistical Dependency Transduction and Semantic Similarity Transduction

Hiyan Alshawi; Shona Douglas

In this paper we compare the performance of two methods for speech translation. One is a statistical dependency transduction model using head transducers, the other a case-based transduction model involving a lexical similarity measure. Examples of translated utterance transcriptions are used in training both models, though the case-based model also uses semantic labels classifying the source utterances. The main conclusion is that while the two methods provide similar translation accuracy under the experimental conditions and accuracy metric used, the statistical dependency transduction method is significantly faster at computing translations.

Archive | 2002