Oscar N. Garcia | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oscar N. Garcia is active.

Explore More

Publication

Featured researches published by Oscar N. Garcia.

IEEE Transactions on Speech and Audio Processing | 1995

The challenge of spoken language systems: Research directions for the nineties

Ron Cole; L. Hirschman; L. Atlas; M. Beckman; Alan W. Biermann; M. Bush; Mark A. Clements; L. Cohen; Oscar N. Garcia; B. Hanson; Hynek Hermansky; S. Levinson; Kathleen R. McKeown; Nelson Morgan; David G. Novick; Mari Ostendorf; Sharon L. Oviatt; Patti Price; Harvey F. Silverman; J. Spiitz; Alex Waibel; Cliff Weinstein; Stephen A. Zahorian; Victor W. Zue

A spoken language system combines speech recognition, natural language processing and human interface technology. It functions by recognizing the persons words, interpreting the sequence of words to obtain a meaning in terms of the application, and providing an appropriate response back to the user. Potential applications of spoken language systems range from simple tasks, such as retrieving information from an existing database (traffic reports, airline schedules), to interactive problem solving tasks involving complex planning and reasoning (travel planning, traffic routing), to support for multilingual interactions. We examine eight key areas in which basic research is needed to produce spoken language systems: (1) robust speech recognition; (2) automatic training and adaptation; (3) spontaneous speech; (4) dialogue models; (5) natural language response generation; (6) speech synthesis and speech generation; (7) multilingual systems; and (8) interactive multimodal systems. In each area, we identify key research challenges, the infrastructure needed to support research, and the expected benefits. We conclude by reviewing the need for multidisciplinary research, for development of shared corpora and related resources, for computational support and far rapid communication among researchers. The successful development of this technology will increase accessibility of computers to a wide range of users, will facilitate multinational communication and trade, and will create new research specialties and jobs in this rapidly expanding area. >

IEEE Transactions on Multimedia | 2005

Speech-driven facial animation with realistic dynamics

Ricardo Gutierrez-Osuna; P. Kakumanu; Anna Esposito; Oscar N. Garcia; Adriana Bojórquez; José Luis Castillo; Isaac Rudomin

This work presents an integral system capable of generating animations with realistic dynamics, including the individualized nuances, of three-dimensional (3-D) human faces driven by speech acoustics. The system is capable of capturing short phenomena in the orofacial dynamics of a given speaker by tracking the 3-D location of various MPEG-4 facial points through stereovision. A perceptual transformation of the speech spectral envelope and prosodic cues are combined into an acoustic feature vector to predict 3-D orofacial dynamics by means of a nearest-neighbor algorithm. The Karhunen-Loe/spl acute/ve transformation is used to identify the principal components of orofacial motion, decoupling perceptually natural components from experimental noise. We also present a highly optimized MPEG-4 compliant player capable of generating audio-synchronized animations at 60 frames/s. The player is based on a pseudo-muscle model augmented with a nonpenetrable ellipsoidal structure to approximate the skull and the jaw. This structure adds a sense of volume that provides more realistic dynamics than existing simplified pseudo-muscle-based approaches, yet it is simple enough to work at the desired frame rate. Experimental results on an audiovisual database of compact TIMIT sentences are presented to illustrate the performance of the complete system.

asilomar conference on signals, systems and computers | 1994

Continuous optical automatic speech recognition by lipreading

Alan J. Goldschen; Oscar N. Garcia; Eric D. Petajan

We describe a continuous optical automatic speech recognizer (OASR) that uses optical information from the oral-cavity shadow of a speaker. The system achieves a 25.3 percent recognition on sentences having a perplexity of 150 without using any syntactic, semantic, acoustic, or contextual guides. We introduce 13, mostly dynamic, oral-cavity features used for optical recognition, present phones that appear optically similar (visemes) for our speaker, and present the recognition results for our hidden Markov models (HMMs) using visemes, trisemes, and generalized trisemes. We conclude that future research is warranted for optical recognition, especially when combined with other input modalities.<<ETX>>

Archive | 1972

Error-correcting codes in computer arithmetic.

James L. Massey; Oscar N. Garcia

This chapter is intended to summarize the most important results which have been obtained in the theory of coding for the correction and detection of errors in computer arithmetic. The rapid growth in the size and speed of digital computers has placed stringent reliability demands on the arithmetic unit. Attempts to satisfy these demands have generally followed one of three directions: (1) Attempts to improve the reliability of the components used in the construction of the arithmetic unit, (2) attempts to improve reliability by incorporating hardware redundancy so that the result of a computation is unaffected by the failure of one or more of the replicated units which form the arithmetic unit, or so that the failure of one or more of the replicated units can be detected and the faulty units replaced, and (3) attempts to incorporate redundancy into the numbers themselves which are being processed so that erroneous results can be corrected or detected. This third approach, which is the subject of this chapter, tacitly assumes that it is possible to build the “decoder” which corrects or detects erroneous results much more reliably than the arithmetic unit which it monitors, so that the decoder can be considered error-free for practical purposes.

Archive | 1996

Rationale for Phoneme-Viseme Mapping and Feature Selection in Visual Speech Recognition

Alan J. Goldschen; Oscar N. Garcia; Eric D. Petajan

We describe a methodology to automatically identify visemes and to determine important oral-cavity features for a speaker dependent, optical continuous speech recognizer. A viseme, as defined by Fisher (1968), represents phones that contain optically similar sequences of oral-cavity movements. Large vocabulary, continuous acoustic speech recognizers that use Hidden Markov Models (HMMs) require accurate phones models (Lee 1989). Similarly, an optical recognizer requires accurate viseme models (Goldschen 1993). Since no universal agreement exists on a subjective viseme definition, we provide an empirical viseme definition using HMMs. We train a set of phone HMMs using optical information, and then cluster similar phone HMMs to form viseme HMMs. We compare our algorithmic phone-to-viseme mapping with the mappings from human speechreading experts. We start, however, by describing the oral-cavity feature selection process to determine features that characterize the movements of the oral-cavity during speech. The feature selection process uses a correlation matrix, principal component analysis, and speechreading heuristics to reduce the number of oral-cavity features from 35 to 13. Our analysis concludes that the dynamic oral-cavity features offer great potential for machine speechreading and for the teaching of human speechreading.

IEEE Transactions on Information Theory | 1971

Cyclic and multiresidue codes for arithmetic operations

Thammavarapu R. N. Rao; Oscar N. Garcia

In this paper, the cyclic nature of AN codes is defined after a brief summary of previous work in this area is given. New results are shown in the determination of the range for single-error-correcting AN codes when A is the product of two odd primes p_1 and p_2 , given the orders of 2 modulo p_1 and modulo p_2 . The second part of the paper treats a more practical class of arithmetic codes known as separate codes. A generalized separate code, called a multiresidue code, is one in which a number N is represented as \begin{equation} [N, \mid N \mid _ {m1}, \mid N \mid _{m2}, \cdots , \mid N \mid _{mk}] \end{equation} where m_i are pairwise relatively prime integers. For each AN code, where A is composite, a multiresidue code can be derived having error-correction properties analogous to those of the AN code. Under certain natural constraints, multiresidue codes of large distance and large range (i.e., large values of N ) can be implemented. This leads to possible realization of practical single and/or multiple-error-correcting arithmetic units.

IEEE Transactions on Education | 2003

Crossing the interdisciplinary barrier: a baccalaureate computer science option in bioinformatics

Travis E. Doom; Michael L. Raymer; Dan E. Krane; Oscar N. Garcia

Bioinformatics is a new and rapidly evolving discipline that has emerged from the fields of experimental molecular biology and biochemistry, and from the artificial intelligence, database, pattern recognition, and algorithms disciplines of computer science. Largely because of the inherently interdisciplinary nature of bioinformatics research, academia has been slow to respond to strong industry and government demands for trained scientists to develop and apply novel bioinformatic techniques to the rapidly growing freely available repositories of genetic and proteomic data. While some institutions are responding to this demand by establishing graduate programs in bioinformatics, the entrance barriers for these programs are high, largely because of the significant amount of prerequisite knowledge in the disparate fields of biochemistry and computer science required for sophisticated new approaches to the analysis and interpretation of bioinformatics data. The authors present an undergraduate-level bioinformatics curriculum in computer science designed for the baccalaureate student. This program is designed to be tailored easily to the needs and resources of a variety of institutions.

workshop on perceptive user interfaces | 2001

Speech driven facial animation

P. Kakumanu; Ricardo Gutierrez-Osuna; Anna Esposito; Robert K. Bryll; A. Ardeshir Goshtasby; Oscar N. Garcia

The results reported in this article are an integral part of a larger project aimed at achieving perceptually realistic animations, including the individualized nuances, of three-dimensional human faces driven by speech. The audiovisual system that has been developed for learning the spatio-temporal relationship between speech acoustics and facial animation is described, including video and speech processing, pattern analysis, and MPEG-4 compliant facial animation for a given speaker. In particular, we propose a perceptual transformation of the speech spectral envelope, which is shown to capture the dynamics of articulatory movements. An efficient nearest-neighbor algorithm is used to predict novel articulatory trajectories from the speech dynamics. The results are very promising and suggest a new way to approach the modeling of synthetic lip motion of a given speaker driven by his/her speech. This would also provide clues toward a more general cross-speaker realistic animation.

international symposium on multiple-valued logic | 1990

A six-valued logic for representing incomplete knowledge

Oscar N. Garcia; Massoud Moussavi

A novel six-valued logic useful in representing incomplete knowledge is introduced. A practical advantage of this logic is that it allows a system to reason progressively about what it will or will not know (or what can or cannot happen) as time advances and further knowledge is acquired from the external world. Applications of this approach to deductive question-answering systems, as well as to decision-making and planning under time constraints, are investigated. A rule-based inference model based on the six valued logic has been built for this purpose. The results of this research indicate that an extension of the classical definition of modus ponens based on designated truth values would be a useful rule of inference.<<ETX>>

technical symposium on computer science education | 2002

A proposed undergraduate bioinformatics curriculum for computer scientists

Travis E. Doom; Michael L. Raymer; Dan E. Krane; Oscar N. Garcia

Bioinformatics is a new and rapidly evolving discipline that has emerged from the fields of experimental molecular biology and biochemistry, and from the the artificial intelligence, database, and algorithms disciplines of computer science. Largely because of the inherently interdisciplinary nature of bioinformatics research, academia has been slow to respond to strong industry and government demands for trained scientists to develop and apply novel bioinformatics techniques to the rapidly-growing, freely-available repositories of genetic and proteomic data. While some institutions are responding to this demand by establishing graduate programs in bioinformatics, the entrance barriers for these programs are high, largely due to the significant amount of prerequisite knowledge in the disparate fields of biochemistry and computer science required to author sophisticated new approaches to the analysis of bioinformatics data. We present a proposal for an undergraduate-level bioinformatics curriculum in computer science that lowers these barriers.

Explore More