Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sharon Oviatt is active.

Publication


Featured researches published by Sharon Oviatt.


Communications of The ACM | 2000

Perceptual user interfaces: multimodal interfaces that process what comes naturally

Sharon Oviatt; Philip R. Cohen

more transparent experience than ever before. Our voice, hands, and entire body, once augmented by sensors such as microphones and cameras, are becoming the ultimate transparent and mobile multimodal input devices. The area of multimodal systems has expanded rapidly during the past five years. Since Bolt’s [1] original “Put That There” concept demonstration, which processed speech and manual pointing during object manipulation, significant achievements have been made in developing more general multimodal systems. State-of-the-art multimodal speech and gesture systems now process complex gestural input other than pointing, and new systems have been extended to process different mode combinations—the most noteworthy being speech and pen input [9], and speech and lip movements [10]. As a foundation for advancing new multimodal systems, proactive empirical work has generated predictive information on human-computer multimodal interaction, which is being used to P U I Sharon Oviatt and Philip Cohen


human factors in computing systems | 2005

Individual differences in multimodal integration patterns: what are they and why do they exist?

Sharon Oviatt; Rebecca Lunsford; Rachel Coulston

Techniques for information fusion are at the heart of multimodal system design. To develop new user-adaptive approaches for multimodal fusion, the present research investigated the stability and underlying cause of major individual differences that have been documented between users in their multimodal integration pattern. Longitudinal data were collected from 25 adults as they interacted with a map system over six weeks. Analyses of 1,100 multimodal constructions revealed that everyone had a dominant integration pattern, either simultaneous or sequential, which was 95-96% consistent and remained stable over time. In addition, coherent behavioral and linguistic differences were identified between these two groups. Whereas performance speed was comparable, sequential integrators made only half as many errors and excelled during new or complex tasks. Sequential integrators also had more precise articulation (e.g., fewer disfluencies), although their speech rate was no slower. Finally, sequential integrators more often adopted terse and direct command-style language, with a smaller and less varied vocabulary, which appeared focused on achieving error-free communication. These distinct interaction patterns are interpreted as deriving from fundamental differences in reflective-impulsive cognitive style. Implications of these findings are discussed for the design of adaptive multimodal systems with substantially improved performance characteristics.


Advances in Computers | 2002

Breaking the Robustness Barrier: Recent Progress on the Design of Robust Multimodal Systems

Sharon Oviatt

Abstract Cumulative evidence now clarifies that a well-designed multimodal system that fuses two or more information sources can be an effective means of reducing recognition uncertainty. Performance advantages have been demonstrated for different modality combinations (speech and pen, speech and lip movements), for varied tasks (map-based simulation, speaker identification), and in different environments (noisy, quiet). Perhaps most importantly, the error suppression achievable with a multimodal system, compared with a unimodal spoken language one, can be in excess of 40%. Recent studies also have revealed that a multimodal system can perform in a more stable way than a unimodal one across varied real-world users (accented versus native speakers) and usage contexts (mobile versus stationary use). This chapter reviews these recent demonstrations of multimodal system robustness, distills general design strategies for optimizing robustness, and discusses future directions in the design of advanced multimodal systems. Finally, implications are discussed for the successful commercialization of promising but error-prone recognition-based technologies during the next decade.


IEEE MultiMedia | 1996

User-centered modeling for spoken language and multimodal interfaces

Sharon Oviatt

By modeling difficult sources of linguistic variability in speech and language, we can design interfaces that transparently guide human input to match system processing capabilities. Such work will yield more user centered and robust interfaces for next generation spoken language and multimodal systems.


conference on applied natural language processing | 1997

QuickSet: Multimodal Interaction for Simulation Set-up and Control

Philip R. Cohen; Michael Johnston; David McGee; Sharon Oviatt; Jay Pittman; Ira Smith; Liang Chen; Josh Clow

This paper presents a novel multimodal system applied to the setup and control of distributed interactive simulations. We have developed the QuickSet prototype, a pen/voice system running on a hand-held PC, communicating through a distributed agent architecture to NRaDs1 LeatherNet system, a distributed interactive training simulator built for the US Marine Corps (USMC). The paper briefly describes the system and illustrates its use in multimodal simulation setup.


IEEE Computer Graphics and Applications | 1999

Multimodal interaction for 2D and 3D environments [virtual reality]

Philip R. Cohen; David McGee; Sharon Oviatt; Lizhong Wu; Josh Clow; Rob King; Simon J. Julier; Lawrence J. Rosenblum

The allure of immersive technologies is undeniable. Unfortunately, the users ability to interact with these environments lags behind the impressive visuals. In particular, its difficult to navigate in unknown visual landscapes, find entities, access information and select entities using six-degrees-of-freedom (6-DOF) devices. We believe multimodal interaction-specifically speech and gesture-will make a major difference in the usability of such environments.


meeting of the association for computational linguistics | 1998

Confirmation in Multimodal Systems

David McGee; Philip R. Cohen; Sharon Oviatt

Systems that attempt to understand natural human input make mistakes, even humans. However, humans avoid misunderstandings by confirming doubtful input. Multimodal systems---those that combine simultaneous input from more than one modality, for example speech and gesture-have historically been designed so that they either request confirmation of speech, their primary modality, or not at all. Instead, we experimented with delaying confirmation until after the speech and gesture were combined into a complete multimodal command. In controlled experiments, subjects achieved more commands per minute at a lower error rate when the system delayed confirmation, than compared to when subjects confirmed only speech. In addition, this style of late confirmation meets the users expectation that confirmed commands should be executable.


IEEE Transactions on Neural Networks | 2002

From members to teams to committee-a robust approach to gestural and multimodal recognition

Lizhong Wu; Sharon Oviatt; Philip R. Cohen

When building a complex pattern recognizer with high-dimensional input features, a number of selection uncertainties arise. Traditional approaches to resolving these uncertainties typically rely either on the researchers intuition or performance evaluation on validation data, both of which result in poor generalization and robustness on test data. This paper describes a novel recognition technique called members to teams to committee (MTC), which is designed to reduce modeling uncertainty. In particular, the MTC posterior estimator is based on a coordinated set of divide-and-conquer estimators that derive from a three-tiered architectural structure corresponding to individual members, teams, and the overall committee. Basically, the MTC recognition decision is determined by the whole empirical posterior distribution, rather than a single estimate. This paper describes the application of the MTC technique to handwritten gesture recognition and multimodal system integration and presents a comprehensive analysis of the characteristics and advantages of the MTC approach.


ubiquitous computing | 2001

Designing robust multimodal systems for universal access

Sharon Oviatt

Multimodal interfaces are being developed that permit our highly skilled and coordinated communicative behavior to control system interactions in a more transparent and flexible interface experience than ever before. As applications become more complex, a single modality alone does not permit varied users to interact effectively across different tasks and usage environments [11]. However, a flexible multimodal interface offers people the choice to use a combination of modalities, or to switch to a better-suited modality, depending on the specifics of their abilities, the task, and the usage conditions.This paper will begin by summarizing some of the primary advantages of multimodal interfaces. In particular, it will discuss the inherent flexibility of multimodal interfaces, which is a key feature that makes them suitable for universal access and mobile computing. It also will discuss the role of multimodal architectures in improving the robustness and performance stability of recognition-based systems. Data will be reviewed from two recent studies in which a multimodal architecture suppressed errors and stabilized system performance for accented nonnative speakers and during mobile use. The paper will conclude by discussing the implications of this research for designing multimodal interfaces for the elderly, as well as the need for future work in this area.


Proceedings of the 1st International Workshop on Multimodal Learning Analytics | 2012

Multimodal prediction of expertise and leadership in learning groups

Stefan Scherer; Nadir Weibel; Louis-Philippe Morency; Sharon Oviatt

In his study, we investigate low level predictors from audio and writing modalities for the separation and identification of socially dominant leaders and experts within a study group. We use a multimodal dataset of situated computer assisted group learning tasks: Groups of three high-school students solve a number of mathematical problems in two separate sessions. In order to automatically identify the socially dominant student and expert in the group we analyze a number of prosodic and voice quality features as well as writing-based features. In this preliminary study we identify a number of promising acoustic and writing predictors for the disambiguation of leaders, experts and other students. We believe that this exploratory study reveals key opportunities for future analysis of multimodal learning analytics based on a combination of audio and writing signals.

Collaboration


Dive into the Sharon Oviatt's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Nadir Weibel

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge