Sumedha Kshirsagar | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Sumedha Kshirsagar is active.

Explore More

Publication

Featured researches published by Sumedha Kshirsagar.

Computer Animation and Virtual Worlds | 2004

Generic personality and emotion simulation for conversational agents

Arjan Egges; Sumedha Kshirsagar; Nadia Magnenat-Thalmann

This paper describes a generic model for personality, mood and emotion simulation for conversational virtual humans. We present a generic model for updating the parameters related to emotional behaviour, as well as a linear implementation of the generic update mechanisms. We explore how existing theories for appraisal can be integrated into the framework. Then we describe a prototype system that uses the described models in combination with a dialogue system and a talking head with synchronized speech and facial expressions. Copyright

international conference on knowledge-based and intelligent information and engineering systems | 2003

A model for personality and emotion simulation

Arjan Egges; Sumedha Kshirsagar; Nadia Magnenat-Thalmann

This paper describes a generic model for personality, mood and emotion simulation for conversational virtual humans. We present a generic model for describing and updating the parameters related to emotional behaviour. Also, this paper explores how existing theories for appraisal can be integrated into the framework. Finally we describe a prototype system that uses the described models in combination with a dialogue system and a talking head with synchronised speech and facial expressions.

Computer Graphics Forum | 2003

Visyllable Based Speech Animation

Sumedha Kshirsagar; Nadia Magnenat-Thalmann

Visemes are visual counterpart of phonemes. Traditionally, the speech animation of 3D synthetic faces involvesextraction of visemes from input speech followed by the application of co‐articulation rules to generate realisticanimation. In this paper, we take a novel approach for speech animation — using visyllables, the visual counterpartof syllables. The approach results into a concatenative visyllable based speech animation system. The key contributionof this paper lies in two main areas. Firstly, we define a set of visyllable units for spoken English along withthe associated phonological rules for valid syllables. Based on these rules, we have implemented a syllabificationalgorithm that allows segmentation of a given phoneme stream into syllables and subsequently visyllables. Secondly,we have recorded the database of visyllables using a facial motion capture system. The recorded visyllableunits are post‐processed semi‐automatically to ensure continuity at the vowel boundaries of the visyllables. We defineeach visyllable in terms of the Facial Movement Parameters (FMP). The FMPs are obtained as a result of thestatistical analysis of the facial motion capture data. The FMPs allow a compact representation of the visyllables.Further, the FMPs also facilitate the formulation of rules for boundary matching and smoothing after concatenatingthe visyllables units. Ours is the first visyllable based speech animation system. The proposed technique iseasy to implement, effective for real‐time as well as non real‐time applications and results into realistic speechanimation.

EGVE '02 Proceedings of the workshop on Virtual environments 2002 | 2002

Avatar Markup Language

Sumedha Kshirsagar; Nadia Magnenat-Thalmann; Anthony Guye-Vuillème; Daniel Thalmann; Kaveh Kamyab; Ebrahim Mamdani

Synchronization of speech, facial expressions and body gestures is one of the most critical problems in realistic avatar animation in virtual environments. In this paper, we address this problem by proposing a new high-level animation language to describe avatar animation. The Avatar Markup Language (AML), based on XML, encapsulates the Text to Speech, Facial Animation and Body Animation in a unified manner with appropriate synchronization. We use low-level animation parameters, defined by the MPEG-4 standard, to demonstrate the use of the AML. However, the AML itself is independent of any low-level parameters as such. AML can be effectively used by intelligent software agents to control their 3D graphical representations in the virtual environments. With the help of the associated tools, AML also facilitates to create and share 3D avatar animations quickly and easily. We also discuss how the language has been developed and used within the SoNG project framework. The tools developed to use AML in a real-time animation system incorporating intelligent agents and 3D avatars are also discussed subsequently.

adaptive agents and multi-agents systems | 2002

Virtual humans personified

Sumedha Kshirsagar; Nadia Magnenat-Thalmann

The focus of the virtual human research has recently shifted from modeling and animation towards imparting personalities to virtual humans. The aim is to create virtual humans that can interact spontaneously using a natural language, emotions and gestures. In this paper we present a system that allows the design of personality for emotional virtual human. We adopt the Five Factor Model (FFM) of personality from psychology studies. To realize the model, we use Bayesian Belief Network. We introduce a layered approach for modeling personality, moods and emotions. In order to demonstrate a virtual human with emotional personality, we explain how the model can be integrated with an appropriate dialogue system.

international conference on multimedia and expo | 2000

Lip synchronization using linear predictive analysis

Sumedha Kshirsagar; Nadia Magnenat-Thalmann

Linear predictive analysis is a widely used technique for speech analysis and encoding. The authors discuss the issues involved in its application to phoneme extraction and lip synchronization. The LP analysis results in a set of reflection coefficients that are closely related to the vocal tract shape. Since the vocal tract shape can be correlated with the phoneme being spoken, LP analysis can be directly applied to phoneme extraction. We use neural networks to train and classify the reflection coefficients into a set of vowels. In addition, average energy is used to take care of vowel-vowel and vowel-consonant transitions, whereas the zero crossing information is used to detect the presence of fricatives. We directly apply the extracted phoneme information to our synthetic 3D face model. The proposed method is fast, easy to implement, and adequate for real time speech animation. As the method does not rely on language structure or speech recognition, it is language independent. Moreover, the method is speaker independent. It can be applied to lip synchronization for entertainment applications and avatar animation in virtual environments.

IEEE Computer Graphics and Applications | 2003

Emerging Web graphics standards and technologies

Rynson W. H. Lau; Frederick W. B. Li; Tosiyasu L. Kunii; Baining Guo; Bo Zhang; Nadia Magnenat-Thalmann; Sumedha Kshirsagar; Daniel Thalmann; Mario Gutiérrez

Migrating computer graphics to the Web poses several problems, but with new standards and technology advances, graphics applications can balance latency and bandwidth constraints with image quality. The paper discusses Web graphics standards, distributed virtual environments and virtual humans.

International Journal of Imaging Systems and Technology | 2003

Synthetic faces: Analysis and applications

Sumedha Kshirsagar; Stephane Garchery; Gael Sannier; Nadia Magnenat-Thalmann

Facial animation has been a topic of intensive research for more than three decades. Still, designing realistic facial animations remains to be a challenging task. Several models and tools have been developed so far to automate the design of faces and facial animations synchronized with speech, emotions, and gestures. In this article, we take a brief overview of the existing parameterized facial animation systems. We then turn our attention to facial expression analysis, which we believe is the key to improving realism in animated faces. We report the results of our research regarding the analysis of the facial motion capture data. We use an optical tracking system that extracts the 3D positions of markers attached at specific feature point locations. We capture the movements of these face markers for a talking person. We then form a vector space representation by using the principal component analysis of this data. We call this space “expression and viseme space.” As a result, we propose a new parameter space for sculpting facial expressions for synthetic faces. Such a representation not only offers insight into improving realism of animated faces, but also gives a new way of generating convincing speech animation and blending between several expressions. Expressive facial animation finds a variety of applications ranging from virtual environments to entertainment and games. With the advances in Internet technology, the development of online sales assistants, Web navigation aides and Web‐based interactive tutors is promising than ever before. We overview the recent advances in the field of facial animation on the Web, with a detailed look at the requirements for Web‐based facial animation systems and various applications.

ieee virtual reality conference | 2001

Personalized face and speech communication over the Internet

Sumedha Kshirsagar; Chris Joslin; Won-Sook Lee; Nadia Magnenat-Thalmann

We present our system for personalized face and speech communication over the Internet. The overall system consists of three parts: the cloning of real human faces to use as the representative avatars; the Networked Virtual Environment System performing the basic task of network and device management; and the speech system which includes a text-to-speech engine and a real time phoneme extraction engine from natural speech. The combination of these three elements provides a system to allow real humans, represented by their virtual counterparts, to communicate with each other even when they are geographically remote. In addition to this, all elements present use MPEG-4 as a common communication and animation standard and were designed and tested on the Windows operating system (OS). The paper presents the main aim of the work, the methodology and the resulting communication system.

Archive | 2001