Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sanjay A. Patil is active.

Publication


Featured researches published by Sanjay A. Patil.


Speech Communication | 2010

The physiological microphone (PMIC): A competitive alternative for speaker assessment in stress detection and speaker verification

Sanjay A. Patil; John H. L. Hansen

Interactive speech system scenarios exist which require the user to perform tasks which exert limitations on speech production, thereby causing speaker variability and reduced speech performance. In noisy stressful scenarios, even if noise could be completely eliminated, the production variability brought on by stress, including Lombard effect, has a more pronounced impact on speech system performance. Thus, in this study we focus on the use of a silent speech interface (PMIC), with a corresponding experimental assessment to illustrate its utility in the tasks of stress detection and speaker verification. This study focuses on the suitability of PMIC versus close-talk microphone (CTM), and reports that the PMIC achieves as good performance as CTM or better for a number of test conditions. PMIC reflects both stress-related information and speaker-dependent information to a far greater extent than the CTM. For stress detection performance (which is reported in % accuracy), PMIC performs at least on par or about 2% better than the CTM-based system. For a speaker verification application, the PMIC outperforms CTM for all matched stress conditions. The performance reported in terms of %EER is 0.91% (as compared to 1.69%), 0.45% (as compared to 1.49%), and 1.42% (as compared to 1.80%) for PMIC. This indicates that PMIC reflects speaker-dependent information. Also, another advantage of the PMIC is its ability to record the user physiology traits/state. Our experiments illustrate that PMIC can be an attractive alternative for stress detection as well as speaker verification tasks along with an advantage of its ability to record physiological information, in situations where the use of CTM may hinder operations (deep sea divers, fire-fighters in rescue operations, etc.).


ieee aerospace conference | 2007

UT-Scope: Speech under Lombard Effect and Cognitive Stress

Ayako Ikeno; Vaishnavi Varadarajan; Sanjay A. Patil; John H.L. Hansen

This paper presents UT-scope data base, and automatic and perceptual an evaluation of Lombard speech in in-set speaker recognition. The speech used for the analysis forms a part of the UT-SCOPE database and consists of sentences from the well-known TIMIT corpus, spoken in the presence of highway, large crowd and pink noise. First, the deterioration of the EER of an in-set speaker identification system trained on neutral and tested with Lombard speech is illustrated. A clear demarcation between the effect of noise and Lombard effect on noise is also given by testing with noisy Lombard speech. The effect of test-token duration on system performance under the Lombard condition is addressed. We also report results from In-Set Speaker Recognition tasks performed by human subjects in comparison to the system performance. Overall observations suggest that deeper understanding of cognitive factor involved in perceptual speaker ID offers meaningful insights for further development of automated systems.


international conference on vehicular electronics and safety | 2009

Enhancing in-vehicle safety via contact sensor for stress detection

Sanjay A. Patil; John H. L. Hansen

The number of vehicles on the road as well as the human drive time is increasing significantly. Many drivers are increasing their attempts to multi-task while driving including eating, drinking, entertainment control etc. A relatively new domain has emerged over the last 5 years focused on increased technology in the vehicle based on: GPS navigation systems, traffic, weather warning systems, advanced music/entertainment systems (e.g., MP3, iPod, etc), voice/cell-phone communications, email access and more recently text messaging. Such technology has significantly increased the cognitive stress load and/or distraction levels of drivers, yet little research has focused on this challenge. For communications alone, more than 60% of the cell-phone calls originated from the vehicle in 1997 [1]. The focus of the current study is to formulate and evaluate a cognitive stress detection scheme under a cell-phone-like scenario while driving. Also, the current study proposes use of a contact sensor for stress detection. Using a contact sensor, the system accuracy can be as high as 86.52% under matched driver scenario. We propose three proto-designs to incorporate contact sensors to enable both drivers safety and enhance overall in-vehicle safety. Another advantage of using a contact sensor is its ability to also capture drivers physiology (such as heart beat and breathing patterns).


international conference on acoustics, speech, and signal processing | 2010

Speech under physical stress: A production-based framework

Sanjay A. Patil; Abhijeet Sangwan; John H. L. Hansen

This paper examines the impact of physical stress on speech. The methodology adopted here identifies inter-utterance breathing (IUB) patterns as a key intermediate variable while studying the relationship between physical stress and speech. Additionally, this work connects high-level prosodic changes in the speech signal (energy, pitch, and duration) to the corresponding breathing patterns. Our results demonstrate the diversity of breathing and articulation patterns that speakers employ in order to compensate for the increased body oxygen demand. Here, we identify the normalized value of breathing energy rate (proportional to minute volume) acquired from a conventional as well as physiological microphone as a reliable and accurate estimator of physical stress. Additionally, we also show that the prosodic patterns (pitch, energy, and duration) of high-level speech structure shows good correlation with the normalized-breathing energy rate. In this manner, the study establishes the interconnection between temporal speech structure and physical stress through breathing.


international conference on acoustics, speech, and signal processing | 2010

Towards more intelligible physiological microphone speech: A probabilistic transformation approach

Seyed Omid Sadjadi; Sanjay A. Patil; John H. L. Hansen

The non-acoustic physiological microphone (PMIC) has been shown to be useful for speech systems under adverse noisy conditions. However, the signal is not a true speech for the listener, therefore appears muffled and metallic with variations to the speaker dependent structure. This study presents a probabilistic transformation approach to improve the perceptual quality and intelligibility of PMIC speech not only by mapping the non-acoustic signal into the conventional speech production space, but also by minimizing distortions arising from alternative pickup location. Performance of the proposed approach is assessed based on five distinct objective metrics. Obtained results indicate that incorporating the probabilistic transformation yields significant improvement in overall PMIC speech quality and intelligibility. This technique along with the PMIC can thus find applications in noise robust human-to-human speech communication.


Lecture Notes in Computer Science | 2007

Speech Under Stress: Analysis, Modeling and Recognition

John H. L. Hansen; Sanjay A. Patil


conference of the international speech communication association | 2008

Detection of Speech Under Physical Stress: Model Development, Sensor Selection, and Feature Fusion

Sanjay A. Patil; John H. L. Hansen


Alternate sensor based speech systems for speaker assessment and robust human communication | 2010

Alternate sensor based speech systems for speaker assessment and robust human communication

John H. L. Hansen; Sanjay A. Patil


conference of the international speech communication association | 2010

Quality conversion of non-acoustic signals for facilitating human-to-human speech communication under harsh acoustic conditions.

Seyed Omid Sadjadi; Sanjay A. Patil; John H. L. Hansen


Proceedings of the 2nd Workshop on Child, Computer and Interaction | 2009

Assessing the stress/neutral speech environment in adult/child interactions for applications in child language development

Sanjay A. Patil; John H. L. Hansen; Gill Gilkerson; Sharmi Gray; Doungxin Xu

Collaboration


Dive into the Sanjay A. Patil's collaboration.

Top Co-Authors

Avatar

John H. L. Hansen

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

John H.L. Hansen

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Seyed Omid Sadjadi

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Abhijeet Sangwan

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Ayako Ikeno

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar

Vaishnavi Varadarajan

University of Texas at Dallas

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Woo-Il Kim

Incheon National University

View shared research outputs
Researchain Logo
Decentralizing Knowledge