Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kailash Patil is active.

Publication


Featured researches published by Kailash Patil.


PLOS Computational Biology | 2012

Music in Our Ears: The Biological Bases of Musical Timbre Perception

Kailash Patil; Daniel Pressnitzer; Shihab A. Shamma; Mounya Elhilali

Timbre is the attribute of sound that allows humans and other animals to distinguish among different sound sources. Studies based on psychophysical judgments of musical timbre, ecological analyses of sounds physical characteristics as well as machine learning approaches have all suggested that timbre is a multifaceted attribute that invokes both spectral and temporal sound features. Here, we explored the neural underpinnings of musical timbre. We used a neuro-computational framework based on spectro-temporal receptive fields, recorded from over a thousand neurons in the mammalian primary auditory cortex as well as from simulated cortical neurons, augmented with a nonlinear classifier. The model was able to perform robust instrument classification irrespective of pitch and playing style, with an accuracy of 98.7%. Using the same front end, the model was also able to reproduce perceptual distance judgments between timbres as perceived by human listeners. The study demonstrates that joint spectro-temporal features, such as those observed in the mammalian primary auditory cortex, are critical to provide the rich-enough representation necessary to account for perceptual judgments of timbre by human listeners, as well as recognition of musical instruments.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

A Multistream Feature Framework Based on Bandpass Modulation Filtering for Robust Speech Recognition

Sridhar Krishna Nemala; Kailash Patil; Mounya Elhilali

There is strong neurophysiological evidence suggesting that processing of speech signals in the brain happens along parallel paths which encode complementary information in the signal. These parallel streams are organized around a duality of slow vs. fast: Coarse signal dynamics appear to be processed separately from rapidly changing modulations both in the spectral and temporal dimensions. We adapt such duality in a multistream framework for robust speaker-independent phoneme recognition. The scheme presented here centers around a multi-path bandpass modulation analysis of speech sounds with each stream covering an entire range of temporal and spectral modulations. By performing bandpass operations along the spectral and temporal dimensions, the proposed scheme avoids the classic feature explosion problem of previous multistream approaches while maintaining the advantage of parallelism and localized feature analysis. The proposed architecture results in substantial improvements over standard and state-of-the-art feature schemes for phoneme recognition, particularly in presence of nonstationary noise, reverberation and channel distortions.


international conference of the ieee engineering in medicine and biology society | 2012

A multiresolution analysis for detection of abnormal lung sounds

Dimitra Emmanouilidou; Kailash Patil; James E. West; Mounya Elhilali

Automated analysis and detection of abnormal lung sound patterns has great potential for improving access to standardized diagnosis of pulmonary diseases, especially in low-resource settings. In the current study, we develop signal processing tools for analysis of paediatric auscultations recorded under non-ideal noisy conditions. The proposed model is based on a biomimetic multi-resolution analysis of the spectro-temporal modulation details in lung sounds. The methodology provides a detailed description of joint spectral and temporal variations in the signal and proves to be more robust than frequency-based techniques in distinguishing crackles and wheezes from normal breathing sounds.


international conference on acoustics, speech, and signal processing | 2013

Task-driven attentional mechanisms for auditory scene recognition

Kailash Patil; Mounya Elhilali

How do humans attend to and pick out relevant auditory objects amongst all other sounds in the environment? Based on neurophysiological findings we propose two task oriented attentional mechanisms acting as Bayesian priors which act on two separate levels of processing: a sensory mapping stage and object representation stage. The former sensory stage is modeled as a high dimensional mapping which captures the spectrotemporal nuances and cues of auditory objects. The latter object representation stage then captures the statistical distribution of the different classes of acoustic scenes. This scheme shows a relative improvement in performance by 81% compared to a baseline system.


Eurasip Journal on Audio, Speech, and Music Processing | 2015

Biomimetic spectro-temporal features for music instrument recognition in isolated notes and solo phrases

Kailash Patil; Mounya Elhilali

The identity of musical instruments is reflected in the acoustic attributes of musical notes played with them. Recently, it has been argued that these characteristics of musical identity (or timbre) can be best captured through an analysis that encompasses both time and frequency domains; with a focus on the modulations or changes in the signal in the spectrotemporal space. This representation mimics the spectrotemporal receptive field (STRF) analysis believed to underlie processing in the central mammalian auditory system, particularly at the level of primary auditory cortex. How well does this STRF representation capture timbral identity of musical instruments in continuous solo recordings remains unclear. The current work investigates the applicability of the STRF feature space for instrument recognition in solo musical phrases and explores best approaches to leveraging knowledge from isolated musical notes for instrument recognition in solo recordings. The study presents an approach for parsing solo performances into their individual note constituents and adapting back-end classifiers using support vector machines to achieve a generalization of instrument recognition to off-the-shelf, commercially available solo music.


International Journal of Speech Technology | 2013

Recognizing the message and the messenger: biomimetic spectral analysis for robust speech and speaker recognition

Sridhar Krishna Nemala; Kailash Patil; Mounya Elhilali

Humans are quite adept at communicating in presence of noise. However most speech processing systems, like automatic speech and speaker recognition systems, suffer from a significant drop in performance when speech signals are corrupted with unseen background distortions. The proposed work explores the use of a biologically-motivated multi-resolution spectral analysis for speech representation. This approach focuses on the information-rich spectral attributes of speech and presents an intricate yet computationally-efficient analysis of the speech signal by careful choice of model parameters. Further, the approach takes advantage of an information-theoretic analysis of the message and speaker dominant regions in the speech signal, and defines feature representations to address two diverse tasks such as speech and speaker recognition. The proposed analysis surpasses the standard Mel-Frequency Cepstral Coefficients (MFCC), and its enhanced variants (via mean subtraction, variance normalization and time sequence filtering) and yields significant improvements over a state-of-the-art noise robust feature scheme, on both speech and speaker recognition tasks.


Journal of the Acoustical Society of America | 2010

A biomimetic multi‐resolution spectrotemporal model for musical timbre recognition.

Kailash Patil; Mounya Elhilali

Timbre is usually defined as that property of sound that is neither pitch nor loudness that helps in identifying sounds. Such a definition of what it is not is in itself problematic and unsatisfactory. The complexity of characterizing a timbre space stems from the intricate interactions between spectrotemporal dynamics of sound, which overcasts the simple description of individual acoustic dimensions as usually captured by techniques such as multidimensional scaling. By contrast, sound encoding in the mammalian auditory system, and particularly its sensitivity to spectrotemporal modulations, offers a rich feature space to explore perceptual representations of timbre. In the present work, the problem of musical timbre modeling was casted as a generative model based on a reduced set of multi‐resolution spectrotemporal features inspired from encoding of complex sound in the auditory cortex. A probabilistic system based on Gaussian mixtures was built to perform a timbre recognition task on a data set consisti...


conference of the international speech communication association | 2011

Multistream Bandpass Modulation Features for Robust Speech Recognition.

Sridhar Krishna Nemala; Kailash Patil; Mounya Elhilali


conference of the international speech communication association | 2010

A phoneme recognition framework based on auditory spectro-temporal receptive fields.

Samuel Thomas; Kailash Patil; Sriram Ganapathy; Nima Mesgarani; Hynek Hermansky


conference of the international speech communication association | 2012

Goal-Oriented Auditory Scene Recognition.

Kailash Patil; Mounya Elhilali

Collaboration


Dive into the Kailash Patil's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elie Khoury

Idiap Research Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

James E. West

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge