Nikhil Deshpande
Rensselaer Polytechnic Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nikhil Deshpande.
Journal of the Acoustical Society of America | 2016
Nikhil Deshpande; Jonas Braasch
This model takes a mixture of two simultaneous speech signals at unique azimuth positions to extract either speaker using the equalization/cancelation (EC) method. Head-related transfer functions are used to spatialize the two sound sources. The model localizes the sources by analyzing interaural time differences and then virtually rotates its head to find the position for the best signal-to-noise ratio. Next, the model segments the mixed speech signal in time and frequency bins, and uses an EC algorithm in each bin to compensate the target signal from the mixture. From the residual non-cancelled energy, it generates a binary map and overlays this on the spectrogram. The ability of the model to cancel out the target signal determines the bins where the target is actually present. The signal is then reconstructed in time and frequency, leaving only one desired target signal. The model achieves signal-to-noise ratios of up to 80 dB. [This material is based upon work supported by the National Science Foundat...
Journal of the Acoustical Society of America | 2018
Nikhil Deshpande; Jonas Braasch
This model identifies a lateralized direct sound and single reflection. Anechoic speech is lateralized using a head-related transfer function. The speech is then time-delayed and lateralized to a different angle. The model then utilizes a binaurally-integrated cross-correlation/auto-correlation mechanism (BICAM) to analyze the lead/lag stimulus and generate a band-limited binaural activity pattern. This output is analyzed to calculate the time delay of the reflection, and the model then uses a neural network to estimate the lateralization of both the direct and reflected sound. From here, the model can remove the reflected copy of the sound, and extract the original raw, anechoic speech. [This work was supported by the HASS Fellowship at Rensselaer Polytechnic Institute, and the National Science Foundation Grant No. NSF BCS-1539276.]
Journal of the Acoustical Society of America | 2018
Nikhil Deshpande; Jonas Braasch
The earliest digital reverberation algorithms were designed to provide sound with a wider and more immersive sense of physical time-based space. These algorithms successfully replicated the perceptual attributes of realistic rooms, but there was also an interesting and perhaps unforeseen secondary outcome. The ubiquity of digital reverberation made the creative misuse of these algorithms to construct physically impossible spaces readily accessible to musicians. By providing users with access to algorithm controls, reverb grew from a tool for constructing realistic perceptions of rooms into the creation of sound design—and often, the centerpiece of composition. Rather than provide a perceptually accurate presentation of sounds in a stereo field, artists instead began to approach reverb as a smaller part of larger timbral systems. Examples span from reverse reverb to the more complex Eno-based shimmer system, and reverb’s place in production and perception will be considered and discussed.The earliest digital reverberation algorithms were designed to provide sound with a wider and more immersive sense of physical time-based space. These algorithms successfully replicated the perceptual attributes of realistic rooms, but there was also an interesting and perhaps unforeseen secondary outcome. The ubiquity of digital reverberation made the creative misuse of these algorithms to construct physically impossible spaces readily accessible to musicians. By providing users with access to algorithm controls, reverb grew from a tool for constructing realistic perceptions of rooms into the creation of sound design—and often, the centerpiece of composition. Rather than provide a perceptually accurate presentation of sounds in a stereo field, artists instead began to approach reverb as a smaller part of larger timbral systems. Examples span from reverse reverb to the more complex Eno-based shimmer system, and reverb’s place in production and perception will be considered and discussed.
Archive | 2017
Jonas Braasch; Selmer Bringsjord; Nikhil Deshpande; Pauline Oliveros; Doug Van Nort
In this chapter, we describe an intelligent music system approach that utilizes a joint bottom-up/top-down structure. The bottom-up structure is purely signal driven and calculates pitch, loudness, and information rate among other parameters using auditory models that simulate the functions of different parts of the brain. The top-down structure builds on a logic-based reasoning system and an ontology that was developed to reflect rules in jazz practice. Two instances of the agent have been developed to perform traditional and free jazz, and it is shown that the same general structure can be used to improvise different styles of jazz.
Journal of the Acoustical Society of America | 2017
Nikhil Deshpande; Jonas Braasch
This study investigates how virtual head rotations can improve a binaural models ability to segregate speech signals. The model takes two mixed speech sources spatialized to unique azimuth positions and localizes them. The model virtually rotates its head to orient itself for the maximum signal-to-noise ratio for extracting the target. An equalization-cancellation approach is used to generate a binary mask for the target based on localization cues. The mask is then overlaid onto the mixed signals spectrogram to extract the target from the mixture. Improvement in signal-to-noise ratios from head rotation approaches over 30 dB.
Journal of the Acoustical Society of America | 2017
Nikhil Deshpande; Jonas Braasch
This model takes two simultaneous speech signals, spatialized to unique azimuth positions and convolved with a simple multi-tap stereo impulse response. The model first identifies reflections and generates an inversion filter for the left and right channels. It then localizes the sources and virtually rotates its head to a known orientation for the best resulting segregation of the sources. Next, the model segments the input signals in time and frequency, applies the inverse filter, and searches for residual energy in each bin to compensate the target signal from the mixture. From the residual non-canceled energy, it generates a binary masking map and overlays this on the mixed signal’s spectrogram to extract only the target signal. Improvement in SNR from head rotation approaches over 30 dB.
Journal of the Acoustical Society of America | 2017
Jonas Braasch; Nikhil Deshpande; Jonathan Mathews; Samuel Chabot
Recently, we completed the Collaborative-Research Augmented Immersive Virtual Environment Laboratory (CRAIVE-Lab) with a usable floor area of 12 × 10 m2 at Rensselaer. The CRAIVE-Lab project addresses the need for a specialized virtual-reality (VR) system for the study and enabling of communication-driven tasks with groups of users immersed in a high-fidelity multi-modal environment located in the same physical space. For the acoustic domain, a 134-loudspeaker-channel system has been installed for Wave Field Synthesis (WFS) with the support of Higher-Order-Ambisonic (HoA) sound projection to render inhomogeneous acoustic fields. An integrated 16-channel spherical microphone array makes the CRAIVE-Lab an ideal test bed to study different spatial rendering techniques such as Wave-Field Synthesis, Higher-Order Ambisonics and Virtual Microphone Control (ViMiC). In this talk, sound-field measurements taken with a traditional binaural manikin will be compared to spherical microphone recordings to assess the qua...
Journal of the Acoustical Society of America | 2015
Jonas Braasch; M. T. Pastore; Nikhil Deshpande; Jens Blauert
A bi-modal model is presented that predicts the psychophysical results of Valente and Braasch [Acustica, 2008]. The model simulates expectation of room-acoustical qualities due to visual cues. The visual part of the model estimates angles of incidence and delays of the first two side reflections of a given frontal sound source. To this end, a stereoscopic image is used to determine azimuth angles and distances for the two frontal room corners. The distance estimates are derived by using the angular differences between the left- and right-eye images of each corner. The model then calculates the room volume by reconstructing a rectangular room from these data, assuming a range of possibilities for the missing room coordinates. In a next step, logarithmic fits of volume to expected reverberation time and of volume to direct-to-reverberant energy ratio predict the expected value ranges for these two parameters. Using a feedback structure, the visually-derived acoustic parameters become input to an auditory Pr...
Journal of the Acoustical Society of America | 2014
Nikhil Deshpande; Jonas Braasch
This algorithm is a real-time implementation of a polyphonic pitch perception model previously described in Braasch et al. [POMA 19, 015027 (2013)]. The model simulates the rate code for pitch detection by taking advantage of phase locking techniques. Complex input tones are processed through a filter bank, and the output of each filter is run through its own separate autocorrelation following the Licklider model. After conversion to polar form, analysis is done on the phase output of the autocorrelation, where the algorithm computes the time delay between localized peaks. This gives the fundamental period of the tone within a given filter; these values are then normalized by the magnitude output of the autocorrelation and combined with output data from the other filters to give full spectral information. The algorithm uses an adjustable running frame window to trade off between frequency resolution and rapid changes in pitch. The model can accurately extract missing or implied fundamental frequencies.
Archive | 2015
Jonas Braasch; Nikhil Deshpande; Pauline Oliveros; Selmer Bringsjord