Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daniel S. Benincasa is active.

Publication


Featured researches published by Daniel S. Benincasa.


international conference on acoustics, speech, and signal processing | 2002

Co-channel speaker segment separation

Brett Y. Smolenski; Robert E. Yantorno; Daniel S. Benincasa; Stanley J. Wenndt

A novel approach to co-channel speaker separation is presented here. The technique uses the statistical properties of combinations of high Target-to-Interferer Ratio (TIR) speech segments, which were extracted from a 0 dB overall TIR co-channel utterance. The problem is broken down into making three simpler decisions. First, closed-set speaker identification technology is used on combinations of high TIR speech segments to determine which speakers are generating the co-channel speech. Next, the proportion of segments belonging to each speaker is estimated using a bimodal model. Lastly, a maximum likelihood decision is made as to which two combinations of segments best represent the two speakers. Using this approach at least one of the speakers could readily be identified when the speaker contributed a segment that was 160 ms or more in length. Once the speakers were determined, greater than 90% reliable speaker separation was obtained.


international conference on acoustics, speech, and signal processing | 1997

Co-channel speaker separation using constrained nonlinear optimization

Daniel S. Benincasa; Michael Savic

This paper describes a technique to separate the speech of two speakers recorded over a single channel. The main focus of this research is to separate overlapping voiced speech signals using constrained nonlinear optimization. Based on the assumption that voiced speech can be modeled as a slowly-varying vocal tract filter with a quasi-periodic train of impulses, the speech waveform is represented as a sum of sine waves with time-varying amplitude, frequency and phase. In this work the unknown parameters of our speech model are the amplitude, frequency and phase of the harmonics of both speech signals. Using constrained nonlinear optimization, we determine, on a frame by frame basis, the best possible parameters that provides the least mean square error (LMSE) between the original co-channel speech signal and the sum of the reconstructed speech signals,.


international conference on acoustics speech and signal processing | 1998

Voicing state determination of co-channel speech

Daniel S. Benincasa; Michael Savic

This paper presents a voicing state determination algorithm (VSDA) that is used to simultaneously estimate the voicing state of two speakers present in a segment of co-channel speech. Supervised learning trains a Bayesian classifier to predict the voicing states. The possible voicing states are silence, voiced/voiced, voiced/unvoiced, unvoiced/voiced and unvoiced/unvoiced. We have assumed the silent state as a subset of the unvoiced class, except when both speakers are silent. We have chosen a binary tree decision structure. Our feature set is a projection of a 37 dimensional feature vector onto a single dimension applied at each branch of the decision tree, using the Fisher linear discriminant. We have produced co-channel speech from the TIMIT database which is used for training and testing. Preliminary results, at signal to interference ratio of 0 dB, have produced classification accuracy of 82.6%, 73.45%, and 68.24% on male/female, male/male and female/female mixtures respectively.


Proceedings of SPIE, the International Society for Optical Engineering | 2001

Effects of cochannel speech on speaker identification

Robert E. Yantorno; Daniel S. Benincasa; Stanley J. Wenndt

Past studies have shown that speaker identification (SID) algorithms that utilized LPC cepstral feature and a vector quantization classifier can be sensitive to changes in environmental conditions. Many experiments have examined the effects of noise on the LPC cepstral feature. This work studies the effects of co-channel speech on a SID system. It has been found that co-channel interference will degrade the performance of a speaker identification system, but not significantly when compared to the effects of wideband noise on an SID system. Our results show that when the interfering speaker is modeled as one of the speakers within he training set, it has less of an effect on the performance of an SID system than when the interfering speaker is outside the set of modeled speakers.


Proceedings of SPIE | 2017

Design and simulation of sensor networks for tracking Wifi users in outdoor urban environments

Christopher Thron; Khoi Tran; Douglas Smith; Daniel S. Benincasa

We present a proof-of-concept investigation into the use of sensor networks for tracking of WiFi users in outdoor urban environments. Sensors are fixed, and are capable of measuring signal power from users’ WiFi devices. We derive a maximum likelihood estimate for user location based on instantaneous sensor power measurements. The algorithm takes into account the effects of power control, and is self-calibrating in that the signal power model used by the location algorithm is adjusted and improved as part of the operation of the network. Simulation results to verify the system’s performance are presented. The simulation scenario is based on a 1.5 km2 area of lower Manhattan, The self-calibration mechanism was verified for initial rms (root mean square) errors of up to 12 dB in the channel power estimates: rms errors were reduced by over 60% in 300 track-hours, in systems with limited power control. Under typical operating conditions with (without) power control, location rms errors are about 8.5 (5) meters with 90% accuracy within 9 (13) meters, for both pedestrian and vehicular users. The distance error distributions for smaller distances (<30 m) are well-approximated by an exponential distribution, while the distributions for large distance errors have fat tails. The issue of optimal sensor placement in the sensor network is also addressed. We specify a linear programming algorithm for determining sensor placement for networks with reduced number of sensors. In our test case, the algorithm produces a network with 18.5% fewer sensors with comparable accuracy estimation performance. Finally, we discuss future research directions for improving the accuracy and capabilities of sensor network systems in urban environments.


Journal of the Acoustical Society of America | 2008

Method for improving speaker identification by determining usable speech

Robert E. Yantorno; Daniel S. Benincasa; Stanley J. Wenndt; Brett Y. Smolenski


Archive | 2000

SPECTRAL AUTOCORRELATION RATIO AS A USABILITY MEASURE OF SPEECH SEGMENTS UNDER CO-CHANNEL CONDITIONS

Kasturi Rangan Krishnamachari; Robert E. Yantorno; Daniel S. Benincasa; Stanley J. Wenndt


international conference on acoustics, speech, and signal processing | 2001

Developing usable speech criteria for speaker identification technology

Jereme M. Lovekin; Robert E. Yantorno; Kasturi Rangan Krishnamachari; Daniel S. Benincasa; Stanley J. Wenndt


international conference on acoustics, speech, and signal processing | 2001

Use of local kurtosis measure for spotting usable speech segments in co-channel speech

Kasturi Rangan Krishnamachari; Robert E. Yantorno; Jereme M. Lovekin; Daniel S. Benincasa; Stanley J. Wenndt


southeastcon | 2018

I/Q Imbalances in QAM Communication Systems with Multi-Antenna Receivers and IF Architecture

Thomas Yang; Douglas Smith; Daniel S. Benincasa

Collaboration


Dive into the Daniel S. Benincasa's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stanley J. Wenndt

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar

Michael Savic

Rensselaer Polytechnic Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Douglas Smith

Air Force Research Laboratory

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge