Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marek B. Trawicki is active.

Publication


Featured researches published by Marek B. Trawicki.


international workshop on machine learning for signal processing | 2005

Automatic Song-Type Classification and Speaker Identification of Norwegian Ortolan Bunting (Emberiza Hortulana) Vocalizations

Marek B. Trawicki; Michael T. Johnson; Tomasz S. Osiejuk

This paper presents an approach to song-type classification and speaker identification of Norwegian Ortolan Bunting (Emberiza Hortulana) vocalizations using traditional human speech processing methods. Hidden Markov models (HMMs) are used for both tasks, with features including mel-frequency cepstral coefficients (MFCCs), log energy, and delta (velocity) and delta-delta (acceleration) coefficients. Vocalizations were tested using leave-one-out cross-validation. Classification accuracy for 5 song-types is 92.4%, dropping to 63.6% as the number and similarity of the songs increases. Song-type dependent speaker identification rates peak at 98.7%, with typical accuracies of 80-95% and a low end at 76.2% as the number of speakers increases. These experiments fit into a larger framework of research working towards methods for acoustic censusing of endangered species populations and more automated bioacoustic analysis methods


Signal Processing | 2012

Distributed multichannel speech enhancement with minimum mean-square error short-time spectral amplitude, log-spectral amplitude, and spectral phase estimation

Marek B. Trawicki; Michael T. Johnson

In this paper, the authors present optimal multichannel frequency domain estimators for minimum mean-square error (MMSE) short-time spectral amplitude (STSA), log-spectral amplitude (LSA), and spectral phase estimation in a widely distributed microphone configuration. The estimators utilize Rayleigh and Gaussian statistical models for the speech prior and noise likelihood with a diffuse noise field for the surrounding environment. Based on the Signal-to-Noise Ratio (SNR) and Segmental Signal-to-Noise Ratio (SSNR) along with the Log-Likelihood Ratio (LLR) and Perceptual Evaluation of Speech Quality (PESQ) as objective metrics, the multichannel LSA estimator decreases background noise and speech distortion and increases speech quality compared to the baseline single channel STSA and LSA estimators, where the optimal multichannel spectral phase estimator serves as a significant quantity to the improvements, and demonstrates robustness due to time alignment and attenuation factor estimation. Overall, the optimal distributed microphone spectral estimators show strong results in noisy environments with application to many consumer, industrial, and military products.


Speech Communication | 2014

Speech enhancement using Bayesian estimators of the perceptually-motivated short-time spectral amplitude (STSA) with Chi speech priors

Marek B. Trawicki; Michael T. Johnson

In this paper, the authors propose new perceptually-motivated Weighted Euclidean (WE) and Weighted Cosh (WCOSH) estimators that utilize more appropriate Chi statistical models for the speech prior with Gaussian statistical models for the noise likelihood. Whereas the perceptually-motivated WE and WCOSH cost functions emphasized spectral valleys rather than spectral peaks (formants) and indirectly accounted for auditory masking effects, the incorporation of the Chi distribution statistical models demonstrated distinct improvement over the Rayleigh statistical models for the speech prior. The estimators incorporate both weighting law and shape parameters on the cost functions and distributions. Performance is evaluated in terms of the Segmental Signal-to-Noise Ratio (SSNR), Perceptual Evaluation of Speech Quality (PESQ), and Signal-to-Noise Ratio (SNR) Loss objective quality measures to determine the amount of noise reduction along with overall speech quality and speech intelligibility improvement. Based on experimental results across three different input SNRs and eight unique noises along with various weighting law and shape parameters, the two general, less-complicated, closed-form derived solution estimators of WE and WCOSH with Chi speech priors provide significant gains in noise reduction and noticeable gains in overall speech quality and speech intelligibility improvements over the baseline WE and WCOSH with the standard Rayleigh speech priors. Overall, the goal of the work is to capitalize on the mutual benefits of the WE and WCOSH cost functions and Chi distributions for the speech prior to improvement enhancement.


international conference on acoustics, speech, and signal processing | 2009

Optimal distributed microphone phase estimation

Marek B. Trawicki; Michael T. Johnson

This paper presents a minimum mean-square error spectral phase estimator for speech enhancement in the distributed multiple microphone scenario. The estimator uses Gaussian models for both the speech and noise priors under the assumption of a diffuse incoherent noise field representing ambient noise in a widely dispersed microphone configuration. Experiments demonstrate significant benefits of using the optimal multichannel phase estimator as compared to the noisy phase of a reference channel.


international conference on audio, language and image processing | 2012

Multichannel speech recognition using distributed microphone signal fusion strategies

Marek B. Trawicki; Michael T. Johnson; An Ji; Tomasz S. Osiejuk

Multichannel fusion strategies are presented for the distributed microphone recognition environment, for the task of song-type recognition in a multichannel songbird dataset. The signals are first fused together based on various heuristics, including their amplitudes, variances, physical distance, or squared distance, before passing the enhanced single-channel signal into the speech recognition system. The intensity-weighted fusion strategy achieved the highest overall recognition accuracy of 94.4%. By combining the noisy distributed microphone signals in an intelligent way that is proportional to the information contained in the signals, speech recognition systems can achieve higher recognition accuracies.


Iet Signal Processing | 2013

Distributed multichannel speech enhancement based on perceptually-motivated Bayesian estimators of the spectral amplitude

Marek B. Trawicki; Michael T. Johnson

In this study, the authors propose multichannel weighted Euclidean (WE) and weighted cosh (WCOSH) cost function estimators for speech enhancement in the distributed microphone scenario. The goal of the work is to illustrate the advantages of utilising additional microphones and modified cost functions for improving signal-to-noise ratio (SNR) and segmental SNR (SSNR) along with log-likelihood ratio (LLR) and perceptual evaluation of speech quality (PESQ) objective metrics over the corresponding single-channel baseline estimators. As with their single-channel counterparts, the perceptually-motivated multichannel WE and WCOSH estimators are functions of a weighting law parameter, which influences attention of the noisy spectral amplitude through a spectral gain function, emphasises spectral peak (formant) information, and accounts for auditory masking effects. Based on the simulation results, the multichannel WE and WCOSH cost function estimators produced gains in SSNR improvement, LLR output and PESQ output over the single-channel baseline results and unweighted cost functions with the best improvements occurring with negative values of the weighting law parameter across all input SNR levels and noise types.


Journal of the Acoustical Society of America | 2007

Distributed multi‐microphone (dmm) classification

Marek B. Trawicki; Michael T. Johnson; Tomasz S. Osiejuk

Over the past several decades, research in signal enhancement and speech recognition has concentrated on single channels and microphone arrays. Whereas single channels require subjects who are relatively close to the microphone, microphone arrays require close spacing and a priori knowledge of the geometry. In contrast to those stringent conditions, distributed multi‐microphones (DMMs) can be utilized for situations that require the microphones that are positioned far away from the subjects with possibly unknown wide‐spacing and configurations such as in meeting rooms or the wild. As opposed to performing recognition through microphone selection, feature integration, or likelihood combination, the proposed work focuses on processing the DMM signals to diminish the effects of ambient noise and form one optimal signal before passing it into the recognizer through two methods: weighted sum of distances and weighted sum of signal powers. Song‐type classification experiments are presented on eight‐channel Norw...


international conference on acoustics, speech, and signal processing | 2006

Generalized Perceptual Features for Vocalization Analysis Across Multiple Species

Patrick J. Clemins; Marek B. Trawicki; Kuntoro Adi; Jidong Tao; Michael T. Johnson


Archive | 2007

Combined Conditional Random Fields and n -Gram Language Models for Gene Mention Recognition

Craig A. Struble; Richard J. Povinelli; Michael T. Johnson; Dina Berchanskiy; Jidong Tao; Marek B. Trawicki


International Journal of Theoretical and Applied Mathematics | 2016

Multichannel MMSE Wiener Filter Using Complex Real and Imaginary Spectral Coefficients for Distributed Microphone Speech Enhancement

Marek B. Trawicki; Michael T. Johnson

Collaboration


Dive into the Marek B. Trawicki's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tomasz S. Osiejuk

Adam Mickiewicz University in Poznań

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

An Ji

Marquette University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge