Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Abigail Anne Kressner is active.

Publication


Featured researches published by Abigail Anne Kressner.


IEEE Transactions on Audio, Speech, and Language Processing | 2013

Evaluating the Generalization of the Hearing Aid Speech Quality Index (HASQI)

Abigail Anne Kressner; David V. Anderson; Christopher J. Rozell

Many developers of audio signal processing strategies rely on objective measures of quality for initial evaluations of algorithms. As such, objective measures should be robust, and they should be able to predict quality accurately regardless of the dataset or testing conditions. Kates and Arehart have developed the Hearing Aid Speech Quality Index (HASQI) to predict the effects of noise, nonlinear distortion, and linear filtering on speech quality for both normal-hearing and hearing-impaired listeners, and they report very high performance with their training and testing datasets [Kates, J. and Arehart, K., Audio Eng. Soc., 58(5), 363-381 (2010)]. In order to investigate the generalizability of HASQI, we test its ability to predict normal-hearing listeners subjective quality ratings of a dataset on which it was not trained. This dataset is designed specifically to contain a wide range of distortions introduced by real-world noises which have been processed by some of the most common noise suppression algorithms in hearing aids. We show that HASQI achieves prediction performance comparable to the Perceptual Evaluation of Speech Quality (PESQ), the standard for objective measures of quality, as well as some of the other measures in the literature. Furthermore, we identify areas of weakness and show that training can improve quantitative prediction.


workshop on applications of signal processing to audio and acoustics | 2011

Robustness of the Hearing Aid Speech Quality Index (HASQI)

Abigail Anne Kressner; David V. Anderson; Christopher J. Rozell

Objective measures of speech quality have been the subject of significant prior work, particularly in the areas of speech codecs and communication channels for normal-hearing listeners. One of the primary concerns of researchers in this area is how these metrics generalize to datasets or listener studies which are “unknown” to the measures. Another growing concern is how these metrics perform for the hearing-impaired community. Researchers working with the this community need to be able to predict how hearing-impaired listeners will perceive the quality of speech, as well as how they will perceive the quality of speech processed specifically by hearing aids. A relatively recent metric, the Hearing Aid Speech Quality Index (HASQI), is a model-based objective measure of quality developed in the context of hearing aids for normal-hearing and hearing-impaired listeners (Kates & Arehart, Journal of the Audio Engineering Society, 2010). As such, HASQI makes substantial progress on some of the generalization issues. However, HASQI has not been tested thus far on any datasets other than the one on which it was trained. The objective of this study is to demonstrate the robustness of HASQI in predicting subjective quality. We use an “unknown” dataset of noisy speech processed by noise suppression algorithms, along with a corresponding set of subjective quality scores from normal-hearing listeners, to demonstrate HASQIs prediction performance. Furthermore, we compare HASQIs performance with that of several other objective measures in order to provide a point of reference.


Journal of the Acoustical Society of America | 2015

Structure in time-frequency binary masking errors and its impact on speech intelligibility

Abigail Anne Kressner; Christopher J. Rozell

Although requiring prior knowledge makes the ideal binary mask an impractical algorithm, substantial increases in measured intelligibility make it a desirable benchmark. While this benchmark has been studied extensively, many questions remain about the factors that influence the intelligibility of binary-masked speech with non-ideal masks. To date, researchers have used primarily uniformly random, uncorrelated mask errors and independently presented error types (i.e., false positives and negatives) to characterize the influence of estimation errors on intelligibility. However, practical estimation algorithms produce masks that contain errors of both types and with non-trivial amounts of structure. This paper introduces an investigation framework for binary masks and presents listener studies that use this framework to illustrate how interactions between error types and structure affect intelligibility. First, this study demonstrates that clustering (i.e., a form of structure) of mask errors reduces intelligibility. Furthermore, while previous research has suggested that false positives are more detrimental to intelligibility than false negatives, this study indicates that false negatives can be equally detrimental to intelligibility when they contain structure or when both error types are present. Finally, this study shows that listeners tolerate fewer mask errors when both types of errors are present, especially when the errors contain structure.


international conference on acoustics, speech, and signal processing | 2013

A novel binary mask estimator based on sparse approximation

Abigail Anne Kressner; David V. Anderson; Christopher J. Rozell

While most single-channel noise reduction algorithms fail to improve speech intelligibility, the ideal binary mask (IBM) has demonstrated substantial intelligibility improvements. However, this approach exploits oracle knowledge. The main objective of this paper is to introduce a novel binary mask estimator based on a simple sparse approximation algorithm. Our approach does not require oracle knowledge and instead uses knowledge of speech structure.


conference of the international speech communication association | 2016

Comparing the Influence of Spectro-Temporal Integration in Computational Speech Segregation.

Thomas Bentsen; Tobias May; Abigail Anne Kressner; Torsten Dau

segregation DTU Orbit (02/11/2019) Comparing the influence of spectro-temporal integration in computational speech segregation The goal of computational speech segregation systems is to automatically segregate a target speaker from interfering maskers. Typically, these systems include a feature extraction stage in the front-end and a classification stage in the backend. A spectrotemporal integration strategy can be applied in either the frontend, using the so-called delta features, or in the back-end, using a second classifier that exploits the posterior probability of speech from the first classifier across a spectro-temporal window. This study systematically analyzes the influence of such stages on segregation performance, the error distributions and intelligibility predictions. Results indicated that it could be problematic to exploit context in the back-end, even though such a spectro-temporal integration stage improves the segregation performance. Also, the results emphasized the potential need of a single metric that comprehensively predicts computational segregation performance and correlates well with intelligibility. The outcome of this study could help to identify the most effective spectro-temporal integration strategy for computational segregation systems.


Journal of the Acoustical Society of America | 2016

Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech

Abigail Anne Kressner; Tobias May; Christopher J. Rozell

To date, the most commonly used outcome measure for assessing ideal binary mask estimation algorithms is based on the difference between the hit rate and the false alarm rate (H-FA). Recently, the error distribution has been shown to substantially affect intelligibility. However, H-FA treats each mask unit independently and does not take into account how errors are distributed. Alternatively, algorithms can be evaluated with the short-time objective intelligibility (STOI) metric using the reconstructed speech. This study investigates the ability of H-FA and STOI to predict intelligibility for binary-masked speech using masks with different error distributions. The results demonstrate the inability of H-FA to predict the behavioral intelligibility and also illustrate the limitations of STOI. Since every estimation algorithm will make errors that are distributed in different ways, performance evaluations should not be made solely on the basis of these metrics.


Journal of the Acoustical Society of America | 2016

Cochlear implant speech intelligibility outcomes with structured and unstructured binary mask errors

Abigail Anne Kressner; Adam Westermann; Jörg M. Buchholz; Christopher J. Rozell

It has been shown that intelligibility can be improved for cochlear implant (CI) recipients with the ideal binary mask (IBM). In realistic scenarios where prior information is unavailable, however, the IBM must be estimated, and these estimations will inevitably contain errors. Although the effects of both unstructured and structured binary mask errors have been investigated with normal-hearing (NH) listeners, they have not been investigated with CI recipients. This study assesses these effects with CI recipients using masks that have been generated systematically with a statistical model. The results demonstrate that clustering of mask errors substantially decreases the tolerance of errors, that incorrectly removing target-dominated regions can be as detrimental to intelligibility as incorrectly adding interferer-dominated regions, and that the individual tolerances of the different types of errors can change when both are present. These trends follow those of NH listeners. However, analysis with a mixed effects model suggests that CI recipients tend to be less tolerant than NH listeners to mask errors in most conditions, at least with respect to the testing methods in each of the studies. This study clearly demonstrates that structure influences the tolerance of errors and therefore should be considered when analyzing binary-masking algorithms.


Journal of the Acoustical Society of America | 2013

Causal binary mask estimation for speech enhancement using sparsity constraints

Abigail Anne Kressner; David V. Anderson; Christopher J. Rozell

While most single-channel noise reduction algorithms fail to improve speech intelligibility, the ideal binary mask (IBM) has demonstrated substantial intelligibility improvements for both normal- and impaired-hearing listeners. However, this approach exploits oracle knowledge of the target and interferer signals to preserve only the time-frequency regions that are target-dominated. Single-channel noise suppression algorithms trying to approximate the IBM using locally estimated signal-to-noise ratios without oracle knowledge have had limited success. Thought of in another way, the IBM exploits the disjoint placement of the target and interferer in time and frequency to create a time-frequency signal representation that is more sparse (i.e., has fewer non-zeros). In recent work (submitted to ICASSP 2013) we have introduced a novel time-frequency masking algorithm based on a sparse approximation algorithm from the signal processing literature. However, the algorithm employs a non-causal estimator. The prese...


Journal of the Acoustical Society of America | 2018

The impact of exploiting spectro-temporal context in computational speech segregation

Thomas Bentsen; Abigail Anne Kressner; Torsten Dau; Tobias May

Computational speech segregation aims to automatically segregate speech from interfering noise, often by employing ideal binary mask estimation. Several studies have tried to exploit contextual information in speech to improve mask estimation accuracy by using two frequently-used strategies that (1) incorporate delta features and (2) employ support vector machine (SVM) based integration. In this study, two experiments were conducted. In Experiment I, the impact of exploiting spectro-temporal context using these strategies was investigated in stationary and six-talker noise. In Experiment II, the delta features were explored in detail and tested in a setup that considered novel noise segments of the six-talker noise. Computing delta features led to higher intelligibility than employing SVM based integration and intelligibility increased with the amount of spectral information exploited via the delta features. The system did not, however, generalize well to novel segments of this noise type. Measured intelligibility was subsequently compared to extended short-term objective intelligibility, hit-false alarm rate, and the amount of mask clustering. None of these objective measures alone could account for measured intelligibility. The findings may have implications for the design of speech segregation systems, and for the selection of a cost function that correlates with intelligibility.


international conference on digital signal processing | 2011

A causal Locally Competitive Algorithm for the sparse decomposition of audio signals

Adam S. Charles; Abigail Anne Kressner; Christopher J. Rozell

While current inference methods can decompose audio signals, they require the entire signal upfront and are therefore ill-suited for real-time applications requiring causal processing. We propose a neurally-inspired, causal, sparse inference scheme based on the Locally Competitive Algorithm (LCA) over a temporal-spectral neighborhood. We demonstrate that this causal inference scheme can achieve lower sparsity levels and better signal fidelity than current filter and threshold approaches. Additionally, for some regimes, the sparsity level approaches those of Matching Pursuit while still maintaining signal integrity.

Collaboration


Dive into the Abigail Anne Kressner's collaboration.

Top Co-Authors

Avatar

Christopher J. Rozell

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

David V. Anderson

Georgia Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Tobias May

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar

Thomas Bentsen

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar

Torsten Dau

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kristine Aavild Juhl

Technical University of Denmark

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Adam S. Charles

Georgia Institute of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge