Maria Koutsogiannaki
University of Crete
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Maria Koutsogiannaki.
Computer Speech & Language | 2014
Elizabeth Godoy; Maria Koutsogiannaki; Yannis Stylianou
Lombard and Clear speech represent two acoustically and perceptually distinct speaking styles that humans employ to increase intelligibility. For Lombard speech, increased spectral energy in a band spanning the range of formants is consistent, effectively augmenting loudness, while vowel space expansion is exhibited in Clear speech, indicating greater articulation. On the other hand, analyses in the first part of this work illustrate that Clear speech does not exhibit significant spectral energy boosting, nor does the Lombard effect invoke an expansion of vowel space. Accordingly, though these two acoustic phenomena are largely attributed with the respective intelligibility gains of the styles, present analyses would suggest that they are mutually exclusive in human speech production. However, these phenomena can be used to inspire signal processing algorithms that seek to exploit and ultimately compound their respective intelligibility gains, as is explored in the second part of this work. While Lombard-inspired spectral shaping has been shown to successfully increase intelligibility, Clear speech-inspired modifications to expand vowel space are rarely explored. With this in mind, the latter part of this work focuses mainly on a novel frequency warping technique that is shown to achieve vowel space expansion. The frequency warping is then incorporated into an established Lombard-inspired Spectral Shaping method that pairs with dynamic range compression to maximize speech audibility (SSDRC). Finally, objective and subjective evaluations are presented in order to assess and compare the intelligibility gains of the different styles and their inspired modifications.
conference of the international speech communication association | 2016
Maria Koutsogiannaki; Yannis Stylianou
In this paper, speech intelligibility is enhanced by manipulating the modulation spectrum of the signal. First, the signal is decomposed into Amplitude Modulation (AM) and Frequency Modulation (FM) components using a high resolution adaptive quasi-harmonic model of speech. Then, the AM part of midrange frequencies of speech spectrum is modified by applying a transforming function which follows the characteristics of the clear style of speaking. This results in increasing the modulation depth of the temporal envelopes of casual speech as in clear speech. The modified AM components of speech are then combined with the original FM parts to synthesize the final processed signal. Subjective listening tests evaluating the intelligibility of speech in noise showed that the suggested approach increases the intelligibility of speech by 40% on average, while it is comparable with recently suggested state-of-the-art algorithms of intelligibility boosters.
international conference on acoustics, speech, and signal processing | 2014
Maria Koutsogiannaki; Yannis Stylianou
In this paper, the problem of modifying casual speech to reach the intelligibility level of clear speech is addressed. Unlike other studies, in this work modifications on casual speech both consider intelligibility and speech quality. To achieve this, the authors focus on human-like modifications inspired by clear speech. An acoustic analysis performed on clear and casual speech reveals energy differences on specific frequency bands between the two speaking styles. Then, a simple method is used to boost these frequency regions on casual speech. The proposed method, called mix-filtering, uses a multi-band filtering scheme to isolate the information of these frequency bands and then, add this information to the original signal. Our method is compared in terms of intelligibility and quality with unmodified casual speech and with a highly intelligible spectral modification technique, namely the Spectral Shaping and Dynamic Range Compression (SSDRC). Two different objective measures that are highly correlated with subjective intelligibility scores are used for estimating the intelligibility, whereas for evaluating the quality, preference listening tests are performed. Results show that the mix-filtering technique increases the intelligibility of casual speech while maintains its quality. On the other hand, while SSDRC outperforms on intelligibility, it degrades significantly the quality of casual speech.
conference of the international speech communication association | 2014
Maria Koutsogiannaki; Olympia Simantiraki; Gilles Degottex; Yannis Stylianou
conference of the international speech communication association | 2012
Maria Koutsogiannaki; Michèle Pettinato; Cassie Mayo; Varvara Kandia; Yannis Stylianou
computer software and applications conference | 2009
Asterios Leonidis; George Baryannis; Xenofon Fafoutis; Maria Korozi; Niki Gazoni; Michail Dimitriou; Maria Koutsogiannaki; Aikaterini Boutsika; Myron Papadakis; Haridimos Papagiannakis; George Tesseris; Emmanouil Voskakis; Antonis Bikakis; Grigoris Antoniou
conference of the international speech communication association | 2013
Elizabeth Godoy; Maria Koutsogiannaki; Yannis Stylianou
conference of the international speech communication association | 2017
Maria Koutsogiannaki; Holly L. Francois; Ki-hyun Choo; Eunmi Oh
MAVEBA | 2009
Yannis Pantazis; Maria Koutsogiannaki; Yannis Stylianou
Journal of The Audio Engineering Society | 2017
Ki-hyun Choo; Anton Porov; Maria Koutsogiannaki; Holly L. Francois; Jong-Hoon Jeong; Ho-Sang Sung; Eunmi Oh