Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Asterios Toutios is active.

Publication


Featured researches published by Asterios Toutios.


Journal of the Acoustical Society of America | 2014

Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (TC)

Shrikanth Narayanan; Asterios Toutios; Vikram Ramanarayanan; Adam C. Lammert; Jangwon Kim; Sungbok Lee; Krishna S. Nayak; Yoon Chul Kim; Yinghua Zhu; Louis Goldstein; Dani Byrd; Erik Bresch; Athanasios Katsamanis; Michael Proctor

USC-TIMIT is an extensive database of multimodal speech production data, developed to complement existing resources available to the speech research community and with the intention of being continuously refined and augmented. The database currently includes real-time magnetic resonance imaging data from five male and five female speakers of American English. Electromagnetic articulography data have also been presently collected from four of these speakers. The two modalities were recorded in two independent sessions while the subjects produced the same 460 sentence corpus used previously in the MOCHA-TIMIT database. In both cases the audio signal was recorded and synchronized with the articulatory data. The database and companion software are freely available to the research community.


Magnetic Resonance in Medicine | 2017

A fast and flexible MRI system for the study of dynamic vocal tract shaping.

Sajan Goud Lingala; Yinghua Zhu; Yoon-Chul Kim; Asterios Toutios; Shrikanth Narayanan; Krishna S. Nayak

The aim of this work was to develop and evaluate an MRI‐based system for study of dynamic vocal tract shaping during speech production, which provides high spatial and temporal resolution.


Journal of the Acoustical Society of America | 2011

Estimating the control parameters of an articulatory model from electromagnetic articulograph data

Asterios Toutios; Slim Ouni; Yves Laprie

Finding the control parameters of an articulatory model that result in given acoustics is an important problem in speech research. However, one should also be able to derive the same parameters from measured articulatory data. In this paper, a method to estimate the control parameters of the the model by Maeda from electromagnetic articulography (EMA) data, which allows the derivation of full sagittal vocal tract slices from sparse flesh-point information, is presented. First, the articulatory grid system involved in the models definition is adapted to the speaker involved in the experiment, and EMA data are registered to it automatically. Then, articulatory variables that correspond to measurements defined by Maeda on the grid are extracted. An initial solution for the articulatory control parameters is found by a least-squares method, under constraints ensuring vocal tract shape naturalness. Dynamic smoothness of the parameter trajectories is then imposed by a variational regularization method. Generated vocal tract slices for vowels are compared with slices appearing in magnetic resonance images of the same speaker or found in the literature. Formants synthesized on the basis of these generated slices are adequately close to those tracked in real speech recorded concurrently with EMA.


APSIPA Transactions on Signal and Information Processing | 2016

Advances in real-time magnetic resonance imaging of the vocal tract for speech science and technology research

Asterios Toutios; Shrikanth Narayanan

Real-time magnetic resonance imaging (rtMRI) of the moving vocal tract during running speech production is an important emerging tool for speech production research providing dynamic information of a speakers upper airway from the entire mid-sagittal plane or any other scan plane of interest. There have been several advances in the development of speech rtMRI and corresponding analysis tools, and their application to domains such as phonetics and phonological theory, articulatory modeling, and speaker characterization. An important recent development has been the open release of a database that includes speech rtMRI data from five male and five female speakers of American English each producing 460 phonetically balanced sentences. The purpose of the present paper is to give an overview and outlook of the advances in rtMRI as a tool for speech research and technology development.


Journal of the Acoustical Society of America | 2015

A kinematic study of critical and non-critical articulators in emotional speech production.

Jangwon Kim; Asterios Toutios; Sungbok Lee; Shrikanth Narayanan

This study explores one aspect of the articulatory mechanism that underlies emotional speech production, namely, the behavior of linguistically critical and non-critical articulators in the encoding of emotional information. The hypothesis is that the possible larger kinematic variability in the behavior of non-critical articulators enables revealing underlying emotional expression goal more explicitly than that of the critical articulators; the critical articulators are strictly controlled in service of achieving linguistic goals and exhibit smaller kinematic variability. This hypothesis is examined by kinematic analysis of the movements of critical and non-critical speech articulators gathered using eletromagnetic articulography during spoken expressions of five categorical emotions. Analysis results at the level of consonant-vowel-consonant segments reveal that critical articulators for the consonants show more (less) peripheral articulations during production of the consonant-vowel-consonant syllables for high (low) arousal emotions, while non-critical articulators show less sensitive emotional variation of articulatory position to the linguistic gestures. Analysis results at the individual phonetic targets show that overall, between- and within-emotion variability in articulatory positions is larger for non-critical cases than for critical cases. Finally, the results of simulation experiments suggest that the postural variation of non-critical articulators depending on emotion is significantly associated with the controls of critical articulators.


Eurasip Journal on Audio, Speech, and Music Processing | 2013

Acoustic-visual synthesis technique using bimodal unit-selection

Slim Ouni; Vincent Colotte; Utpala Musti; Asterios Toutios; Brigitte Wrobel-Dautcourt; Marie-Odile Berger; Caroline Lavecchia

This paper presents a bimodal acoustic-visual synthesis technique that concurrently generates the acoustic speech signal and a 3D animation of the speaker’s outer face. This is done by concatenating bimodal diphone units that consist of both acoustic and visual information. In the visual domain, we mainly focus on the dynamics of the face rather than on rendering. The proposed technique overcomes the problems of asynchrony and incoherence inherent in classic approaches to audiovisual synthesis. The different synthesis steps are similar to typical concatenative speech synthesis but are generalized to the acoustic-visual domain. The bimodal synthesis was evaluated using perceptual and subjective evaluations. The overall outcome of the evaluation indicates that the proposed bimodal acoustic-visual synthesis technique provides intelligible speech in both acoustic and visual channels.


international conference on acoustics, speech, and signal processing | 2009

Registration of multimodal data for estimating the parameters of an articulatory model

Michael Aron; Asterios Toutios; Marie-Odile Berger; Erwan Kerrien; Brigitte Wrobel-Dautcourt; Yves Laprie

Being able to animate a speech production model with articulatory data would open applications in many domains. In this paper, we first consider the problem of acquiring articulatory data from non invasive image and sensor modalities: dynamic ultrasound (US) images, stereovision 3D data, electromagnetic sensors and MRI. We here especially focus on automatic registration methods which enable the fusion of the articulatory features in a common frame. We then derive articulatory parameters by fitting these features with Maedas model. To our knowledge, it is the first attempt to derive articulatory parameters from features automatically extracted and registered between the modalities. Results prove the soundness of the approach and the reliability of the fused articulatory data.


Computer Speech & Language | 2008

Estimating electropalatographic patterns from the speech signal

Asterios Toutios; Konstantinos G. Margaritis

Electropalatography is a well established technique for recording information on the patterns of contact between the tongue and the hard palate during speech, leading to a stream of binary vectors representing contacts or non-contacts between the tongue and certain positions on the hard palate. A data-driven approach to mapping the speech signal onto electropalatographic information is presented. Principal component analysis is used to model the spatial structure of the electropalatographic data and support vector regression is used to map acoustic parameters onto projections of the electropalatographic data on the principal components.


Journal of the Acoustical Society of America | 2017

Test–retest repeatability of human speech biomarkers from static and real-time dynamic magnetic resonance imaging

Johannes Töger; Tanner Sorensen; Krishna Somandepalli; Asterios Toutios; Sajan Goud Lingala; Shrikanth Narayanan; Krishna S. Nayak

Static anatomical and real-time dynamic magnetic resonance imaging (RT-MRI) of the upper airway is a valuable method for studying speech production in research and clinical settings. The test-retest repeatability of quantitative imaging biomarkers is an important parameter, since it limits the effect sizes and intragroup differences that can be studied. Therefore, this study aims to present a framework for determining the test-retest repeatability of quantitative speech biomarkers from static MRI and RT-MRI, and apply the framework to healthy volunteers. Subjects (n = 8, 4 females, 4 males) are imaged in two scans on the same day, including static images and dynamic RT-MRI of speech tasks. The inter-study agreement is quantified using intraclass correlation coefficient (ICC) and mean within-subject standard deviation (σe). Inter-study agreement is strong to very strong for static measures (ICC: min/median/max 0.71/0.89/0.98, σe: 0.90/2.20/6.72 mm), poor to strong for dynamic RT-MRI measures of articulator motion range (ICC: 0.26/0.75/0.90, σe: 1.6/2.5/3.6 mm), and poor to very strong for velocities (ICC: 0.21/0.56/0.93, σe: 2.2/4.4/16.7 cm/s). In conclusion, this study characterizes repeatability of static and dynamic MRI-derived speech biomarkers using state-of-the-art imaging. The introduced framework can be used to guide future development of speech biomarkers. Test-retest MRI data are provided free for research use.


conference of the international speech communication association | 2016

State-of-the-Art MRI Protocol for Comprehensive Assessment of Vocal Tract Structure and Function.

Sajan Goud Lingala; Asterios Toutios; Johannes Töger; Yongwan Lim; Yinghua Zhu; Yoon-Chul Kim; Colin Vaz; Shrikanth Narayanan; Krishna S. Nayak

Magnetic Resonance Imaging (MRI) provides a safe and flexible means to study the vocal tract, and is increasingly used in speech production research. This work details a state-ofthe-art MRI protocol for comprehensive assessment of vocal tract structure and function, and presents results from representative speakers. The system incorporates (a) custom upper airway coils that are maximally sensitive to vocal tract tissues, (b) graphical user interface for 2D real-time MRI that provides on-the-fly reconstruction for interactive localization, and correction of imaging artifacts, (c) off-line constrained reconstruction for generating high spatio-temporal resolution dynamic images at (83 frames per sec, 2.4 mm), (d) 3D static imaging of sounds sustained for 7 sec with full vocal tract coverage and isotropic resolution (resolution: 1.25 mm), (e) T2-weighted high-resolution, high-contrast depiction of soft-tissue boundaries of the full vocal tract (axial, coronal, sagittal sweeps with resolution: 0.58 x 0.58 x 3 mm), and (f) simultaneous audio recording with off-line noise cancellation and temporal alignment of audio with 2D real-time MRI. A stimuli set was designed to capture efficiently salient, static and dynamic, articulatory and morphological aspects of speech production in 90minute data acquisition sessions.

Collaboration


Dive into the Asterios Toutios's collaboration.

Top Co-Authors

Avatar

Shrikanth Narayanan

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Krishna S. Nayak

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Louis Goldstein

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Slim Ouni

University of Lorraine

View shared research outputs
Top Co-Authors

Avatar

Tanner Sorensen

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Dani Byrd

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Yinghua Zhu

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge