Daniel Voigt
Max Planck Society
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Daniel Voigt.
Journal of the Acoustical Society of America | 2010
Anxiong Yang; Jörg Lohscheller; David A. Berry; Stefan Becker; Ulrich Eysholdt; Daniel Voigt; Michael Döllinger
Human voice originates from the three-dimensional (3D) oscillations of the vocal folds. In previous studies, biomechanical properties of vocal fold tissues have been predicted by optimizing the parameters of simple two-mass-models to fit its dynamics to the high-speed imaging data from the clinic. However, only lateral and longitudinal displacements of the vocal folds were considered. To extend previous studies, a 3D mass-spring, cover-model is developed, which predicts the 3D vibrations of the entire medial surface of the vocal fold. The model consists of five mass planes arranged in vertical direction. Each plane contains five longitudinal, mass-spring, coupled oscillators. Feasibility of the model is assessed using a large body of dynamical data previously obtained from excised human larynx experiments, in vivo canine larynx experiments, physical models, and numerical models. Typical model output was found to be similar to existing findings. The resulting model enables visualization of the 3D dynamics of the human vocal folds during phonation for both symmetric and asymmetric vibrations.
Artificial Intelligence in Medicine | 2010
Daniel Voigt; Michael Döllinger; Thomas Braunschweig; Anxiong Yang; Ulrich Eysholdt; Jörg Lohscheller
OBJECTIVE This work presents a computer-aided method for automatically and objectively classifying individuals with healthy and dysfunctional vocal fold vibration patterns as depicted in clinical high-speed (HS) videos of the larynx. METHODS By employing a specialized image segmentation and vocal fold movement visualization technique - namely phonovibrography - a novel set of numerical features is derived from laryngeal HS videos capturing the dynamic behavior and the symmetry of oscillating vocal folds. In order to assess the discriminatory power of the features, a support vector machine is applied to the preprocessed data with regard to clinically relevant diagnostic tasks. Finally, the classification performance of the learned nonlinear models is evaluated to allow for conclusions to be drawn about suitability of features and data resulting from different examination paradigms. As a reference, a second feature set is determined which corresponds to more traditional voice analysis approaches. RESULTS For the first time an automatic classification of healthy and pathological voices could be obtained by analyzing the vibratory patterns of vocal folds using phonovibrograms (PVGs). An average classification accuracy of approximately 81% was achieved for 2-class discrimination with PVG features. This exceeds the results obtained through traditional voice analysis features. Furthermore, a relevant influence of phonation frequency on classification accuracy was substantiated by the clinical HS data. CONCLUSION The PVG feature extraction and classification approach can be assessed as being promising with regard to the diagnosis of functional voice disorders. The obtained results indicate that an objective analysis of dysfunctional vocal fold vibration can be achieved with considerably high accuracy. Moreover, the PVG classification method holds a lot of potential when it comes to the clinical assessment of voice pathologies in general, as the diagnostic support can be provided to the voice clinician in a timely and reliable manner. Due to the observed interdependency between phonation frequency and classification accuracy, in future comparative studies of HS recordings of oscillating vocal folds homogeneous frequencies should be taken into account during examination.
Journal of the Acoustical Society of America | 2011
Anxiong Yang; Michael Stingl; David A. Berry; Jörg Lohscheller; Daniel Voigt; Ulrich Eysholdt; Michael Döllinger
With the use of an endoscopic, high-speed camera, vocal fold dynamics may be observed clinically during phonation. However, observation and subjective judgment alone may be insufficient for clinical diagnosis and documentation of improved vocal function, especially when the laryngeal disease lacks any clear morphological presentation. In this study, biomechanical parameters of the vocal folds are computed by adjusting the corresponding parameters of a three-dimensional model until the dynamics of both systems are similar. First, a mathematical optimization method is presented. Next, model parameters (such as pressure, tension and masses) are adjusted to reproduce vocal fold dynamics, and the deduced parameters are physiologically interpreted. Various combinations of global and local optimization techniques are attempted. Evaluation of the optimization procedure is performed using 50 synthetically generated data sets. The results show sufficient reliability, including 0.07 normalized error, 96% correlation, and 91% accuracy. The technique is also demonstrated on data from human hemilarynx experiments, in which a low normalized error (0.16) and high correlation (84%) values were achieved. In the future, this technique may be applied to clinical high-speed images, yielding objective measures with which to document improved vocal function of patients with voice disorders.
European Archives of Oto-rhino-laryngology | 2010
Katrin Werth; Daniel Voigt; Michael Döllinger; Ulrich Eysholdt; Jörg Lohscheller
Within this study a retrospective analysis of clinical voice perturbation measures, Dysphonia Severity Index and subjective perceived hoarseness was performed to determine their value under clinical aspects. The study included the data of 580 healthy and 1,700 pathologic voices, which were investigated under the following aspects. The relevant parameters were identified and their interrelation determined. Group differences between healthy and pathologic voices were figured out and investigated if voice quality measures allowed an automatic diagnosis of voice disorders. The analysis revealed significant changes between the clinical groups, which indicate the diagnostic relevance of voice quality measures. However, an individual diagnosis of the underlying voice disorder failed due to a vast spread of the parameter values within the respective groups. Classification accuracies of 75–90% were achieved. The high misclassification rate of up to 25% implied that in voice disorder diagnosis, the individual interpretation of the parameter values has to be done carefully.
Journal of the Acoustical Society of America | 2010
Daniel Voigt; Michael Döllinger; Ulrich Eysholdt; Anxiong Yang; Ercan Gürlek; Jörg Lohscheller
In this work a detection algorithm for mucosal wave propagation is presented. By incorporating physiological knowledge of mucosal wave properties and taking the segmented lateral movement of both vocal fold edges as a basis, the spatio-temporal position of the traveling mucosal wave is identified and quantitatively captured. The course of mucosal wave propagation can be successfully detected and analyzed with regard to discriminating different types of mucosal wave activity (in terms of spread velocity and symmetry). The preliminary results obtained for six exemplary laryngeal high-speed recordings are promising and demonstrate the potential of the proposed detection and objective description approach.
Journal of Phonetics | 2016
Leonardo Lancia; Daniel Voigt; Georgy Krasovitskiy
This paper introduces an original variant of recurrence analysis to quantify the degree of regularity of vocal fold vibration as captured by electroglottography during phonation. The proposed technique is applied to the analysis of laryngealized phonation as this phonation type typically shows irregular vibration cycles. The reliability of this approach is validated with synthetic vocal fold vibration signals, demonstrating that it permits measuring the regularity of vocal fold vibration, unaffected by changes in fundamental frequency. The method is also applied to real electroglottographic signals recorded at the onset of vowel-initial nonsense words produced in a speeded repetition task by five female German speakers. Results show that the degree of laryngealization during the production of word-initial vowels is modulated by the presence of stress (with stressed vowels being less laryngealized). Due to its robustness to changes of F0, the proposed technique proves to be a suitable tool for studying vocal fold regularity in concatenated speech. Its applications are not limited to the study of glottalization, since the degree of regularity of vocal fold vibration has paralinguistic functions and is a clinically relevant measure of voice pathologies.
Folia Phoniatrica Et Logopaedica | 2015
Sebastian Dippold; Daniel Voigt; Bernhard Richter; Matthias Echternach
Background: Little data are available concerning register functions in different styles of singing such as classically or jazz-trained voices. Differences between registers seem to be much more audible in jazz singing than classical singing, and so we hypothesized that classically trained singers exhibit a smoother register transition, stemming from more regular vocal fold oscillation patterns. Methods: High-speed digital imaging (HSDI) was used for 19 male singers (10 jazz-trained singers, 9 classically trained) who performed a glissando from modal to falsetto register across the register transition. Vocal fold oscillation patterns were analyzed in terms of different parameters of regularity such as relative average perturbation (RAP), correlation dimension (D2) and shimmer. Results: HSDI observations showed more regular vocal fold oscillation patterns during the register transition for the classically trained singers. Additionally, the RAP and D2 values were generally lower and more consistent for the classically trained singers compared to the jazz singers. However, intergroup comparisons showed no statistically significant differences. Conclusion: Some of our results may support the hypothesis that classically trained singers exhibit a smoother register transition from modal to falsetto register.
Journal of the Acoustical Society of America | 2011
Daniel Voigt; Ulrich Eysholdt
In the previous work, a computer-based analysis framework was proposed, which is capable of objectively and automatically classifying vocal fold vibrations as captured by high-speed videoendoscopy during phonation. The method is based on quantitative feature extraction from Phonovibrograms combined with nonlinear machine learning techniques, allowing for the discrimination of normal and pathological laryngeal movement patterns. The diagnostic reliability and potential of this analysis approach were demonstrated. However, the practically relevant question, whether certain control parameters of the procedure can lead to increased classification accuracy, remained partially unanswered. In this study, the following parameter sets of the analysis framework were investigated in a systematic manner: method of feature extraction, type of feature aggregation and normalization, number of considered oscillation cycles, feature laterality, classification task, and employed machine learning algorithm. For this purpose...
artificial intelligence in medicine in europe | 2009
Daniel Voigt; Michael Döllinger; Anxiong Yang; Ulrich Eysholdt; Jörg Lohscheller
For the diagnosis of pathological voices it is of particular importance to examine the dynamic properties of the underlying vocal fold (VF) movements occurring at a fundamental frequency of 100---300 Hz. To this end, a patients laryngeal oscillation patterns are captured with state-of-the-art endoscopic high-speed (HS) camera systems capable of recording 4000 frames/second. To date the clinical analysis of these HS videos is commonly performed in a subjective manner via slow-motion playback. Hence, the resulting diagnoses are inherently error-prone, exhibiting high inter-rater variability. In this paper an objective method for overcoming this drawback is presented which employs a quantitative description and classification approach based on a novel image analysis strategy called Phonovibrography. By extracting the relevant VF movement information from HS videos the spatio-temporal patterns of laryngeal activity are captured using a set of specialized features. As reference for performance, conventional voice analysis features are also computed. The derived features are analyzed with different machine learning (ML) algorithms regarding clinically meaningful classification tasks. The applicability of the approach is demonstrated using a clinical data set comprising individuals with normophonic and paralytic voices. The results indicate that the presented approach holds a lot of promise for providing reliable diagnosis support in the future.
Journal of the Acoustical Society of America | 2008
Jörg Lohscheller; Daniel Voigt; Michael Doellinger
Clinical examination of voice disorders demands an endoscopical observation of vocal fold vibrations. Highspeed endoscopy is the state-of-the-art technology for investigation of vocal fold vibrations. A novel visualization strategy is proposed which transforms the segmented contours of vocal fold edges into a set of two dimensional images, denoted Phonovibrograms (PVG). Within PVGs the individual type of vocal fold vibration becomes uniquely characterized by specific geometric patterns which can be seen as fingerprints of vocal fold vibration. The PVGs give an intuitive access on the type and degree of the laryngeal asymmetry which is essential to quantify the eects of functional and organic voice disorders. To determine the vibration characteristics within the computed PVG pattern recognition algorithms are applied. Thus, for each vocal fold the vibration type can be quantified and classified. The results of the PVG classification will be presented in 80 subjects (normal and pathological voices). It will be shown, that a classification of the vibration type can be performed very precisely even in disturbed vocal fold vibrations. The obtained PVG images can be documented and stored on a hard-disc using a lossless image data-format. The quantitative description of PVG patterns has the potential to realize a novel classification of vocal fold vibrations.