Jarek Krajewski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jarek Krajewski is active.

Explore More

Publication

Featured researches published by Jarek Krajewski.

Proceedings of the 4th International Workshop on Audio/Visual Emotion Challenge | 2014

AVEC 2014: 3D Dimensional Affect and Depression Recognition Challenge

Michel F. Valstar; Björn W. Schuller; Kirsty Smith; Timur R. Almaev; Florian Eyben; Jarek Krajewski; Roddy Cowie; Maja Pantic

Mood disorders are inherently related to emotion. In particular, the behaviour of people suffering from mood disorders such as unipolar depression shows a strong temporal correlation with the affective dimensions valence, arousal and dominance. In addition to structured self-report questionnaires, psychologists and psychiatrists use in their evaluation of a patients level of depression the observation of facial expressions and vocal cues. It is in this context that we present the fourth Audio-Visual Emotion recognition Challenge (AVEC 2014). This edition of the challenge uses a subset of the tasks used in a previous challenge, allowing for more focussed studies. In addition, labels for a third dimension (Dominance) have been added and the number of annotators per clip has been increased to a minimum of three, with most clips annotated by 5. The challenge has two goals logically organised as sub-challenges: the first is to predict the continuous values of the affective dimensions valence, arousal and dominance at each moment in time. The second is to predict the value of a single self-reported severity of depression indicator for each recording in the dataset. This paper presents the challenge guidelines, the common data used, and the performance of the baseline system on the two tasks.

Speech Communication | 2015

A review of depression and suicide risk assessment using speech analysis

Nicholas Cummins; Stefan Scherer; Jarek Krajewski; Sebastian Schnieder; Julien Epps; Thomas F. Quatieri

Review of current diagnostic and assessment methods for depression and suicidality.Review the characteristics of active depressed and suicidal speech databases.Discuss the effects of depression and suicidality on common speech characteristics.Review of studies that use speech to classify or predict depression or suicidality.Discuss future challenges in finding a speech-based markers of either condition. This paper is the first review into the automatic analysis of speech for use as an objective predictor of depression and suicidality. Both conditions are major public health concerns; depression has long been recognised as a prominent cause of disability and burden worldwide, whilst suicide is a misunderstood and complex course of death that strongly impacts the quality of life and mental health of the families and communities left behind. Despite this prevalence the diagnosis of depression and assessment of suicide risk, due to their complex clinical characterisations, are difficult tasks, nominally achieved by the categorical assessment of a set of specific symptoms. However many of the key symptoms of either condition, such as altered mood and motivation, are not physical in nature; therefore assigning a categorical score to them introduces a range of subjective biases to the diagnostic procedure. Due to these difficulties, research into finding a set of biological, physiological and behavioural markers to aid clinical assessment is gaining in popularity. This review starts by building the case for speech to be considered a key objective marker for both conditions; reviewing current diagnostic and assessment methods for depression and suicidality including key non-speech biological, physiological and behavioural markers and highlighting the expected cognitive and physiological changes associated with both conditions which affect speech production. We then review the key characteristics; size, associated clinical scores and collection paradigm, of active depressed and suicidal speech databases. The main focus of this paper is on how common paralinguistic speech characteristics are affected by depression and suicidality and the application of this information in classification and prediction systems. The paper concludes with an in-depth discussion on the key challenges - improving the generalisability through greater research collaboration and increased standardisation of data collection, and the mitigating unwanted sources of variability - that will shape the future research directions of this rapidly growing field of speech processing research.

Behavior Research Methods | 2009

Acoustic sleepiness detection: Framework and validation of a speech-adapted pattern recognition approach

Jarek Krajewski; Anton Batliner; Martin Golz

This article describes a general framework for detecting sleepiness states on the basis of prosody, articulation, and speech-quality-related speech characteristics. The advantages of this automatic real-time approach are that obtaining speech data is nonobstrusive and is free from sensor application and calibration efforts. Different types of acoustic features derived from speech, speaker, and emotion recognition were employed (frame-level-based speech features). Combing these features with high-level contour descriptors, which capture the temporal information of frame-level descriptor contours, results in 45,088 features per speech sample. In general, the measurement proceß follows the speech-adapted steps of pattern recognition: (1) recording speech, (2) preproceßing, (3) feature computation (using perceptual and signal-proceßing-related features such as, e.g., fundamental frequency, intensity, pause patterns, formants, and cepstral coefficients), (4) dimensionality reduction, (5) claßification, and (6) evaluation. After a correlation-filter-based feature subset selection employed on the feature space in order to find most relevant features, different claßification models were trained. The best model—namely, the support-vector machine—achieved 86.1% claßification accuracy in predicting sleepineß in a sleep deprivation study (two-claß problem, N 5 12; 01.00-08.00 a.m.).

Neurocomputing | 2012

Applying multiple classifiers and non-linear dynamics features for detecting sleepiness from speech

Jarek Krajewski; Sebastian Schnieder; David Sommer; Anton Batliner; Björn W. Schuller

Comparing different novel feature sets and classifiers for speech processing based fatigue detection is the primary aim of this study. Thus, we conducted a within-subject partial sleep deprivation design (20.00-04.00h, N=77 participants) and recorded 372 speech samples of sustained vowel phonation. The self-report on the Karolinska Sleepiness Scale (KSS) and an observer report on the KSS, the KSS Observer Scale were applied to determine sleepiness reference values. Feature extraction methods of non-linear dynamics (NLD) provide additional information regarding the dynamics and structure of sleepiness speech. In all, 395 NLD features and the 170 phonetic features, which have been computed partially, represent so far unknown auditive-perceptual concepts. Several NLD and phonetic features show significant correlations to KSS ratings, e.g., from the NLD features for male speakers the skewness of vector length within reconstructed phase space (r=.56), and for female speaker the mean of Caos minimum embedding dimensions (r=-.39). After a correlation-filter feature subset selection different classification models and ensemble classifiers (by AdaBoost, Bagging) were trained. Bagging procedures turned out to achieve best performance for male and female speakers on the phonetic and the NLD feature set. The best models for the phonetic feature set achieved 78.3% (NaiveBayes) for male and 68.5% (Bagging Bayes Net) for female speaker classification accuracy in detecting sleepiness. The best model for the NLD feature set achieved 77.2% (Bagging Bayes Net) for male and 76.8% (Bagging Bayes Net) for female speakers. Nevertheless, employing the combined phonetic and NLD feature sets provided additional information and thus resulted in an improved highest UA of 79.6% for male (Bayes Net) and 77.1% for female (AdaBoost Nearest Neighbor) speakers.

Journal of Occupational Health Psychology | 2010

Regulating strain states by using the recovery potential of lunch breaks.

Jarek Krajewski; Rainer Wieland; Martin Sauerland

The aim of the worksite study is to elucidate the strain reducing impact of different forms of spending lunch breaks. With the help of the so-called silent room cabin concept, it was possible to induce a lunch-break relaxation opportunity that provided visual and territorial privacy. To evaluate the proposed effects, 14 call center agents were assigned to either 20 min progressive muscle relaxation (PMR) or small-talk (ST) break groups. We analyzed the data in a controlled trial for a period of 6 months (every 2 months four measurements a day at 12:00, 13:00, 16:00, 20:00) using independent observer and self-report ratings of emotional, mental, motivational, and physical strain. Results indicated that only the PMR break reduced postlunchtime and afternoon strain. Although further intervention research is required, our results suggest that PMR lunch break may sustainable reduce strain states in real worksite settings.

IEEE Transactions on Affective Computing | 2012

The Voice of Leadership: Models and Performances of Automatic Analysis in Online Speeches

Felix Weninger; Jarek Krajewski; Anton Batliner; Björn W. Schuller

We introduce the automatic determination of leadership emergence by acoustic and linguistic features in online speeches. Full realism is provided by the varying and challenging acoustic conditions of the presented YouTube corpus of online available speeches labeled by 10 raters and by processing that includes Long Short-Term Memory-based robust voice activity detection (VAD) and automatic speech recognition (ASR) prior to feature extraction. We discuss cluster-preserving scaling of 10 original dimensions for discrete and continuous task modeling, ground truth establishment, and appropriate feature extraction for this novel speaker trait analysis paradigm. In extensive classification and regression runs, different temporal chunkings and optimal late fusion strategies (LFSs) of feature streams are presented. In the result, achievers, charismatic speakers, and teamplayers can be recognized significantly above chance level, reaching up to 72.5 percent accuracy on unseen test data.

international conference on computers helping people with special needs | 2008

An Acoustic Framework for Detecting Fatigue in Speech Based Human-Computer-Interaction

Jarek Krajewski; Rainer Wieland; Anton Batliner

This article describes a general framework for detecting accident-prone fatigue states based on prosody, articulation and speech quality related speech characteristics. The advantages of this real-time measurement approach are that obtaining speech data is non obtrusive, and free from sensor application and calibration efforts. The main part of the feature computation is the combination of frame level based speech features and high level contour descriptors resulting in over 8,500 features per speech sample. In general the measurement process follows the speech adapted steps of pattern recognition: (a) recording speech, (b) preprocessing (segmenting speech units of interest), (c) feature computation (using perceptual and signal processing related features, as e.g. fundamental frequency, intensity, pause patterns, formants, cepstral coefficients), (d) dimensionality reduction (filter and wrapper based feature subset selection, (un-)supervised feature transformation), (e) classification (e.g. SVM, K-NN classifier), and (f) evaluation (e.g. 10-fold cross validation). The validity of this approach is briefly discussed by summarizing the empirical results of a sleep deprivation study.

Driving Assessment 2009: 5th International Driving Symposium on Human Factors in Driving Assessment, Training and Vehicle DesignFederal Motor Carrier Safety AdministrationWestern Transportation InstituteNissan Technical Center, North AmericaHonda R&D Americas, IncorporatedUniversity of Iowa, Iowa City5DT, Inc.DriveCam, IncorporatedHFES Surface Transportation Technical GroupUniversity of LeedsLiberty Mutual Research Institute for Safety and HealthRealtime Technologies IncorporatedSeeing MachinesSWERVE Driver TrainingTransportation Research BoardNational Highway Traffic Safety AdministrationUniversity of Minnesota, Minneapolis | 2009

Operator fatigue estimation using heart rate measures

C. Heinze; U. Trutschel; T. Schnupp; D. Sommer; A. Schenka; Jarek Krajewski; M. Golz

The growing number of fatigue related accidents in recent years has become a serious concern. Accidents caused by fatigue in transportation and in mining operations involving heavy equipment can lead to substantial damage and loss of human life. Preventing such fatigue related accidents is highly desirable, but requires techniques for continuously estimating and predicting the operator’s alertness state. This paper proposes ECG-based operator fatigue estimation. For this aim, ECG was recorded continuously, and several heart rate measures were calculated and correlated with other well established fatigue labels. As a result, changes in operator’s fatigue during a night time study could be depicted during three different conditions. In the first condition, subjective and objective fatigue measures were collected during a 40-minute monotonous driving task. In the second and third condition, a 10-minute Compensatory Tracking Task (CTT) and a 5-minute Psychomotoric Vigilance Test (PVT), respectively, delivered a set of additional objective fatigue measures. Correlations between heart rate and fatigue measures were calculated, using experimental results of two volunteers, who each completed two nights in a real-car lab following a partial sleep deprivation design. The subjects were going through all three conditions (driving, CTT and PVT) eight times during the course of one night.

Speech Communication | 2015

Analysis of acoustic space variability in speech affected by depression

Nicholas Cummins; Vidhyasaharan Sethu; Julien Epps; Sebastian Schnieder; Jarek Krajewski

Present novel probabilistic acoustic volume, a robust acoustic variability measure.As depression increases phonetic events become concentrated in acoustic space.MFCC feature space becomes tightly concentrated with increasing depression.Speech trajectory in acoustic space becomes smoother with increasing depression.Choice of speech collection paradigm may adversely affect depression detection. The spectral and energy properties of speech have consistently been observed to change with a speakers level of clinical depression. This has resulted in spectral and energy based features being a key component in many speech-based classification and prediction systems. However there has been no in-depth investigation into understanding how acoustic models of spectral features are affected by depression. This paper investigates the hypothesis that the effects of depression in speech manifest as a reduction in the spread of phonetic events in acoustic space as modelled by Gaussian Mixture Models (GMM) in combination with Mel Frequency Cepstral Coefficients (MFCC). Our investigation uses three measures of acoustic variability: Average Weighted Variance (AWV), Acoustic Movement (AM) and Acoustic Volume, which attempt to model depression specific acoustic variations (AWV and Acoustic Volume), or the trajectory of a speech in the acoustic space (AM). Within our analysis we present the Probabilistic Acoustic Volume (PAV) a novel method for robustly estimating Acoustic Volume using a Monte Carlo sampling of the feature distribution being modelled. We show that using an array of PAV points we gain insights into how the concentration of the feature vectors in the feature space changes with depression. Key results - found on two commonly used depression corpora - consistently indicate that as a speakers level of depression increases there are statistically significantly reductions in both AWV (-0.44≤rs≤-0.18 with p<.05) and AM (-0.26≤rs≤-0.19 with p<.05) values, indicating a decrease in localised acoustic variance and smoothing in acoustic trajectory respectively. Further there are also statistically significant reductions (-0.32≤rs≤-0.20 with p<.05) in Acoustic Volume measures and strong statistical evidence (-0.48≤rs≤-0.23 with p<.05) that the MFCC feature space becomes more concentrated. Quantifying these effects is expected to be a key step towards building an objective classification or prediction system which is robust to many of the unwanted - in terms of depression analysis - sources of variability modulated into a speech signal.

international conference on acoustics, speech, and signal processing | 2014

Variability compensation in small data: Oversampled extraction of i-vectors for the classification of depressed speech

Nicholas Cummins; Julien Epps; Vidhyasaharan Sethu; Jarek Krajewski

Variations in the acoustic space due to changes in speaker mental state are potentially overshadowed by variability due to speaker identity and phonetic content. Using the Audio/Visual Emotion Challenge and Workshop 2013 Depression Dataset we explore the suitability of i-vectors for reducing these latter sources of variability for distinguishing between low or high levels of speaker depression. In addition we investigate whether supervised variability compensation methods such as Linear Discriminant Analysis (LDA), and Within Class Covariance Normalisation (WCCN), applied in the i-vector domain, could be used to compensate for speaker and phonetic variability. Classification results show that i-vectors formed using an over-sampling methodology outperform a baseline set by KL-means supervectors. However the effect of these two compensation methods does not appear to improve system accuracy. Visualisations afforded by the t-Distributed Stochastic Neighbour Embedding (t-SNE) technique suggest that despite the application of these techniques, speaker variability is still a strong confounding effect.

Explore More