Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Noam Amir is active.

Publication


Featured researches published by Noam Amir.


affective computing and intelligent interaction | 2007

The HUMAINE Database: Addressing the Collection and Annotation of Naturalistic and Induced Emotional Data

Ellen Douglas-Cowie; Roddy Cowie; Ian Sneddon; Cate Cox; Orla Lowry; Margaret McRorie; Jean-Claude Martin; Laurence Devillers; Sarkis Abrilian; Anton Batliner; Noam Amir; Kostas Karpouzis

The HUMAINE project is concerned with developing interfaces that will register and respond to emotion, particularly pervasive emotion (forms of feeling, expression and action that colour most of human life). The HUMAINE Database provides naturalistic clips which record that kind of material, in multiple modalities, and labelling techniques that are suited to describing it.


International Journal of Audiology | 2012

Preferred listening levels of personal listening devices in young teenagers: Self reports and physical measurements

Chava Muchnik; Noam Amir; Ester Shabtai; Ricky Kaplan-Neeman

Abstract Objective: To assess the potential risk of hearing loss to young listeners, due to the use of personal listening devices (PLDs). Design: The study included two parts: (1) A self-report questionnaire on music listening habits, and (2) Physical measurements of preferred listening levels, in quiet and in everyday background noise. Study sample: Young teenagers aged 13 to 17 years. Part 1 included 289 participants with mean age of 14 years. Part 2 included 11 and 74 participants (2A and 2B) with a mean age of 15 years. Eleven listened to PLDs in quiet conditions (2A) and 74 in everyday background noise (2B). Results: Questionnaire main findings indicated that most of the participants reported high or very high volume settings and demonstrated low awareness towards loud music listening consequences. Physical measurements corrected for diffuse field indicated mean preferred listening levels of: 82 (SD = 9) dBA in quiet, and 89 (SD = 9) dBA in the presence of background noise. The potential risk to hearing of PLDs users was calculated using the 8 hour equivalent level. Conclusion: More than 25% of the participants in the noisy condition were found to be at risk according to occupational damage risk criteria NIOSH, 1998.


Archive | 2011

The Automatic Recognition of Emotions in Speech

Anton Batliner; Björn W. Schuller; Dino Seppi; Stefan Steidl; Laurence Devillers; Laurence Vidrascu; Thurid Vogt; Vered Aharonson; Noam Amir

In this chapter, we focus on the automatic recognition of emotional states using acoustic and linguistic parameters as features and classifiers as tools to predict the ‘correct’ emotional states. We first sketch history and state of the art in this field; then we describe the process of ‘corpus engineering’, i.e. the design and the recording of databases, the annotation of emotional states, and further processing such as manual or automatic segmentation. Next, we present an overview of acoustic and linguistic features that are extracted automatically or manually. In the section on classifiers, we deal with topics such as the curse of dimensionality and the sparse data problem, classifiers, and evaluation. At the end of each section, we point out important aspects that should be taken into account for the planning or the assessment of studies. The subject area of this chapter is not emotions in some narrow sense but in a wider sense encompassing emotion-related states such as moods, attitudes, or interpersonal stances as well. We do not aim at an in-depth treatise of some specific aspects or algorithms but at an overview of approaches and strategies that have been used or should be used.


Journal on Multimodal User Interfaces | 2010

Multimodal user's affective state analysis in naturalistic interaction

George Caridakis; Kostas Karpouzis; Manolis Wallace; Loic Kessous; Noam Amir

Affective and human-centered computing have attracted an abundance of attention during the past years, mainly due to the abundance of environments and applications able to exploit and adapt to multimodal input from the users. The combination of facial expressions with prosody information allows us to capture the users’ emotional state in an unintrusive manner, relying on the best performing modality in cases where one modality suffers from noise or bad sensing conditions. In this paper, we describe a multi-cue, dynamic approach to detect emotion in naturalistic video sequences, where input is taken from nearly real world situations, contrary to controlled recording conditions of audiovisual material. Recognition is performed via a recurrent neural network, whose short term memory and approximation capabilities cater for modeling dynamic events in facial and prosodic expressivity. This approach also differs from existing work in that it models user expressivity using a dimensional representation, instead of detecting discrete ‘universal emotions’, which are scarce in everyday human-machine interaction. The algorithm is deployed on an audiovisual database which was recorded simulating human-human discourse and, therefore, contains less extreme expressivity and subtle variations of a number of emotion labels. Results show that in turns lasting more than a few frames, recognition rates rise to 98%.


Autism Research | 2015

Prosody recognition in adults with high-functioning autism spectrum disorders: from psychoacoustics to cognition.

Eitan Globerson; Noam Amir; Liat Kishon-Rabin; Ofer Golan

Prosody is an important tool of human communication, carrying both affective and pragmatic messages in speech. Prosody recognition relies on processing of acoustic cues, such as the fundamental frequency of the voice signal, and their interpretation according to acquired socioemotional scripts. Individuals with autism spectrum disorders (ASD) show deficiencies in affective prosody recognition. These deficiencies have been mostly associated with general difficulties in emotion recognition. The current study explored an additional association between affective prosody recognition in ASD and auditory perceptual abilities. Twenty high‐functioning male adults with ASD and 32 typically developing male adults, matched on age and verbal abilities undertook a battery of auditory tasks. These included affective and pragmatic prosody recognition tasks, two psychoacoustic tasks (pitch direction recognition and pitch discrimination), and a facial emotion recognition task, representing nonvocal emotion recognition. Compared with controls, the ASD group demonstrated poorer performance on both vocal and facial emotion recognition, but not on pragmatic prosody recognition or on any of the psychoacoustic tasks. Both groups showed strong associations between psychoacoustic abilities and prosody recognition, both affective and pragmatic, although these were more pronounced in the ASD group. Facial emotion recognition predicted vocal emotion recognition in the ASD group only. These findings suggest that auditory perceptual abilities, alongside general emotion recognition abilities, play a significant role in affective prosody recognition in ASD. Autism Res 2015, 8: 153–163.


Archive | 2011

Issues in Data Collection

Roddy Cowie; Ellen Douglas-Cowie; Margaret McRorie; Ian Sneddon; Laurence Devillers; Noam Amir

The chapter reviews methods of obtaining records that show signs of emotion. Concern with authenticity is central to the task. Converging lines of argument indicate that even sophisticated acting does not reproduce emotion as it appears in everyday action and interaction. Acting is the appropriate source for some kinds of material, and work on that topic is described. Methods that aim for complete naturalism are also described, and the problems associated with them are noted. Techniques for inducing emotion are considered under five headings: classical induction; physical induction; games; task settings; and conversational interactions. The ethical issues that affect area are outlined, and a framework for dealing with them is set out.


Attention Perception & Psychophysics | 2013

Psychoacoustic abilities as predictors of vocal emotion recognition

Eitan Globerson; Noam Amir; Ofer Golan; Liat Kishon-Rabin; Michal Lavidor

Prosodic attributes of speech, such as intonation, influence our ability to recognize, comprehend, and produce affect, as well as semantic and pragmatic meaning, in vocal utterances. The present study examines associations between auditory perceptual abilities and the perception of prosody, both pragmatic and affective. This association has not been previously examined. Ninety-seven participants (49 female and 48 male participants) with normal hearing thresholds took part in two experiments, involving both prosody recognition and psychoacoustic tasks. The prosody recognition tasks included a vocal emotion recognition task and a focus perception task requiring recognition of an accented word in a spoken sentence. The psychoacoustic tasks included a task requiring pitch discrimination and three tasks also requiring pitch direction (i.e., high/low, rising/falling, changing/steady pitch). Results demonstrate that psychoacoustic thresholds can predict 31% and 38% of affective and pragmatic prosody recognition scores, respectively. Psychoacoustic tasks requiring pitch direction recognition were the only significant predictors of prosody recognition scores. These findings contribute to a better understanding of the mechanisms underlying prosody recognition and may have an impact on the assessment and rehabilitation of individuals suffering from deficient prosodic perception.


international conference on acoustics, speech, and signal processing | 2012

Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music

Felix Weninger; Noam Amir; Ofer Amir; Irit Ronen; Florian Eyben; Björn W. Schuller

We address the robustness of features for fully automatic recognition of vibrato, which is usually defined as a periodic oscillation of the pitch (F0) of the singing voice, in recorded polyphonic music. Using an evaluation database covering jazz, pop and opera music, we show that the extraction of pitch is challenging in the presence of instrumental accompaniment, leading to unsatisfactory classification accuracy (61.1 %) if only the F0 frequency spectrum is used as features. To alleviate, we investigate alternative functionals of F0, alternative low-level features besides F0, and extraction of vocals by monaural source separation. Finally, we propose to use inter-quartile ranges of F0 delta regression coefficients as features which are highly robust against pitch extraction errors, reaching up to 86.9% accuracy in real-life conditions without any signal enhancement.


affective computing and intelligent interaction | 2007

Characterizing Emotion in the Soundtrack of an Animated Film: Credible or Incredible?

Noam Amir; Rachel Cohen

In this study we present a novel emotional speech corpus, consisting of dialog that was extracted from an animated film. This type of corpus presents an interesting compromise between the sparsity of emotion found in spontaneous speech, and the contrived emotion found in speech acted solely for research purposes. The dialog was segmented into 453 short units and judged for emotional content by native and non-native English speakers. Emotion was rated on two scales: Activation and Valence. Acoustic analysis gave a comprehensive set of 100 features covering F0, intensity, voice quality and spectrum. We found that Activation is more strongly correlated to our acoustic features than Valence. Activat-ion was correlated to several types of features, whereas Valence was correlated mainly to intensity related features. Further, ANOVA analysis showed some interesting contrasts between the two scales, and interesting differences in the judgments of native vs. non-native English speakers.


Journal of Phonetics | 2016

The acoustic correlates of lexical stress in Israeli Hebrew

Vered Silber-Varod; Hagit Sagi; Noam Amir

Abstract Lexical stress is an omnipresent phenomenon in spoken language, serving in many cases to disambiguate lexically identical words. Lexical stress in Spoken Israeli Hebrew is usually either penultimate or final, however its prosodic cues have not been measured systematically. In the present study the acoustics of lexical stress were characterized in detail. A list of 34 two-syllable stress-based minimal pairs was collected, where each word form has a different meaning. Each lexical word was embedded in a carrier sentence, thus creating 68 sentences, in which each word-form appeared once in sentence-middle position. Thirty speakers, gender balanced, uttered each sentence, giving a total of 2040 utterance productions. All syllable and vowel boundaries of the target words were annotated manually, and three acoustical parameters – duration, f0 and intensity – were measured for each nucleus vowel. Statistical analysis revealed that vowel duration was the dominant marker of Hebrew lexical stress; f0 played a minor role in indicating stress; while intensity played a more prominent role than f0. Regression analysis showed that the above three cues explained 80% of the variance, with duration contributing 77% and intensity and f0 1.5% each.

Collaboration


Dive into the Noam Amir's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dino Seppi

Katholieke Universiteit Leuven

View shared research outputs
Top Co-Authors

Avatar

Stefan Steidl

University of Erlangen-Nuremberg

View shared research outputs
Top Co-Authors

Avatar

Thurid Vogt

University of Augsburg

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge