Oya Aran | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Oya Aran is active.

Explore More

Publication

Featured researches published by Oya Aran.

IEEE Transactions on Multimedia | 2012

A Nonverbal Behavior Approach to Identify Emergent Leaders in Small Groups

Dairazalia Sanchez-Cortes; Oya Aran; Marianne Schmid Mast; Daniel Gatica-Perez

Identifying emergent leaders in organizations is a key issue in organizational behavioral research, and a new problem in social computing. This paper presents an analysis on how an emergent leader is perceived in newly formed, small groups, and then tackles the task of automatically inferring emergent leaders, using a variety of communicative nonverbal cues extracted from audio and video channels. The inference task uses rule-based and collective classification approaches with the combination of acoustic and visual features extracted from a new small group corpus specifically collected to analyze the emergent leadership phenomenon. Our results show that the emergent leader is perceived by his/her peers as an active and dominant person; that visual information augments acoustic information; and that adding relational information to the nonverbal cues improves the inference of each participants leadership rankings in the group.

Pattern Recognition | 2010

A multi-class classification strategy for Fisher scores: Application to signer independent sign language recognition

Oya Aran; Lale Akarun

Fisher kernels combine the powers of discriminative and generative classifiers by mapping the variable-length sequences to a new fixed length feature space, called the Fisher score space. The mapping is based on a single generative model and the classifier is intrinsically binary. We propose a multi-class classification strategy that applies a multi-class classification on each Fisher score space and combines the decisions of multi-class classifiers. We experimentally show that the Fisher scores of one class provide discriminative information for the other classes as well. We compare several multi-class classification strategies for Fisher scores generated from the hidden Markov models of sign sequences. The proposed multi-class classification strategy increases the classification accuracy in comparison with the state of the art strategies based on combining binary classifiers. To reduce the computational complexity of the Fisher score extraction and the training phases, we also propose a score space selection method and show that, similar or even higher accuracies can be obtained by using only a subset of the score spaces. Based on the proposed score space selection method, a signer adaptation technique is also presented that does not require any re-training.

Journal on Multimodal User Interfaces | 2013

Emergent leaders through looking and speaking: from audio-visual data to multimodal recognition

Dairazalia Sanchez-Cortes; Oya Aran; Dinesh Babu Jayagopi; Marianne Schmid Mast; Daniel Gatica-Perez

In this paper we present a multimodal analysis of emergent leadership in small groups using audio-visual features and discuss our experience in designing and collecting a data corpus for this purpose. The ELEA Audio-Visual Synchronized corpus (ELEA AVS) was collected using a light portable setup and contains recordings of small group meetings. The participants in each group performed the winter survival task and filled in questionnaires related to personality and several social concepts such as leadership and dominance. In addition, the corpus includes annotations on participants’ performance in the survival task, and also annotations of social concepts from external viewers. Based on this corpus, we present the feasibility of predicting the emergent leader in small groups using automatically extracted audio and visual features, based on speaking turns and visual attention, and we focus specifically on multimodal features that make use of the looking at participants while speaking and looking at while not speaking measures. Our findings indicate that emergent leadership is related, but not equivalent, to dominance, and while multimodal features bring a moderate degree of effectiveness in inferring the leader, much simpler features extracted from the audio channel are found to give better performance.

international conference on multimodal interfaces | 2010

Identifying emergent leadership in small groups using nonverbal communicative cues

Dairazalia Sanchez-Cortes; Oya Aran; Marianne Schmid Mast; Daniel Gatica-Perez

This paper addresses firstly an analysis on how an emergent leader is perceived in newly formed small-groups, and secondly, explore correlations between perception of leadership and automatically extracted nonverbal communicative cues. We hypothesize that the difference in individual nonverbal features between emergent leaders and non-emergent leaders is significant and measurable using speech activity. Our results on a new interaction corpus show that such an approach is promising, identifying the emergent leader with an accuracy of up to 80%.

international conference on pattern recognition | 2010

Fusing Audio-Visual Nonverbal Cues to Detect Dominant People in Group Conversations

Oya Aran; Daniel Gatica-Perez

This paper addresses the multimodal nature of social dominance and presents multimodal fusion techniques to combine audio and visual nonverbal cues for dominance estimation in small group conversations. We combine the two modalities both at the feature extraction level and at the classifier level via score and rank level fusion. The classification is done by a simple rule-based estimator. We perform experiments on a new 10-hour dataset derived from the popular AMI meeting corpus. We objectively evaluate the performance of each modality and each cue alone and in combination. Our results show that the combination of audio and visual cues is necessary to achieve the best performance.

signal processing and communications applications conference | 2007

A Database of Non-Manual Signs in Turkish Sign Language

Oya Aran; Ismail Ari; Amaç Güvensan; Hakan Haberdar; Zeyneb Kurr; İrem Türkmen; Asli Uyar; Lale Akarun

Sign languages are visual languages. The message is not only transferred via hand gestures (manual signs) but also head/body motion and facial expressions (non-manual signs). In this article, we present a database of non-manual signs in Turkish sign language (TSL). There are eight non-manual signs in the database, which are frequently used in TSL. The database contains the videos of these signs as well as a ground truth data of 60 manually landmarked points of the face.

digital television conference | 2007

A Multimodal 3D Healthcare Communication System

Cem Keskin; Koray Balci; Oya Aran; Bülent Sankur; Lale Akarun

We present a system that integrates gesture recognition and 3D talking head technologies for a patient communication application at a hospital or healthcare setting for supporting patients treated in bed. As a multimodal user interface, we get the input from patients using hand gestures and provide feedback by using a 3D talking avatar.

Journal on Multimodal User Interfaces | 2008

SPEECH AND SLIDING TEXT AIDED SIGN RETRIEVAL FROM HEARING IMPAIRED SIGN NEWS VIDEOS

Oya Aran; Ismail Ari; Lale Akarun; Erinç Dikici; Siddika Parlak; Murat Saraclar; Pavel Campr; Marek Hrúz

The objective of this study is to automatically extract annotated sign data from the broadcast news recordings for the hearing impaired. These recordings present an excellent source for automatically generating annotated data: In news for the hearing impaired, the speaker also signs with the hands as she talks. On top of this, there is also corresponding sliding text superimposed on the video. The video of the signer can be segmented via the help of either the speech or both the speech and the text, generating segmented, and annotated sign videos. We call this application as Signiary, and aim to use it as a sign dictionary where the users enter a word as text and retrieve sign videos of the related sign. This application can also be used to automatically create annotated sign databases that can be used for training recognizers.

acm multimedia | 2006

Recognizing two handed gestures with generative, discriminative and ensemble methods via fisher kernels

Oya Aran; Lale Akarun

Use of gestures extends Human Computer Interaction (HCI) possibilities in multimodal environments. However, the great variability in gestures, both in time, size, and position, as well as interpersonal differences, makes the recognition task difficult. With their power in modeling sequence data and processing variable length sequences, modeling hand gestures using Hidden Markov Models (HMM) is a natural extension. On the other hand, discriminative methods such as Support Vector Machines (SVM), compared to model based approaches such as HMMs, have flexible decision boundaries and better classification performance. By extracting features from gesture sequences via Fisher Kernels based on HMMs, classification can be done by a discriminative classifier. We compared the performance of this combined classifier with generative and discriminative classifiers on a small database of two handed gestures recorded with two cameras. We used Kalman tracking of hands from two cameras using center-of-mass and blob tracking. The results show that (i) blob tracking incorporates general hand shape with hand motion and performs better than simple center-of-mass tracking, and (ii) in a stereo camera setup, even if 3D reconstruction is not possible, combining 2D information from each camera at feature level decreases the error rates, and (iii) Fisher Score methodology combines the powers of generative and discriminative approaches and increases the classification performance.

international conference on multimodal interfaces | 2013

Cross-domain personality prediction: from video blogs to small group meetings

Oya Aran; Daniel Gatica-Perez

In this study, we investigate the use of social media content as a domain to learn personality trait impressions, particularly extraversion. Our aim is to transfer the knowledge that can be extracted from conversational videos in video blogging sites to small group settings to predict the extraversion trait with nonverbal cues. We use YouTube data containing personality impression scores of 442 people as the source domain and a small-group meeting data from a total of 102 people as our target domain. Our results show that, for the extraversion trait, by using user-created video blogs, as part of the training data, and a small amount of adaptation data from the target domain, we are able to achieve higher prediction accuracies than using only the data recorded in small group settings.

Explore More