Jaebok Kim | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jaebok Kim is active.

Explore More

Publication

Featured researches published by Jaebok Kim.

robot and human interactive communication | 2015

Dynamics of social positioning patterns in group-robot interactions

Jered Hendrik Vroon; Michiel Joosse; Manja Lohse; Jan Kolkmeier; Jaebok Kim; Khiet Phuong Truong; Gwenn Englebienne; Dirk Heylen; Vanessa Evers

When a mobile robot interacts with a group of people, it has to consider its position and orientation. We introduce a novel study aimed at generating hypotheses on suitable behavior for such social positioning, explicitly focusing on interaction with small groups of users and allowing for the temporal and social dynamics inherent in most interactions. In particular, the interactions we look at are approach, converse and retreat. In this study, groups of three participants and a telepresence robot (controlled remotely by a fourth participant) solved a task together while we collected quantitative and qualitative data, including tracking of positioning/orientation and ratings of the behaviors used. In the data we observed a variety of patterns that can be extrapolated to hypotheses using inductive reasoning. One such pattern/hypothesis is that a (telepresence) robot could pass through a group when retreating, without this affecting how comfortable that retreat is for the group members. Another is that a group will rate the position/orientation of a (telepresence) robot as more comfortable when it is aimed more at the center of that group.

Engineering Applications of Artificial Intelligence | 2016

Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition

Jaebok Kim; Jeong-Sik Park

This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But, handheld communications devices such as smartphones provide a collection of individual voice data, thus providing suitable conditions for personalized SER that is more enhanced than the SI model framework. By taking advantage of personal devices, we propose an efficient personalized SER scheme employing maximum likelihood linear regression (MLLR), a representative speaker adaptation technique. To further advance the conventional MLLR technique for SER tasks, the proposed approach selects useful data that convey emotionally discriminative acoustic characteristics and uses only those data for adaptation. For reliable data selection, we conduct multistage selection using a log-likelihood distance-based measure and a universal background model. On SER experiments based on a Linguistic Data Consortium emotional speech corpus, our approach exhibited superior performance when compared to conventional adaptation techniques as well as the SI model framework. Graphical abstractDisplay Omitted

acm multimedia | 2017

Deep Temporal Models using Identity Skip-Connections for Speech Emotion Recognition

Jaebok Kim; Gwenn Englebienne; Khiet Phuong Truong; Vanessa Evers

Deep architectures using identity skip-connections have demonstrated groundbreaking performance in the field of image classification. Recently, empirical studies suggested that identity skip-connections enable ensemble-like behaviour of shallow networks, and that depth is not a solo ingredient for their success. Therefore, we examine the potential of identity skip-connections for the task of Speech Emotion Recognition (SER) where moderately deep temporal architectures are often employed. To this end, we propose a novel architecture which regulates unimpeded feature flows and captures long-term dependencies via gate-based skip-connections and a memory mechanism. Our proposed architecture is compared to other state-of-the-art methods of SER and is evaluated on large aggregated corpora recorded in different contexts. Our proposed architecture outperforms the state-of-the-art methods by 9 - 15% and achieves an Unweighted Accuracy of 80.5% in an imbalanced class distribution. In addition, we examine a variant adopting simplified skip-connections of Residual Networks (ResNet) and show that gate-based skip-connections are more effective than simplified skip-connections.

Workshop on Child Computer Interaction | 2016

Automatic detection of children's engagement using non-verbal features and ordinal learning

Jaebok Kim; Khiet Phuong Truong; Vanessa Evers

In collaborative play, young children can exhibit different types of engagement. Some children are engaged with other children in the play activity while others are just looking. In this study, we investigated methods to automatically detect the childrens levels of engagement in play settings using non-verbal vocal features. Rather than labelling the level of engagement in an absolute manner, as has frequently been done in previous related studies, we designed an annotation scheme that takes the order of childrens engagement levels into account. Taking full advantage of the ordinal annotations, we explored the use of SVM-based ordinal learning, i.e. ordinal regression and ranking, and compared these to a rule-based ranking and a classification method. We found promising performances for the ordinal methods. Particularly, the ranking method demonstrated the most robust performance against the large variation of children and their interactions.

Computer Speech & Language | 2018

Automatic temporal ranking of children’s engagement levels using multi-modal cues

Jaebok Kim; Khiet Phuong Truong; Vanessa Evers

Abstract As children of ages 5–8 often play with each other in small groups, their differences in social development and personality traits usually cause various levels of engagement among others. For example, one child may just observe without engaging at all with others while another child may be interested in both the other children as well as the activity. To develop child-friendly interaction technology such as social robots that can adapt robot behaviours to the social situation of a group of children and facilitate harmonious engagement, we aim to study how we can automatically detect these children’s engagement levels. In this paper, we present a novel automatic method that ranks children in a group according to their engagement level in a temporal way based on non-verbal cues that are robust in naturalistic group settings. Our method combines the omission probability of each rank transformed from discriminative outputs from an SVM ranking method and the transition probability between ranks in time. In comparing our proposed method to other existing methods (such as rule-based ranking, basic SVM, SVM ordinal regression, SVM ranking, and SVMHMM), we found that our novel method yields promising results.

Workshop on Child Computer Interaction | 2016

Automatic analysis of children’s engagement using interactional network features

Jaebok Kim; Khiet Phuong Truong

We explored the automatic analysis of vocal non-verbal cues of a group of children in the context of engagement and collaborative play. For the current study, we defined two types of engagement on groups of children: harmonised and unharmonised. A spontaneous audiovisual corpus with groups of children who collaboratively build a 3D puzzle was collected. With this corpus, we modelled the interactions among children using network-based features representing the centrality and similarity of interactions. The centrality measures how interactions among group members are concentrated on a specific speaker while the similarity measures how similar the interactions are. We examined their discriminative characteristics in harmonised and unharmonised engagement situations. High centrality and low similarity values were found in unharmonised engagement situations. In harmonised engagement situations, we found low centrality and high similarity values. These results suggest that interactional network features are promising for the development of automatic detection of engagement at the group level.

International Workshop on Human Behavior Understanding | 2016

Multimodal Detection of Engagement in Groups of Children Using Rank Learning

Jaebok Kim; Khiet Phuong Truong; Vasiliki Charisi; Cristina Zaga; Vanessa Evers; Mohamed Chetouani

In collaborative play, children exhibit different levels of engagement. Some children are engaged with other children while some play alone. In this study, we investigated multimodal detection of individual levels of engagement using a ranking method and non-verbal features: turn-taking and body movement. Firstly, we automatically extracted turn-taking and body movement features in naturalistic and challenging settings. Secondly, we used an ordinal annotation scheme and employed a ranking method considering the great heterogeneity and temporal dynamics of engagement that exist in interactions. We showed that levels of engagement can be characterised by relative levels between children. In particular, a ranking method, Ranking SVM, outperformed a conventional method, SVM classification. While either turn-taking or body movement features alone did not achieve promising results, combining the two features yielded significant error reduction, showing their complementary power.

Workshop on Machine Learning for Social Robotics 2015 | 2015

TERESA: a socially intelligent semi-autonomous telepresence system

Kyriacos Shiarlis; João V. Messias; M. van Someren; Shimon Whiteson; Jaebok Kim; Jered Hendrik Vroon; Gwenn Englebienne; Khiet Phuong Truong; Vanessa Evers; Noé Pérez-Higueras; Ignacio Pérez-Hurtado; Rafael Ramón-Vigo; Fernando Caballero; Luis Merino; Jie Shen; Stavros Petridis; Maja Pantic; L. Hedman; M. Scherlund; R. Koster; H. Michel

conference of the international speech communication association | 2015