Michael Xuelin Huang
Hong Kong Polytechnic University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Xuelin Huang.
human factors in computing systems | 2016
Michael Xuelin Huang; Tiffany C.K. Kwok; Grace Ngai; Stephen Chi-fai Chan; Hong Va Leong
We present PACE, a Personalized, Automatically Calibrating Eye-tracking system that identifies and collects data unobtrusively from user interaction events on standard computing systems without the need for specialized equipment. PACE relies on eye/facial analysis of webcam data based on a set of robust geometric gaze features and a two-layer data validation mechanism to identify good training samples from daily interaction data. The design of the system is founded on an in-depth investigation of the relationship between gaze patterns and interaction cues, and takes into consideration user preferences and habits. The result is an adaptive, data-driven approach that continuously recalibrates, adapts and improves with additional use. Quantitative evaluation on 31 subjects across different interaction behaviors shows that training instances identified by the PACE data collection have higher gaze point-interaction cue consistency than those identified by conventional approaches. An in-situ study using real-life tasks on a diverse set of interactive applications demonstrates that the PACE gaze estimation achieves an average error of 2.56º, which is comparable to state-of-the-art, but without the need for explicit training or calibration. This demonstrates the effectiveness of both the gaze estimation method and the corresponding data collection mechanism.
acm multimedia | 2014
Michael Xuelin Huang; Tiffany C.K. Kwok; Grace Ngai; Hong Va Leong; Stephen Chi-fai Chan
Most eye gaze estimation systems rely on explicit calibration, which is inconvenient to the user, limits the amount of possible training data and consequently the performance. Since there is likely a strong correlation between gaze and interaction cues, such as cursor and caret locations, a supervised learning algorithm can learn the complex mapping between gaze features and the gaze point by training on incremental data collected implicitly from normal computer interactions. We develop a set of robust geometric gaze features and a corresponding data validation mechanism that identifies good training data from noisy interaction-informed data collected in real-use scenarios. Based on a study of gaze movement patterns, we apply behavior-informed validation to extract gaze features that correspond with the interaction cue, and data-driven validation provides another level of crosschecking using previous good data. Experimental evaluation shows that the proposed method achieves an average error of 4.06º, and demonstrates the effectiveness of the proposed gaze estimation method and corresponding validation mechanism.
intelligent user interfaces | 2015
Tiffany C.K. Kwok; Michael Xuelin Huang; Wai Cheong Tam; Grace Ngai
Affect exchange is essential for healthy physical and social development [7], and friends and family communicate their emotions to each other instinctively. In particular, watching movies has always been a popular mode of socialization and video sharing is increasingly viewed as an effective way to facilitate communication of feelings and affects, even when the parties are not in the same location. We present an asynchronous video-sharing platform that uses Emotars to facilitate affect sharing in order to create and enhance the sense of togetherness through the experience of asynchronous movie watching. We investigate its potential impact and benefits, including a better viewing experience, supporting relationships, and strengthening engagement, connectedness and emotion awareness among individuals.
designing interactive systems | 2012
Michael Xuelin Huang; Will W. W. Tang; Kenneth W. K. Lo; Chi Kin Lau; Grace Ngai; Stephen Chi-fai Chan
MelodicBrush is a novel system that connects two ancient art forms: Chinese ink-brush calligraphy and Chinese music. Our system uses vision-based techniques to create a digitized ink-brush calligraphic writing surface with enhanced interaction functionalities. The music generation combines cross-modal stroke-note mapping and statistical language modeling techniques into a hybrid model that generates music as a real-time, auditory response and feedback to the users calligraphic strokes. Our system is in fact a new cross-modal musical system that endows the ancient art of calligraphy writing with a novel auditory representation to provide the users with a natural and novel artistic experience. Experiment evaluations with real users suggest that MelodicBrush is intuitive and realistic, and can also be easily used to exercise creativity and support art generation.
acm multimedia | 2016
Michael Xuelin Huang; Jiajia Li; Grace Ngai; Hong Va Leong
Stress sensing is valuable in many applications, including online learning crowdsourcing and other daily human-computer interactions. Traditional affective computing techniques investigate affect inference based on different individual modalities, such as facial expression, vocal tones, and physiological signals or the aggregation of signals of these independent modalities, without explicitly exploiting their inter-connections. In contrast, this paper focuses on exploring the impact of mental stress on the coordination between two human nervous systems, the somatic and autonomic nervous systems. Specifically, we present the analysis of the subtle but indicative pattern of human gaze behaviors surrounding a mouse-click event, i.e. the gaze-click pattern. Our evaluation shows that mental stress affects the gaze-click pattern, and this influence has largely been ignored in previous work. This paper, therefore, further proposes a non-intrusive approach to inferring human stress level based on the gaze-click pattern, using only data collected from the common computer webcam and mouse. We conducted a human study on solving math questions under different stress levels to explore the validity of stress recognition based on this coordination pattern. Experimental results show the effectiveness of our technique and the generalizability of the proposed features for user-independent modeling. Our results suggest that it may be possible to detect stress non-intrusively in the wild, without the need for specialized equipment.
computer software and applications conference | 2014
Yujun Fu; Hong Va Leong; Grace Ngai; Michael Xuelin Huang; Stephen Chi-fai Chan
Human-centered computing is rapidly becoming a major research direction in human-computer interaction research. Among the various research issues, we believe that affective computing, or the ability of computers to react according to what a user feels, is very important. In order to recognize the human affect (feeling), one can rely on the analysis of signal inputs captured by a multitude of means. In this paper, we propose to make use of human physiological signals as a new form of modality in determining human affects, in a non-intrusive manner. This is achieved via the physiological mouse, as a first step towards affective computing. We augment the mouse with a small optical component for capturing user photoplethysmographic (PPG) signal. With the PPG signal, we are able to compute and derive human physiological signals. We built a prototype of the physiological mouse and measured raw PPG readings. We performed experiments to study the accuracy of our approach in determining human physiological signals from the mouse PPG data. We believe that our research will provide a new dimension for multimodal affective computing research.
human factors in computing systems | 2017
Michael Xuelin Huang; Jiajia Li; Grace Ngai; Hong Va Leong
Gaze estimation has widespread applications. However, little work has explored gaze estimation on smartphones, even though they are fast becoming ubiquitous. This paper presents ScreenGlint, a novel approach which exploits the glint (reflection) of the screen on the users cornea for gaze estimation, using only the image captured by the front-facing camera. We first conduct a user study on common postures during smartphone use. We then design an experiment to evaluate the accuracy of ScreenGlint under varying face-to-screen distances. An in-depth evaluation involving multiple users is conducted and the impact of head pose variations is investigated. ScreenGlint achieves an overall angular error of 2.44º without head pose variations, and 2.94º with head pose variations. Our technique compares favorably to state-of-the-art research works, indicating that the glint of the screen is an effective and practical cue to gaze estimation on the smartphone platform. We believe that this work can open up new possibilities for practical and ubiquitous gaze-aware applications.
IEEE Transactions on Affective Computing | 2016
Michael Xuelin Huang; Grace Ngai; Kien A. Hua; Stephen Chi-fai Chan; Hong Va Leong
This paper presents Personalized Affect Detection with Minimal Annotation (PADMA), a user-dependent approach for identifying affective states from spontaneous facial expressions without the need for expert annotation. The conventional approach relies on the use of key frames in recorded affect sequences and requires an expert observer to identify and annotate the frames. It is susceptible to user variability and accommodating individual differences is difficult. The alternative is a user-dependent approach, but it would be prohibitively expensive to collect and annotate data for each user. PADMA uses a novel Association-based Multiple Instance Learning (AMIL) method, which learns a personal facial affect model through expression frequency analysis, and does not need expert input or frame-based annotation. PADMA involves a training/calibration phase in which the user watches short video segments and reports the affect that best describes his/her overall feeling throughout the segment. The most indicative facial gestures are identified and extracted from the facial response video, and the association between gesture and affect labels is determined by the distribution of the gesture over all reported affects. Hence both the geometric deformation and distribution of key facial gestures are specially adapted for each user. We show results that demonstrate the feasibility, effectiveness and extensibility of our approach.
human factors in computing systems | 2012
Michael Xuelin Huang; Will W. W. Tang; Kenneth W. K. Lo; Chi Kin Lau; Grace Ngai; Stephen Chi-fai Chan
MelodicBrush is a novel cross-modal musical system that connects two ancient art forms: Chinese ink-brush calligraphy and Chinese music. Our system endows the process of calligraphy writing with a novel auditory representation in a natural and intuitive manner to create a novel artistic experience. The writing effect is simulated as though the user were writing on an infinitely large piece of paper viewed through a viewport. The real-time musical generation effects are motivated by principles of metaphoric congruence and statistical music modeling.
IEEE Transactions on Multimedia | 2018
Michael Xuelin Huang; Jiajia Li; Grace Ngai; Hong Va Leong; Kien A. Hua
A user-specific model generally performs better in facial affect recognition. Existing solutions, however, have usability issues since the annotation can be long and tedious for the end users (e.g., consumers). We address this critical issue by presenting a more user-friendly user-adaptive model to make the personalized approach more practical. This paper proposes a novel user-adaptive model, which we have called fast-Personal Affect Detection with Minimal Annotation (Fast-PADMA). Fast-PADMA integrates data from multiple source subjects with a small amount of data from the target subject. Collecting this target subject data is feasible since fast-PADMA requires only one self-reported affect annotation per facial video segment. To alleviate overfitting in this context of limited individual training data, we propose an efficient bootstrapping technique, which strengthens the contribution of multiple similar source subjects. Specifically, we employ an ensemble classifier to construct pretrained weak generic classifiers from data of multiple source subjects, which is weighted according to the available data from the target user. The result is a model that does not require expensive computation, such as distribution dissimilarity calculation or model retraining. We evaluate our method with in-depth experimental evaluations on five publicly available facial datasets, with results that compare favorably with the state-of-the-art performance on classifying pain, arousal, and valence. Our findings show that fast-PADMA is effective at rapidly constructing a user-adaptive model that outperforms both its generic and user-specific counterparts. This efficient technique has the potential to significantly improve user-adaptive facial affect recognition for personal use and, therefore, enable comprehensive affect-aware applications.