Hongying Meng | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hongying Meng is active.

Explore More

Publication

Featured researches published by Hongying Meng.

international conference on machine learning | 2005

The 2005 PASCAL visual object classes challenge

Mark Everingham; Andrew Zisserman; Christopher K. I. Williams; Luc Van Gool; Moray Allan; Christopher M. Bishop; Olivier Chapelle; Navneet Dalal; Thomas Deselaers; Gyuri Dorkó; Stefan Duffner; Jan Eichhorn; Jason Farquhar; Mario Fritz; Christophe Garcia; Thomas L. Griffiths; Frédéric Jurie; Daniel Keysers; Markus Koskela; Jorma Laaksonen; Diane Larlus; Bastian Leibe; Hongying Meng; Hermann Ney; Bernt Schiele; Cordelia Schmid; Edgar Seemann; John Shawe-Taylor; Amos J. Storkey; Sandor Szedmak

The PASCAL Visual Object Classes Challenge ran from February to March 2005. The goal of the challenge was to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). Four object classes were selected: motorbikes, bicycles, cars and people. Twelve teams entered the challenge. In this chapter we provide details of the datasets, algorithms used by the teams, evaluation criteria, and results achieved.

ACM Transactions on Computer-Human Interaction | 2012

What Does Touch Tell Us about Emotions in Touchscreen-Based Gameplay?

Yuan Gao; Nadia Bianchi-Berthouze; Hongying Meng

The increasing number of people playing games on touch-screen mobile phones raises the question of whether touch behaviors reflect players’ emotional states. This prospect would not only be a valuable evaluation indicator for game designers, but also for real-time personalization of the game experience. Psychology studies on acted touch behavior show the existence of discriminative affective profiles. In this article, finger-stroke features during gameplay on an iPod were extracted and their discriminative power analyzed. Machine learning algorithms were used to build systems for automatically discriminating between four emotional states (Excited, Relaxed, Frustrated, Bored), two levels of arousal and two levels of valence. Accuracy reached between 69% and 77% for the four emotional states, and higher results (~89%) were obtained for discriminating between two levels of arousal and two levels of valence. We conclude by discussing the factors relevant to the generalization of the results to applications other than games.

computer vision and pattern recognition | 2007

A Human Action Recognition System for Embedded Computer Vision Application

Hongying Meng; Nick Pears; Chris Bailey

In this paper, we propose a human action recognition system suitable for embedded computer vision applications in security systems, human-computer interaction and intelligent environments. Our system is suitable for embedded computer vision application based on three reasons. Firstly, the system was based on a linear support vector machine (SVM) classifier where classification progress can be implemented easily and quickly in embedded hardware. Secondly, we use compacted motion features easily obtained from videos. We address the limitations of the well known motion history image (MHI) and propose a new hierarchical motion history histogram (HMHH) feature to represent the motion information. HMHH not only provides rich motion information, but also remains computationally inexpensive. Finally, we combine MHI and HMHH together and extract a low dimension feature vector to be used in the SVM classifiers. Experimental results show that our system achieves significant improvement on the recognition performance.

acm multimedia | 2013

Depression recognition based on dynamic facial and vocal expression features using partial least square regression

Hongying Meng; Di Huang; Heng Wang; Hongyu Yang; Mohammed AI-Shuraifi; Yunhong Wang

Depression is a typical mood disorder, and the persons who are often in this state face the risk in mental and even physical problems. In recent years, there has therefore been increasing attention in machine based depression analysis. In such a low mood, both the facial expression and voice of human beings appear different from the ones in normal states. This paper presents a novel method, which comprehensively models visual and vocal modalities, and automatically predicts the scale of depression. On one hand, Motion History Histogram (MHH) extracts the dynamics from corresponding video and audio data to represent characteristics of subtle changes in facial and vocal expression of depression. On the other hand, for each modality, the Partial Least Square (PLS) regression algorithm is applied to learn the relationship between the dynamic features and depression scales using training data, and then predict the depression scale for an unseen one. Predicted values of visual and vocal clues are further combined at decision level for final decision. The proposed approach is evaluated on the AVEC2013 dataset and experimental results clearly highlight its effectiveness and better performance than baseline results provided by the AVEC2013 challenge organiser.

affective computing and intelligent interaction | 2011

Naturalistic affective expression classification by a multi-stage approach based on hidden Markov models

Hongying Meng; Nadia Bianchi-Berthouze

In naturalistic behaviour, the affective states of a person change at a rate much slower than the typical rate at which video or audio is recorded (e.g. 25fps for video). Hence, there is a high probability that consecutive recorded instants of expressions represent a same affective content. In this paper, a multi-stage automatic affective expression recognition system is proposed which uses Hidden Markov Models (HMMs) to take into account this temporal relationship and finalize the classification process. The hidden states of the HMMs are associated with the levels of affective dimensions to convert the classification problem into a best path finding problem in HMM. The system was tested on the audio data of the Audio/Visual Emotion Challenge (AVEC) datasets showing performance significantly above that of a one-stage classification system that does not take into account the temporal relationship, as well as above the baseline set provided by this Challenge. Due to the generality of the approach, this system could be applied to other types of affective modalities.

IEEE Transactions on Systems, Man, and Cybernetics | 2014

Affective State Level Recognition in Naturalistic Facial and Vocal Expressions

Hongying Meng; Nadia Bianchi-Berthouze

Naturalistic affective expressions change at a rate much slower than the typical rate at which video or audio is recorded. This increases the probability that consecutive recorded instants of expressions represent the same affective content. In this paper, we exploit such a relationship to improve the recognition performance of continuous naturalistic affective expressions. Using datasets of naturalistic affective expressions (AVEC 2011 audio and video dataset, PAINFUL video dataset) continuously labeled over time and over different dimensions, we analyze the transitions between levels of those dimensions (e.g., transitions in pain intensity level). We use an information theory approach to show that the transitions occur very slowly and hence suggest modeling them as first-order Markov models. The dimension levels are considered to be the hidden states in the Hidden Markov Model (HMM) framework. Their discrete transition and emission matrices are trained by using the labels provided with the training set. The recognition problem is converted into a best path-finding problem to obtain the best hidden states sequence in HMMs. This is a key difference from previous use of HMMs as classifiers. Modeling of the transitions between dimension levels is integrated in a multistage approach, where the first level performs a mapping between the affective expression features and a soft decision value (e.g., an affective dimension level), and further classification stages are modeled as HMMs that refine that mapping by taking into account the temporal relationships between the output decision labels. The experimental results for each of the unimodal datasets show overall performance to be significantly above that of a standard classification system that does not take into account temporal relationships. In particular, the results on the AVEC 2011 audio dataset outperform all other systems presented at the international competition.

IEEE Journal of Solid-state Circuits | 2004

A VLSI architecture of JPEG2000 encoder

Leibo Liu; Ning Chen; Hongying Meng; Li Zhang; Zhihua Wang; Hongyi Chen

This paper proposes a VLSI architecture of JPEG2000 encoder, which functionally consists of two parts: discrete wavelet transform (DWT) and embedded block coding with optimized truncation (EBCOT). For DWT, a spatial combinative lifting algorithm (SCLA)-based scheme with both 5/3 reversible and 9/7 irreversible filters is adopted to reduce 50% and 42% multiplication computations, respectively, compared with the conventional lifting-based implementation (LBI). For EBCOT, a dynamic memory control (DMC) strategy of Tier-1 encoding is adopted to reduce 60% scale of the on-chip wavelet coefficient storage and a subband parallel-processing method is employed to speed up the EBCOT context formation (CF) process; an architecture of Tier-2 encoding is presented to reduce the scale of on-chip bitstream buffering from full-tile size down to three-code-block size and considerably eliminate the iterations of the rate-distortion (RD) truncation.

Computer Vision and Image Understanding | 2010

Accelerated hardware video object segmentation: From foreground detection to connected components labelling

Kofi Appiah; Andrew Hunter; Patrick Dickinson; Hongying Meng

This paper demonstrates the use of a single-chip FPGA for the segmentation of moving objects in a video sequence. The system maintains highly accurate background models, and integrates the detection of foreground pixels with the labelling of objects using a connected components algorithm. The background models are based on 24-bit RGB values and 8-bit gray scale intensity values. A multimodal background differencing algorithm is presented, using a single FPGA chip and four blocks of RAM. The real-time connected component labelling algorithm, also designed for FPGA implementation, run-length encodes the output of the background subtraction, and performs connected component analysis on this representation. The run-length encoding, together with other parts of the algorithm, is performed in parallel; sequential operations are minimized as the number of run-lengths are typically less than the number of pixels. The two algorithms are pipelined together for maximum efficiency.

Journal of Real-time Image Processing | 2008

Real-time human action recognition on an embedded, reconfigurable video processing architecture

Hongying Meng; Michael J. Freeman; Nick Pears; Chris Bailey

In recent years, automatic human action recognition has been widely researched within the computer vision and image processing communities. Here we propose a real-time, embedded vision solution for human action recognition, implemented on an FPGA-based ubiquitous device. There are three main contributions in this paper. Firstly, we have developed a fast human action recognition system with simple motion features and a linear support vector machine classifier. The method has been tested on a large, public human action dataset and achieved competitive performance for the temporal template class of approaches, which include “Motion History Image” based techniques. Secondly, we have developed a reconfigurable, FPGA based video processing architecture. One advantage of this architecture is that the system processing performance can be reconfigured for a particular application, with the addition of new or replicated processing cores. Finally, we have successfully implemented a human action recognition system on this reconfigurable architecture. With a small number of human actions (hand gestures), this stand-alone system is operating reliably at 12 frames/s, with an 80% average recognition rate using limited training data. This type of system has applications in security systems, man–machine communications and intelligent environments.

In: Kisačanin, B and Bhattacharyya, SS and Chai, S, (eds.) Embedded Computer Vision. (pp. 139-162). Springer-Verlag New York Inc (2008) | 2009

Motion History Histograms for Human Action Recognition

Hongying Meng; Nick Pears; Michael J. Freeman; Chris Bailey

In this chapter, a compact human action recognition system is presented with a view to applications in security systems, human-computer interaction, and intelligent environments. There are three main contributions: Firstly, the framework of an embedded human action recognition system based on a support vector machine (SVM) classifier and some compact motion features has been presented. Secondly, the limitations of the well-known motion history image (MHI) are addressed and a new motion history histograms (MHH) feature is introduced to represent the motion information in the video. MHH not only provides rich motion information, but also remains computationally inexpensive. We combine MHI and MHH into a low-dimensional feature vector for the system and achieve improved performance in human action recognition over comparable methods that use tracking-free temporal template motion representations. Finally, a simple system based on SVM and MHI has been implemented on a reconfigurable embedded computer vision architecture for real-time gesture recognition.

Explore More