Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Min Chun Hu is active.

Publication


Featured researches published by Min Chun Hu.


acm multimedia | 2012

Human action recognition and retrieval using sole depth information

Yan Ching Lin; Min Chun Hu; Wen-Huang Cheng; Yung Huan Hsieh; Hong Ming Chen

Observing the widespread use of Kinect-like depth cameras, in this work, we investigate into the problem of using sole depth data for human action recognition and retrieval in videos. We proposed the use of simple depth descriptors without learning optimization to achieve promising performances as compatible to those of the leading methods based on color images and videos, and can be effectively applied for real-time applications. Because of the infrared nature of depth cameras, the proposed approach will be especially useful under poor lighting conditions, e.g. the surveillance environments without sufficient lighting. Meanwhile, we proposed a large Depth-included Human Action video dataset, namely DHA, which contains 357 videos of performed human actions belonging to 17 categories. To the best of our knowledge, the DHA is one of the largest depth-included video datasets of human actions.


IEEE Transactions on Systems, Man, and Cybernetics | 2015

Real-Time Human Movement Retrieval and Assessment With Kinect Sensor

Min Chun Hu; Chi Wen Chen; Wen-Huang Cheng; Che-Han Chang; Jui Hsin Lai; Ja-Ling Wu

The difficulty of vision-based posture estimation is greatly decreased with the aid of commercial depth camera, such as Microsoft Kinect. However, there is still much to do to bridge the results of human posture estimation and the understanding of human movements. Human movement assessment is an important technique for exercise learning in the field of healthcare. In this paper, we propose an action tutor system which enables the user to interactively retrieve a learning exemplar of the target action movement and to immediately acquire motion instructions while learning it in front of the Kinect. The proposed system is composed of two stages. In the retrieval stage, nonlinear time warping algorithms are designed to retrieve video segments similar to the query movement roughly performed by the user. In the learning stage, the user learns according to the selected video exemplar, and the motion assessment including both static and dynamic differences is presented to the user in a more effective and organized way, helping him/her to perform the action movement correctly. The experiments are conducted on the videos of ten action types, and the results show that the proposed human action descriptor is representative for action video retrieval and the tutor system can effectively help the user while learning action movements.


IEEE Transactions on Image Processing | 2014

Learning and Recognition of On-Premise Signs From Weakly Labeled Street View Images

Tsung Hung Tsai; Wen-Huang Cheng; Chuang Wen You; Min Chun Hu; Arvin Wen Tsui; Heng Yu Chi

Camera-enabled mobile devices are commonly used as interaction platforms for linking the users virtual and physical worlds in numerous research and commercial applications, such as serving an augmented reality interface for mobile information retrieval. The various application scenarios give rise to a key technique of daily life visual object recognition. On-premise signs (OPSs), a popular form of commercial advertising, are widely used in our living life. The OPSs often exhibit great visual diversity (e.g., appearing in arbitrary size), accompanied with complex environmental conditions (e.g., foreground and background clutter). Observing that such real-world characteristics are lacking in most of the existing image data sets, in this paper, we first proposed an OPS data set, namely OPS-62, in which totally 4649 OPS images of 62 different businesses are collected from Googles Street View. Further, for addressing the problem of real-world OPS learning and recognition, we developed a probabilistic framework based on the distributional clustering, in which we proposed to exploit the distributional information of each visual feature (the distribution of its associated OPS labels) as a reliable selection criterion for building discriminative OPS models. Experiments on the OPS-62 data set demonstrated the outperformance of our approach over the state-of-the-art probabilistic latent semantic analysis models for more accurate recognitions and less false alarms, with a significant 151.28% relative improvement in the average recognition rate. Meanwhile, our approach is simple, linear, and can be executed in a parallel fashion, making it practical and scalable for large-scale multimedia applications.


IEEE Transactions on Multimedia | 2015

Efficient QR Code Beautification With High Quality Visual Content

Shih Syun Lin; Min Chun Hu; Chien Han Lee; Tong-Yee Lee

Quick response (QR) code is generally used for embedding messages such that people can conveniently use mobile devices to capture the QR code and acquire information through a QR code reader. In the past, the design of QR code generators only aimed to achieve high decodability and the produced QR codes usually look like random black-and-white patterns without visual semantics. In recent years, researchers have been tried to endow the QR code with aesthetic elements and QR code beautification has been formulated as an optimization problem that minimizes the visual perception distortion subject to acceptable decoding rate. However, the visual quality of the QR code generated by existing methods still leaves much to be desired. In this work, we propose a two-stage approach to generate QR code with high quality visual content. In the first stage, a baseline QR code with reliable decodability but poor visual quality is first synthesized based on the Gauss-Jordan elimination procedure. In the second stage, a rendering mechanism is designed to improve the visual quality while avoiding affecting the decodability of the QR code. The experimental results show that the proposed method substantially enhances the appearance of the QR code and the processing complexity is near real-time.


international conference on computer vision | 2013

Rectangling Stereographic Projection for Wide-Angle Image Visualization

Che-Han Chang; Min Chun Hu; Wen-Huang Cheng; Yung-Yu Chuang

This paper proposes a new projection model for mapping a hemisphere to a plane. Such a model can be useful for viewing wide-angle images. Our model consists of two steps. In the first step, the hemisphere is projected onto a swung surface constructed by a circular profile and a rounded rectangular trajectory. The second step maps the projected image on the swung surface onto the image plane through the perspective projection. We also propose a method for automatically determining proper parameters for the projection model based on image content. The proposed model has several advantages. It is simple, efficient and easy to control. Most importantly, it makes a better compromise between distortion minimization and line preserving than popular projection models, such as stereographic and Pannini projections. Experiments and analysis demonstrate the effectiveness of our model.


ieee international conference on multimedia big data | 2016

Real-Time Sign Language Recognition in Complex Background Scene Based on a Hierarchical Clustering Classification Method

Tse Yu Pan; Li Yun Lo; Chung Wei Yeh; Jhe Wei Li; Hou Tim Liu; Min Chun Hu

Cameras are embedded in many mobile/wearable devices and can be used for gesture recognition or even sign language recognition to help the deaf people communicate with others. In this paper, we proposed a vision-based gesture recognition system which can be used in environments with complex background. We design a method to adaptively update the skin color model for different users and various lighting conditions. Three kinds of features are combined to describe the contours and the salient points of hand gestures. Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Support Vector Machine (SVM) are integrated to construct a novel hierarchical classification scheme. We evaluated the proposed recognition method on two datasets: (1) the CSL dataset collected by ourselves, in which images were captured in complex background. (2) The public ASL dataset, in which images of the same gesture were captured in different lighting conditions. Our method achieves the accuracies of 99.8% and 94%, respectively, which outperforms the existing works.


conference on multimedia modeling | 2016

Locality Constrained Sparse Representation for Cat Recognition

Yu-Chen Chen; Shintami Chusnul Hidayati; Wen-Huang Cheng; Min Chun Hu; Kai-Lung Hua

Cat (Felis catus) plays an important social role within our society and can provide considerable emotional support for their owners. Missing, swapping, theft, and false insurance claims of cat have become global problem throughout the world. Reliable cat identification is thus an essential factor in the effective management of the owned cat population. The traditional cat identification methods by permanent (e.g., tattoos, microchip, ear tips/notches, and freeze branding), semi-permanent (e.g., identification collars and ear tags), or temporary (e.g., paint/dye and radio transmitters) procedures are not robust to provide adequate level of security. Moreover, these methods might have adverse effects on the cats. Though the work on animal identification based on their phenotype appearance (face and coat patterns) has received much attention in recent years, however none of them specifically targets cat. In this paper, we therefore propose a novel biometrics method to recognize cat by exploiting their noses that are believed to be a unique identifier by cat professionals. As the pioneer of this research topic, we first collect a Cat Database that contains 700 cat nose images from 70 different cats. Based on this dataset, we design a representative dictionary with data locality constraint for cat identification. Experimental results well demonstrate the effectiveness of the proposed method compared to several state-of-the-art feature-based algorithms.


Journal of Medical Systems | 2016

Color Correction Parameter Estimation on the Smartphone and Its Application to Automatic Tongue Diagnosis

Min Chun Hu; Ming Hsun Cheng; Kun Chan Lan

BackgroundAn automatic tongue diagnosis framework is proposed to analyze tongue images taken by smartphones. Different from conventional tongue diagnosis systems, our input tongue images are usually in low resolution and taken under unknown lighting conditions. Consequently, existing tongue diagnosis methods cannot be directly applied to give accurate results.Materials and MethodsWe use the SVM (support vector machine) to predict the lighting condition and the corresponding color correction matrix according to the color difference of images taken with and without flash. We also modify the state-of-the-art work of fur and fissure detection for tongue images by taking hue information into consideration and adding a denoising step.ResultsOur method is able to correct the color of tongue images under different lighting conditions (e.g. fluorescent, incandescent, and halogen illuminant) and provide a better accuracy in tongue features detection with less processing complexity than the prior work.ConclusionsIn this work, we proposed an automatic tongue diagnosis framework which can be applied to smartphones. Unlike the prior work which can only work in a controlled environment, our system can adapt to different lighting conditions by employing a novel color correction parameter estimation scheme.


multimedia signal processing | 2015

VRank: Voting system on Ranking model for human age estimation

Tekoing Lim; Kai-Lung Hua; Hong Cyuan Wang; Kai Wen Zhao; Min Chun Hu; Wen-Huang Cheng

Ranking algorithms have proven the potential for human age estimation. Currently, a common paradigm is to compare the input face with reference faces of known age to generate a ranking relation whereby the first-rank reference is exploited for labeling the input face. In this paper, we proposed a framework to improve upon the typical ranking model, called Voting system on Ranking model (VRank), by leveraging relational information (comparative relations, i.e. if the input face is younger or older than each of the references) to make a more robust estimation. Our approach has several advantages: firstly, comparative relations can be explicitly involved to benefit the estimation task; secondly, few incorrect comparisons will not influence much the accuracy of the result, making this approach more robust than the conventional approach; finally, we propose to incorporate the deep learning architecture for training, which extracts robust facial features for increasing the effectiveness of classification. In comparison to the best results from the state-of-the-art methods, the VRank showed a significant outperformance on all the benchmarks, with a relative improvement of 5.74% ~ 69.45% (FG-NET), 19.09% ~ 68.71% (MORPH), and 0.55% ~ 17.73% (IoG).


Multimedia Systems | 2015

Efficient human detection in crowded environment

Min Chun Hu; Wen-Huang Cheng; Chuan Shen Hu; Ja-Ling Wu; Jhe Wei Li

Detecting humans in crowded environment is profitable but challenging in video surveillance. We propose an efficient human detection method by combining both motion and appearance clues. Moving pixels are first extracted by background subtraction, and then a filtering step is used to narrow the range for human template matching. We utilize integral images to fast generate shape information from edge maps of each frame and define the matching probability to be capable of detecting both full-body and partial-body. Representative human templates are constructed by sparse contours on the basis of the point distribution model. Moreover, linear regression analysis is also applied to adaptively adjust the template sizes. With the aid of the proposed foreground ratio filtering and the multi-sized template matching techniques, experimental results show that our method not only can efficiently detect humans in a crowded environment, but also largely enhance the resultant detection accuracy.

Collaboration


Dive into the Min Chun Hu's collaboration.

Top Co-Authors

Avatar

Wen-Huang Cheng

Center for Information Technology

View shared research outputs
Top Co-Authors

Avatar

Ja-Ling Wu

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Tse Yu Pan

National Cheng Kung University

View shared research outputs
Top Co-Authors

Avatar

Kai-Lung Hua

National Taiwan University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Kun Chan Lan

National Cheng Kung University

View shared research outputs
Top Co-Authors

Avatar

Li Yun Lo

National Cheng Kung University

View shared research outputs
Top Co-Authors

Avatar

Tse-Yu Pan

National Cheng Kung University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jui Hsin Lai

National Taiwan University

View shared research outputs
Top Co-Authors

Avatar

Wan-Lun Tsai

National Cheng Kung University

View shared research outputs
Researchain Logo
Decentralizing Knowledge