Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wenming Cao is active.

Publication


Featured researches published by Wenming Cao.


IEEE Transactions on Multimedia | 2016

Animal Detection From Highly Cluttered Natural Scenes Using Spatiotemporal Object Region Proposals and Patch Verification

Zhi Zhang; Zhihai He; Guitao Cao; Wenming Cao

In this paper, we consider the animal object detection and segmentation from wildlife monitoring videos captured by motion-triggered cameras, called camera-traps. For these types of videos, existing approaches often suffer from low detection rates due to low contrast between the foreground animals and the cluttered background, as well as high false positive rates due to the dynamic background. To address this issue, we first develop a new approach to generate animal object region proposals using multilevel graph cut in the spatiotemporal domain. We then develop a cross-frame temporal patch verification method to determine if these region proposals are true animals or background patches. We construct an efficient feature description for animal detection using joint deep learning and histogram of oriented gradient features encoded with Fisher vectors. Our extensive experimental results and performance comparisons over a diverse set of challenging camera-trap data demonstrate that the proposed spatiotemporal object proposal and patch verification framework outperforms the state-of-the-art methods, including the recent Faster-RCNN method, on animal object detection accuracy by up to 4.5%.


IEEE Access | 2017

Liver Fibrosis Classification Based on Transfer Learning and FCNet for Ultrasound Images

Dan Meng; Libo Zhang; Guitao Cao; Wenming Cao; Guixu Zhang; Bing Hu

Diagnostic ultrasound offers great improvements in diagnostic accuracy and robustness. However, it is difficult to make subjective and uniform diagnoses, because the quality of ultrasound images can be easily influenced by machine settings, the characteristics of ultrasonic waves, the interactions between ultrasound and body tissues, and other uncontrollable factors. In this paper, we propose a novel liver fibrosis classification method based on transfer learning (TL) using VGGNet and a deep classifier called fully connected network (FCNet). In case of insufficient samples, deep features extracted using TL strategy can provide sufficient classification information. These deep features are then sent to FCNet for the classification of different liver fibrosis statuses. With this framework, tests show that our deep features combined with the FCNet can provide suitable information to enable the construction of the most accurate prediction model when compared with other methods.


IEEE Transactions on Multimedia | 2016

Task-Driven Progressive Part Localization for Fine-Grained Object Recognition

Chen Huang; Zhihai He; Guitao Cao; Wenming Cao

The problem of fine-grained object recognition is very challenging due to the subtle visual differences between different object categories. In this paper, we propose a task-driven progressive part localization (TPPL) approach for fine-grained object recognition. Most existing methods follow a two-step approach that first detects salient object parts to suppress the interference from background scenes and then classifies objects based on features extracted from these regions. The part detector and object classifier are often independently designed and trained. In this paper, our major finding is that the part detector should be jointly designed and progressively refined with the object classifier so that the detected regions can provide the most distinctive features for final object recognition. Specifically, we develop a part-based SPP-net (Part-SPP) as our baseline part detector. We then establish a TPPL framework, which takes the predicted boxes of Part-SPP as an initial guess, and then examines new regions in the neighborhood using a particle swarm optimization approach, searching for more discriminative image regions to maximize the objective function and the recognition performance. This procedure is performed in an iterative manner to progressively improve the joint part detection and object classification performance. Experimental results on the Caltech-UCSD-200-2011 dataset demonstrate that our method outperforms state-of-the-art fine-grained categorization methods both in part localization and classification, even without requiring a bounding box during testing.


bioinformatics and biomedicine | 2016

Automatic fall detection of human in video using combination of features

Kun Wang; Guitao Cao; Dan Meng; Weiting Chen; Wenming Cao

The problem of automatically fall detection of older people living alone is a popular research topic since falls are one of the major health hazards among the aging population aged 65 and above and the population of them in China is more than 100 million. In this paper, we present an automatic human fall detection framework based on video surveillance which can improve safety of elders in indoor environments. First, a vision component was used to detect and extract moving people in videos from static cameras. Then, we combine Histograms of Oriented Gradients(HOG),Local Binary Pattern(LBP)and feature extracted by the Deep Learning Framework Caffe to form a new augmented feature and the feature is named HLC. We use HLC to represent a persons motion state in a frame of a video sequence. Because the process of fall is a sequence of movements, we use HLC features which were extracted from continuous frames of a video sequence to implement the fall detection. With the help of the HLC feature, we achieve an average fall detection result of 93.7% sensitivity and 92.0% specificity on three different datasets.


Journal of Visual Communication and Image Representation | 2016

Constellational contour parsing for deformable object detection

Chen Huang; Tony X. Han; Zhihai He; Wenming Cao

We proposed a novel framework for contour-based object detection from cluttered environment.We simultaneously perform selecting of contour fragments, grouping of fragments, and finding best matching to model contours.We developed local shape descriptors and an additive similarity metric function.We augmented the metric function with a local motion search, modeled the relationship between different shape parts.The proposed method outperforms the state-of-the-art contour-based object detection algorithms. In this paper we propose a novel framework for contour-based object detection from cluttered environments. Given a contour model for a class of object, it is first decomposed into fragments, then in the test image we simultaneously perform selection of relevant contour fragments in edge images, grouping of the selected contour fragments, and finding best geometry-preserving matching to model contours. Finding the best matching is inherently a computationally expensive problem. To address this challenge, we developed local shape descriptors and an additive similarity metric function which can be computed locally while preserving the capability of matching deformable shapes globally. This allows us to establish a constellational shape parsing framework using low-complexity dynamic programming to find optimal configuration of contour segments in test images to match the model contour. To effectively detect objects with large deformation, we augmented the metric function with a local motion search, modeled the relationship between different shape parts using multiple concurrent dynamic programming shape parsers. Our experimental results show that the proposed method outperforms the state-of-the-art contour-based object detection algorithms on two benchmark datasets in terms of average precision.


IEEE Access | 2018

Fast Deep Neural Networks With Knowledge Guided Training and Predicted Regions of Interests for Real-Time Video Object Detection

Wenming Cao; Jianhe Yuan; Zhihai He; Zhi Zhang; Zhiquan He

It has been recognized that deeper and wider neural networks are continuously advancing the state-of-the-art performance of various computer vision and machine learning tasks. However, they often require large sets of labeled data for effective training and suffer from extremely high computational complexity, preventing them from being deployed in real-time systems, for example vehicle object detection from vehicle cameras for assisted driving. In this paper, we aim to develop a fast deep neural network for real-time video object detection by exploring the ideas of knowledge-guided training and predicted regions of interest. Specifically, we will develop a new framework for training deep neural networks on datasets with limited labeled samples using cross-network knowledge projection which is able to improve the network performance while reducing the overall computational complexity significantly. A large pre-trained teacher network is used to observe samples from the training data. A projection matrix is learned to project this teacher-level knowledge and its visual representations from an intermediate layer of the teacher network to an intermediate layer of a thinner and faster student network to guide and regulate the training process. To further speed up the network, we propose to train a low-complexity object detection using traditional machine learning methods, such as support vector machine. Using this low-complexity object detector, we identify the regions of interest that contain the target objects with high confidence. We obtain a mathematical formula to estimate the regions of interest to save the computation for each convolution layer. Our experimental results on vehicle detection from videos demonstrated that the proposed method is able to speed up the network by up to 16 times while maintaining the object detection performance.


international conference on audio language and image processing | 2016

Supervised Feature Learning Network Based on the Improved LLE for face recognition

Dan Meng; Guitao Cao; Wenming Cao; Zhihai He

Deep neural networks (DNNs) have been successfully applied in the fields of computer vision and pattern recognition. One drawback of DNNs is that most of existing DNNs models and their variants usually need to learn a very large set of parameters. Another drawback of DNNs is that DNNs does not fully take the class label and local structure into account during the training stage. To address these issues, this paper proposes a novel approach, called Supervised Feature Learning Network Based on the Improved LLE (SFLNet) for face recognition. The goal of SFLNet is to extract features efficiently. Thus SFLNet consists of learning kernels based on the improved Locally Linear Embedding (LLE) and multiscale feature analysis. Instead of taking image pixels as the input of LLE algorithm, the improved LLE uses linear discriminant kernel distance (LDKD). Besides, the outputs of the improved LLE are convolutional kernels, not the dimensional reduction features. Mutiscale feature analysis enhances the insensitive to complex changes caused by large pose, expression, or illumination variations. So SFLNet has better discrimination and is more suitable for face recognition task. Experimental results on Extended Yale B and AR dataset shows the impressive improvement of the proposed method and robustness to occlusion when compared with other state-of-art methods.


bioinformatics and biomedicine | 2016

Automated human physical function measurement using constrained high dispersal network with SVM-linear

Dan Meng; Guitao Cao; Xinyu Song; Weiting Chen; Wenming Cao

Physical measurement have been becoming increasingly helpful in monitoring the humans health status. Manual measurement of physical status is time consuming and may result in misdiagnosing, so an automatic method for identification the status of physical is urgently needed. This paper presents a novel feature extraction method based on using constrained high dispersal network for depth images and coped with Support Vector Machines (SVM) to measure human physical function. The proposed method can catch the most representative features of depth images belonging to different actions and statuses. We analyze the representation efficiency of hand-crafted features (HOG features, and LBP features), deep learning features (CNN features, and PCANet features) and our proposed deep learning features separately in order to validate the efficiency and accuracy of our proposed method. The results show superior performance of 85.19% on 3840 samples (three actions, each with four different statuses, and every status contains sixteen sequences) when the proposed deep features combined with SVM.


bioinformatics and biomedicine | 2016

Facial expression recognition based on LLENet

Dan Meng; Guitao Cao; Zhihai He; Wenming Cao

Facial expression recognition plays an important role in lie detection, and computer-aided diagnosis. Many deep learning facial expression feature extraction methods have a great improvement in recognition accuracy and robutness than traditional feature extraction methods. However, most of current deep learning methods need special parameter tuning and ad hoc fine-tuning tricks. This paper proposes a novel feature extraction model called Locally Linear Embedding Network (LLENet) for facial expression recognition. The proposed LLENet first reconstructs image sets for the cropped images. Unlike previous deep convolutional neural networks that initialized convolutional kernels randomly, we learn multi-stage kernels from reconstructed image sets directly in a supervised way. Also, we create an improved LLE to select kernels, from which we can obtain the most representative feature maps. Furthermore, to better measure the contribution of these kernels, a new distance based on kernel Euclidean is proposed. After the procedure of multi-scale feature analysis, feature representations are finally sent into a linear classifier. Experimental results on facial expression datasets (CK+) show that the proposed model can capture most representative features and thus improves previous results.


international conference on audio, language and image processing | 2010

Fuzzy geometric localization for triangular grid deployment in passive sensor networks

Rui Wang; Wenming Cao; Wanggen Wan; Yanshan Li

For bearings-only target localization in passive sensor networks, a novel analysis approach based on fuzzy geometry is introduced to investigate the fuzzy measurability for a moving target in R2 space. The fuzzy analytical bias expressions are derived. And the interplay between fuzzy localization geometry and the fuzzy estimation bias for the case of fuzzy triangular grid deployment is analyzed in detail in sensor networks, which can realize the 3-dimensional target localization including fuzzy estimate position and velocity by measuring the fuzzy azimuth angles at intervals of fixed time. The theoretical findings of the paper are backed up with simulation results.

Collaboration


Dive into the Wenming Cao's collaboration.

Top Co-Authors

Avatar

Guitao Cao

East China Normal University

View shared research outputs
Top Co-Authors

Avatar

Zhihai He

University of Missouri

View shared research outputs
Top Co-Authors

Avatar

Dan Meng

East China Normal University

View shared research outputs
Top Co-Authors

Avatar

Weiting Chen

East China Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kun Wang

East China Normal University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xinyu Song

East China Normal University

View shared research outputs
Top Co-Authors

Avatar

Chen Huang

University of Missouri

View shared research outputs
Top Co-Authors

Avatar

Zhi Zhang

University of Missouri

View shared research outputs
Researchain Logo
Decentralizing Knowledge