Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wenguan Wang is active.

Publication


Featured researches published by Wenguan Wang.


computer vision and pattern recognition | 2015

Saliency-aware geodesic video object segmentation

Wenguan Wang; Jianbing Shen; Fatih Porikli

We introduce an unsupervised, geodesic distance based, salient video object segmentation method. Unlike traditional methods, our method incorporates saliency as prior for object via the computation of robust geodesic measurement. We consider two discriminative visual features: spatial edges and temporal motion boundaries as indicators of foreground object locations. We first generate framewise spatiotemporal saliency maps using geodesic distance from these indicators. Building on the observation that foreground areas are surrounded by the regions with high spatiotemporal edge values, geodesic distance provides an initial estimation for foreground and background. Then, high-quality saliency results are produced via the geodesic distances to background regions in the subsequent frames. Through the resulting saliency maps, we build global appearance models for foreground and background. By imposing motion continuity, we establish a dynamic location model for each frame. Finally, the spatiotemporal saliency maps, appearance models and dynamic location models are combined into an energy minimization framework to attain both spatially and temporally coherent object segmentation. Extensive quantitative and qualitative experiments on benchmark video dataset demonstrate the superiority of the proposed method over the state-of-the-art algorithms.


IEEE Transactions on Image Processing | 2014

Lazy Random Walks for Superpixel Segmentation

Jianbing Shen; Yunfan Du; Wenguan Wang; Xuelong Li

We present a novel image superpixel segmentation approach using the proposed lazy random walk (LRW) algorithm in this paper. Our method begins with initializing the seed positions and runs the LRW algorithm on the input image to obtain the probabilities of each pixel. Then, the boundaries of initial superpixels are obtained according to the probabilities and the commute time. The initial superpixels are iteratively optimized by the new energy function, which is defined on the commute time and the texture measurement. Our LRW algorithm with self-loops has the merits of segmenting the weak boundaries and complicated texture regions very well by the new global probability maps and the commute time strategy. The performance of superpixel is improved by relocating the center positions of superpixels and dividing the large superpixels into small ones with the proposed optimization algorithm. The experimental results have demonstrated that our method achieves better performance than previous superpixel approaches.


IEEE Transactions on Image Processing | 2015

Consistent Video Saliency Using Local Gradient Flow Optimization and Global Refinement

Wenguan Wang; Jianbing Shen; Ling Shao

We present a novel spatiotemporal saliency detection method to estimate salient regions in videos based on the gradient flow field and energy optimization. The proposed gradient flow field incorporates two distinctive features: 1) intra-frame boundary information and 2) inter-frame motion information together for indicating the salient regions. Based on the effective utilization of both intra-frame and inter-frame information in the gradient flow field, our algorithm is robust enough to estimate the object and background in complex scenes with various motion patterns and appearances. Then, we introduce local as well as global contrast saliency measures using the foreground and background information estimated from the gradient flow field. These enhanced contrast saliency cues uniformly highlight an entire object. We further propose a new energy function to encourage the spatiotemporal consistency of the output saliency maps, which is seldom explored in previous video saliency methods. The experimental results show that the proposed algorithm outperforms state-of-the-art video saliency detection methods.


IEEE Transactions on Image Processing | 2015

Robust Video Object Cosegmentation

Wenguan Wang; Jianbing Shen; Xuelong Li; Fatih Porikli

With ever-increasing volumes of video data, automatic extraction of salient object regions became even more significant for visual analytic solutions. This surge has also opened up opportunities for taking advantage of collective cues encapsulated in multiple videos in a cooperative manner. However, it also brings up major challenges, such as handling of drastic appearance, motion pattern, and pose variations, of foreground objects as well as indiscriminate backgrounds. Here, we present a co segmentation framework to discover and segment out common object regions across multiple frames and multiple videos in a joint fashion. We incorporate three types of cues, i.e., intraframe saliency, interframe consistency, and across-video similarity into an energy optimization framework that does not make restrictive assumptions on foreground appearance and motion model, and does not require objects to be visible in all frames. We also introduce a spatio-temporal scale-invariant feature transform (SIFT) flow descriptor to integrate across-video correspondence from the conventional SIFT-flow into interframe motion flow from optical flow. This novel spatio-temporal SIFT flow generates reliable estimations of common foregrounds over the entire video data set. Experimental results show that our method outperforms the state-of-the-art on a new extensive data set (ViCoSeg).


IEEE Transactions on Image Processing | 2018

Video Salient Object Detection via Fully Convolutional Networks

Wenguan Wang; Jianbing Shen; Ling Shao

This paper proposes a deep learning model to efficiently detect salient regions in videos. It addresses two important issues: 1) deep video saliency model training with the absence of sufficiently large and pixel-wise annotated video data and 2) fast video saliency training and detection. The proposed deep video saliency network consists of two modules, for capturing the spatial and temporal saliency information, respectively. The dynamic saliency model, explicitly incorporating saliency estimates from the static saliency model, directly produces spatiotemporal saliency inference without time-consuming optical flow computation. We further propose a novel data augmentation technique that simulates video training data from existing annotated image data sets, which enables our network to learn diverse saliency information and prevents overfitting with the limited number of training videos. Leveraging our synthetic video data (150K video sequences) and real videos, our deep video saliency model successfully learns both spatial and temporal saliency cues, thus producing accurate spatiotemporal saliency estimate. We advance the state-of-the-art on the densely annotated video segmentation data set (MAE of .06) and the Freiburg-Berkeley Motion Segmentation data set (MAE of .07), and do so with much improved speed (2 fps with all steps).


IEEE Transactions on Image Processing | 2016

Correspondence Driven Saliency Transfer

Wenguan Wang; Jianbing Shen; Ling Shao; Fatih Porikli

In this paper, we show that large annotated datasets datasets have great potential to provide strong priors for saliency estimation rather than merely serving for benchmark evaluations. To this end, we present a novel image saliency detection method called saliency transfer. Given an input image, we first retrieve a support set of best matches from the large database of saliency annotated images. Then, we assign the transitional saliency scores by warping the support set annotations onto the input image according to computed dense correspondences. To incorporate context, we employ two complementary correspondence strategies: a global matching scheme based on scene-level analysis, and a local matching scheme based on patch-level inference. We then introduce two refinement measures to further refine the saliency maps, and apply the random-walk-with-restart by exploring the global saliency structure to estimate the affinity between foreground and background assignments. Extensive experimental results on four publicly available benchmark datasets demonstrate that the proposed saliency algorithm consistently outperforms the current state-of-the-art methods.In this paper, we show that large annotated data sets have great potential to provide strong priors for saliency estimation rather than merely serving for benchmark evaluations. To this end, we present a novel image saliency detection method called saliency transfer. Given an input image, we first retrieve a support set of best matches from the large database of saliency annotated images. Then, we assign the transitional saliency scores by warping the support set annotations onto the input image according to computed dense correspondences. To incorporate context, we employ two complementary correspondence strategies: a global matching scheme based on scene-level analysis and a local matching scheme based on patch-level inference. We then introduce two refinement measures to further refine the saliency maps and apply the random-walk-with-restart by exploring the global saliency structure to estimate the affinity between foreground and background assignments. Extensive experimental results on four publicly available benchmark data sets demonstrate that the proposed saliency algorithm consistently outperforms the current state-of-the-art methods.


IEEE Transactions on Image Processing | 2016

Real-Time Superpixel Segmentation by DBSCAN Clustering Algorithm

Jianbing Shen; Xiaopeng Hao; Zhiyuan Liang; Yu Liu; Wenguan Wang; Ling Shao

In this paper, we propose a real-time image superpixel segmentation method with 50 frames/s by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. In order to decrease the computational costs of superpixel algorithms, we adopt a fast two-step framework. In the first clustering stage, the DBSCAN algorithm with color-similarity and geometric restrictions is used to rapidly cluster the pixels, and then, small clusters are merged into superpixels by their neighborhood through a distance measurement defined by color and spatial features in the second merging stage. A robust and simple distance function is defined for obtaining better superpixels in these two steps. The experimental results demonstrate that our real-time superpixel algorithm (50 frames/s) by the DBSCAN clustering outperforms the state-of-the-art superpixel segmentation methods in terms of both accuracy and efficiency.In this paper, we propose a real-time image superpixel segmentation method with 50 frames/s by using the density-based spatial clustering of applications with noise (DBSCAN) algorithm. In order to decrease the computational costs of superpixel algorithms, we adopt a fast two-step framework. In the first clustering stage, the DBSCAN algorithm with color-similarity and geometric restrictions is used to rapidly cluster the pixels, and then, small clusters are merged into superpixels by their neighborhood through a distance measurement defined by color and spatial features in the second merging stage. A robust and simple distance function is defined for obtaining better superpixels in these two steps. The experimental results demonstrate that our real-time superpixel algorithm (50 frames/s) by the DBSCAN clustering outperforms the state-of-the-art superpixel segmentation methods in terms of both accuracy and efficiency.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2018

Saliency-Aware Video Object Segmentation

Wenguan Wang; Jianbing Shen; Ruigang Yang; Fatih Porikli

Video saliency, aiming for estimation of a single dominant object in a sequence, offers strong object-level cues for unsupervised video object segmentation. In this paper, we present a geodesic distance based technique that provides reliable and temporally consistent saliency measurement of superpixels as a prior for pixel-wise labeling. Using undirected intra-frame and inter-frame graphs constructed from spatiotemporal edges or appearance and motion, and a skeleton abstraction step to further enhance saliency estimates, our method formulates the pixel-wise segmentation task as an energy minimization problem on a function that consists of unary terms of global foreground and background models, dynamic location models, and pairwise terms of label smoothness potentials. We perform extensive quantitative and qualitative experiments on benchmark datasets. Our method achieves superior performance in comparison to the current state-of-the-art in terms of accuracy and speed.


IEEE Transactions on Multimedia | 2016

Higher-Order Image Co-segmentation

Wenguan Wang; Jianbing Shen

A novel interactive image cosegmentation algorithm using likelihood estimation and higher order energy optimization is proposed for extracting common foreground objects from a group of related images. Our approach introduces the higher order cliques, energy into the cosegmentation optimization process successfully. A region-based likelihood estimation procedure is first performed to provide the prior knowledge for our higher order energy function. Then, a new cosegmentation energy function using higher order cliques is developed, which can efficiently cosegment the foreground objects with large appearance variations from a group of images in complex scenes. Both the quantitative and qualitative experimental results on representative datasets demonstrate that the accuracy of our cosegmentation results is much higher than the state-of-the-art cosegmentation methods.


IEEE Transactions on Image Processing | 2018

Deep Visual Attention Prediction

Wenguan Wang; Jianbing Shen

In this paper, we aim to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. Although convolutional neural networks (CNNs) have made substantial improvement on human attention prediction, it is still needed to improve the CNN-based attention models by efficiently leveraging multi-scale features. Our visual attention network is proposed to capture hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned in a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark data sets demonstrate our method yields the state-of-the-art performance with competitive inference time.In this paper, we aim to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. Although convolutional neural networks (CNNs) have made substantial improvement on human attention prediction, it is still needed to improve the CNN-based attention models by efficiently leveraging multi-scale features. Our visual attention network is proposed to capture hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned in a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark data sets demonstrate our method yields the state-of-the-art performance with competitive inference time.11Our source code is available at https://github.com/wenguanwang/deepattention.

Collaboration


Dive into the Wenguan Wang's collaboration.

Top Co-Authors

Avatar

Jianbing Shen

Beijing Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Fatih Porikli

Australian National University

View shared research outputs
Top Co-Authors

Avatar

Ling Shao

University of East Anglia

View shared research outputs
Top Co-Authors

Avatar

Song-Chun Zhu

University of California

View shared research outputs
Top Co-Authors

Avatar

Xingping Dong

Beijing Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Yu Liu

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yuanlu Xu

Sun Yat-sen University

View shared research outputs
Top Co-Authors

Avatar

Hanqiu Sun

The Chinese University of Hong Kong

View shared research outputs
Top Co-Authors

Avatar

Fang Guo

Beijing Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge