Haoqian Wang
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Haoqian Wang.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015
Kai Li; Jue Wang; Haoqian Wang; Qionghai Dai
We present a fully automatic system for extracting the semantic structure of a typical academic presentation video, which captures the whole presentation stage with abundant camera motions such as panning, tilting, and zooming. Our system automatically detects and tracks both the projection screen and the presenter whenever they are visible in the video. By analyzing the image content of the tracked screen region, our system is able to detect slide progressions and extract a high-quality, non-occluded, geometrically-compensated image for each slide, resulting in a list of representative images that reconstruct the main presentation structure. Afterwards, our system recognizes text content and extracts keywords from the slides, which can be used for keyword-based video retrieval and browsing. Experimental results show that our system is able to generate more stable and accurate screen localization results than commonly-used object tracking methods. Our system also extracts more accurate presentation structures than general video summarization methods, for this specific type of video.
IEEE Transactions on Circuits and Systems for Video Technology | 2017
Yongbing Zhang; Huijin Lv; Yebin Liu; Haoqian Wang; Xingzheng Wang; Qian Huang; Xinguang Xiang; Qionghai Dai
In this paper, we propose a novel method for 4D light-field (LF) depth estimation exploiting the special linear structure of an epipolar plane image (EPI) and locally linear embedding (LLE). Without high computational complexity, depth maps are locally estimated by locating the optimal slope of each line segmentation on the EPIs, which are projected by the corresponding scene points. For each pixel to be processed, we build and then minimize the matching cost that aggregates the intensity pixel value, gradient pixel value, spatial consistency, as well as reliability measure to select the optimal slope from a predefined set of directions. Next, a subangle estimation method is proposed to further refine the obtained optimal slope of each pixel. Furthermore, based on a local reliability measure, all the pixels are classified into reliable and unreliable pixels. For the unreliable pixels, LLE is employed to propagate the missing pixels by the reliable pixels based on the assumption of manifold preserving property maintained by natural images. We demonstrate the effectiveness of our approach on a number of synthetic LF examples and real-world LF data sets, and show that our experimental results can achieve higher performance than the typical and recent state-of-the-art LF stereo matching methods.
visual communications and image processing | 2013
Haoqian Wang; Mian Wu; Yongbing Zhang; Lei Zhang
In this paper, we propose an effective stereo matching algorithm using reliable points and region-based graph cut. Firstly, the initial disparity maps are calculated via local windowbased method. Secondly, the unreliable points are detected according to the DSI(Disparity Space Image) and the estimated disparity values of each unreliable point are obtained by considering its surrounding points. Then, the scheme of reliable points is introduced in region-based graph cut framework to optimize the initial result. Finally, remaining errors in the disparity results are effectively handled in a multi-step refinement process. Experiment results show that the proposed algorithm achieves a significant reduction in computation cost and guarantee high matching quality.
IEEE Transactions on Circuits and Systems for Video Technology | 2013
Yongbing Zhang; Xiangyang Ji; Haoqian Wang; Qionghai Dai
Stereo interleaving video coding, in which both left and right view frames are subsampled into half size and multiplexed into one single frame before being encoded by a traditional 2-D video encoder, is an efficient encoding scenario for stereoscopic video. Many existing stereo interleaving video coding methods subsample each frame by utilizing fixed subsampling filter coefficients. Such methods are easy to implement; however, the varying property of the frame signal is ignored. By jointly considering the influences of subsampling and compression, a rate and distortion analysis about stereo interleaving video coding is proposed. The final distortion in stereo interleaving video coding is the summation of errors caused by subsampling (causing distortion between subsampling-interpolated image and the original full resolution one) and by quantization during compression. Based on the provided rate distortion analysis, a content adaptive image subsampling (CAIS) is also proposed. In CAIS, the half-size frames are generated by the optimal subsampling filters, which are calculated based on frame contents and the targeted interpolation coefficients. Experimental results demonstrate that the proposed CAIS is able to greatly improve compression efficiency of stereo interleaving video coding.
IEEE Transactions on Image Processing | 2016
Jinli Suo; Dongsheng An; Xiangyang Ji; Haoqian Wang; Qionghai Dai
Specular reflection exists widely in photography and causes the recorded color deviating from its true value, thus, fast and high quality highlight removal from a single nature image is of great importance. In spite of the progress in the past decades in highlight removal, achieving wide applicability to the large diversity of nature scenes is quite challenging. To handle this problem, we propose an analytic solution to highlight removal based on an L2 chromaticity definition and corresponding dichromatic model. Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively. In the former illumination orthogonal subspace, which is specular-free, we can conduct robust clustering with an explicit criterion to determine the cluster number adaptively. In the latter, illumination parallel subspace, a property called pure diffuse pixels distribution rule helps map each specular-influenced pixel to its diffuse component. In terms of efficiency, the proposed approach involves few complex calculation, and thus can remove highlight from high resolution images fast. Experiments show that this method is of superior performance in various challenging cases.Specular reflection exists widely in photography and causes the recorded color deviating from its true value, thus, fast and high quality highlight removal from a single nature image is of great importance. In spite of the progress in the past decades in highlight removal, achieving wide applicability to the large diversity of nature scenes is quite challenging. To handle this problem, we propose an analytic solution to highlight removal based on an L2 chromaticity definition and corresponding dichromatic model. Specifically, this paper derives a normalized dichromatic model for the pixels with identical diffuse color: a unit circle equation of projection coefficients in two subspaces that are orthogonal to and parallel with the illumination, respectively. In the former illumination orthogonal subspace, which is specular-free, we can conduct robust clustering with an explicit criterion to determine the cluster number adaptively. In the latter, illumination parallel subspace, a property called pure diffuse pixels distribution rule helps map each specular-influenced pixel to its diffuse component. In terms of efficiency, the proposed approach involves few complex calculation, and thus can remove highlight from high resolution images fast. Experiments show that this method is of superior performance in various challenging cases.
visual communications and image processing | 2015
Yulun Zhang; Yongbing Zhang; Jian Zhang; Haoqian Wang; Xingzheng Wang; Qionghai Dai
We propose a fast single image super-resolution algorithm based on adaptive local nonparametric regression. Making use of dictionary learning and regression, we learn multiple projection matrices mapping low-resolution features to their corresponding high-resolution ones directly. Different from previous linear regression that needs some constant parameters, our method would not use extra parameters for regression. We use the mutual coherence between dictionary atom and low-resolution feature as a label to reconstruct more sophisticated high-resolution feature. As we use the same form of mutual coherence as labels in both training and testing phases, our method would lead to an adaptive local linear regression model. Moreover, we investigate the statistical property of the dictionary atoms from the training features. Utilizing the learned statistical priors, our method would not only obtain more useful dictionary atoms, but also further decrease the computational time. As shown in our experimental results, the proposed method yields high-quality super-resolution images quantitatively and visually against state-of-the-art methods.
advances in multimedia | 2015
Yulun Zhang; Yongbing Zhang; Jian Zhang; Haoqian Wang; Qionghai Dai
We propose a new model called iterative collaborative representation ICR for image super-resolution SR. Most of popular SR approaches extract low-resolution LR features from the given LR image directly to recover its corresponding high-resolution HR features. However, they neglect to utilize the reconstructed HR image for further image SR enhancement. Based on this observation, we extract features from the reconstructed HR image to progressively upscale LR image in an iterative way. In the learning phase, we use the reconstructed and the original HR images as inputs to train the mapping models. These mapping models are then used to upscale the original LR images. In the reconstruction phase, mapping models and LR features extracted from the LR and reconstructed image are then used to conduct image SR in each iteration. Experimental results on standard images demonstrate that our ICR obtains state-of-the-art SR performance quantitatively and visually, surpassing recently published leading SR methods.
Computer Graphics Forum | 2013
Jinlong Ju; Jue Wang; Yebin Liu; Haoqian Wang; Qionghai Dai
Previous video matting approaches mostly adopt the “binary segmentation + matting” strategy, i.e., first segment each frame into foreground and background regions, then extract the fine details of the foreground boundary using matting techniques. This framework has several limitations due to the fact that binary segmentation is employed. In this paper, we propose a new supervised video matting approach. Instead of applying binary segmentation, we explicitly model segmentation uncertainty in a novel tri-level segmentation procedure. The segmentation is done progressively, enabling us to handle difficult cases such as large topology changes, which are challenging to previous approaches. The tri-level segmentation results can be naturally fed into matting techniques to generate the final alpha mattes. Experimental results show that our system can generate high quality results with less user inputs than the state-of-the-art methods.
visual communications and image processing | 2015
Judong Wu; Haoqian Wang; Xingzheng Wang; Yongbing Zhang
We propose a novel light field super-resolution framework based on hybrid imaging system, which combines two different imaging mechanisms: conventional imaging and current art-of-the-state imaging - light field imaging. We take advantage of conventional imaging in spatial resolution to make up light field and reconstruct a higher quality light field. In our method, we classify the points of the 3D scene: First, for highlight and occlusion, dictionary learning based interpolation is utilized, Second, for other areas, an improved patch matching algorithm is applied. As shown in experimental results, compared with four methods, which include the art-of-the-state algorithms, our approach is effective.
Optics Express | 2015
Dongsheng An; Jinli Suo; Haoqian Wang; Qionghai Dai
The reflection spectrum of an object characterizes its surface material, but for non-Lambertian scenes, the recorded spectrum often deviates owing to specular contamination. To compensate for this deviation, the illumination spectrum is required, and it can be estimated from specularity. However, existing illumination-estimation methods often degenerate in challenging cases, especially when only weak specularity exists. By adopting the dichromatic reflection model, which formulates a specular-influenced image as a linear combination of diffuse and specular components, this paper explores two individual priors and one mutual prior upon these two components: (i) The chromaticity of a specular component is identical over all the pixels. (ii) The diffuse component of a specular-contaminated pixel can be reconstructed using its specular-free counterpart describing the same material. (iii) The spectrum of illumination usually has low correlation with that of diffuse reflection. A general optimization framework is proposed to estimate the illumination spectrum from the specular component robustly and accurately. The results of both simulation and real experiments demonstrate the robustness and accuracy of our method.