Xiangyang Ji
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiangyang Ji.
european conference on computer vision | 2012
Genzhi Ye; Yebin Liu; Nils Hasler; Xiangyang Ji; Qionghai Dai; Christian Theobalt
We present an algorithm for marker-less performance capture of interacting humans using only three hand-held Kinect cameras. Our method reconstructs human skeletal poses, deforming surface geometry and camera poses for every time step of the depth video. Skeletal configurations and camera poses are found by solving a joint energy minimization problem which optimizes the alignment of RGBZ data from all cameras, as well as the alignment of human shape templates to the Kinect data. The energy function is based on a combination of geometric correspondence finding, implicit scene segmentation, and correspondence finding using image features. Only the combination of geometric and photometric correspondences and the integration of human pose and camera pose estimation enables reliable performance capture with only three sensors. As opposed to previous performance capture methods, our algorithm succeeds on general uncontrolled indoor scenes with potentially dynamic background, and it succeeds even if the cameras are moving.
IEEE Transactions on Circuits and Systems for Video Technology | 2009
Yongbing Zhang; Debin Zhao; Xiangyang Ji; Ronggang Wang; Wen Gao
This paper proposes a spatio-temporal auto regressive (STAR) model for frame rate upconversion. In the STAR model, each pixel in the interpolated frame is approximated as the weighted combination of a sample space including the pixels within its two temporal neighborhoods from the previous and following original frames as well as the available interpolated pixels within its spatial neighborhood in the current to-be-interpolated frame. To derive accurate STAR weights, an iterative self-feedback weight training algorithm is proposed. In each iteration, first the pixels of each training window in the interpolated frames are approximated by the sample space from the previous and following original frames and the to-be-interpolated frame. And then the actual pixels of each training window in the original frame are approximated by the sample space from the previous and following interpolated frames and the current original frame with the same weights. The weights of each training window are calculated by jointly minimizing the distortion between the interpolated frames in the current and previous iterations as well as the distortion between the original frame and its interpolated one. Extensive simulation results demonstrate that the proposed STAR model is able to yield the interpolated frames with high performance in terms of both subjective and objective qualities.
IEEE Transactions on Circuits and Systems for Video Technology | 2012
Qifei Wang; Xiangyang Ji; Qionghai Dai; Naiyao Zhang
To improve free viewpoint video (FVV) coding efficiency and optimize the quality of the synthesized virtual view video, this paper proposes a depth-assisted FVV coding framework and analyzes the rate-distortion (R-D) property of the synthesized virtual view video in FVV coding. In the depth-assisted FVV coding framework, the depth assigned disparity compensated prediction is introduced to exploit the correlation between multiview video (MVV) and depth. To model the R-D property of the synthesized virtual view video, a region-based view synthesis distortion estimation approach is investigated with respect to the distortion of MVV and depth. Subsequently, the general R-D property estimation models of MVV and depth are analyzed. Finally, a rate-allocation scheme is designed to optimize the quantization parameter pair of MVV and depth in FVV coding. The simulation results demonstrate that the proposed depth-assisted FVV coding framework can improve the FVV coding efficiency. The region-based view synthesis distortion estimation approach and the general R-D model are able to precisely approximate the R-D property of synthesized virtual view video in the multiview video plus depth based FVV coding frameworks. The proposed rate-allocation scheme can optimize the overall FVV coding efficiency to achieve a high-quality reconstructed video at the desired viewpoint with a given rate constraint.
IEEE Transactions on Image Processing | 2013
Jingyu Lin; Xiangyang Ji; Wenli Xu; Qionghai Dai
Shape from defocus (SFD) is one of the most popular techniques in monocular 3D vision. While most SFD approaches require two or more images of the same scene captured at a fixed view point, this paper presents an efficient approach to estimate absolute depth from a single defocused image. Instead of directly measuring defocus level of each pixel, we propose to design a sequence of aperture-shape filters to segment a defocused image by defocus level. A boundary-weighted belief propagation algorithm is employed to obtain a smooth depth map. We also give an estimation of depth error. Extensive experiments show that our approach outperforms the state-of-the-art single-image SFD approaches both in precision of the estimated absolute depth and running time.
IEEE Transactions on Image Processing | 2011
Long Xu; Debin Zhao; Xiangyang Ji; Lei Deng; Sam Kwong; Wen Gao
In rate control, smooth picture quality and smooth buffer occupancy are both important but contrary to each other at a given bit rate. How to get a good tradeoff between them was not devoted much attention previously. To deal with this problem, a theoretical window model is proposed in this paper, in which several adjacent frames grouped as a window are considered together. The smoothness of both picture quality and buffer occupancy can be gracefully achieved by regulating the size of the window. To illustrate the usage of window model, a window-level rate control algorithm cooperated with the traditional ρ-domain rate-distortion model is further introduced. In experiments, we first show how the proposed window model achieves the tradeoff between picture quality smoothness and buffer smoothness, and then demonstrate the significant PSNR improvement, accuracy of bit control and consistency of visual quality of the proposed window-level rate control algorithm.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Da Liu; Debin Zhao; Xiangyang Ji; Wen Gao
In dual frame motion compensation (DFMC), one short-term reference frame and one long-term reference frame (LTR) are utilized for motion compensation. The performance of DFMC is heavily influenced by the jump updating parameter and bit allocation for the reference frames. In this paper, first the rate-distortion performance analysis of motion compensated prediction in DFMC is presented. Based on this analysis, an adaptive jump updating DFMC (JU-DFMC) with optimal LTR selection and bit allocation is proposed. Subsequently, an error resilient JU-DFMC is further presented based on the error propagation analysis of the proposed adaptive JU-DFMC. The experimental results show that the proposed adaptive JU-DFMC achieves better performance over the existing JU-DFMC schemes and the normal DFMC scheme, in which the temporally most recently decoded two frames are used as the references. The performance of the adaptive JU-DFMC is significantly improved for video transmission over noisy channels when the specified error resilience functionality is introduced.
international conference on image processing | 2007
Xiaoming Li; Debin Zhao; Xiangyang Ji; Qiang Wang; Wen Gao
The multi-view video coding improves the coding efficiency by utilizing motion-compensated prediction (MCP) and disparity-compensated prediction (DCP). However, the complexity of the inter frame prediction is very high, especially when the rate-distortion optimization is used. This paper presents a fast inter frame prediction algorithm to reduce the complexity. Firstly the prediction type is decided according to reference frames. Then some unuseful search regions in view direction are removed. Finally a fast inter mode decision strategy is proposed based on the relationship between MCP and DCP. Experimental results verify that the proposed algorithm can greatly increase the speed of prediction with negligible loss of coding efficiency.
IEEE Transactions on Systems, Man, and Cybernetics | 2014
Yue Deng; Yipeng Li; Yanjun Qian; Xiangyang Ji; Qionghai Dai
Codebook-based learning provides a flexible way to extract the contents of an image in a data-driven manner for visual recognition. One central task in such frameworks is codeword assignment, which allocates local image descriptors to the most similar codewords in the dictionary to generate histogram for categorization. Nevertheless, existing assignment approaches, e.g., nearest neighbors strategy (hard assignment) and Gaussian similarity (soft assignment), suffer from two problems: 1) too strong Euclidean assumption and 2) neglecting the label information of the local descriptors. To address the aforementioned two challenges, we propose a graph assignment method with maximal mutual information (GAMI) regularization. GAMI takes the power of manifold structure to better reveal the relationship of massive number of local features by nonlinear graph metric. Meanwhile, the mutual information of descriptor-label pairs is ultimately optimized in the embedding space for the sake of enhancing the discriminant property of the selected codewords. According to such objective, two optimization models, i.e., inexact-GAMI and exact-GAMI, are respectively proposed in this paper. The inexact model can be efficiently solved with a closed-from solution. The stricter exact-GAMI nonparametrically estimates the entropy of descriptor-label pairs in the embedding space and thus leads to a relatively complicated but still trackable optimization. The effectiveness of GAMI models are verified on both the public and our own datasets.
international symposium on circuits and systems | 2007
Long Xu; Wen Gao; Xiangyang Ji; Debin Zhao
The coding performance can be further improved when the hierarchical B-picture coding is introduced into H.264/AVC. However, the existing rate control schemes can not work efficiently in such new coding framework. This paper proposes a novel rate control algorithm when hierarchical B-picture coding is used in H.264/AVC. Firstly, a set of scaling-factors applied in designing cascaded quantizer for the B frames at different temporal levels is introduced. Based on the designed scaling-factors, an efficient bit-allocation strategy for hierarchical B-picture coding is presented. The experiments show that the proposed rate control algorithm can further improve PSNR up to 0.7dB compared to the existing hierarchical B-picture coding in H.264/AVC, while the mismatch of target bit rate and real bit rate does not exceed 2%.
IEEE Transactions on Circuits and Systems for Video Technology | 2010
Hongjiang Xiao; Qionghai Dai; Xiangyang Ji; Wenwu Zhu
The multiple-input-multiple-output (MIMO) and cooperative communication are two state-of-the-art techniques to provide high-rate high-quality video communication services. By taking advantage of both techniques, this paper presents a novel joint source-channel coding (JSCC) framework for scalable video transmission over cooperative MIMO. In this framework, we first propose a cooperative MIMO architecture, which employs macro-micro power control strategy as a relaying protocol to determine the on/off mode of relays and the specific power allocation among them either by equal power amplification or by cooperative beamforming. Then, an unequal error protection structure is proposed to protect the video layers with different importance levels by concatenating the rate-variable low-density parity-check codes and diversity-embedded space-time block codes. Moreover, for the purpose of channel adaptation, the switch of space-time codes is designed to achieve different diversity and multiplexing tradeoff points. Finally, the JSCC algorithm integrated with diversity-multiplexing-coding gain tradeoff is proposed to optimize the resources of cooperative system to improve the transmission quality of scalable video. Experimental results demonstrate the effectiveness of our proposed schemes.