Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jiangjian Xiao is active.

Publication


Featured researches published by Jiangjian Xiao.


european conference on computer vision | 2006

Bilateral filtering-based optical flow estimation with occlusion detection

Jiangjian Xiao; Hui Cheng; Harpreet S. Sawhney; Cen Rao; Michael Anthony Isnardi

Using the variational approaches to estimate optical flow between two frames, the flow discontinuities between different motion fields are usually not distinguished even when an anisotropic diffusion operator is applied. In this paper, we propose a multi-cue driven adaptive bilateral filter to regularize the flow computation, which is able to achieve the smoothly varied optical flow field with highly desirable motion discontinuities. First, we separate the traditional one-step variational updating model into a two-step filtering-based updating model. Then, employing our occlusion detector, we reformulate the energy functional of optical flow estimation by explicitly introducing an occlusion term to balance the energy loss due to the occlusion or mismatches. Furthermore, based on the two-step updating framework, a novel multi-cue driven bilateral filter is proposed to substitute the original anisotropic diffusion process, and it is able to adaptively control the diffusion process according to the occlusion detection, image intensity dissimilarity, and motion dissimilarity. After applying our approach on various video sources (movie and TV) in the presence of occlusion, motion blurring, non-rigid deformation, and weak textureness, we generate a spatial-coherent flow field between each pair of input frames and detect more accurate flow discontinuities along the motion boundaries.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2005

Motion layer extraction in the presence of occlusion using graph cuts

Jiangjian Xiao; Mubarak Shah

Extracting layers from video is very important for video representation, analysis, compression, and recognition. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust novel approach to automatically extract a set of affine transformations induced by these regions, and accurately segment the scene into several motion layers. First, a number of seed regions are determined by using two frame correspondences. Then the seed regions are expanded and refined using the level set representation and employing graph cut method. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, after exploiting the occlusion order constraint on multiple frames the robust layer extraction is obtained by graph cut algorithm, and the occlusions between the overlapping layers are explicitly determined. Several examples are demonstrated in the experiments to show that our approach is effective and robust.


workshop on applications of computer vision | 2005

Motion Layer Based Object Removal in Videos

Yunjun Zhang; Jiangjian Xiao; Mubarak Shah

This paper proposes a novel method to generate plausible video sequences after removing relatively large objects from the original videos. In order to maintain temporal coherence among the frames, a motion layer segmentation method is applied. Then, a set of synthesized layers are generated by applying motion compensation and region completion algorithm. Finally, a new video, in which the selected object is removed, is plausibly rendered given the synthesized layers and the motion parameters. A number of example videos are shown in the results to demonstrate the effectiveness of our method


computer vision and pattern recognition | 2010

Vehicle detection and tracking in wide field-of-view aerial video

Jiangjian Xiao; Hui Cheng; Harpreet S. Sawhney; Feng Han

This paper presents a joint probabilistic relation graph approach to simultaneously detect and track a large number of vehicles in low frame rate aerial videos. Due to low frame rate, low spatial resolution and sheer number of moving objects, detection and tracking in wide area video poses unique challenges. In this paper, we explore vehicle behavior model from road structure and generate a set of constraints to regulate both object based vertex matching and pairwise edge matching schemes. The proposed relation graph approach then unifies these two matching schemes into a single cost minimization framework to produce a quadratic optimized association result. The experiments on hours of real videos demonstrate the graph matching framework with vehicle behavior model effectively improves tracking performance in large scale dense traffic scenarios.


computer vision and pattern recognition | 2004

Motion layer extraction in the presence of occlusion using graph cut

Jiangjian Xiao; Mubarak Shah

Extracting layers from video is very important for video representation, analysis, compression, and recognition. Assuming that a scene can be approximately described by multiple planar regions, this paper describes a robust novel approach to automatically extract a set of affine transformations induced by these regions, and accurately segment the scene into several motion layers. First, a number of seed regions are determined by using two frame correspondences. Then the seed regions are expanded and refined using the level set representation and employing graph cut method. Next, these initial regions are merged into several initial layers according to the motion similarity. Third, after exploiting the occlusion order constraint on multiple frames the robust layer extraction is obtained by graph cut algorithm, and the occlusions between the overlapping layers are explicitly determined. Several examples are demonstrated in the experiments to show that our approach is effective and robust.


international conference on computer vision | 2003

Two-frame wide baseline matching

Jiangjian Xiao; Shah

We describe a novel approach to automatically recover corresponding feature points and epipolar geometry over two wide baseline frames. Our contributions consist of several aspects: First, the use of an affine invariant feature, edge-corner, is introduced to provide a robust and consistent matching primitives. Second, based on SVD decomposition of affine matrix, the affine matching space between two corners can be approximately divided into two independent spaces by rotation angle and scaling factor. Employing this property, a two-stage affine matching algorithm is designed to obtain robust matches over two frames. Third, using the epipolar geometry estimated by these matches, more corresponding feature points are determined. Based on these robust correspondences, the fundamental matrix is refined, and a series of virtual views of the scene are synthesized. Finally, several experiments are presented to illustrate that a number of robust correspondences can be stably determined for two wide baseline images under significant camera motions with illumination changes, occlusions, and self-similarities. After testing a number of examples and comparing with the existing methods, the experimental results strongly demonstrate that our matching method outperforms the state-of-art algorithms for all of the test cases.


Computer Vision and Image Understanding | 2004

Tri-view morphing

Jiangjian Xiao; Mubarak Shah

This paper presents an efficient image-based approach to navigate a scene based on only three wide-baseline uncalibrated images without the explicit use of a 3D model. After automatically recovering corresponding points between each pair of images, an accurate trifocal plane is extracted from the trifocal tensor of these three images. Next, based on a small number of feature marks using a friendly GUI, the correct dense disparity maps are obtained by using our trinocular-stereo algorithm. Employing the barycentric warping scheme with the computed disparity, we can generate an arbitrary novel view within a triangle spanned by three camera centers. Furthermore, after self-calibration of the cameras, 3D objects can be correctly augmented into the virtual environment synthesized by the tri-view morphing algorithm. Three applications of the tri-view morphing algorithm are demonstrated. The first one is 4D video synthesis, which can be used to fill in the gap between a few sparsely located video cameras to synthetically generate a video from a virtual moving camera. This synthetic camera can be used to view the dynamic scene from a novel view instead of the original static camera views. The second application is multiple view morphing, where we can seamlessly fly through the scene over a 2D space constructed by more than three cameras. The last one is dynamic scene synthesis using three still images, where several rigid objects may move in any orientation or direction. After segmenting three reference frames into several layers, the novel views in the dynamic scene can be generated by applying our algorithm. Finally, the experiments are presented to illustrate that a series of photo-realistic virtual views can be generated to fly through a virtual environment covered by several static cameras.


computer vision and pattern recognition | 2005

Accurate motion layer segmentation and matting

Jiangjian Xiao; Mubarak Shah

Given a video sequence, obtaining accurate layer segmentation and alpha matting is very important for various applications. However, when a non-textured or smooth area is present in the scene, the segmentation based on only single motion cue usually cannot provide satisfactory results. Conversely, the most matting approaches require a smooth assumption on foreground and background to obtain a good result. In this paper, we combine the merits of motion segmentation and alpha matting technique together to simultaneously achieve high-quality layer segmentation and alpha mattes. First, we explore a general occlusion constraint and design a novel graph cuts framework to solve the layer-based motion segmentation problem for the textured regions using multiple frames. Then, an alpha matting technique is further used to refine the segmentation and resolve the non-textured ambiguities by determining proper alpha values for the foreground and background respectively.


computer vision and pattern recognition | 2008

Geo-spatial aerial video processing for scene understanding and object tracking

Jiangjian Xiao; Hui Cheng; Feng Han; Harpreet S. Sawhney

This paper presents an approach to extracting and using semantic layers from low altitude aerial videos for scene understanding and object tracking. The input video is captured by low flying aerial platforms and typically consists of strong parallax from non-ground-plane structures. A key aspect of our approach is the use of geo-registration of video frames to reference image databases (such as those available from Terraserver and Google satellite imagery) to establish a geo-spatial coordinate system for pixels in the video. Geo-registration enables Euclidean 3D reconstruction with absolute scale unlike traditional monocular structure from motion where continuous scale estimation over long periods of time is an issue. Geo-registration also enables correlation of video data to other stored information sources such as GIS (geo-spatial information system) databases. In addition to the geo-registration and 3D reconstruction aspects, the key contributions of this paper include: (1) exploiting appearance and 3D shape constraints derived from geo-registered videos for labeling of structures such as buildings, foliage, and roads for scene understanding, and (2) elimination of moving object detection and tracking errors using 3D parallax constraints and semantic labels derived from geo-registered videos. Experimental results on extended time aerial video data demonstrates the qualitative and quantitative aspects of our work.


Computer Vision and Image Understanding | 2006

Self-calibration from turn-table sequences in presence of zoom and focus

Xiaochun Cao; Jiangjian Xiao; Hassan Foroosh; Mubarak Shah

This paper proposes a novel method, using constant inter-frame motion, for self-calibration from an image sequence of an object rotating around a single axis with varying camera internal parameters. Our approach makes use of the facts that in many commercial systems rotation angles are often controlled by an electromechanical system, and that the inter-frame essential matrices are invariant if the rotation angles are constant but not necessary known. Therefore, recovering camera internal parameters is possible by making use of the equivalence of essential matrices which relate the unknown calibration matrices to the fundamental matrices computed from the point correspondences. We also describe a linear method that works under restrictive conditions on camera internal parameters, the solution of which can be used as the starting point of the iterative non-linear method with looser constraints. The results are refined by enforcing the global constraints that the projected trajectory of any 3D point should be a conic after compensating for the focusing and zooming effects. Finally, using the bundle adjustment method tailored to the special case, i.e., static camera and constant object rotation, the 3D structure of the object is recovered and the camera parameters are further refined simultaneously. To determine the accuracy and the robustness of the proposed algorithm, we present the results on both synthetic and real sequences.

Collaboration


Dive into the Jiangjian Xiao's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mubarak Shah

University of Central Florida

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Hassan Foroosh

University of Central Florida

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge