Tai-Peng Tian | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tai-Peng Tian is active.

Explore More

Publication

Featured researches published by Tai-Peng Tian.

european conference on computer vision | 2006

Monocular tracking of 3d human motion with a coordinated mixture of factor analyzers

Rui Li; Ming-Hsuan Yang; Stan Sclaroff; Tai-Peng Tian

Filtering based algorithms have become popular in tracking human body pose. Such algorithms can suffer the curse of dimensionality due to the high dimensionality of the pose state space; therefore, efforts have been dedicated to either smart sampling or reducing the dimensionality of the original pose state space. In this paper, a novel formulation that employs a dimensionality reduced state space for multi-hypothesis tracking is proposed. During off-line training, a mixture of factor analyzers is learned. Each factor analyzer can be thought of as a “local dimensionality reducer” that locally approximates the pose manifold. Global coordination between local factor analyzers is achieved by learning a set of linear mixture functions that enforces agreement between local factor analyzers. The formulation allows easy bidirectional mapping between the original body pose space and the low-dimensional space. During online tracking, the clusters of factor analyzers are utilized in a multiple hypothesis tracking algorithm. Experiments demonstrate that the proposed algorithm tracks 3D body pose efficiently and accurately , even when self-occlusion, motion blur and large limb movements occur. Quantitative comparisons show that the formulation produces more accurate 3D pose estimates over time than those that can be obtained via a number of previously-proposed particle filtering based tracking algorithms.

computer vision and pattern recognition | 2005

Articulated Pose Estimation in a Learned Smooth Space of Feasible Solutions

Tai-Peng Tian; Rui Li; Stan Sclaroff

A learning based framework is proposed for estimating human body pose from a single image. Given a differentiable function that maps from pose space to image feature space, the goal is to invert the process: estimate the pose given only image features. The inversion is an ill-posed problem as the inverse mapping is a one to many process, hence multiple solutions exist. It is desirable to restrict the solution space to a smaller subset of feasible solutions. The space of feasible solutions may not admit a closed form description. The proposed framework seeks to learn an approximation over such a space. Using Gaussian Process Latent Variable Modelling. The scaled conjugate gradient method is used to find the best matching pose in the learned space. The formulation allows easy incorporation of various constraints for more accurate pose estimation. The performance of the proposed approach is evaluated in the task of upper-body pose estimation from silhouettes and compared with the Specialized Mapping Architecture. The proposed approach performs better than the latter approach in terms of estimation accuracy with synthetic data and qualitatively better results with real video of humans performing gestures.

computer vision and pattern recognition | 2010

Fast globally optimal 2D human detection with loopy graph models

Tai-Peng Tian; Stan Sclaroff

This paper presents an algorithm for recovering the globally optimal 2D human figure detection using a loopy graph model. This is computationally challenging because the time complexity scales exponentially in the size of the largest clique in the graph. The proposed algorithm uses Branch and Bound (BB) to search for the globally optimal solution. The algorithm converges rapidly in practice and this is due to a novel method for quickly computing tree based lower bounds. The key idea is to recycle the dynamic programming (DP) tables associated with the tree model to look up the tree based lower bound rather than recomputing the lower bound from scratch. This technique is further sped up using Range Minimum Query data structures to provide O(1) cost for computing the lower bound for most iterations of the BB algorithm. The algorithm is evaluated on the Iterative Parsing dataset and it is shown to run fast empirically.

international conference on computer vision | 2007

Simultaneous Learning of Nonlinear Manifold and Dynamical Models for High-dimensional Time Series

Rui Li; Tai-Peng Tian; Stan Sclaroff

The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and conversely, the low-dimensional space allows dynamics to be learnt efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. The proposed solution approximates the nonlinear manifold and dynamics using piecewise linear models. The interactions among the linear models are captured in a graphical model. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.

International Journal of Computer Vision | 2010

3D Human Motion Tracking with a Coordinated Mixture of Factor Analyzers

Rui Li; Tai-Peng Tian; Stan Sclaroff; Ming-Hsuan Yang

A major challenge in applying Bayesian tracking methods for tracking 3D human body pose is the high dimensionality of the pose state space. It has been observed that the 3D human body pose parameters typically can be assumed to lie on a low-dimensional manifold embedded in the high-dimensional space. The goal of this work is to approximate the low-dimensional manifold so that a low-dimensional state vector can be obtained for efficient and effective Bayesian tracking. To achieve this goal, a globally coordinated mixture of factor analyzers is learned from motion capture data. Each factor analyzer in the mixture is a “locally linear dimensionality reducer” that approximates a part of the manifold. The global parametrization of the manifold is obtained by aligning these locally linear pieces in a global coordinate system. To enable automatic and optimal selection of the number of factor analyzers and the dimensionality of the manifold, a variational Bayesian formulation of the globally coordinated mixture of factor analyzers is proposed. The advantages of the proposed model are demonstrated in a multiple hypothesis tracker for tracking 3D human body pose. Quantitative comparisons on benchmark datasets show that the proposed method produces more accurate 3D pose estimates over time than those obtained from two previously proposed Bayesian tracking methods.

international conference on document analysis and recognition | 2005

Tracking, analysis, and recognition of human gestures in video

Stan Sclaroff; Margrit Betke; George Kollios; Jonathan Alon; Vassilis Athitsos; Rui Li; John J. Magee; Tai-Peng Tian

An overview of research in automated gesture spotting, tracking and recognition by the Image and Video Computing Group at Boston University is given. Approaches for localization and tracking human hands in video, estimation of hand shape and upper body pose; tracking head and facial motion, as well as efficient spotting and recognition of specific gestures in video streams are summarized. Methods for efficient dimensionality reduction of gesture time series, boosting of classifiers for nearest neighbor search in pose space, and model-based pruning of gesture alignment hypotheses are described. Algorithms are demonstrated in three domains: American sign language, hand signals like those employed by flight-directors on airport runways, and gesture-based interfaces for severely disabled users. The methods described are general and can be applied in other domains that require efficient detection and analysis of patterns in time-series, images or video.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Divide, Conquer and Coordinate: Globally Coordinated Switching Linear Dynamical System

Rui Li; Tai-Peng Tian; Stan Sclaroff

The goal of this work is to learn a parsimonious and informative representation for high-dimensional time series. Conceptually, this comprises two distinct yet tightly coupled tasks: learning a low-dimensional manifold and modeling the dynamical process. These two tasks have a complementary relationship as the temporal constraints provide valuable neighborhood information for dimensionality reduction and, conversely, the low-dimensional space allows dynamics to be learned efficiently. Solving these two tasks simultaneously allows important information to be exchanged mutually. If nonlinear models are required to capture the rich complexity of time series, then the learning problem becomes harder as the nonlinearities in both tasks are coupled. A divide, conquer, and coordinate method is proposed. The solution approximates the nonlinear manifold and dynamics using simple piecewise linear models. The interactions and coordinations among the linear models are captured in a graphical model. The model structure setup and parameter learning are done using a variational Bayesian approach, which enables automatic Bayesian model structure selection, hence solving the problem of overfitting. By exploiting the model structure, efficient inference and learning algorithms are obtained without oversimplifying the model of the underlying dynamical process. Evaluation of the proposed framework with competing approaches is conducted in three sets of experiments: dimensionality reduction and reconstruction using synthetic time series, video synthesis using a dynamic texture database, and human motion synthesis, classification, and tracking on a benchmark data set. In all experiments, the proposed approach provides superior performance.

computer vision and pattern recognition | 2011

Scale and rotation invariant matching using linearly augmented trees

Hao Jiang; Tai-Peng Tian; Stan Sclaroff

We propose a novel linearly augmented tree method for efficient scale and rotation invariant object matching. The proposed method enforces pairwise matching consistency defined on trees, and high-order constraints on all the sites of a template. The pairwise constraints admit arbitrary metrics while the high-order constraints use L1 norms and therefore can be linearized. Such a linearly augmented tree formulation introduces hyperedges and loops into the basic tree structure, but different from a general loopy graph, its special structure allows us to relax and decompose the optimization into a sequence of tree matching problems efficiently solvable by dynamic programming. The proposed method also works on continuous scale and rotation parameters; we can match with a scale up to any large number with the same efficiency. Our experiments on ground truth data and a variety of real images and videos show that the proposed method is efficient, accurate and reliable.

european conference on computer vision | 2010

Fast multi-aspect 2D human detection

Tai-Peng Tian; Stan Sclaroff

We address the problem of detecting human figures in images, taking into account that the image of the human figure may be taken from a range of viewpoints. We capture the geometric deformations of the 2D human figure using an extension of the Common Factor Model (CFM) of Lan and Huttenlocher. The key contribution of the paper is an improved iterative message passing inference algorithm that runs faster than the original CFM algorithm. This is based on the insight that messages created using the distance transform are shift invariant and therefore messages can be created once and then shifted for subsequent iterations. Since shifting (O(1) complexity) is faster than computing a distance transform (O(n) complexity), a significant speedup is observed in the experiments. We demonstrate the effectiveness of the new model for the human parsing problem using the Iterative Parsing data set and results are competitive with the state of the art detection algorithm of Andriluka, et al.

workshop on applications of computer vision | 2005

Handsignals Recognition From Video Using 3D Motion Capture Data

Tai-Peng Tian; Stan Sclaroff

Hand signals are commonly used in applications such as giving instructions to a pilot for airplane takeoff or direction of a crane operator by a foreman on the ground. A new algorithm for recognizing hand signals from a single camera is proposed. Typically, tracked 2D feature positions of hand signals are matched to 2D training images. In contrast, our approach matches the 2D feature positions to an archive of 3D motion capture sequences. The method avoids explicit reconstruction of the 3D articulated motion from 2D image features. Instead, the matching between the 2D and 3D sequence is done by backprojecting the 3D motion capture data onto 2D. Experiments demonstrate the effectiveness of the approach in an example application: recognizing six classes of basketball referee hand signals in video.

Explore More