Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Steve Gu is active.

Publication


Featured researches published by Steve Gu.


asian conference on computer vision | 2010

Efficient visual object tracking with online nearest neighbor classifier

Steve Gu; Ying Zheng; Carlo Tomasi

A tracking-by-detection framework is proposed that combines nearest-neighbor classification of bags of features, efficient subwindow search, and a novel feature selection and pruning method to achieve stability and plasticity in tracking targets of changing appearance. Experiments show that near-frame-rate performance is achieved (sans feature detection), and that the state of the art is improved in terms of handling occlusions, clutter, changes of scale, and of appearance. A theoretical analysis shows why nearest neighbor works better than more sophisticated classifiers in the context of tracking.


international conference on computer vision | 2011

Detailed reconstruction of 3D plant root shape

Ying Zheng; Steve Gu; Herbert Edelsbrunner; Carlo Tomasi; Philip N. Benfey

We study the 3D reconstruction of plant roots from multiple 2D images. To meet the challenge caused by the delicate nature of thin branches, we make three innovations to cope with the sensitivity to image quality and calibration. First, we model the background as a harmonic function to improve the segmentation of the root in each 2D image. Second, we develop the concept of the regularized visual hull which reduces the effect of jittering and refraction by ensuring consistency with one 2D image. Third, we guarantee connectedness through adjustments to the 3D reconstruction that minimize global error. Our software is part of a biological phenotype/genotype study of agricultural root systems. It has been tested on more than 40 plant roots and results are promising in terms of reconstruction quality and efficiency.


european conference on computer vision | 2010

Critical nets and beta-stable features for image matching

Steve Gu; Ying Zheng; Carlo Tomasi

We propose new ideas and efficient algorithms towards bridging the gap between bag-of-features and constellation descriptors for image matching. Specifically, we show how to compute connections between local image features in the form of a critical net whose construction is repeatable across changes of viewing conditions or scene configuration. Arcs of the net provide a more reliable frame of reference than individual features do for the purpose of invariance. In addition, regions associated with either small stars or loops in the critical net can be used as parts for recognition or retrieval, and subgraphs of the critical net that are matched across images exhibit common structures shared by different images. We also introduce the notion of beta-stable features, a variation on the notion of feature lifetime from the literature of scale space. Our experiments show that arc-based SIFT-like descriptors of beta-stable features are more repeatable and more accurate than competing descriptors. We also provide anecdotal evidence of the usefulness of image parts and of the structures that are found to be common across images.


international conference on computer vision | 2011

Linear time offline tracking and lower envelope algorithms

Steve Gu; Ying Zheng; Carlo Tomasi

Offline tracking of visual objects is particularly helpful in the presence of significant occlusions, when a frame-by-frame, causal tracker is likely to lose sight of the target. In addition, the trajectories found by offline tracking are typically smoother and more stable because of the global optimization this approach entails. In contrast with previous work, we show that this global optimization can be performed in O(MNT) time for T frames of video at M × N resolution, with the help of the generalized distance transform developed by Felzenszwalb and Huttenlocher [13]. Recognizing the importance of this distance transform, we extend the computation to a more general lower envelope algorithm in certain heterogeneous l1-distance metric spaces. The generalized lower envelope algorithm is of complexity O(MN(M+N)) and is useful for a more challenging offline tracking problem. Experiments show that trajectories found by offline tracking are superior to those computed by online tracking methods, and are computed at 100 frames per second.


computer vision and pattern recognition | 2011

Branch and track

Steve Gu; Carlo Tomasi

We present a new paradigm for tracking objects in video in the presence of other similar objects. This branch-and-track paradigm is also useful in the absence of motion, for the discovery of repetitive patterns in images. The object of interest is the lead object and the distracters are extras. The lead tracker branches out trackers for extras when they are detected, and all trackers share a common set of features. Sometimes, extras are tracked because they are of interest in their own right. In other cases, and perhaps more importantly, tracking extras makes tracking the lead nimbler and more robust, both because shared features provide a richer object model, and because tracking extras accounts for sources of confusion explicitly. Sharing features also makes joint tracking less expensive, and coordinating tracking across lead and extras allows optimizing window positions jointly rather than separately, for better results. The joint tracking of both lead and extras can be solved optimally by dynamic programming and branching is quickly determined by efficient subwindow search. Matlab experiments show near real time performance at 5–30 frames per second on a single-core laptop for 240 by 320 images.


acm multimedia | 2011

Detecting motion synchrony by video tubes

Ying Zheng; Steve Gu; Carlo Tomasi

Motion synchrony, i.e., the coordinated motion of a group of individuals, is an interesting phenomenon in nature or daily life. Fish swim in schools, birds fly in flocks, soldiers march in platoons, etc. Our goal is to detect motion synchrony that may be present in the video data, and to track the group of moving objects as a whole. This opens the door to novel algorithms and applications. To this end, we model individual motions as video tubes in space-time, define motion synchrony by the geometric relation among video tubes, and track a whole set of tubes by dynamic programming. The resulting algorithm is highly efficient in practice. Given a video clip of T frames of resolution XxY, we show that finding the K spatially correlated video tubes and determining the presence of synchrony can be solved optimally in O(XYTK) time. Preliminary experiments show that our method is both effective and efficient. Typical running times are 30 - 100 VGA-resolution frames per second after feature extraction, and the accuracy for the detection of synchrony is more than 90% as evaluated in our annotated data set.


international conference on acoustics, speech, and signal processing | 2012

Topological persistence on a Jordan curve

Ying Zheng; Steve Gu; Carlo Tomasi

Topological persistence measures the resilience of extrema of a function to perturbations, and has received increasing attention in computer graphics, visualization and computer vision. While the notion of topological persistence for piece-wise linear functions defined on a simplicial complex has been well studied, the time complexity of all the known algorithms are super-linear (e.g. O(n log n)) in the size n of the complex. We give an O(n) algorithm to compute topological persistence for a function defined on a Jordan curve. To the best of our knowledge, our algorithm is the first to attain linear asymptotic complexity, and is asymptotically optimal. We demonstrate the usefulness of persistence in shape abstraction and compression.


european conference on computer vision | 2012

Fast tiered labeling with topological priors

Ying Zheng; Steve Gu; Carlo Tomasi

We consider labeling an image with multiple tiers. Tiers, one on top of another, enforce a strict vertical order among objects (e.g. sky is above the ground). Two new ideas are explored: First, under a simplification of the general tiered labeling framework proposed by Felzenszwalb and Veksler [1], we design an efficient O(KN) algorithm for the approximate optimal labeling of an image of N pixels with K tiers. Our algorithm runs in over 100 frames per second on images of VGA resolutions when K is less than 6. When K=3, our solution overlaps with the globally optimal one by Felzenszwalb and Veksler in over 99% of all pixels but runs 1000 times faster. Second, we define a topological prior that specifies the number of local extrema in the tier boundaries, and give an O(NM) algorithm to find a single, optimal tier boundary with exactly M local maxima and minima. These two extensions enrich the general tiered labeling framework and enable fast computation. The proposed topological prior further improves the accuracy in labeling details.


international conference on acoustics, speech, and signal processing | 2012

Shape from point features

Steve Gu; Ying Zheng; Carlo Tomasi

We present a nonparametric and efficient method for shape localization that improves on the traditional sub-window search in capturing the fine geometry of an object from a small number of feature points. Our method implies that the discrete set of features capture more appearance and shape information than is commonly exploited. We use the a-complex by Edelsbrunner et al. to build a filtration of simplicial complexes from a user-provided set of features. The optimal value of a is determined automatically by a search for the densest complex connected component, resulting in a parameter-free algorithm. Given K features, localization occurs in O(K log K) time. For VGA-resolution images, computation takes typically less than 10 milliseconds. We use our method for interactive object cut, with promising results.


european conference on computer vision | 2012

Nested pictorial structures

Steve Gu; Ying Zheng; Carlo Tomasi

We propose a theoretical construct coined nested pictorial structure to represent an object by parts that are recursively nested. Three innovative ideas are proposed: First, the nested pictorial structure finds a part configuration that is allowed to be deformed in geometric arrangement, while being confined to be topologically nested. Second, we define nested features which lend themselves to better, more detailed accounting of pixel data cost and describe occlusion in a principled way. Third, we develop the concept of constrained distance transform, a variation of the generalized distance transform, to guarantee the topological nesting relations and to further enforce that parts have no overlap with each other. We show that matching an optimal nested pictorial structure of K parts on an image of N pixels takes O(NK) time using dynamic programming and constrained distance transform. In our MATLAB/C++ implementation, it takes less than 0.1 seconds to do the global optimal matching when K=10 and N=400 ×400. We demonstrate the usefulness of nested pictorial structures in the matching of objects of nested patterns, objects in occlusion, and objects that live in a context.

Collaboration


Dive into the Steve Gu's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Herbert Edelsbrunner

Institute of Science and Technology Austria

View shared research outputs
Researchain Logo
Decentralizing Knowledge