Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jingyong Su is active.

Publication


Featured researches published by Jingyong Su.


computer vision and pattern recognition | 2015

Elastic functional coding of human actions: From vector-fields to latent variables

Rushil Anirudh; Pavan K. Turaga; Jingyong Su; Anuj Srivastava

Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addition to being non-linear, which can severely limit computations on them. We also argue that by their nature, human actions themselves lie on a much lower dimensional manifold compared to the high dimensional feature space. Learning an accurate low dimensional embedding for actions could have a huge impact in the areas of efficient search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in ℝn, but its extension to trajectories on Riemannian manifolds is non-trivial and has remained unexplored. The challenge arises due to the inherent non-linearity, and temporal variability that can significantly distort the distance metric between trajectories. To address these issues we use the transport square-root velocity function (TSRVF) space, a recently proposed representation that provides a metric which has favorable theoretical properties such as invariance to group action. We propose to learn the low dimensional embedding with a manifold functional variant of principal component analysis (mfPCA). We show that mf-PCA effectively models the manifold trajectories in several applications such as action recognition, clustering and diverse sequence sampling while reducing the dimensionality by a factor of ~ 250×. The mfPCA features can also be reconstructed back to the original manifold to allow for easy visualization of the latent variable space.


Image and Vision Computing | 2012

Fitting smoothing splines to time-indexed, noisy points on nonlinear manifolds

Jingyong Su; Ian L. Dryden; Eric Klassen; Huiling Le; Anuj Srivastava

We address the problem of estimating full curves/paths on certain nonlinear manifolds using only a set of time-indexed points, for use in interpolation, smoothing, and prediction of dynamic systems. These curves are analogous to smoothing splines in Euclidean spaces as they are optimal under a similar objective function, which is a weighted sum of a fitting-related (data term) and a regularity-related (smoothing term) cost functions. The search for smoothing splines on manifolds is based on a Palais metric-based steepest-decent algorithm developed in Samir et al. [38]. Using three representative manifolds: the rotation group for pose tracking, the space of symmetric positive-definite matrices for DTI image analysis, and Kendalls shape space for video-based activity recognition, we demonstrate the effectiveness of the proposed algorithm for optimal curve fitting. This paper derives certain geometrical elements, namely the exponential map and its inverse, parallel transport of tangents, and the curvature tensor, on these manifolds, that are needed in the gradient-based search for smoothing splines. These ideas are illustrated using experimental results involving both simulated and real data, and comparing the results to some current algorithms such as piecewise geodesic curves and splines on tangent spaces, including the method by Kume et al. [24].


computer vision and pattern recognition | 2014

Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition

Jingyong Su; Anuj Srivastava; Fillipe D. M. de Souza; Sudeep Sarkar

In statistical analysis of video sequences for speech recognition, and more generally activity recognition, it is natural to treat temporal evolutions of features as trajectories on Riemannian manifolds. However, different evolution patterns result in arbitrary parameterizations of these trajectories. We investigate a recent framework from statistics literature that handles this nuisance variability using a cost function/distance for temporal registration and statistical summarization & modeling of trajectories. It is based on a mathematical representation of trajectories, termed transported square-root vector field (TSRVF), and the L2 norm on the space of TSRVFs. We apply this framework to the problem of speech recognition using both audio and visual components. In each case, we extract features, form trajectories on corresponding manifolds, and compute parametrization-invariant distances using TSRVFs for speech classification. On the OuluVS database the classification performance under metric increases significantly, by nearly 100% under both modalities and for all choices of features. We obtained speaker-dependent classification rate of 70% and 96% for visual and audio components, respectively.


information hiding | 2006

A Fire-Alarming Method Based on Video Processing

Ping-He Huang; Jingyong Su; Zhe-Ming Lu; Jeng-Shyang Pan

This paper presents a fire-alarming method based on video processing. We propose a system that uses color and motion information extracted from video sequences to detect fire. Flame can be recognized according to its color which is a primary element of fire images. Thus choosing a suitable color model is the key to detect flames from fire images. An effective fire detection criterion based on color model is proposed in this paper by intensive experiments and trainings. The detection criterion is used to make a raw localization of fire regions first. However, color alone is not enough for fire detection. To identify a real burning fire, in addition to chromatic features, dynamic features are usually adopted to distinguish other fire aliases. In this paper, both the growth of fire region and the invariability of flame are utilized to further detect the fire regions as a complement of the detection criterion. The effectiveness of the proposed fire-alarming method is demonstrated by the experiments implemented on a large number of scenes.


semantics, knowledge and grid | 2008

Pornographic Images Detection Based on CBIR and Skin Analysis

Bei-bei Liu; Jingyong Su; Zhe-Ming Lu; Zhen Li

A novel two-stage scheme of pornographic image detection is proposed in this paper. Specifically, we first apply the content-based image retrieval technique to find out whether human are present in the images. Then a detailed skin color analysis is performed to affirm the presence of pornographic content in the images. Experimental results show that the proposed algorithm performs well and fast in detecting pornographic images.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2017

Elastic Functional Coding of Riemannian Trajectories

Rushil Anirudh; Pavan K. Turaga; Jingyong Su; Anuj Srivastava

Visual observations of dynamic phenomena, such as human actions, are often represented as sequences of smoothly-varying features. In cases where the feature spaces can be structured as Riemannian manifolds, the corresponding representations become trajectories on manifolds. Analysis of these trajectories is challenging due to non-linearity of underlying spaces and high-dimensionality of trajectories. In vision problems, given the nature of physical systems involved, these phenomena are better characterized on a low-dimensional manifold compared to the space of Riemannian trajectories. For instance, if one does not impose physical constraints of the human body, in data involving human action analysis, the resulting representation space will have highly redundant features. Learning an effective, low-dimensional embedding for action representations will have a huge impact in the areas of search and retrieval, visualization, learning, and recognition. Traditional manifold learning addresses this problem for static points in the euclidean space, but its extension to Riemannian trajectories is non-trivial and remains unexplored. The difficulty lies in inherent non-linearity of the domain and temporal variability of actions that can distort any traditional metric between trajectories. To overcome these issues, we use the framework based on transported square-root velocity fields (TSRVF); this framework has several desirable properties, including a rate-invariant metric and vector space representations. We propose to learn an embedding such that each action trajectory is mapped to a single point in a low-dimensional euclidean space, and the trajectories that differ only in temporal rates map to the same point. We utilize the TSRVF representation, and accompanying statistical summaries of Riemannian trajectories, to extend existing coding methods such as PCA, KSVD and Label Consistent KSVD to Riemannian trajectories or more generally to Riemannian functions. We show that such coding efficiently captures trajectories in applications such as action recognition, stroke rehabilitation, visual speech recognition, clustering and diverse sequence sampling. Using this framework, we obtain state-of-the-art recognition results, while reducing the dimensionality/complexity by a factor of


Computational Statistics & Data Analysis | 2013

Detection, classification and estimation of individual shapes in 2D and 3D point clouds

Jingyong Su; Anuj Srivastava; Fred W. Huffer

100-250\times


computer vision and pattern recognition | 2015

Temporally coherent interpretations for long videos using pattern theory

Fillipe D. M. de Souza; Sudeep Sarkar; Anuj Srivastava; Jingyong Su

. Since these mappings and codes are invertible, they can also be used to interactively visualize Riemannian trajectories and synthesize actions.


international conference on pattern recognition | 2014

Pattern Theory-Based Interpretation of Activities

Fillipe D. M. de Souza; Sudeep Sarkar; Anuj Srivastava; Jingyong Su

The problems of detecting, classifying, and estimating shapes in point cloud data are important due to their general applicability in image analysis, computer vision, and graphics. They are challenging because the data is typically noisy, cluttered, and unordered. We study these problems using a fully statistical model where the data is modeled using a Poisson process on the objects boundary (curves or surfaces), corrupted by additive noise and a clutter process. Using likelihood functions dictated by the model, we develop a generalized likelihood ratio test for detecting a shape in a point cloud. This ratio test is based on optimizing over some unknown parameters, including the pose and scale associated with hypothesized objects, and an empirical evaluation of the log-likelihood ratio distribution. Additionally, we develop a procedure for estimating most likely shapes in observed point clouds under given shape hypotheses. We demonstrate this framework using examples of 2D and 3D shape detection and estimation in both real and simulated data, and a usage of this framework in shape retrieval from a 3D shape database.


international conference on pattern recognition | 2010

Detection of Shapes in 2D Point Clouds Generated from Images

Jingyong Su; Zhiqiang Zhu; Anuj Srivastava; Fred W. Huffer

Graph-theoretical methods have successfully provided semantic and structural interpretations of images and videos. A recent paper introduced a pattern-theoretic approach that allows construction of flexible graphs for representing interactions of actors with objects and inference is accomplished by an efficient annealing algorithm. Actions and objects are termed generators and their interactions are termed bonds; together they form high-probability configurations, or interpretations, of observed scenes. This work and other structural methods have generally been limited to analyzing short videos involving isolated actions. Here we provide an extension that uses additional temporal bonds across individual actions to enable semantic interpretations of longer videos. Longer temporal connections improve scene interpretations as they help discard (temporally) local solutions in favor of globally superior ones. Using this extension, we demonstrate improvements in understanding longer videos, compared to individual interpretations of non-overlapping time segments. We verified the success of our approach by generating interpretations for more than 700 video segments from the YouCook data set, with intricate videos that exhibit cluttered background, scenarios of occlusion, viewpoint variations and changing conditions of illumination. Interpretations for long video segments were able to yield performance increases of about 70% and, in addition, proved to be more robust to different severe scenarios of classification errors.

Collaboration


Dive into the Jingyong Su's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Sudeep Sarkar

University of South Florida

View shared research outputs
Top Co-Authors

Avatar

Eric Klassen

Florida State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Fred W. Huffer

Florida State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Zhengwu Zhang

Florida State University

View shared research outputs
Top Co-Authors

Avatar

Huiling Le

University of Nottingham

View shared research outputs
Top Co-Authors

Avatar

Jeng-Shyang Pan

Fujian University of Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge