Ziyan Wu
Rensselaer Polytechnic Institute
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ziyan Wu.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015
Ziyan Wu; Yang Li; Richard J. Radke
Human re-identification across cameras with non-overlapping fields of view is one of the most important and difficult problems in video surveillance and analysis. However, current algorithms are likely to fail in real-world scenarios for several reasons. For example, surveillance cameras are typically mounted high above the ground plane, causing serious perspective changes. Also, most algorithms approach matching across images using the same descriptors, regardless of camera viewpoint or human pose. Here, we introduce a re-identification algorithm that addresses both problems. We build a model for human appearance as a function of pose, using training data gathered from a calibrated camera. We then apply this “pose prior” in online re-identification to make matching and identification more robust to viewpoint. We further integrate person-specific features learned over the course of tracking to improve the algorithms performance. We evaluate the performance of the proposed algorithm and compare it to several state-of-the-art algorithms, demonstrating superior performance on standard benchmarking datasets as well as a challenging new airport surveillance scenario.
IEEE Transactions on Pattern Analysis and Machine Intelligence | 2013
Ziyan Wu; Richard J. Radke
Pan-tilt-zoom (PTZ) cameras are pervasive in modern surveillance systems. However, we demonstrate that the (pan, tilt) coordinates reported by PTZ cameras become inaccurate after many hours of operation, endangering tracking and 3D localization algorithms that rely on the accuracy of such values. To solve this problem, we propose a complete model for a PTZ camera that explicitly reflects how focal length and lens distortion vary as a function of zoom scale. We show how the parameters of this model can be quickly and accurately estimated using a series of simple initialization steps followed by a nonlinear optimization. Our method requires only 10 images to achieve accurate calibration results. Next, we show how the calibration parameters can be maintained using a one-shot dynamic correction process; this ensures that the camera returns the same field of view every time the user requests a given (pan, tilt, zoom), even after hundreds of hours of operation. The dynamic calibration algorithm is based on matching the current image against a stored feature library created at the time the PTZ camera is mounted. We evaluate the calibration and dynamic correction algorithms on both experimental and real-world datasets, demonstrating the effectiveness of the techniques.
workshop on applications of computer vision | 2015
Yang Li; Ziyan Wu; Richard J. Radke
Human re-identification remains one of the fundamental, difficult problems in video surveillance and analysis. Current metric learning algorithms mainly focus on finding an optimized vector space such that observations of the same person in this space have a smaller distance than observations of two different people. In this paper, we propose a novel metric learning approach to the human reidentification problem, with an emphasis on the multi-shot scenario. First, we perform dimensionality reduction on image feature vectors through random projection. Next, a random forest is trained based on pair wise constraints in the projected subspace. This procedure repeats with a number of random projection bases, so that a series of random forests are trained in various feature subspaces. Finally, we select personalized random forests for each subject using their multi-shot appearances. We evaluate the performance of our algorithm on three benchmark datasets.
british machine vision conference | 2015
Yang Li; Ziyan Wu; Srikrishna Karanam; Richard J. Radke
While much research in human re-identification has focused on the single-shot case, in real-world applications we are likely to have an image sequence from both the person to be matched and each candidate in the gallery, extracted from automated video tracking. It is desirable to take advantage of the multiple visual aspects (states) of each subject observed during training and testing. However, since each subject may spend different amounts of time in each state, equally weighting all the images in a sequence is likely to produce suboptimal performance. To address this problem, we introduce an algorithm to hierarchically cluster image sequences and use the representative data samples to learn a feature subspace maximizing the Fisher criterion. The clustering and subspace learning processes are applied iteratively to obtain diversity-preserving discriminative features. A metric learning step is then applied to bridge the appearance difference between two cameras. The proposed method is evaluated on three multi-shot re-id datasets and the results outperform state-of-the-art methods.
IEEE Transactions on Circuits and Systems for Video Technology | 2017
Octavia I. Camps; Mengran Gou; Tom Hebble; Srikrishna Karanam; Oliver Lehmann; Yang Li; Richard J. Radke; Ziyan Wu; Fei Xiong
Over the past ten years, human re-identification has received increased attention from the computer vision research community. However, for the most part, these research papers are divorced from the context of how such algorithms would be used in a real-world system. This paper describes the unique opportunity our group of academic researchers had to design and deploy a human re-identification system in a demanding real-world environment: a busy airport. The system had to be designed from the ground up, including robust modules for real-time human detection and tracking, a distributed, low-latency software architecture, and a front-end user interface designed for a specific scenario. None of these issues are typically addressed in re-identification research papers, but all are critical to an effective system that end users would actually be willing to adopt. We detail the challenges of the real-world airport environment, the computer vision algorithms underlying our human detection and re-identification algorithms, our robust software architecture, and the ground-truthing system required to provide the training and validation data for the algorithms. Our initial results show that despite the challenges and constraints of the airport environment, the proposed system achieves very good performance while operating in real time.
computer vision and pattern recognition | 2011
Ziyan Wu; Richard J. Radke
We introduce an airport security checkpoint surveillance system using a camera network. The system tracks the movement of each passenger and carry-on bag, continuously maintains the association between bags and passengers, and verifies that passengers leave the checkpoint with the correct bags. We present methods for calibrating the camera network and tracking the many moving objects in the environment. We define a state machine for bag tracking and association, dividing the imaged area into several semantically meaningful regions. The real-time algorithms are validated on a full-scale simulation of a security checkpoint with several runs of volunteer groups, demonstrating high performance in a challenging environment.
international conference on distributed smart cameras | 2014
Yang Li; Ziyan Wu; Srikrishna Karanam; Richard J. Radke
Human re-identification across non-overlapping fields of view is one of the fundamental problems in video surveillance. While most reported research for this problem is focused on improving the matching rate between pairs of cropped rectangles around humans, the situation is quite different when it comes to creating a re-identification algorithm that operates robustly in the real world. In this paper, we describe an end-to-end system solution of the re-identification problem installed in an airport environment, with a focus on the challenges brought by the real world scenario. We discuss the high-level system design of the video surveillance application, and the issues we encountered during our development and testing. We also describe the algorithm framework for our human re-identification software, and discuss considerations of speed and matching performance. Finally, we report the results of an experiment conducted to illustrate the output of the developed software as well as its feasibility for the airport surveillance task.
computer vision and pattern recognition | 2012
Ziyan Wu; Richard J. Radke
We introduce two novel methods to improve the performance of wide area video surveillance applications by using scene features. First, we evaluate the drift in intrinsic and extrinsic parameters for typical pan-tilt-zoom (PTZ) cameras, which stems from accumulated mechanical and random errors after many hours of operation. When the PTZ camera is out of calibration, we show how the pose and internal parameters can be dynamically corrected by matching the scene features in the current image with a precomputed feature library. Experimental results show that the proposed method can keep a PTZ camera calibrated, even over long surveillance sequences. Second, we introduce a classifier to identify scene feature points, which can be used to improve robustness in tracking foreground objects and detect jitter in surveillance videos sequences. We show that the classifier produces improved performance on the problem of detecting counterflow in real surveillance video.
international symposium on biomedical imaging | 2016
Bor-Jeng Chen; Ziyan Wu; Shanhui Sun; Dong Zhang; Terrence Chen
Robust guidewire tracking is important in many clinical applications yet it is still a very challenging problem. A guidewire is a thin, deformable medical device and usually appears in low quality X-ray imagery. Moreover, distracters with stronger gradient response may also present in the scenario. In this paper, we model the wire-like structure as a sequence of small segments and formulate guidewire tracking as a graph-based optimization problem which aims to find the optimal link set. To overcome distracters, we extract them from the dominant motion pattern and propose a confidence re-weighting process in the appearance measurement. Medical validations have shown over 50 percent reduction in mean tracking error in 54 clinical X-Ray sequences compared with a state-of-the-art guidewire tracking method.
international conference on document analysis and recognition | 2011
Elisa H. Barney Smith; Daniel P. Lopresti; George Nagy; Ziyan Wu
Resources are presented for fostering paper-based election technology. They comprise a diverse collection of real and simulated ballot and survey images, and software tools for ballot synthesis, registration, segmentation, and ground truthing. The grids underlying the designated location of voter marks are extracted from 13,315 degraded ballot images. The actual skew angles of sample ballots, recorded as part of complete ballot descriptions compiled with the interactive ground-truthing tool, are compared with their automatically extracted parameters. The average error is 0.1 degrees. These results provide a baseline for the application of digital image analysis to the scrutiny of electoral ballots.