Quanfu Fan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Quanfu Fan is active.

Explore More

Publication

Featured researches published by Quanfu Fan.

european conference on computer vision | 2016

A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection

Zhaowei Cai; Quanfu Fan; Rogério Schmidt Feris; Nuno Vasconcelos

A unified deep neural network, denoted the multi-scale CNN (MS-CNN), is proposed for fast multi-scale object detection. The MS-CNN consists of a proposal sub-network and a detection sub-network. In the proposal sub-network, detection is performed at multiple output layers, so that receptive fields match objects of different scales. These complementary scale-specific detectors are combined to produce a strong multi-scale object detector. The unified network is learned end-to-end, by optimizing a multi-task loss. Feature upsampling by deconvolution is also explored, as an alternative to input upsampling, to reduce the memory and computation costs. State-of-the-art object detection performance, at up to 15 fps, is reported on datasets, such as KITTI and Caltech, containing a substantial number of small objects.

computer vision and pattern recognition | 2009

Recognition of repetitive sequential human activity

Quanfu Fan; Russell P. Bobbitt; Yun Zhai; Akira Yanagawa; Sharath Pankanti; Arun Hampapur

We present a novel framework for recognizing repetitive sequential events performed by human actors with strong temporal dependencies and potential parallel overlap. Our solution incorporates sub-event (or primitive) detectors and a spatiotemporal model for sequential event changes. We develop an effective and efficient method to integrate primitives into a set of sequential events where strong temporal constraints are imposed on the ordering of the primitives. In particular, the combination process is approached as an optimization problem. A specialized Viterbi algorithm is designed to learn and infer the target sequential events and handle the event overlap simultaneously. To demonstrate the effectiveness of the proposed framework, we report detailed quantitative analysis on a large set of cashier checkout activities in a retail store.

computer vision and pattern recognition | 2014

Temporal Sequence Modeling for Video Event Detection

Yu Cheng; Quanfu Fan; Sharath Pankanti; Alok N. Choudhary

We present a novel approach for event detection in video by temporal sequence modeling. Exploiting temporal information has lain at the core of many approaches for video analysis (i.e., action, activity and event recognition). Unlike previous works doing temporal modeling at semantic event level, we propose to model temporal dependencies in the data at sub-event level without using event annotations. This frees our model from ground truth and addresses several limitations in previous work on temporal modeling. Based on this idea, we represent a video by a sequence of visual words learnt from the video, and apply the Sequence Memoizer [21] to capture long-range dependencies in a temporal context in the visual sequence. This data-driven temporal model is further integrated with event classification for jointly performing segmentation and classification of events in a video. We demonstrate the efficacy of our approach on two challenging datasets for visual recognition.

advanced video and signal based surveillance | 2011

Modeling of temporarily static objects for robust abandoned object detection in urban surveillance

Quanfu Fan; Sharath Pankanti

We propose a robust approach for abandoned object detection in urban surveillance with over thousands of cameras. For such a large-scale monitoring based on intelligent video analysis, it is critical that a system be designed with careful control of false alarms. Our approach is based on proactive modeling of temporally static objects (TSO) such as cars stopping at red light and still pedestrians in the street. We develop a finite state machine to track the entire life cycles of TSOs from creation to termination. The semantically meaningful object information provided by the state machine in turn allows adaptive region-level updating of the background model without using any sophisticated object classification techniques. We demonstrate that our approach significantly mitigates the problematic issue of false alarm related to people in city surveillance, using both a small publicly available data set and a large one collected from various realistic urban scenarios.

international conference on image processing | 2011

Robust abandoned object detection using region-level analysis

Jiyan Pan; Quanfu Fan; Sharath Pankanti

We propose a robust abandoned object detection algorithm for real-time video surveillance. Different from conventional approaches that mostly rely on pixel-level processing, we perform region-level analysis in both background maintenance and static foreground object detection. In background maintenance, region-level information is fed back to adaptively control the learning rate. In static foreground object detection, region-level analysis double-checks the validity of candidate abandoned blobs. Attributed to such analysis, our algorithm is robust against illumination change, “ghosts” left by removed objects, distractions from partially static objects, and occlusions. Experiments on nearly 130,000 frames of i-LIDS dataset show the superior performance of our approach.

international conference on computer vision | 2013

Relative Attributes for Large-Scale Abandoned Object Detection

Quanfu Fan; Prasad Gabbur; Sharath Pankanti

Effective reduction of false alarms in large-scale video surveillance is rather challenging, especially for applications where abnormal events of interest rarely occur, such as abandoned object detection. We develop an approach to prioritize alerts by ranking them, and demonstrate its great effectiveness in reducing false positives while keeping good detection accuracy. Our approach benefits from a novel representation of abandoned object alerts by relative attributes, namely static ness, foreground ness and abandonment. The relative strengths of these attributes are quantified using a ranking function[19] learnt on suitably designed low-level spatial and temporal features. These attributes of varying strengths are not only powerful in distinguishing abandoned objects from false alarms such as people and light artifacts, but also computationally efficient for large-scale deployment. With these features, we apply a linear ranking algorithm to sort alerts according to their relevance to the end-user. We test the effectiveness of our approach on both public data sets and large ones collected from the real world.

international conference on acoustics, speech, and signal processing | 2011

Detecting human activities in retail surveillance using hierarchical finite state machine

Hoang Trinh; Quanfu Fan; Pan Jiyan; Prasad Gabbur; Sachiko Miyazawa; Sharath Pankanti

Cashiers in retail stores usually exhibit certain repetitive and periodic activities when processing items. Detecting such activities plays a key role in most retail fraud detection systems. In this paper, we propose a highly efficient, effective and robust vision technique to detect checkout-related primitive activities, based on a hierarchical finite state machine (FSM). Our deterministic approach uses visual features and prior spatial constraints on the hand motion to capture particular motion patterns performed in primitive activities. We also apply our approach to the problem of retail fraud detection. Experimental results on a large set of video data captured from retail stores show that our approach, while much simpler and faster, achieves significantly better results than state-of-the-art machine learning-based techniques both in detecting checkout-related activities and in detecting checkout-related fraudulent incidents.

computer vision and pattern recognition | 2014

Random Laplace Feature Maps for Semigroup Kernels on Histograms

Jiyan Yang; Vikas Sindhwani; Quanfu Fan; Haim Avron; Michael W. Mahoney

With the goal of accelerating the training and testing complexity of nonlinear kernel methods, several recent papers have proposed explicit embeddings of the input data into low-dimensional feature spaces, where fast linear methods can instead be used to generate approximate solutions. Analogous to random Fourier feature maps to approximate shift-invariant kernels, such as the Gaussian kernel, on Rd, we develop a new randomized technique called random Laplace features, to approximate a family of kernel functions adapted to the semigroup structure of R+d. This is the natural algebraic structure on the set of histograms and other non-negative data representations. We provide theoretical results on the uniform convergence of random Laplace features. Empirical analyses on image classification and surveillance event detection tasks demonstrate the attractiveness of using random Laplace features relative to several other feature maps proposed in the literature.

computer vision and pattern recognition | 2012

Hand tracking by binary quadratic programming and its application to retail activity recognition

Hoang Trinh; Quanfu Fan; Prasad Gabbur; Sharath Pankanti

Substantial ambiguities arise in hand tracking due to issues such as small hand size, deformable hand shapes and similar hand appearances. These issues have greatly limited the capability of current multi-target tracking techniques in hand tracking. As an example, state-of-the-art approaches for people tracking handle indentity switching by exploiting the appearance cues using advanced object detectors. For hand tracking, such approaches will fail due to similar, or even identical hand appearances. The main contribution of our work is a global optimization framework based on binary quadratic programming (BQP) that seamlessly integrates appearance, motion and complex interactions between hands. Our approach effectively handles key challenges such as occlusion, detection failure, identity switching, and robustly tracks both hands in two challenging real-life scenarios: retail surveillance and sign languages. In addition, we demonstrate that an automatic method based on hand trajectory analysis outperforms state-of-the-art on checkout-related activity recognition in grocery stores.

ieee intelligent vehicles symposium | 2016

A closer look at Faster R-CNN for vehicle detection

Quanfu Fan; Lisa M. Brown; John R. Smith

Faster R-CNN achieves state-of-the-art performance on generic object detection. However, a simple application of this method to a large vehicle dataset performs unimpressively. In this paper, we take a closer look at this approach as it applies to vehicle detection. We conduct a wide range of experiments and provide a comprehensive analysis of the underlying structure of this model. We show that through suitable parameter tuning and algorithmic modification, we can significantly improve the performance of Faster R-CNN on vehicle detection and achieve competitive results on the KITTI vehicle dataset. We believe our studies are instructive for other researchers investigating the application of Faster R-CNN to their problems and datasets.

Explore More