Niels Haering | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Niels Haering is active.

Explore More

Publication

Featured researches published by Niels Haering.

IEEE Transactions on Circuits and Systems for Video Technology | 2000

A semantic event-detection approach and its application to detecting hunts in wildlife video

Niels Haering; Richard J. Qian; M.I. Sezan

We propose a three-level video-event detection methodology and apply it to animal-hunt detection in wildlife documentaries. The first level extracts color, texture, and motion features, and detects shot boundaries and moving object blobs. The mid-level employs a neural network to determine the object class of the moving object blobs. This level also generates shot descriptors that combine features from the first level and inferences from the mid-level. The shot descriptors are then used by the domain-specific inference process at the third level to detect video segments that match the user defined event model. The proposed approach has been applied to the detection of hunts in wildlife documentaries. Our method can be applied to different events by adapting the classifier at the intermediate level and by specifying a new event model at the highest level. Event-based video indexing, summarization, and browsing are among the applications of the proposed approach.

computer vision and pattern recognition | 1999

A computational approach to semantic event detection

Richard J. Qian; Niels Haering; M. Ibrahim Sezan

We propose a three-level video event detection algorithm and apply it to animal hunt detection in wildlife documentaries. The first level extracts texture, color and motion features, and detects motion blobs. The mid-level employs a neural network to verify whether the motion blobs belong to objects of interest. This level also generates shot summaries in terms of intermediate-level descriptors which combine low-level features from the first level and contain results of mid-level, domain specific inferences made on the basis of shot features. The shot summaries are then used by a domain-specific inference process at the third level to detect the video segments that contain events of interest, e.g., hunts. Event based video indexing, summarization and browsing are among the applications of the proposed approach.

Computer Vision and Image Understanding | 1999

Features and Classification Methods to Locate Deciduous Trees in Images

Niels Haering; Niels da Vitoria Lobo

We compare features and classification methods to locate deciduous trees in images. From this comparison we conclude that a back-propagation neural network achieves better classification results than the other classifiers we tested. Our analysis of the relevance of 51 features from seven feature extraction methods based on the graylevel co-occurrence matrix, Gabor filters, fractal dimension, steerable filters, the Fourier transform, entropy, and color shows that each feature contributes important information. We show how we obtain a 13-feature subset that significantly reduces the feature extraction time while retaining most of the complete feature sets power and robustness. The best subsets of features were found to be combinations of features of each of the extraction methods. Methods for classification and feature relevance determination that are based on the covariance or correlation matrix of the features (such as eigenanalyses or linear or quadratic classifiers) generally cannot be used, since even small sets of features are usually highly linearly redundant, rendering their covariance or correlation matrices too singular to be invertible. We argue that representing deciduous trees and many other objects by rich image descriptions can significantly aid their classification. We make no assumptions about the shape, location, viewpoint, viewing distance, lighting conditions, and camera parameters, and we only expect scanning methods and compression schemes to retain a “reasonable” image quality.

Archive | 2001

Visual Event Detection

Niels Haering; Niels da Vitoria Lobo

1. Introduction. 2. A Framework for the Design of Visual Event Detectors. 3. Features and Classification Methods. 4. Results. 5. Summary and Discussion of Alternatives. A. Appendix. References. Index.

computer vision and pattern recognition | 2008

SAVE: A framework for semantic annotation of visual events

Mun Wai Lee; Asaad Hakeem; Niels Haering; Song-Chun Zhu

In this paper we propose a framework that performs automatic semantic annotation of visual events (SAVE). This is an enabling technology for content-based video annotation, query and retrieval with applications in Internet video search and video data mining. The method involves identifying objects in the scene, describing their inter-relations, detecting events of interest, and representing them semantically in a human readable and query-able format. The SAVE framework is composed of three main components. The first component is an image parsing engine that performs scene content extraction using bottom-up image analysis and a stochastic attribute image grammar, where we define a visual vocabulary from pixels, primitives, parts, objects and scenes, and specify their spatio-temporal or compositional relations; and a bottom-up top-down strategy is used for inference. The second component is an event inference engine, where the video event markup language (VEML) is adopted for semantic representation, and a grammar-based approach is used for event analysis and detection. The third component is the text generation engine that generates text report using head-driven phrase structure grammar (HPSG). The main contribution of this paper is a framework for an end-to-end system that infers visual events and annotates a large collection of videos. Experiments with maritime and urban scenes indicate the feasibility of the proposed approach.

workshop on applications of computer vision | 2008

Self Calibrating Visual Sensor Networks

Khurram Shafique; Asaad Hakeem; Omar Javed; Niels Haering

This paper presents an unsupervised data driven scheme to automatically estimate the relative topology of overlapping cameras in a large visual sensor network. The proposed method learns the camera topology by employing the statistics of co-occurring observations (of moving targets) in each sensor. Since target observation data is typically very noisy in realistic scenarios, an efficient two step method is used for robust estimation of the planar homography between camera views. In the first step, modes in the co-occurrence data are learned using meanshift. In the second step, a RANSAC based procedure is used to estimate the homography from weighted co-occurrence modes. Note that the first step not only lessens the effects of noise but also reduces the search space for efficient calculation. Unlike most existing algorithms for overlapping camera calibration, the proposed method uses an update mechanism to adapt online to the changes in network topology. The method does not assume prior knowledge about the scene, target, or network properties. It is also robust to noise, traffic intensity, and the amount of overlap between the fields of view. Experiments and quantitative evaluation using both synthetic and real data are presented to support the above claims.

computer vision and pattern recognition | 2008

A rank constrained continuous formulation of multi-frame multi-target tracking problem

Khurram Shafique; Mun Wai Lee; Niels Haering

This paper presents a multi-frame data association algorithm for tracking multiple targets in video sequences. Multi-frame data association involves finding the most probable correspondences between target tracks and measurements (collected over multiple time instances) as well as handling the common tracking problems such as, track initiations and terminations, occlusions, and noisy detections. The problem is known to be NP-Hard for more than two frames. A rank constrained continuous formulation of the problem is presented that can be efficiently solved using nonlinear optimization methods. It is shown that the global and local extrema of the continuous problem respectively coincide with the maximum and the maximal solutions of the discrete counterpart. A scanning window based tracking algorithm is developed using the formulation that performs well under noisy conditions with frequent occlusions and multiple track initiations and terminations. The above claims are supported by experiments and quantitative evaluations using both synthetic and real data under different operating conditions.

computer vision and pattern recognition | 2010

Traffic analysis with low frame rate camera networks

Tae Eun Choe; Mun Wai Lee; Niels Haering

We propose a new traffic analysis framework using existing traffic camera networks. The framework integrates vehicle detection and image-based matching methods with geographic context to match vehicles across different views and analyze traffic. This is a challenging problem due to the low frame-rate of traffic-cams and the large distance between views. A vehicle may not always appear in a camera due to large inter-frame interval or inter-occlusion. We applied the proposed method to a traffic camera network to detect and track key vehicles to analyze traffic condition. Vehicles are detected using a multi-view approach. By integrating camera calibration information and GIS data, we extract traffic lane information and prior knowledge of expected vehicle orientation and image size at each image location. This improves detection speed and reduces false alarms by discarding unlikely scale and orientation. Subsequently, detected vehicles are matched across cameras using a view-invariant appearance model. For more accurate vehicle matching, traffic patterns observed at two sets of cameras are temporally aligned. Finally, key vehicles are globally tracked across cameras using the max-flow/min-cut network tracking algorithm. Traffic conditions at each camera location are presented on a map.

1997 Proceedings IEEE Workshop on Content-Based Access of Image and Video Libraries | 1997

Locating deciduous trees

Niels Haering; Z. Myles; N. da Vitoria Lobo

Presents a method to obtain information about the presence of deciduous trees in images. Since a single measure, observation or model is unlikely to yield robust recognition of trees, we present an approach that combines color measures and estimates of the complexity, structure, roughness and directionality of the image based on entropy measures, grey-level co-occurrence matrices, Fourier transforms, multi-resolution Gabor filter sets, steerable filters and the fractal dimension. A standard backpropagation neural network is used to arbitrate between the different measures and to find a set of robust and mutually consistent “tree experts”

acm multimedia | 2009

Semantic video search using natural language queries

Asaad Hakeem; Mun Wai Lee; Omar Javed; Niels Haering

Recent advances in computer vision and artificial intelligence algorithms have allowed automatic extraction of metadata from video. This metadata can be represented by using the RDF/OWL ontology which can encode scene objects and their relationships in an unambiguous and well-formed manner. The encoded data can be queried using SPARQL. However, SPARQL has a steep learning curve and cannot be directly utilized by a general user for video content search. In this paper, we propose a method to bridge this gap by automatically translating user provided natural language query into an ontology-based SPARQL query for semantic video search. The proposed method consists of three major steps. First, semantically labeled training corpus of natural language query sentences is used for learning the Semantic Stochastic Context Free Grammar (SSCFG). Second, given a user provided natural language query sentence, we use the Earley-Stolcke parsing algorithm to determine the maximum likelihood semantic parsing of the query sentence. This parsing infers the semantic meaning for each word in the query sentence from which the SPARQL query is constructed. Third, the SPARQL query is executed to retrieve relevant video segments from the RDF-OWL video content database. The method is evaluated by running natural language queries on surveillance videos from maritime and land-based domains, though the framework itself is general and extensible to search videos from other domains.

Explore More