Jan Neumann
University of Maryland, College Park
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jan Neumann.
computer vision and pattern recognition | 2007
Vinay D. Shet; Jan Neumann; Visvanathan Ramesh; Larry S. Davis
The capacity to robustly detect humans in video is a critical component of automated visual surveillance systems. This paper describes a bilattice based logical reasoning approach that exploits contextual information and knowledge about interactions between humans, and augments it with the output of different low level detectors for human detection. Detections from low level parts-based detectors are treated as logical facts and used to reason explicitly about the presence or absence of humans in the scene. Positive and negative information from different sources, as well as uncertainties from detections and logical rules, are integrated within the bilattice framework. This approach also generates proofs or justifications for each hypothesis it proposes. These justifications (or lack thereof) are further employed by the system to explain and validate, or reject potential hypotheses. This allows the system to explicitly reason about complex interactions between humans and handle occlusions. These proofs are also available to the end user as an explanation of why the system thinks a particular hypothesis is actually a human. We employ a boosted cascade of gradient histograms based detector to detect individual body parts. We have applied this framework to analyze the presence of humans in static images from different datasets.
International Journal of Computer Vision | 2001
Jan Neumann; Yiannis Aloimonos
We present a method to automatically extract spatio-temporal descriptions of moving objects from synchronized and calibrated multi-view sequences. The object is modeled by a time-varying multi-resolution subdivision surface that is fitted to the image data using spatio-temporal multi-view stereo information, as well as contour constraints. The stereo data is utilized by computing the normalized correlation between corresponding spatio-temporal image trajectories of surface patches, while the contour information is determined using incremental segmentation of the viewing volume into object and background. We globally optimize the shape of the spatio-temporal surface in a coarse-to-fine manner using the multi-resolution structure of the subdivision mesh. The method presented incorporates the available image information in a unified framework and automatically reconstructs accurate spatio-temporal representations of complex non-rigidly moving objects.
computer vision and pattern recognition | 2007
Adam O'Donovan; Ramani Duraiswami; Jan Neumann
Combinations of microphones and cameras allow the joint audio visual sensing of a scene. Such arrangements of sensors are common in biological organisms and in applications such as meeting recording and surveillance where both modalities are necessary to provide scene understanding. Microphone arrays provide geometrical information on the source location, and allow the sound sources in the scene to be separated and the noise suppressed, while cameras allow the scene geometry and the location and motion of people and other objects to be estimated. In most previous work the fusion of the audio-visual information occurs at a relatively late stage. In contrast, we take the viewpoint that both cameras and microphone arrays are geometry sensors, and treat the microphone arrays as generalized cameras. We employ computer-vision inspired algorithms to treat the combined system of arrays and cameras. In particular, we consider the geometry introduced by a general microphone array and spherical microphone arrays. The latter show a geometry that is very close to central projection cameras, and we show how standard vision based calibration algorithms can be profitably applied to them. Experiments are presented that demonstrate the usefulness of the considered approach.
computer vision and pattern recognition | 2003
Jan Neumann; Cornelia Fermüller; Yiannis Aloimonos
Most cameras used in computer vision applications are still based on the pinhole principle inspired by our own eyes. It has been found though that this is not necessarily the optimal image formation principle for processing visual information using a machine. We describe how to find the optimal camera for 3D motion estimation by analyzing the structure of the space formed by the light rays passing through a volume of space. Every camera corresponds to a sampling pattern in light ray space, thus the question of camera design can be rephrased as finding the optimal sampling pattern with regard to a given task. This framework suggests that large field-of-view multi-perspective (polydioptric) cameras are the optimal image sensors for 3D motion estimation. We conclude by proposing design principles for polydioptric cameras and describe an algorithm for such a camera that estimates its 3D motion in a scene independent and robust manner.
Pattern Recognition Letters | 2002
Jan Neumann; Hanan Samet; Aya Soffer
Abstract A comparison is made of global and local methods for the shape analysis of logos in an image database. The qualities of the methods are judged by using the shape signatures to define a similarity metric on the logos. As representatives for the two classes of methods, we use the negative shape method which is based on local shape information and a wavelet-based method which makes use of global information. We apply both methods to images with different kinds of degradations and examine how a particular degradation highlights the strengths and shortcomings of each method. Finally, we use these results to develop a new adaptive weighting scheme which is based on the relative performances of the two methods. This scheme gives rise to a new method that is much more robust with respect to all degradations examined and works by automatically predicting if the negative shape or wavelet method is performing better.
intelligent robots and systems | 2004
Jan Neumann; Cornelia Fermüller; Yiannis Aloimonos; Vladimir Brajovic
We describe a compound eye vision sensor for 3D ego motion computation. Inspired by eyes of insects, we show that the compound eye sampling geometry is optimal for 3D camera motion estimation. This optimality allows us to estimate the 3D camera motion in a scene-independent and robust manner by utilizing linear equations. The mathematical model of the new sensor can be implemented in analog networks resulting in a compact computational sensor for instantaneous 3D ego motion measurements in full six degrees of freedom.
Proceedings of the IEEE Workshop on Omnidirectional Vision 2002. Held in conjunction with ECCV'02 | 2002
Jan Neumann; Cornelia Fermüller; Yiannis Aloimonos
We investigate the relationship between camera design and the problem of recovering the motion and structure of a scene from video data. The visual information that could possibly be obtained is described by the plenoptic function. A camera can be viewed as a device that captures a subset of this function, that is, it measures some of the light rays in some part of the space. The information contained in the subset determines how difficult it is to solve subsequent interpretation processes. By examining the differential structure of the time varying plenoptic function we relate different known and new camera models to the spatiotemporal structure of the observed scene. This allows us to define a hierarchy of camera designs, where the order is determined by the stability and complexity of the computations necessary to estimate structure and motion. At the low end of this hierarchy is the standard planar pinhole camera for which the structure from motion problem is non-linear and ill-posed. At the high end is a new camera, which we call the full field of view polydioptric camera, for which the problem is linear and stable. In between are multiple-view cameras with large fields of view which we have built, as well as catadioptric panoramic sensors and other omni-directional cameras. We develop design suggestions for the polydioptric camera, and based upon this new design we propose a linear algorithm for ego-motion estimation, which in essence combines differential motion estimation with differential stereo.
Computer Vision and Image Understanding | 2004
Jan Neumann; Cornelia Fermüller; Yiannis Aloimonos
The view-independent visualization of 3D scenes is most often based on rendering accurate 3D models or utilizes image-based rendering techniques. To compute the 3D structure of a scene from a moving vision sensor or to use image-based rendering approaches, we need to be able to estimate the motion of the sensor from the recorded image information with high accuracy, a problem that has been well-studied. In this work, we investigate the relationship between camera design and our ability to perform accurate 3D photography, by examining the influence of camera design on the estimation of the motion and structure of a scene from video data. By relating the differential structure of the time varying plenoptic function to different known and new camera designs, we can establish a hierarchy of cameras based upon the stability and complexity of the computations necessary to estimate structure and motion. At the low end of this hierarchy is the standard planar pinhole camera for which the structure from motion problem is non-linear and ill-posed. At the high end is a camera, which we call the full field of view polydioptric camera, for which the motion estimation problem can be solved independently of the depth of the scene which leads to fast and robust algorithms for 3D Photography. In between are multiple view cameras with a large field of view which we have built, as well as omni-directional sensors.
The Visual Computer | 2003
Jan Neumann; Cornelia Fermüller
More and more processing of visual information is nowadays done by computers, but the images captured by conventional cameras are still based on the pinhole principle inspired by our own eyes. This principle though is not necessarily the optimal image-formation principle for automated processing of visual information. Each camera samples the space of light rays according to some pattern. If we understand the structure of the space formed by the light rays passing through a volume of space, we can determine the camera, or in other words the sampling pattern of light rays, that is optimal with regard to a given task. In this work we analyze the differential structure of the space of time-varying light rays described by the plenoptic function and use this analysis to relate the rigid motion of an imaging device to the derivatives of the plenoptic function. The results can be used to define a hierarchy of camera models with respect to the structure from motion problem and formulate a linear, scene-independent estimation problem for the rigid motion of the sensor purely in terms of the captured images.
Lecture Notes in Computer Science | 2001
Jan Neumann; Hanan Samet; Aya Soffer
A comparison is made of global and local methods for the shape analysis of logos in an image database. The qualities of the methods are judged by using the shape signatures to define a similarity metric on the logos. As representatives for the two classes of methods, we use the negative shape method which is based on local shape information and a wavelet-based method which makes use of global information. We apply both methods to images with different kinds of degradations and examine how a given degradation highlights the strengths and shortcomings of each method. Finally, we use these results to combine information from both methods and develop a new method which is based on the relative performances of the two methods.