Is this you? Create Your Porfile

Martin Herman

National Institute of Standards and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Martin Herman is active.

Explore More

Publication

Featured researches published by Martin Herman.

Artificial Intelligence | 1986

Incremental reconstruction of 3D scenes from multiple, complex images

Martin Herman; Takeo Kanade

Abstract The 3D Mosaic system is a vision system that incrementally reconstructs complex 3D scenes from a sequence of images obtained from multiple viewpoints. The system encompasses several levels of the vision process, starting with images and ending with symbolic scene descriptions. This paper describes the various components of the system, including stereo analysis, monocular analysis, and constructing and updating the scene model. In addition, the representation of the scene model is described. This model is intended for tasks such as matching, display generation, planning paths through the scene, and making other decisions about the scene environment. Examples showing how the system is used to interpret complex aerial photographs of urban scenes are presented. Each view of the scene, which may be either a single image or a stereo pair, undergoes analysis which results in a 3D wire-frame description that represents portions of edges and vertices of objects. The model is a surface-based description constructed from the wire frames. With each successive view, the model is incrementally updated and gradually becomes more accurate and complete. Task-specific knowledge, involving block-shaped objects in an urban scene, is used to extract the wire frames and construct and update the model. The model is represented as a graph in terms of symbolic primitives such as faces, edges, vertices, and their topology and geometry. This permits the representation of partially complete, planar-faced objects. Because incremental modifications to the model must be easy to perform, the model contains mechanisms to (1) add primitives in a manner such that constraints on geometry imposed by these additions are propagated throughout the model, and (2) modify and delete primitives if discrepancies arise between newly derived and current information. The model also contains mechanisms that permit the generation, addition, and deletion of hypotheses for parts of the scene for which there is little data.

international conference on robotics and automation | 1986

Fast, three-dimensional, collision-free motion planning

Martin Herman

Issues dealing with fast, 3-D, collision-free motion planning are discussed, and a fast path planning system under development at NBS is described. The components of a general motion planner are outlined, and some of their computational aspects are discussed. It is argued that an octree representation of the obstacles in the world leads to fast path planning algorithms. The system we are developing uses such an octree representation. The robot and its swept-volume paths are approximated by primitive shapes so as to result in fast collision detection algorithms. The search for a path is performed in the octree space, and combines hypothesize and test, hill climbing, and A.

international conference on pattern recognition | 1996

Real-time single-workstation obstacle avoidance using only wide-field flow divergence

Theodore(Ted) Camus; David Coombs; Martin Herman; Tsai Hong Hong

A real-time robot vision system is described which uses only the divergence of the optical flow field for both steering control and collision detection. The robot has wandered about the lab at 20 cm/s for as long as 26 minutes without collision. The entire system is implemented on a single ordinary UNIX workstation without the benefit of real-time operating system support. Dense optical flow data are calculated in real-time across the entire wide-angle image. The divergence of this optical flow field is calculated everywhere and used to control steering and collision behavior. Divergence alone has proven sufficient for steering past objects and detecting imminent collision. The major contribution is the demonstration of a simple, robust, minimal system that uses flow-derived measures to control steering and speed to avoid collision in real time for extended periods. Such a system can be embedded in a general, multi-level perception/control system.

International Journal of Computer Vision | 1997

A General Motion Model and Spatio-Temporal Filters forComputing Optical Flow

Hongche Liu; Tsai-Hong Hong; Martin Herman; Rama Chellappa

Traditional optical flow algorithms assume local image translational motion and apply simple image filtering techniques. Recent studies have taken two separate approaches toward improving the accuracy of computed flow: the application of spatio-temporal filtering schemes and the use of advanced motion models such as the affine model. Each has achieved some improvement over traditional algorithms in specialized situations but the computation of accurate optical flow for general motion has been elusive. In this paper, we exploit the interdependency between these two approaches and propose a unified approach. The general motion model we adopt characterizes arbitrary 3-D steady motion. Under perspective projection, we derive an image motion equation that describes the spatio-temporal relation of gray-scale intensity in an image sequence, thus making the utilization of 3-D filtering possible. However, to accommodate this motion model, we need to extend the filter design to derive additional motion constraint equations. Using Hermite polynomials, we design differentiation filters, whose orthogonality and Gaussian derivative properties insure numerical stability; a recursive relation facilitates application of the general nonlinear motion model while separability promotes efficiency. The resulting algorithm produces accurate optical flow and other useful motion parameters. It is evaluated quantitatively using the scheme established by Barron et al. (1994) and qualitatively with real images.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 1984

Incremental Acquisition of a Three-Dimensional Scene Model from Images

Martin Herman; Takeo Kanade; Shigeru Kuroe

We describe the current state of the 3-D Mosaic project, whose goal is to incrementally acquire a 3-D model of a complex urban scene from images. The notion of incremental acquisition arises from the observations that 1) single images contain only parfial information about a scene, 2) complex images are difficult to fully interpret, and 3) different features of a given scene tend to be easier to extract in different images because of differences in viewpoint and lighting conditions. In our approach, multiple images of the scene are sequentially analyzed so as to incrementaly construct the model. Each new image provides information which refines the model. We describe some experiments toward this end. Our method of extracting 3-D shape information from the images is stereo analysis. Because we are dealing with urban scenes, a junction-based matching technique proves very useful. This technique produces rather sparse wire-frame descriptions of the scene. A reasoning system that relies on task-specific knowledge generates an approximate model of the scene from the stereo output. Gray scale information is also acquired for the faces in the model. Finally, we describe an experiment in combining two views of the scene to obtain a rermed model.

Readings in computer vision: issues, problems, principles, and paradigms | 1987

The 3D MOSAIC scene understanding system: incremental reconstruction of 3D scenes for complex images

Martin Herman; Takeo Kanade

Abstract The 3D Mosaic system is a vision system that incrementally reconstructs complex 3D scenes from multiple images. The system encompasses several levels of the vision process, starting with images and ending with symbolic scene descriptions. This paper describes the various components of the system, including stereo analysis, monocular analysis, and constructing and modifying the scene model In addition, the representation of the scene model is described. This model is intended for tasks such as matching, display generation, planning paths through the scene, and making other decisions about the scene environment. Examples showing how the system is used to interpret complex aerial photographs of urban scenes are presented. Each view of the scene, which may be either a single image or a stereo pair, undergoes analysis which results in a 3D wire-frame description that represents portions of edges and vertices of objects. The model is a surface-based description constructed from the wire frames. With each successive view, the model is incrementally updated and gradually becomes more accurate and complete. Task-specific knowledge, involving block-shaped objects in an urban scene, is used to extract the wire frames and construct and update the model.

workshop on applications of computer vision | 2000

Head tracking using stereo

Daniel B. Russakoff; Martin Herman

Abstract. Head tracking is an important primitive for smart environments and perceptual user interfaces where the poses and movements of body parts need to be determined. Most previous solutions to this problem are based on intensity images and, as a result, suffer from a host of problems including sensitivity to background clutter and lighting variations. Our approach avoids these pitfalls by using stereo depth data together with a simple human-torso model to create a head-tracking system that is both fast and robust. We use stereo data (Commercial equipment and materials are identified in order to adequately specify certain procedures. In no case does such identification imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.) to derive a depth model of the background that is then employed to provide accurate foreground segmentation. We then use directed local edge detectors on the foreground to find occluding edges that are used as features to fit to a torso model. Once we have the model parameters, the location and orientation of the head can be easily estimated. A useful side effect from using stereo data is the ability to track head movement through a room in three dimensions. Experimental results on real image sequences are given.

Proceedings of the IEEE Workshop on Visual Motion | 1991

A new approach to vision and control for road following

Daniel Raviv; Martin Herman

The paper deals with a new, quantitative, vision-based approach to road following. It is based on the theoretical framework of the recently developed optical flow-based visual field theory. By building on this theory, the authors suggest that motion commands can be generated from a visual feature, or cue, consisting of the projection into the image of the tangent point on the edge of the road, along with the optical flow of this point. Using this cue, they suggest several different vision-based control approaches. There are several advantages to using this visual cue: (1) it is extracted directly from the image, i.e. there is no need to reconstruct the scene, (2) it can be used in a tight perception-action loop to directly generate action commands, (3) for many road following situations this visual cue is sufficient, (4) it has a scientific basis, and (5) the related computations are relatively simple and thus suitable for real-time applications. For each control approach, they derive the value of the related steering commands.<<ETX>>

IEEE Transactions on Systems, Man, and Cybernetics | 1994

A unified approach to camera fixation and vision-based road following

Daniel Raviv; Martin Herman

Both camera fixation and vision-based road following are problems that involve tracking or fixating on 3-D points and features. This paper presents a unified theoretical approach to analyzing camera fixation and vision-based road following. The approach is based on the concept of equal flow circles (EFCs) and zero flow circles (ZFCs). Using EFCs it is possible to locate points in space relative to the fixation point, and predict the behavior. The cameras instantaneous direction of translation and the fixation point determine the plane on which the EFCs can be found. We show that points on an EFC inside the ZFC produce optical flow that is opposite in sign to that produced by points outside the ZFC. When a point in space crosses a ZFC it produces zero flow. For explanation purposes we analyzed a special case of motion. However, a similar approach can be taken for a more general motion of the camera. The analysis for the current motion can also be extended to find equal flow curves. >

international conference on robotics and automation | 1988

Overview of the multiple autonomous underwater vehicles (MAUV) project

Martin Herman; James S. Albus

The US National Bureau of Standards multiple autonomous underwater vehicles (MAUV) project involves the development of a real-time intelligent control system that performs sensing, world modeling, planning, and execution for underwater robot vehicles. The goal of the project is to have multiple vehicles exhibiting intelligent, autonomous, cooperative behavior. Initial tests have involved two identical vehicles engaged in various scenarios in Lake Winnipesaukee in New Hampshire. All software for controlling the vehicles reside on computer boards mounted onboard the vehicles.<<ETX>>

Explore More