Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Irving Biederman is active.

Publication


Featured researches published by Irving Biederman.


Psychological Review | 1987

Recognition-by-Components: A Theory of Human Image Understanding

Irving Biederman

The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into an arrangement of simple geometric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of generalized-cone components, called geons (N £ 36), can be derived from contrasts of five readily detectable properties of edges in a two-dimensiona l image: curvature, collinearity, symmetry, parallelism, and cotermination. The detection of these properties is generally invariant over viewing position an


Psychological Review | 1992

Dynamic binding in a neural network for shape recognition

John E. Hummel; Irving Biederman

image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or is degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the objects components. Representational power derives from an allowance of free combinations of the geons. A Principle of Componential Recovery can account for the major phenomena of object recognition: If an arrangement of two or three geons can be recovered from the input, objects can be quickly recognized even when they are occluded, novel, rotated in depth, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory. Any single object can project an infinity of image configurations to the retina. The orientation of the object to the viewer can vary continuously, each giving rise to a different two-dimensional projection. The object can be occluded by other objects or texture fields, as when viewed behind foliage. The object need not be presented as a full-colored textured image but instead can be a simplified line drawing. Moreover, the object can even be missing some of its parts or be a novel exemplar of its particular category. But it is only with rare exceptions that an image fails to be rapidly and readily classified, either as an instance of a familiar object category or as an instance that cannot be so classified (itself a form of classification).


Cognitive Psychology | 1982

Scene Perception" Detecting and Judging Objects Undergoing Relational Violations

Irving Biederman; Robert J. Mezzanotte; Jan C. Rabinowitz

Given a single view of an object, humans can readily recognize that object from other views that preserve the parts in the original view. Empirical evidence suggests that this capacity reflects the activation of a viewpoint-invariant structural description specifying the objects parts and the relations among them. This article presents a neural network that generates such a description. Structural description is made possible through a solution to the dynamic binding problem: Temporary conjunctions of attributes (parts and relations) are represented by synchronized oscillatory activity among independent units representing those attributes. Specifically, the model uses synchrony (a) to parse images into their constituent parts, (b) to bind together the attributes of a part, and (c) to bind the relations to the parts to which they apply. Because it conjoins independent units temporarily, dynamic binding allows tremendous economy of representation and permits the representation to reflect the attribute structure of the shapes represented.


Science | 1972

Perceiving Real-World Scenes

Irving Biederman

Abstract Five classes of relations between an object and its setting can characterize the organization of objects into real-world scenes. The relations are (1) Interposition (objects interrupt their background), (2) Support (objects tend to rest on surfaces), (3) Probability (objects tend to be found in some scenes but not others), (4) Position (given an object is probable in a scene, it often is found in some positions and not others), and (5) familiar Size (objects have a limited set of size relations with other objects). In two experiments subjects viewed brief (150 msec) presentations of slides of scenes in which an object in a cued location in the scene was either in a normal relation to its background or violated from one to three of the relations. Such objects appear to (1) have the background pass through them, (2) float in air, (3) be unlikely in that particular scene, (4) be in an inappropriate position, and (5) be too large or too small relative to the other objects in the scene. In Experiment I, subjects attempted to determine whether the cued object corresponded to a target object which had been specified in advance by name. With the exception of the Interposition violation, violation costs were incurred in that the detection of objects undergoing violations was less accurate and slower than when those same objects were in normal relations to their setting. However, the detection of objects in normal relations to their setting (innocent bystanders) was unaffected by the presence of another object undergoing a violation in that same setting. This indicates that the violation costs were incurred not because of an unsuccessful elicitation of a frame or schema for the scene but because properly formed frames interfered with (or did not facilitate) the perceptibility of objects undergoing violations. As the number of violations increased, target detectability generally decreased. Thus, the relations were accessed from the results of a single fixation and were available sufficiently early during the time course of scene perception to affect the perception of the objects in the scene. Contrary to expectations from a bottom-up account of scene perception, violations of the pervasive physical relations of Support and Interposition were not more disruptive on object detection than the semantic violations of Probability, Position and Size. These are termed semantic because they require access to the referential meaning of the object. In Experiment II, subjects attempted to detect the presence of the violations themselves. Violations of the semantic relations were detected more accurately than violations of Interposition and at least as accurately as violations of Support. As the number of violations increased, the detectability of the incongruities between an object and its setting increased. These results provide converging evidence that semantic relations can be accessed from the results of a single fixation. In both experiments information about Position was accessed at least as quickly as information on Probability. Thus in Experiment I, the interference that resulted from placing a fire hydrant in a kitchen was not greater than the interference from placing it on top of a mail ☐ in a street scene. Similarly, violations of Probability in Experiment II were not more detectable than violations of Position. Thus, the semantic relations which were accessed included information about the detailed interactions among the objects—information which is more specific than what can be inferred from the general setting. Access to the semantic relations among the entities in a scene is not deferred until the completion of spatial and depth processing and object identification. Instead, an objects semantic relations are accessed simultaneously with its physical relations as well as with its own identification.


Graphical Models \/graphical Models and Image Processing \/computer Vision, Graphics, and Image Processing | 1985

Human image understanding: Recent research and a theory

Irving Biederman

When a briefly presented real-world scene was jumbled, the accuracy of identifying a single, cued object was less than that when the scene was coherent. Jumbling remained an effective variable even when the subject knew where to look and what to look for. Thus an objects meaningful context may affect the course of perceptual recognition and not just peripheral scanning or memory.


Cognitive Psychology | 1988

Surface versus Edge-Based Determinants of Visual Recognition

Irving Biederman; Ginny Ju

The perceptual recognition of objects is conceptualized to be a process in which the image of the input is segmented at regions of deep concavity into simple volumetric components, such as blocks, cylinders, wedges, and cones. The fundamental assumption of the proposed theory, recognition-by-components (RBC), is that a modest set of components [ N probably ≤ 36] can be derived from contrasts of five readily detectable properties of edges in a 2-dimensional image: curvature, collinearity, symmetry, parallelism, and cotermination. The detection of these properties is generally invariant over viewing position and image quality and consequently allows robust object perception when the image is projected from a novel viewpoint or degraded. RBC thus provides a principled account of the heretofore undecided relation between the classic principles of perceptual organization and pattern recognition: The constraints toward regularization (Pragnanz) characterize not the complete object but the objects components. A principle of componential recovery can account for the major phenomena of object recognition: If an arrangement of two or three primitive components can be recovered from the input, objects can be quickly recognized even when they are occluded, rotated in depth, novel, or extensively degraded. The results from experiments on the perception of briefly presented pictures by human observers provide empirical support for the theory.


American Journal of Psychology | 1976

Mental set and mental shift revisited

Amos Spector; Irving Biederman

Abstract Two roles hypothesized for surface characteristics, such as color, brightness, and texture, in object recognition are that such information can (a) define the gradients needed for a 2 1 2 -D sketch so that a 3-D representation can be derived (e.g., Marr & Nishihara, 1978 ) and (b) provide additional distinctive features for accessing memory. In a series of five experiments, subjects either named or verified (against a target name) brief (50–100 ms) presentations of slides of common objects. Each object was shown in two versions: professionally photographed in full color or as a simplified line drawing showing only the objects major components (which typically corresponded to its parts). Although one or the other type of picture would be slightly favored in a particular condition of exposure (duration or masking), overall mean reaction times and error rates were virtually identical for the two types of stimuli. These results support a view that edge-based representations mediate real-time object recognition in contrast to surface gradient or multiple cue representations. A previously unexplored distinction of color diagnosticity allowed us to determine whether color (and brightness) was employed as an additional feature in accessing memory for those objects or conditions where there might have been an advantage for the color slides. For some objects, e.g., banana, fork, fish, and camera, color is diagnostic as to the objects classification. For other objects, e.g., chair, pen, mitten, and bicycle pump, color is not diagnostic, as such objects can be of any color. If color was employed in accessing memory, color-diagnostic objects should have shown a relative advantage when presented as color slides compared to the line drawing versions of the same objects. Also, this advantage would be magnified when subjects could anticipate the color of an object in the verification task, particularly on NO trials when the foil was of a different color. Neither an overall advantage for color-diagnostic objects when presented in color nor a magnification of a relative advantage on the NO trials in the verification task was obtained. Although differences in surface characteristics such as color, brightness, and texture can be instrumental in defining edges and can provide cues for visual search, they play only a secondary role in the real-time recognition of an intact object when its edges can be readily extracted.


Perception | 1991

Evidence for Complete Translational and Reflectional Invariance in Visual Object Priming

Irving Biederman; Eric E. Cooper

In 1927, Jersild found that alternately subtracting 3 from a two-digit number and giving the common opposite to a word in a mixed list of numbers and words was faster than the average speed of subtracting 3s from a pure list of numbers and giving the opposites to a pure list of words. Experiment I replicated those findings: mixed lists were slightly, albeit nonsignificantly, faster than pure lists. Experiments II, III, and IV were designed to determine why changes of set did not slow performance on mixed lists: the results suggest that a shift of operations will take little or no time if the stimulus can serve as a retrieval cue for the operation to be performed on it. But changes of set will have a large effect when the selection of the appropriate operation requires that one keep track of previously performed operations.


Cognitive Psychology | 1991

Priming Contour-Deleted images: Evidence for intermediate Representations in Visual Object Recognition

Irving Biederman; Eric E. Cooper

The magnitude of priming on naming reaction times and on the error rates, resulting from the perception of a briefly presented picture of an object approximately 7 min before the primed object, was found to be independent of whether the primed object was originally viewed in the same hemifield, left—right or upper—lower, or in the same left—right orientation. Performance for same-name, different-examplar images was worse than for identical images, indicating that not only was there priming from block one to block two, but that some of the priming was visual, rather than purely verbal or conceptual. These results provide evidence for complete translational and reflectional invariance in the representation of objects for purposes of visual recognition. Explicit recognition memory for position and orientation was above chance, suggesting that the representation of objects for recognition is independent of the representations of the location and left—right orientation of objects in space.


Journal of Experimental Psychology: Learning, Memory and Cognition | 1987

Sexing Day-Old Chicks: A Case Study and Expert Systems Analysis of a Difficult Perceptual-Learning Task

Irving Biederman; Margaret M. Shiffrar

The speed and accuracy of perceptual recognition of a briefly presented picture of an object is facilitated by its prior presentation. Picture priming tasks were used to assess whether the facilitation is a function of the repetition of: (a) the objects image features (viz., vertices and edges), (b) the object model (e.g., that it is a grand piano), or (c) a representation intermediate between (a) and (b) consisting of convex or singly concave components of the object, roughly corresponding to the objects parts. Subjects viewed pictures with half their contour removed by deleting either (a) every other image feature from each part, or (b) half the components. On a second (primed) block of trials, subjects saw: (a) the identical image that they viewed on the first block, (b) the complement which had the missing contours, or (c) a same name-different exemplar of the object class (e.g., a grand piano when an upright piano had been shown on the first block). With deletion of features, speed and accuracy of naming identical and complementary images were equivalent, indicating that none of the priming could be attributed to the features actually present in the image. Performance with both types of image enjoyed an advantage over that with the different exemplars, establishing that the priming was visual rather than verbal or conceptual. With deletion of the components, performance with identical images was much better than that with their complements. The latter were equivalent to the different exemplars, indicating that all the visual priming of an image of an object is through the activation of a representation of its components in specified relations. In terms of a recent neural net implementation of object recognition (Hummel & Biederman, in press), the results suggest that the locus of object priming may be at changes in the weight matrix for a geon assembly layer, where units have self-organized to represent combinations of convex or singly concave components (or geons) and their attributes (e.g., aspect ratio, orientation, and relations with other geons such as TOP-OF). The results of these experiments provide evidence for the psychological reality of intermediate representations in real-time visual object recognition.

Collaboration


Dive into the Irving Biederman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Xiaokun Xu

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Mark D. Lescroart

University of Southern California

View shared research outputs
Top Co-Authors

Avatar

Ori Amir

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jiye G. Kim

University of Southern California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rufin Vogels

Katholieke Universiteit Leuven

View shared research outputs
Researchain Logo
Decentralizing Knowledge