Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Donald Geman is active.

Publication


Featured researches published by Donald Geman.


Journal of Applied Statistics | 1993

Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images*

Stuart Geman; Donald Geman

We make an analogy between images and statistical mechanics systems. Pixel gray levels and the presence and orientation of edges are viewed as states of atoms or molecules in a lattice-like physical system. The assignment of an energy function in the physical system determines its Gibbs distribution. Because of the Gibbs distribution, Markov random field (MRF) equivalence, this assignment also determines an MRF image model. The energy function is a more convenient and natural mechanism for embodying picture attributes than are the local characteristics of the MRF. For a range of degradation mechanisms, including blurring, non-linear deformations, and multiplicative or additive noise, the posterior distribution is an MRF with a structure akin to the image model. By the analogy, the posterior distribution defines another (imaginary) physical system. Gradual temperature reduction in the physical system isolates low-energy states (‘annealing’), or what is the same thing, the most probable states under the Gib...


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1992

Constrained restoration and the recovery of discontinuities

Donald Geman; George Reynolds

The linear image restoration problem is to recover an original brightness distribution X/sup 0/ given the blurred and noisy observations Y=KX/sup 0/+B, where K and B represent the point spread function and measurement error, respectively. This problem is typical of ill-conditioned inverse problems that frequently arise in low-level computer vision. A conventional method to stabilize the problem is to introduce a priori constraints on X/sup 0/ and design a cost functional H(X) over images X, which is a weighted average of the prior constraints (regularization term) and posterior constraints (data term); the reconstruction is then the image X, which minimizes H. A prominent weakness in this approach, especially with quadratic-type stabilizers, is the difficulty in recovering discontinuities. The authors therefore examine prior smoothness constraints of a different form, which permit the recovery of discontinuities without introducing auxiliary variables for marking the location of jumps and suspending the constraints in their vicinity. In this sense, discontinuities are addressed implicitly rather than explicitly. >


Nature Reviews Genetics | 2010

Tackling the widespread and critical impact of batch effects in high-throughput data

Jeffrey T. Leek; Robert B. Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W. Evan Johnson; Donald Geman; Keith A. Baggerly; Rafael A. Irizarry

High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.


Neural Computation | 1997

Shape quantization and recognition with randomized trees

Yali Amit; Donald Geman

We explore a new approach to shape recognition based on a virtually infinite family of binary features (queries) of the image data, designed to accommodate prior information about shape invariance and regularity. Each query corresponds to a spatial arrangement of several local topographic codes (or tags), which are in themselves too primitive and common to be informative about shape. All the discriminating power derives from relative angles and distances among the tags. The important attributes of the queries are a natural partial ordering corresponding to increasing structure and complexity; semi-invariance, meaning that most shapes of a given class will answer the same way to two queries that are successive in the ordering; and stability, since the queries are not based on distinguished points and substructures. No classifier based on the full feature set can be evaluated, and it is impossible to determine a priori which arrangements are informative. Our approach is to select informative features and build tree classifiers at the same time by inductive learning. In effect, each tree provides an approximation to the full posterior where the features chosen depend on the branch that is traversed. Due to the number and nature of the queries, standard decision tree construction based on a fixed-length feature vector is not feasible. Instead we entertain only a small random sample of queries at each node, constrain their complexity to increase with tree depth, and grow multiple trees. The terminal nodes are labeled by estimates of the corresponding posterior distribution over shape classes. An image is classified by sending it down every tree and aggregating the resulting distributions. The method is applied to classifying handwritten digits and synthetic linear and nonlinear deformations of three hundred symbols. State-of-the-art error rates are achieved on the National Institute of Standards and Technology database of digits. The principal goal of the experiments on symbols is to analyze invariance, generalization error and related issues, and a comparison with artificial neural networks methods is presented in this context. Figure 1: LATEX Symbol


IEEE Transactions on Image Processing | 1995

Nonlinear image recovery with half-quadratic regularization

Donald Geman; Chengda Yang

One popular method for the recovery of an ideal intensity image from corrupted or indirect measurements is regularization: minimize an objective function that enforces a roughness penalty in addition to coherence with the data. Linear estimates are relatively easy to compute but generally introduce systematic errors; for example, they are incapable of recovering discontinuities and other important image attributes. In contrast, nonlinear estimates are more accurate but are often far less accessible. This is particularly true when the objective function is nonconvex, and the distribution of each data component depends on many image components through a linear operator with broad support. Our approach is based on an auxiliary array and an extended objective function in which the original variables appear quadratically and the auxiliary variables are decoupled. Minimizing over the auxiliary array alone yields the original function so that the original image estimate can be obtained by joint minimization. This can be done efficiently by Monte Carlo methods, for example by FFT-based annealing using a Markov chain that alternates between (global) transitions from one array to the other. Experiments are reported in optical astronomy, with space telescope data, and computed tomography.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1990

Boundary detection by constrained optimization

Donald Geman; Stuart Geman; Christine Graffigne; Ping Dong

A statistical framework is used for finding boundaries and for partitioning scenes into homogeneous regions. The model is a joint probability distribution for the array of pixel gray levels and an array of labels. In boundary finding, the labels are binary, zero, or one, representing the absence or presence of boundary elements. In partitioning, the label values are generic: two labels are the same when the corresponding scene locations are considered to belong to the same region. The distribution incorporates a measure of disparity between certain spatial features of block pairs of pixel gray levels, using the Kolmogorov-Smirnov nonparametric measures of difference between the distributions of these features. The number of model parameters is minimized by forbidding label configurations, which are assigned probability zero. The maximum a posteriori estimator of boundary placements and partitionings is examined. The forbidden states introduce constraints into the calculation of these configurations. Stochastic relaxation methods are extended to accommodate constrained optimization. >


International Journal of Computer Vision | 2001

Coarse-to-Fine Face Detection

François Fleuret; Donald Geman

We study visual selection: Detect and roughly localize all instances of a generic object class, such as a face, in a greyscale scene, measuring performance in terms of computation and false alarms. Our approach is sequential testing which is coarse-to-fine in both in the exploration of poses and the representation of objects. All the tests are binary and indicate the presence or absence of loose spatial arrangements of oriented edge fragments. Starting from training examples, we recursively find larger and larger arrangements which are “decomposable,” which implies the probability of an arrangement appearing on an object decays slowly with its size. Detection means finding a sufficient number of arrangements of each size along a decreasing sequence of pose cells. At the beginning, the tests are simple and universal, accommodating many poses simultaneously, but the false alarm rate is relatively high. Eventually, the tests are more discriminating, but also more complex and dedicated to specific poses. As a result, the spatial distribution of processing is highly skewed and detection is rapid, but at the expense of (isolated) false alarms which, presumably, could be eliminated with localized, more intensive, processing.


Statistical Applications in Genetics and Molecular Biology | 2004

Classifying Gene Expression Profiles from Pairwise mRNA Comparisons

Donald Geman; Christian d'Avignon; Daniel Q. Naiman; Raimond L. Winslow

We present a new approach to molecular classification based on mRNA comparisons. Our method, referred to as the top-scoring pair(s) (TSP) classifier, is motivated by current technical and practical limitations in using gene expression microarray data for class prediction, for example to detect disease, identify tumors or predict treatment response. Accurate statistical inference from such data is difficult due to the small number of observations, typically tens, relative to the large number of genes, typically thousands. Moreover, conventional methods from machine learning lead to decisions which are usually very difficult to interpret in simple or biologically meaningful terms. In contrast, the TSP classifier provides decision rules which i) involve very few genes and only relative expression values (e.g., comparing the mRNA counts within a single pair of genes); ii) are both accurate and transparent; and iii) provide specific hypotheses for follow-up studies. In particular, the TSP classifier achieves prediction rates with standard cancer data that are as high as those of previous studies which use considerably more genes and complex procedures. Finally, the TSP classifier is parameter-free, thus avoiding the type of over-fitting and inflated estimates of performance that result when all aspects of learning a predictor are not properly cross-validated.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 1984

Bayes Smoothing Algorithms for Segmentation of Binary Images Modeled by Markov Random Fields

Haluk Derin; H. Elliott; Roberto Cristi; Donald Geman

A new image segmentation algorithm is presented, based on recursive Bayes smoothing of images modeled by Markov random fields and corrupted by independent additive noise. The Bayes smoothing algorithm yields the a posteriori distribution of the scene value at each pixel, given the total noisy image, in a recursive way. The a posteriori distribution together with a criterion of optimality then determine a Bayes estimate of the scene. The algorithm presented is an extension of a 1-D Bayes smoothing algorithm to 2-D and it gives the optimum Bayes estimate for the scene value at each pixel. Computational concerns in 2-D, however, necessitate certain simplifying assumptions on the model and approximations on the implementation of the algorithm. In particular, the scene (noiseless image) is modeled as a Markov mesh random field, a special class of Markov random fields, and the Bayes smoothing algorithm is applied on overlapping strips (horizontal/vertical) of the image consisting of several rows (columns). It is assumed that the signal (scene values) vector sequence along the strip is a vector Markov chain. Since signal correlation in one of the dimensions is not fully used along the edges of the strip, estimates are generated only along the middle sections of the strips. The overlapping strips are chosen such that the union of the middle sections of the strips gives the whole image. The Bayes smoothing algorithm presented here is valid for scene random fields consisting of multilevel (discrete) or continuous random variables.


Neural Computation | 1999

A computational model for visual selection

Yali Amit; Donald Geman

We propose a computational model for detecting and localizing instances from an object class in static gray-level images. We divide detection into visual selection and final classification, concentrating on the former: drastically reducing the number of candidate regions that require further, usually more intensive, processing, but with a minimum of computation and missed detections. Bottom-up processing is based on local groupings of edge fragments constrained by loose geometrical relationships. They have no a priori semantic or geometric interpretation. The role of training is to select special groupings that are moderately likely at certain places on the object but rare in the background. We show that the statistics in both populations are stable. The candidate regions are those that contain global arrangements of several local groupings. Whereas our model was not conceived to explain brain functions, it does cohere with evidence about the functions of neurons in V1 and V2, such as responses to coarse or incomplete patterns (e.g., illusory contours) and to scale and translation invariance in IT. Finally, the algorithm is applied to face and symbol detection.

Collaboration


Dive into the Donald Geman's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Joseph Horowitz

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar

Laurent Younes

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Bahman Afsari

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar

Yali Amit

University of Chicago

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Elana J. Fertig

Johns Hopkins University School of Medicine

View shared research outputs
Top Co-Authors

Avatar

Luigi Marchionni

Johns Hopkins University School of Medicine

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bruno Jedynak

Johns Hopkins University

View shared research outputs
Researchain Logo
Decentralizing Knowledge