Herbert K. H. Lee
University of California, Santa Cruz
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Herbert K. H. Lee.
Journal of the American Statistical Association | 2008
Robert B. Gramacy; Herbert K. H. Lee
Motivated by a computer experiment for the design of a rocket booster, this article explores nonstationary modeling methodologies that couple stationary Gaussian processes with treed partitioning. Partitioning is a simple but effective method for dealing with nonstationarity. The methodological developments and statistical computing details that make this approach efficient are described in detail. In addition to providing an analysis of the rocket booster simulator, we show that our approach is effective in other arenas as well.
Technometrics | 2009
Robert B. Gramacy; Herbert K. H. Lee
Computer experiments often are performed to allow modeling of a response surface of a physical experiment that can be too costly or difficult to run except by using a simulator. Running the experiment over a dense grid can be prohibitively expensive, yet running over a sparse design chosen in advance can result in insufficient information in parts of the space, particularly when the surface calls for a nonstationary model. We propose an approach that automatically explores the space while simultaneously fitting the response surface, using predictive uncertainty to guide subsequent experimental runs. We use the newly developed Bayesian treed Gaussian process as the surrogate model; a fully Bayesian approach allows explicit measures of uncertainty. We develop an adaptive sequential design framework to cope with an asynchronous, random, agent–based supercomputing environment by using a hybrid approach that melds optimal strategies from the statistics literature with flexible strategies from the active learning literature. The merits of this approach are borne out in several examples, including the motivating computational fluid dynamics simulation of a rocket booster.
Statistics and Computing | 2012
Robert B. Gramacy; Herbert K. H. Lee
Most surrogate models for computer experiments are interpolators, and the most common interpolator is a Gaussian process (GP) that deliberately omits a small-scale (measurement) error term called the nugget. The explanation is that computer experiments are, by definition, “deterministic”, and so there is no measurement error. We think this is too narrow a focus for a computer experiment and a statistically inefficient way to model them. We show that estimating a (non-zero) nugget can lead to surrogate models with better statistical properties, such as predictive accuracy and coverage, in a variety of common situations.
Physical Review Letters | 2010
Tracy Holsclaw; Ujjaini Alam; Bruno Sansó; Herbert K. H. Lee; Katrin Heitmann; Salman Habib; David Higdon
Understanding the origin of the accelerated expansion of the Universe poses one of the greatest challenges in physics today. Lacking a compelling fundamental theory to test, observational efforts are targeted at a better characterization of the underlying cause. If a new form of mass-energy, dark energy, is driving the acceleration, the redshift evolution of the equation of state parameter w(z) will hold essential clues as to its origin. To best exploit data from observations it is necessary to develop a robust and accurate reconstruction approach, with controlled errors, for w(z). We introduce a new, nonparametric method for solving the associated statistical inverse problem based on Gaussian process modeling and Markov chain Monte Carlo sampling. Applying this method to recent supernova measurements, we reconstruct the continuous history of w out to redshift z=1.5.
Technometrics | 2002
Herbert K. H. Lee; David Higdon; Zhuoxin Bi; Marco A. R. Ferreira; Mike West
We give an approach for using flow information from a system of wells to characterize hydrologic properties of an aquifer. In particular, we consider experiments where an impulse of tracer fluid is injected along with the water at the input wells and its concentration is recorded over time at the uptake wells. We focus on characterizing the spatially varying permeability field, which is a key attribute of the aquifer for determining flow paths and rates for a given flow experiment. As is standard for estimation from such flow data, we use complicated subsurface flow code that simulates the fluid flow through the aquifer for a particular well configuration and aquifer specification, in particular the permeability field over a grid. The solution to this ill-posed problem requires that some regularity conditions be imposed on the permeability field. Typically, this regularity is accomplished by specifying a stationary Gaussian process model for the permeability field. Here we use an intrinsically stationary Markov random field, which compares favorably to Gaussian process models and offers some additional flexibility and computational advantages. Our interest in quantifying uncertainty leads us to take a Bayesian approach, using Markov chain Monte Carlo for exploring the high-dimensional posterior distribution. We demonstrate our approach with several examples. We also note that the methodology is general and is not specific to hydrology applications.
IEEE Transactions on Signal Processing | 2002
Dave Higdon; Herbert K. H. Lee; Zhuoxin Bi
The Bayesian approach allows one to easily quantify uncertainty, at least in theory. In practice, however, the Markov chain Monte Carlo (MCMC) method can be computationally expensive, particularly in complicated inverse problems. We present a methodology for improving the speed and efficiency of an MCMC analysis by combining runs on different scales. By using a coarser scale, the chain can run faster (particularly when there is an external forward simulator involved in the likelihood evaluation) and better explore the posterior, being less likely to become stuck in local maxima. We discuss methods for linking the coarse chain back to the original fine-scale chain of interest. The resulting coupled chain can thus be run more efficiently without sacrificing the accuracy achieved at the finer scale.
international conference on machine learning | 2004
Robert B. Gramacy; Herbert K. H. Lee; William G. Macready
Computer experiments often require dense sweeps over input parameters to obtain a qualitative understanding of their response. Such sweeps can be prohibitively expensive, and are unnecessary in regions where the response is easy predicted; well-chosen designs could allow a mapping of the response with far fewer simulation runs. Thus, there is a need for computationally inexpensive surrogate models and an accompanying method for selecting small designs. We explore a general methodology for addressing this need that uses non-stationary Gaussian processes. Binary trees partition the input space to facilitate non-stationarity and a Bayesian interpretation provides an explicit measure of predictive uncertainty that can be used to guide sampling. Our methods are illustrated on several examples, including a motivating example involving computational fluid dynamics simulation of a NASA reentry vehicle.
Technometrics | 2009
Matthew A. Taddy; Herbert K. H. Lee; Genetha Anne Gray; Joshua D. Griffin
Optimization for complex systems in engineering often involves the use of expensive computer simulation. By combining statistical emulation using treed Gaussian processes with pattern search optimization, we are able to perform robust local optimization more efficiently and effectively than when using either method alone. Our approach is based on the augmentation of local search patterns with location sets generated through improvement prediction over the input space. We further develop a computational framework for asynchronous parallel implementation of the optimization algorithm. We demonstrate our methods on two standard test problems and our motivating example of calibrating a circuit device simulator.
Physical Review D | 2010
Tracy Holsclaw; David Higdon; Katrin Heitmann; Bruno Sansó; Ujjaini Alam; Herbert K. H. Lee; Salman Habib
A basic aim of ongoing and upcoming cosmological surveys is to unravel the mystery of dark energy. In the absence of a compelling theory to test, a natural approach is to better characterize the properties of dark energy in search of clues that can lead to a more fundamental understanding. One way to view this characterization is the improved determination of the redshift-dependence of the dark energy equation of state parameter, w(z). To do this requires a robust and bias-free method for reconstructing w(z) from data that does not rely on restrictive expansion schemes or assumed functional forms for w(z). We present a new nonparametric reconstruction method that solves for w(z) as a statistical inverse problem, based on a Gaussian process representation. This method reliably captures nontrivial behavior of w(z) and provides controlled error bounds. We demonstrate the power of the method on different sets of simulated supernova data; the approach can be easily extended to include diverse cosmological probes.
Statistical Modelling | 2005
Herbert K. H. Lee; Dave Higdon; Catherine A. Calder; Christopher H. Holloman
Gaussian processes (GP) have proven to be useful and versatile stochastic models in a wide variety of applications including computer experiments, environmental monitoring, hydrology and climate modeling. A GP model is determined by its mean and covariance functions. In most cases, the mean is specified to be a constant, or some other simple linear function, whereas the covariance function is governed by a few parameters. A Bayesian formulation is attractive as it allows for formal incorporation of uncertainty regarding the parameters governing the GP. However, estimation of these parameters can be problematic. Large datasets, posterior correlation and inverse problems can all lead to difficulties in exploring the posterior distribution. Here, we propose an alternative model which is quite tractable computationally - even with large datasets or indirectly observed data - while still maintaining the flexibility and adaptiveness of traditional GP models. This model is based on convolving simple Markov random fields with a smoothing kernel. We consider applications in hydrology and aircraft prototype testing.