Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Murali Haran is active.

Publication


Featured researches published by Murali Haran.


Journal of the American Statistical Association | 2006

Fixed-Width Output Analysis for Markov Chain Monte Carlo

Galin L. Jones; Murali Haran; Brian Caffo; Ronald C. Neath

Markov chain Monte Carlo is a method of producing a correlated sample to estimate features of a target distribution through ergodic averages. A fundamental question is when sampling should stop; that is, at what point the ergodic averages are good estimates of the desired quantities. We consider a method that stops the simulation when the width of a confidence interval based on an ergodic average is less than a user-specified value. Hence calculating a Monte Carlo standard error is a critical step in assessing the simulation output. We consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We give sufficient conditions for the strong consistency of both methods and investigate their finite-sample properties in various examples.


Statistical Science | 2008

Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?

James M. Flegal; Murali Haran; Galin L. Jones

Current reporting of results based on Markov chain Monte Carlo computations could be improved. In particular, a measure of the accuracy of the resulting estimates is rarely reported. Thus we have little ability to objectively assess the quality of the reported estimates. We address this issue in that we discuss why Monte Carlo standard errors are important, how they can be easily calculated in Markov chain Monte Carlo and how they can be used to decide when to stop the simulation. We compare their use to a popular alternative in the context of two examples.


foundations of software engineering | 2005

Applying classification techniques to remotely-collected program execution data

Murali Haran; Alan F. Karr; Alessandro Orso; Adam A. Porter; Ashish P. Sanil

There is an increasing interest in techniques that support measurement and analysis of fielded software systems. One of the main goals of these techniques is to better understand how software actually behaves in the field. In particular, many of these techniques require a way to distinguish, in the field, failing from passing executions. So far, researchers and practitioners have only partially addressed this problem: they have simply assumed that program failure status is either obvious (i.e., the program crashes) or provided by an external source (e.g., the users). In this paper, we propose a technique for automatically classifying execution data, collected in the field, as coming from either passing or failing program runs. (Failing program runs are executions that terminate with a failure, such as a wrong outcome.) We use statistical learning algorithms to build the classification models. Our approach builds the models by analyzing executions performed in a controlled environment (e.g., test cases run in-house) and then uses the models to predict whether execution data produced by a fielded instance were generated by a passing or failing program execution. We also present results from an initial feasibility study, based on multiple versions of a software subject, in which we investigate several issues vital to the applicability of the technique. Finally, we present some lessons learned regarding the interplay between the reliability of classification models and the amount and type of data collected.


Journal of Geophysical Research | 2012

A climate sensitivity estimate using Bayesian fusion of instrumental observations and an Earth System model

Roman Olson; Ryan L. Sriver; Marlos Goes; Nathan M. Urban; H. Damon Matthews; Murali Haran; Klaus Keller

[1] Current climate model projections are uncertain. This uncertainty is partly driven by the uncertainty in key model parameters such as climate sensitivity (CS), vertical ocean diffusivity (Kv), and strength of anthropogenic sulfate aerosol forcing. These parameters are commonly estimated using ensembles of model runs constrained by observations. Here we obtain a probability density function (pdf) of these parameters using the University of Victoria Earth System Climate Model (UVic ESCM) - an intermediate complexity model with a dynamic three-dimensional ocean. Specifically, we run an ensemble of UVic ESCM runs varying parameters that affect CS, ocean vertical diffusion, and the effects of anthropogenic sulfate aerosols. We use a statistical emulator that interpolates the UVic ESCM output to parameter settings where the model was not evaluated. We adopt a Bayesian approach to constrain the model output with instrumental surface temperature and ocean heat observations. Our approach accounts for the uncertainties in the properties of model-data residuals. We use a Markov chain Monte Carlo method to obtain a posterior pdf of these parameters. The mode of the climate sensitivity estimate is 2.8°C, with the corresponding 95% credible interval ranging from 1.8 to 4.9°C. These results are generally consistent with previous studies. The CS pdf is sensitive to the assumptions about the priors, to the effects of anthropogenic sulfate aerosols, and to the background vertical ocean diffusivity. Our method can be used with more complex climate models.


Statistics and Computing | 2011

Parallel multivariate slice sampling

Matthew M. Tibbits; Murali Haran; John Liechty

Slice sampling provides an easily implemented method for constructing a Markov chain Monte Carlo (MCMC) algorithm. However, slice sampling has two major drawbacks: (i) it requires repeated evaluation of likelihoods for each update, which can make it impractical when evaluations are expensive or as the number of evaluations grows (geometrically) with the dimension of the slice sampler, and (ii) since it can be challenging to construct multivariate updates, the updates are typically univariate, which often results in slow mixing samplers. We propose an approach to multivariate slice sampling that naturally lends itself to a parallel implementation. Our approach takes advantage of recent advances in computer architectures, for instance, the newest generation of graphics cards can execute roughly 30,000 threads simultaneously. We demonstrate that it is possible to construct a multivariate slice sampler that has good mixing properties and is efficient in terms of computing time. The contributions of this article are therefore twofold. We study approaches for constructing a multivariate slice sampler, and we show how parallel computing can be useful for making MCMC algorithms computationally efficient. We study various implementations of our algorithm in the context of real and simulated data.


Journal of Computational and Graphical Statistics | 2003

Accelerating Computation in Markov Random Field Models for Spatial Data via Structured MCMC

Murali Haran; James S. Hodges; Bradley P. Carlin

Spatial Poisson models for areal count data use nonstationary “intrinsic autoregressions,” also often referred to as “conditionally autoregressive” (CAR) models. Bayesian inference for these models has generally involved using single parameter updating Markov chain Monte Carlo algorithms, which often exhibit slow mixing (i.e., poor convergence) properties. These spatial models are richly parameterized and lend themselves to the structured Markov chain Monte Carlo (SMCMC) algorithms. SMCMC provides a simple, general, and flexible framework for accelerating convergence in an MCMC sampler by providing a systematic way to block groups of similar parameters while taking full advantage of the posterior correlation structure induced by the model and data. Among the SMCMC strategies considered here are blocking using different size blocks (grouping by geographical region), reparameterization, updating jointly with and without model hyperparameters, “oversampling” some of the model parameters, and “pilot adaptation” versus continuous tuning techniques for the proposal density. We apply the techniques presented here to datasets on cancer mortality and late detection in the state of Minnesota. We find that, compared to univariate sampling procedures, our techniques will typically lead to more accurate posterior estimates, and they are sometimes also far more efficient in terms of the number of effective samples generated per second.


The Annals of Applied Statistics | 2014

Fast dimension-reduced climate model calibration and the effect of data aggregation

Won Chang; Murali Haran; Roman Olson; Klaus Keller

How will the climate system respond to anthropogenic forcings? One approach to this question relies on climate model projections. Current climate projections are considerably uncertain. Characterizing and, if possible, reducing this uncertainty is an area of ongoing research. We consider the problem of making projections of the North Atlantic meridional overturning circulation (AMOC). Uncertainties about climate model parameters play a key role in uncertainties in AMOC projections. When the observational data and the climate model output are high-dimensional spatial data sets, the data are typically aggregated due to computational constraints. The effects of aggregation are unclear because statistically rigorous approaches for model parameter inference have been infeasible for high-resolution data. Here we develop a flexible and computationally efficient approach using principal components and basis expansions to study the effect of spatial data aggregation on parametric and projection uncertainties. Our Bayesian reduced-dimensional calibration approach allows us to study the effect of complicated error structures and data-model discrepancies on our ability to learn about climate model parameters from high-dimensional data. Considering high-dimensional spatial observations reduces the effect of deep uncertainty associated with prior specifications for the data-model discrepancy. Also, using the unaggregated data results in sharper projections based on our climate model. Our computationally efficient approach may be widely applicable to a variety of high-dimensional computer model calibration problems.


Biometrics | 2015

An attraction-repulsion point process model for respiratory syncytial virus infections.

Joshua Goldstein; Murali Haran; Ivan Simeonov; John Fricks; Francesca Chiaromonte

How is the progression of a virus influenced by properties intrinsic to individual cells? We address this question by studying the susceptibility of cells infected with two strains of the human respiratory syncytial virus (RSV-A and RSV-B) in an in vitro experiment. Spatial patterns of infected cells give us insight into how local conditions influence susceptibility to the virus. We observe a complicated attraction and repulsion behavior, a tendency for infected cells to lump together or remain apart. We develop a new spatial point process model to describe this behavior. Inference on spatial point processes is difficult because the likelihood functions of these models contain intractable normalizing constants; we adapt an MCMC algorithm called double Metropolis-Hastings to overcome this computational challenge. Our methods are computationally efficient even for large point patterns consisting of over 10,000 points. We illustrate the application of our model and inferential approach to simulated data examples and fit our model to various RSV experiments. Because our model parameters are easy to interpret, we are able to draw meaningful scientific conclusions from the fitted models.


Journal of Climate | 2015

On Discriminating between GCM Forcing Configurations Using Bayesian Reconstructions of Late-Holocene Temperatures*

Martin P. Tingley; Peter F. Craigmile; Murali Haran; Bo Li; Elizabeth Mannshardt; Bala Rajaratnam

AbstractSeveral climate modeling groups have recently generated ensembles of last-millennium climate simulations under different forcing scenarios. These experiments represent an ideal opportunity to establish the baseline feasibility of using proxy-based reconstructions of late-Holocene climate as out-of-calibration tests of the fidelity of the general circulation models used to project future climate. This paper develops a formal statistical model for assessing the agreement between members of an ensemble of climate simulations and the ensemble of possible climate histories produced from a hierarchical Bayesian climate reconstruction. As the internal variabilities of the simulated and reconstructed climate are decoupled from one another, the comparison is between the two latent, or unobserved, forced responses. Comparisons of the spatial average of a 600-yr high northern latitude temperature reconstruction to suites of last-millennium climate simulations from the GISS-E2 and CSIRO models, respectively, ...


Journal of the American Statistical Association | 2016

Calibrating an Ice Sheet Model Using High-Dimensional Binary Spatial Data

Won Chang; Murali Haran; Patrick J. Applegate; David Pollard

Rapid retreat of ice in the Amundsen Sea sector of West Antarctica may cause drastic sea level rise, posing significant risks to populations in low-lying coastal regions. Calibration of computer models representing the behavior of the West Antarctic Ice Sheet is key for informative projections of future sea level rise. However, both the relevant observations and the model output are high-dimensional binary spatial data; existing computer model calibration methods are unable to handle such data. Here we present a novel calibration method for computer models whose output is in the form of binary spatial data. To mitigate the computational and inferential challenges posed by our approach, we apply a generalized principal component based dimension reduction method. To demonstrate the utility of our method, we calibrate the PSU3D-ICE model by comparing the output from a 499-member perturbed-parameter ensemble with observations from the Amundsen Sea sector of the ice sheet. Our methods help rigorously characterize the parameter uncertainty even in the presence of systematic data-model discrepancies and dependence in the errors. Our method also helps inform environmental risk analyses by contributing to improved projections of sea level rise from the ice sheets. Supplementary materials for this article are available online.

Collaboration


Dive into the Murali Haran's collaboration.

Top Co-Authors

Avatar

Klaus Keller

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Won Chang

University of Chicago

View shared research outputs
Top Co-Authors

Avatar

Roman Olson

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

David Pollard

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Patrick J. Applegate

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar

Nathan M. Urban

Los Alamos National Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaewoo Park

Pennsylvania State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

K. Sham Bhat

Pennsylvania State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge