Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dawn B. Woodard is active.

Publication


Featured researches published by Dawn B. Woodard.


european conference on computer systems | 2010

Fingerprinting the datacenter: automated classification of performance crises

Peter Bodik; Moises Goldszmidt; Armando Fox; Dawn B. Woodard; Hans Christian Andersen

Contemporary datacenters comprise hundreds or thousands of machines running applications requiring high availability and responsiveness. Although a performance crisis is easily detected by monitoring key end-to-end performance indicators (KPIs) such as response latency or request throughput, the variety of conditions that can lead to KPI degradation makes it difficult to select appropriate recovery actions. We propose and evaluate a methodology for automatic classification and identification of crises, and in particular for detecting whether a given crisis has been seen before, so that a known solution may be immediately applied. Our approach is based on a new and efficient representation of the datacenters state called a fingerprint, constructed by statistical selection and summarization of the hundreds of performance metrics typically collected on such systems. Our evaluation uses 4 months of trouble-ticket data from a production datacenter with hundreds of machines running a 24x7 enterprise-class user-facing application. In experiments in a realistic and rigorous operational setting, our approach provides operators the information necessary to initiate recovery actions with 80% correctness in an average of 10 minutes, which is 50 minutes earlier than the deadline provided to us by the operators. To the best of our knowledge this is the first rigorous evaluation of any such approach on a large-scale production installation.


The Annals of Applied Statistics | 2011

Forecasting emergency medical service call arrival rates

David S. Matteson; Mathew W. McLean; Dawn B. Woodard; Shane G. Henderson

We introduce a new method for forecasting emergency call arrival rates that combines integer-valued time series models with a dynamic latent factor structure. Covariate information is captured via simple constraints on the factor loadings. We directly model the count-valued arrivals per hour, rather than using an artificial assumption of normality. This is crucial for the emergency medical service context, in which the volume of calls may be very low. Smoothing splines are used in estimating the factor levels and loadings to improve long-term forecasts. We impose time series structure at the hourly level, rather than at the daily level, capturing the fine-scale dependence in addition to the long-term structure. Our analysis considers all emergency priority calls received by Toronto EMS between January 2007 and December 2008 for which an ambulance was dispatched. Empirical results demonstrate significantly reduced error in forecasting call arrival volume. To quantify the impact of reduced forecast errors, we design a queueing model simulation that approximates the dynamics of an ambulance system. The results show better performance as the forecasting method improves. This notion of quantifying the operational impact of improved statistical procedures may be of independent interest.


The Annals of Applied Statistics | 2013

Travel time estimation for ambulances using Bayesian data augmentation

Bradford S. Westgate; Dawn B. Woodard; David S. Matteson; Shane G. Henderson

We introduce a Bayesian model for estimating the distribution of ambulance travel times on each road segment in a city, using Global Positioning System (GPS) data. Due to sparseness and error in the GPS data, the exact ambulance paths and travel times on each road segment are unknown. We simultaneously estimate the paths, travel times, and parameters of each road segment travel time distribution using Bayesian data augmentation. To draw ambulance path samples, we use a novel reversible jump Metropolis-Hastings step. We also introduce two simpler estimation methods based on GPS speed data. We compare these methods to a recently published travel time estimation method, using simulated data and data from Toronto EMS. In both cases, out-of-sample point and interval estimates of ambulance trip times from the Bayesian method outperform estimates from the alternative methods. We also construct probability-of-coverage maps for ambulances. The Bayesian method gives more realistic maps than the recently published method. Finally, path estimates from the Bayesian method interpolate well between sparsely recorded GPS readings and are robust to GPS location errors.


Annals of Applied Probability | 2009

Conditions for rapid and torpid mixing of parallel and simulated tempering on multimodal distributions

Dawn B. Woodard; Scott C. Schmidler; Mark Huber

We obtain upper bounds on the convergence rates of Markov chains constructed by parallel and simulated tempering. These bounds are used to provide a set of sucien t conditions for torpid mixing of both techniques. We apply these conditions to show torpid mixing of parallel and simulated tempering for three examples: a normal mixture model with unequal covariances in R M and the mean-eld Potts model with q 3, regardless of the number and choice of temperatures, and the meaneld Ising model when an insucien t set of temperatures is chosen. The latter result contrasts with the rapid mixing of parallel and simulated tempering on the meaneld Ising model with a linearly increasing set of temperatures as shown previously.


Electronic Journal of Statistics | 2011

Stationarity of generalized autoregressive moving average models

Dawn B. Woodard; David S. Matteson; Shane G. Henderson

Time series models are often constructed by combining nonstationary effects such as trends with stochastic processes that are believed to be stationary. Although stationarity of the underlying process is typically crucial to ensure desirable properties or even validity of statistical estimators, there are numerous time series models for which this stationarity is not yet proven. A major barrier is that the most commonly-used methods assume φ-irreducibility, a condition that can be violated for the important class of discrete-valued observation-driven models. We show (strict) stationarity for the class of Generalized Autoregressive Moving Average (GARMA) models, which provides a flexible analogue of ARMA models for count, binary, or other discrete-valued data. We do this from two perspectives. First, we show stationarity and ergodicity of a perturbed version of the GARMA model, and show that the perturbed model yields parameter estimates that are arbitrarily close to those of the original model. This approach utilizes the fact that the perturbed model is φ-irreducible. Second, we show that the original GARMA model has a unique stationary distribution (so is strictly stationary when initialized in that distribution).


Journal of the American Statistical Association | 2015

A Spatio-Temporal Point Process Model for Ambulance Demand

Zhengyi Zhou; David S. Matteson; Dawn B. Woodard; Shane G. Henderson; Athanasios C. Micheas

Ambulance demand estimation at fine time and location scales is critical for fleet management and dynamic deployment. We are motivated by the problem of estimating the spatial distribution of ambulance demand in Toronto, Canada, as it changes over discrete 2 hr intervals. This large-scale dataset is sparse at the desired temporal resolutions and exhibits location-specific serial dependence, daily, and weekly seasonality. We address these challenges by introducing a novel characterization of time-varying Gaussian mixture models. We fix the mixture component distributions across all time periods to overcome data sparsity and accurately describe Toronto’s spatial structure, while representing the complex spatio-temporal dynamics through time-varying mixture weights. We constrain the mixture weights to capture weekly seasonality, and apply a conditionally autoregressive prior on the mixture weights of each component to represent location-specific short-term serial dependence and daily seasonality. While estimation may be performed using a fixed number of mixture components, we also extend to estimate the number of components using birth-and-death Markov chain Monte Carlo. The proposed model is shown to give higher statistical predictive accuracy and to reduce the error in predicting emergency medical service operational performance by as much as two-thirds compared to a typical industry practice.


Statistics and Computing | 2017

The use of a single pseudo-sample in approximate Bayesian computation

Luke Bornn; Natesh S. Pillai; Aaron Smith; Dawn B. Woodard

We analyze the computational efficiency of approximate Bayesian computation (ABC), which approximates a likelihood function by drawing pseudo-samples from the associated model. For the rejection sampling version of ABC, it is known that multiple pseudo-samples cannot substantially increase (and can substantially decrease) the efficiency of the algorithm as compared to employing a high-variance estimate based on a single pseudo-sample. We show that this conclusion also holds for a Markov chain Monte Carlo version of ABC, implying that it is unnecessary to tune the number of pseudo-samples used in ABC-MCMC. This conclusion is in contrast to particle MCMC methods, for which increasing the number of particles can provide large gains in computational efficiency.


European Journal of Operational Research | 2016

Large-network travel time distribution estimation for ambulances

Bradford S. Westgate; Dawn B. Woodard; David S. Matteson; Shane G. Henderson

We propose a regression approach for estimating the distribution of ambulance travel times between any two locations in a road network. Our method uses ambulance location data that can be sparse in both time and network coverage, such as Global Positioning System data. Estimates depend on the path traveled and on explanatory variables such as the time of day and day of week. By modeling at the trip level, we account for dependence between travel times on individual road segments. Our method is parsimonious and computationally tractable for large road networks. We apply our method to estimate ambulance travel time distributions in Toronto, providing improved estimates compared to a recently published method and a commercial software package. We also demonstrate our method’s impact on ambulance fleet management decisions, showing substantial differences between our method and the recently published method in the predicted probability that an ambulance arrives within a time threshold.


Annals of Statistics | 2013

Convergence rate of Markov chain methods for genomic motif discovery

Dawn B. Woodard; Jeffrey S. Rosenthal

We analyze the convergence rate of a popular Gibbs sampling method used for statistical discovery of gene regulatory binding motifs in DNA sequences. This sampler satisfies a very strong form of ergodicity (uniform). However, we show that, due to multimodality of the posterior distribution, the rate of convergence often decreases exponentially as a function of the length of the DNA sequence. Specifically, we show that this occurs whenever there is more than one true repeating pattern in the data. In practice there are typically multiple, even numerous, such patterns in biological data, the goal being to detect the most well-conserved and frequently-occurring of these. Our findings match empirical results, in which the motif-discovery Gibbs sampler has exhibited such poor convergence that it is used only for finding modes of the posterior distribution (candidate motifs) rather than for obtaining samples from that distribution. Ours appear to be the first meaningful bounds on the convergence rate of a Markov chain method for sampling from a multimodal posterior distribution, as a function of statistical quantities like the number of observations.


Journal of Computational and Graphical Statistics | 2013

Hierarchical Adaptive Regression Kernels for Regression With Functional Predictors

Dawn B. Woodard; Ciprian M. Crainiceanu; David Ruppert

We propose a new method for regression using a parsimonious and scientifically interpretable representation of functional predictors. Our approach is designed for data that exhibit features such as spikes, dips, and plateaus whose frequency, location, size, and shape varies stochastically across subjects. We propose Bayesian inference of the joint functional and exposure models, and give a method for efficient computation. We contrast our approach with existing state-of-the-art methods for regression with functional predictors, and show that our method is more effective and efficient for data that include features occurring at varying locations. We apply our methodology to a large and complex dataset from the Sleep Heart Health Study, to quantify the association between sleep characteristics and health outcomes. Software and technical appendices are provided in the online supplementary materials.

Collaboration


Dive into the Dawn B. Woodard's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Luke Bornn

Simon Fraser University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark Huber

Claremont McKenna College

View shared research outputs
Researchain Logo
Decentralizing Knowledge