Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Tilmann Gneiting is active.

Publication


Featured researches published by Tilmann Gneiting.


Journal of the American Statistical Association | 2007

Strictly Proper Scoring Rules, Prediction, and Estimation

Tilmann Gneiting; Adrian E. Raftery

Scoring rules assess the quality of probabilistic forecasts, by assigning a numerical score based on the predictive distribution and on the event or value that materializes. A scoring rule is proper if the forecaster maximizes the expected score for an observation drawn from the distributionF if he or she issues the probabilistic forecast F, rather than G ≠ F. It is strictly proper if the maximum is unique. In prediction problems, proper scoring rules encourage the forecaster to make careful assessments and to be honest. In estimation problems, strictly proper scoring rules provide attractive loss and utility functions that can be tailored to the problem at hand. This article reviews and develops the theory of proper scoring rules on general probability spaces, and proposes and discusses examples thereof. Proper scoring rules derive from convex functions and relate to information measures, entropy functions, and Bregman divergences. In the case of categorical variables, we prove a rigorous version of the Savage representation. Examples of scoring rules for probabilistic forecasts in the form of predictive densities include the logarithmic, spherical, pseudospherical, and quadratic scores. The continuous ranked probability score applies to probabilistic forecasts that take the form of predictive cumulative distribution functions. It generalizes the absolute error and forms a special case of a new and very general type of score, the energy score. Like many other scoring rules, the energy score admits a kernel representation in terms of negative definite functions, with links to inequalities of Hoeffding type, in both univariate and multivariate settings. Proper scoring rules for quantile and interval forecasts are also discussed. We relate proper scoring rules to Bayes factors and to cross-validation, and propose a novel form of cross-validation known as random-fold cross-validation. A case study on probabilistic weather forecasts in the North American Pacific Northwest illustrates the importance of propriety. We note optimum score approaches to point and quantile estimation, and propose the intuitively appealing interval score as a utility function in interval estimation that addresses width as well as coverage.


Monthly Weather Review | 2005

Using Bayesian Model Averaging to Calibrate Forecast Ensembles

Adrian E. Raftery; Tilmann Gneiting; Fadoua Balabdaoui; Michael Polakowski

Ensembles used for probabilistic weather forecasting often exhibit a spread-error correlation, but they tend to be underdispersive. This paper proposes a statistical method for postprocessing ensembles based on Bayesian model averaging (BMA), which is a standard method for combining predictive distributions from different sources. The BMA predictive probability density function (PDF) of any quantity of interest is a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are equal to posterior probabilities of the models generating the forecasts and reflect the models’ relative contributions to predictive skill over the training period. The BMA weights can be used to assess the usefulness of ensemble members, and this can be used as a basis for selecting ensemble members; this can be useful given the cost of running large ensembles. The BMA PDF can be represented as an unweighted ensemble of any desired size, by simulating from the BMA predictive distribution. The BMA predictive variance can be decomposed into two components, one corresponding to the between-forecast variability, and the second to the within-forecast variability. Predictive PDFs or intervals based solely on the ensemble spread incorporate the first component but not the second. Thus BMA provides a theoretical explanation of the tendency of ensembles to exhibit a spread-error correlation but yet be underdispersive. The method was applied to 48-h forecasts of surface temperature in the Pacific Northwest in January– June 2000 using the University of Washington fifth-generation Pennsylvania State University–NCAR Mesoscale Model (MM5) ensemble. The predictive PDFs were much better calibrated than the raw ensemble, and the BMA forecasts were sharp in that 90% BMA prediction intervals were 66% shorter on average than those produced by sample climatology. As a by-product, BMA yields a deterministic point forecast, and this had root-mean-square errors 7% lower than the best of the ensemble members and 8% lower than the ensemble mean. Similar results were obtained for forecasts of sea level pressure. Simulation experiments show that BMA performs reasonably well when the underlying ensemble is calibrated, or even overdispersed.


Journal of the American Statistical Association | 2002

Nonseparable, Stationary Covariance Functions for Space–Time Data

Tilmann Gneiting

Geostatistical approaches to spatiotemporal prediction in environmental science, climatology, meteorology, and related fields rely on appropriate covariance models. This article proposes general classes of nonseparable, stationary covariance functions for spatiotemporal random processes. The constructions are directly in the space–time domain and do not depend on closed-form Fourier inversions. The model parameters can be associated with the datas spatial and temporal structures, respectively; and a covariance model with a readily interpretable space–time interaction parameter is fitted to wind data from Ireland.


Monthly Weather Review | 2005

Calibrated Probabilistic Forecasting Using Ensemble Model Output Statistics and Minimum CRPS Estimation

Tilmann Gneiting; Adrian E. Raftery; Anton H. Westveld; Tom Goldman

Abstract Ensemble prediction systems typically show positive spread-error correlation, but they are subject to forecast bias and dispersion errors, and are therefore uncalibrated. This work proposes the use of ensemble model output statistics (EMOS), an easy-to-implement postprocessing technique that addresses both forecast bias and underdispersion and takes into account the spread-skill relationship. The technique is based on multiple linear regression and is akin to the superensemble approach that has traditionally been used for deterministic-style forecasts. The EMOS technique yields probabilistic forecasts that take the form of Gaussian predictive probability density functions (PDFs) for continuous weather variables and can be applied to gridded model output. The EMOS predictive mean is a bias-corrected weighted average of the ensemble member forecasts, with coefficients that can be interpreted in terms of the relative contributions of the member models to the ensemble, and provides a highly competiti...


Journal of the American Statistical Association | 2011

Making and Evaluating Point Forecasts

Tilmann Gneiting

Typically, point forecasting methods are compared and assessed by means of an error measure or scoring function, with the absolute error and the squared error being key examples. The individual scores are averaged over forecast cases, to result in a summary measure of the predictive performance, such as the mean absolute error or the mean squared error. I demonstrate that this common practice can lead to grossly misguided inferences, unless the scoring function and the forecasting task are carefully matched. Effective point forecasting requires that the scoring function be specified ex ante, or that the forecaster receives a directive in the form of a statistical functional, such as the mean or a quantile of the predictive distribution. If the scoring function is specified ex ante, the forecaster can issue the optimal point forecast, namely, the Bayes rule. If the forecaster receives a directive in the form of a functional, it is critical that the scoring function be consistent for it, in the sense that the expected score is minimized when following the directive. A functional is elicitable if there exists a scoring function that is strictly consistent for it. Expectations, ratios of expectations and quantiles are elicitable. For example, a scoring function is consistent for the mean functional if and only if it is a Bregman function. It is consistent for a quantile if and only if it is generalized piecewise linear. Similar characterizations apply to ratios of expectations and to expectiles. Weighted scoring functions are consistent for functionals that adapt to the weighting in peculiar ways. Not all functionals are elicitable; for instance, conditional value-at-risk is not, despite its popularity in quantitative finance.


Siam Review | 2004

Stochastic Models That Separate Fractal Dimension and the Hurst Effect

Tilmann Gneiting; Martin Schlather

Fractal behavior and long-range dependence have been observed in an astonishing number of physical, biological, geological, and socioeconomic systems. Time series, profiles, and surfaces have been characterized by their fractal dimension, a measure of roughness, and by the Hurst coefficient, a measure of long-memory dependence. Both phenomena have been modeled and explained by self-affine random functions, such as fractional Gaussian noise and fractional Brownian motion. The assumption of statistical self-affinity implies a linear relationship between fractal dimension and Hurst coefficient and thereby links the two phenomena. This article introduces stochastic models that allow for any combination of fractal dimension and Hurst coefficient. Associated software for the synthesis of images with arbitrary, prespecified fractal properties and power-law correlations is available. The new models suggest a test for self-affinity that assesses coupling and decoupling of local and global behavior.


Monthly Weather Review | 2007

Probabilistic Quantitative Precipitation Forecasting Using Bayesian Model Averaging

J. Mc Lean Sloughter; Adrian E. Raftery; Tilmann Gneiting; Chris Fraley

Abstract Bayesian model averaging (BMA) is a statistical way of postprocessing forecast ensembles to create predictive probability density functions (PDFs) for weather quantities. It represents the predictive PDF as a weighted average of PDFs centered on the individual bias-corrected forecasts, where the weights are posterior probabilities of the models generating the forecasts and reflect the forecasts’ relative contributions to predictive skill over a training period. It was developed initially for quantities whose PDFs can be approximated by normal distributions, such as temperature and sea level pressure. BMA does not apply in its original form to precipitation, because the predictive PDF of precipitation is nonnormal in two major ways: it has a positive probability of being equal to zero, and it is skewed. In this study BMA is extended to probabilistic quantitative precipitation forecasting. The predictive PDF corresponding to one ensemble member is a mixture of a discrete component at zero and a gam...


Biometrics | 2009

Predictive Model Assessment for Count Data

Claudia Czado; Tilmann Gneiting; Leonhard Held

We discuss tools for the evaluation of probabilistic forecasts and the critique of statistical models for count data. Our proposals include a nonrandomized version of the probability integral transform, marginal calibration diagrams, and proper scoring rules, such as the predictive deviance. In case studies, we critique count regression models for patent data, and assess the predictive performance of Bayesian age-period-cohort models for larynx cancer counts in Germany. The toolbox applies in Bayesian or classical and parametric or nonparametric settings and to any type of ordered discrete outcomes.


Journal of the American Statistical Association | 2006

Calibrated Probabilistic Forecasting at the Stateline Wind Energy Center: The Regime-Switching Space–Time Method

Tilmann Gneiting; Kristin Larson; Kenneth Westrick; Marc G. Genton; Eric M. Aldrich

With the global proliferation of wind power, the need for accurate short-term forecasts of wind resources at wind energy sites is becoming paramount. Regime-switching space–time (RST) models merge meteorological and statistical expertise to obtain accurate and calibrated, fully probabilistic forecasts of wind speed and wind power. The model formulation is parsimonious, yet takes into account all of the salient features of wind speed: alternating atmospheric regimes, temporal and spatial correlation, diurnal and seasonal nonstationarity, conditional heteroscedasticity, and non-Gaussianity. The RST method identifies forecast regimes at a wind energy site and fits a conditional predictive model for each regime. Geographically dispersed meteorological observations in the vicinity of the wind farm are used as off-site predictors. The RST technique was applied to 2-hour-ahead forecasts of hourly average wind speed near the Stateline wind energy center in the U. S. Pacific Northwest. The RST point forecasts and distributional forecasts were accurate, calibrated, and sharp, and they compared favorably with predictions based on state-of-the-art time series techniques. This suggests that quality meteorological data from sites upwind of wind farms can be efficiently used to improve short-term forecasts of wind resources.


Journal of the American Statistical Association | 2010

Probabilistic Wind Speed Forecasting Using Ensembles and Bayesian Model Averaging

J. McLean Sloughter; Tilmann Gneiting; Adrian E. Raftery

The current weather forecasting paradigm is deterministic, based on numerical models. Multiple estimates of the current state of the atmosphere are used to generate an ensemble of deterministic predictions. Ensemble forecasts, while providing information on forecast uncertainty, are often uncalibrated. Bayesian model averaging (BMA) is a statistical ensemble postprocessing method that creates calibrated predictive probability density functions (PDFs). Probabilistic wind forecasting offers two challenges: a skewed distribution, and observations that are coarsely discretized. We extend BMA to wind speed, taking account of these challenges. This method provides calibrated and sharp probabilistic forecasts. Comparisons are made between several formulations.

Collaboration


Dive into the Tilmann Gneiting's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Werner Ehm

Dresden University of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Peter Guttorp

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Chris Fraley

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Eric P. Grimit

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Adrian Raftery

National Center for Atmospheric Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge