Archive | 2021

The SWAG solution for probabilistic predictions with a single neural network

Abstract

Ensemble predictions are essential to characterize the forecast uncertainty and the likelihood of an event to occur. Stochasticity in predictions comes from data and model uncertainty. In deep learning (DL), data uncertainty can be approached by training an ensemble of DL models on data subsets or by performing data augmentations (e.g., random or singular value decomposition (SVD) perturbations). Model uncertainty is typically addressed by training a DL model multiple times from different weight initializations (DeepEnsemble) or by training sub-networks by dropping weights (Dropout). Dropout is cheap but less effective, while DeepEnsemble is computationally expensive.We propose instead to tackle model uncertainty with SWAG (Maddox et al., 2019), a method to learn stochastic weights—the sampling of which allows to draw hundreds of forecast realizations at a fraction of the cost required by DeepEnsemble. In the context of data-driven weather forecasting, we demonstrate that the SWAG ensemble has i) better deterministic skills than a single DL model trained in the usual way, and ii) approaches deterministic and probabilistic skills of DeepEnsemble at a fraction of the cost. Finally, multiSWAG (SWAG applied on top of DeepEnsemble models) provides a trade-off between computational cost, model diversity, and performance.We believe that the method we present will become a common tool to generate large ensembles at a fraction of the current cost. Additionally, the possibility of sampling DL models allows the design of data-driven/emulated stochastic model components and sub-grid parameterizations.ReferenceMaddox W.J, Garipov T., Izmailov P., Vetrov D., Wilson A. G., 2019: A Simple Baseline for Bayesian Uncertainty in Deep Learning. arXiv:1902.02476

Archive | 2021

The SWAG solution for probabilistic predictions with a single neural network

Abstract

Volume None

Pages None

DOI 10.5194/EGUSPHERE-EGU21-2401

Language English

Journal None

Full Text