Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Dorit Hammerling is active.

Publication


Featured researches published by Dorit Hammerling.


Journal of Computational and Graphical Statistics | 2015

A multi-resolution Gaussian process model for the analysis of large spatial data sets

Douglas Nychka; Soutir Bandyopadhyay; Dorit Hammerling; Finn Lindgren; Stephan R. Sain

We develop a multiresolution model to predict two-dimensional spatial fields based on irregularly spaced observations. The radial basis functions at each level of resolution are constructed using a Wendland compactly supported correlation function with the nodes arranged on a rectangular grid. The grid at each finer level increases by a factor of two and the basis functions are scaled to have a constant overlap. The coefficients associated with the basis functions at each level of resolution are distributed according to a Gaussian Markov random field (GMRF) and take advantage of the fact that the basis is organized as a lattice. Several numerical examples and analytical results establish that this scheme gives a good approximation to standard covariance functions such as the Matérn and also has flexibility to fit more complicated shapes. The other important feature of this model is that it can be applied to statistical inference for large spatial datasets because key matrices in the computations are sparse. The computational efficiency applies to both the evaluation of the likelihood and spatial predictions.


Journal of Geophysical Research | 2012

Mapping of CO2 at high spatiotemporal resolution using satellite observations: Global distributions from OCO-2

Dorit Hammerling; Anna M. Michalak; S. Randolph Kawa

[1] Satellite observations of CO2 offer new opportunities to improve our understanding of the global carbon cycle. Using such observations to infer global maps of atmospheric CO2 and their associated uncertainties can provide key information about the distribution and dynamic behavior of CO2, through comparison to atmospheric CO2 distributions predicted from biospheric, oceanic, or fossil fuel flux emissions estimates coupled with atmospheric transport models. Ideally, these maps should be at temporal resolutions that are short enough to represent and capture the synoptic dynamics of atmospheric CO2. This study presents a geostatistical method that accomplishes this goal. The method can extract information about the spatial covariance structure of the CO2 field from the available CO2 retrievals, yields full coverage (Level 3) maps at high spatial resolutions, and provides estimates of the uncertainties associated with these maps. The method does not require information about CO2 fluxes or atmospheric transport, such that the Level 3 maps are informed entirely by available retrievals. The approach is assessed by investigating its performance using synthetic OCO-2 data generated from the PCTM/ GEOS-4/CASA-GFED model, for time periods ranging from 1 to 16 days and a target spatial resolution of 1 � latitude � 1.25 � longitude. Results show that global CO2 fields from OCO-2 observations can be predicted well at surprisingly high temporal resolutions. Even one-day Level 3 maps reproduce the large-scale features of the atmospheric CO2 distribution, and yield realistic uncertainty bounds. Temporal resolutions of two to four days result in the best performance for a wide range of investigated scenarios, providing maps at an order of magnitude higher temporal resolution relative to the monthly or seasonal Level 3 maps typically reported in the literature.


Statistics and Computing | 2017

Parallel inference for massive distributed spatial data using low-rank models

Matthias Katzfuss; Dorit Hammerling

Due to rapid data growth, statistical analysis of massive datasets often has to be carried out in a distributed fashion, either because several datasets stored in separate physical locations are all relevant to a given problem, or simply to achieve faster (parallel) computation through a divide-and-conquer scheme. In both cases, the challenge is to obtain valid inference that does not require processing all data at a single central computing node. We show that for a very widely used class of spatial low-rank models, which can be written as a linear combination of spatial basis functions plus a fine-scale-variation component, parallel spatial inference and prediction for massive distributed data can be carried out exactly, meaning that the results are the same as for a traditional, non-distributed analysis. The communication cost of our distributed algorithms does not depend on the number of data points. After extending our results to the spatio-temporal case, we illustrate our methodology by carrying out distributed spatio-temporal particle filtering inference on total precipitable water measured by three different satellite sensor systems.


international conference on conceptual structures | 2016

Towards Characterizing the Variability of Statistically Consistent Community Earth System Model Simulations

Daniel Milroy; Allison H. Baker; Dorit Hammerling; John M. Dennis; Sheri Mickelson; Elizabeth R. Jessup

Abstract Large, complex codes such as earth system models are in a constant state of development, requiring frequent software quality assurance. The recently developed Community Earth System Model (CESM) Ensemble Consistency Test (CESM-ECT) provides an objective measure of statistical consistency for new CESM simulation runs, which has greatly facilitated error detection and rapid feedback for model users and developers. CESM-ECT determines consistency based on an ensemble of simulations that represent the same earth system model. Its statistical distribution embodies the natural variability of the model. Clearly the composition of the employed ensemble is critical to CESM-ECTs effectiveness. In this work we examine whether the composition of the CESM-ECT ensemble is adequate for characterizing the variability of a consistent climate. To this end, we introduce minimal code changes into CESM that should pass the CESM-ECT, and we evaluate the composition of the CESM-ECT ensemble in this context. We suggest an improved ensemble composition that better captures the accepted variability induced by code changes, compiler changes, and optimizations, thus more precisely facilitating the detection of errors in the CESM hardware or software stack as well as enabling more in-depth code optimization and the adoption of new technologies.


Journal of Geophysical Research | 2015

Detectability of CO2 flux signals by a space-based lidar mission

Dorit Hammerling; S. Randolph Kawa; Kevin Schaefer; Scott C. Doney; Anna M. Michalak

Satellite observations of carbon dioxide (CO2) offer novel and distinctive opportunities for improving our quantitative understanding of the carbon cycle. Prospective observations include those from space-based lidar such as the active sensing of CO2 emissions over nights, days, and seasons (ASCENDS) mission. Here we explore the ability of such a mission to detect regional changes in CO2 fluxes. We investigate these using three prototypical case studies, namely, the thawing of permafrost in the northern high latitudes, the shifting of fossil fuel emissions from Europe to China, and changes in the source/sink characteristics of the Southern Ocean. These three scenarios were used to design signal detection studies to investigate the ability to detect the unfolding of these scenarios compared to a baseline scenario. Results indicate that the ASCENDS mission could detect the types of signals investigated in this study, with the caveat that the study is based on some simplifying assumptions. The permafrost thawing flux perturbation is readily detectable at a high level of significance. The fossil fuel emission detectability is directly related to the strength of the signal and the level of measurement noise. For a nominal (lower) fossil fuel emission signal, only the idealized noise-free instrument test case produces a clearly detectable signal, while experiments with more realistic noise levels capture the signal only in the higher (exaggerated) signal case. For the Southern Ocean scenario, differences due to the natural variability in the El Nino–Southern Oscillation climatic mode are primarily detectable as a zonal increase.


Geophysical Research Letters | 2017

A Bayesian hierarchical model for climate change detection and attribution

Matthias Katzfuss; Dorit Hammerling; Richard L. Smith

Regression-based detection and attribution methods continue to take a central role in the study of climate change and its causes. Here we propose a novel Bayesian hierarchical approach to this problem, which allows us to address several open methodological questions. Specifically, we take into account the uncertainties in the true temperature change due to imperfect measurements, the uncertainty in the true climate signal under different forcing scenarios due to the availability of only a small number of climate model simulations, and the uncertainty associated with estimating the climate variability covariance matrix, including the truncation of the number of empirical orthogonal functions (EOFs) in this covariance matrix. We apply Bayesian model averaging to assign optimal probabilistic weights to different possible truncations and incorporate all uncertainties into the inference on the regression coefficients. We provide an efficient implementation of our method in a software package and illustrate its use with a realistic application.


Journal of the American Statistical Association | 2018

Compression and Conditional Emulation of Climate Model Output

Joseph Guinness; Dorit Hammerling

ABSTRACT Numerical climate model simulations run at high spatial and temporal resolutions generate massive quantities of data. As our computing capabilities continue to increase, storing all of the data is not sustainable, and thus it is important to develop methods for representing the full datasets by smaller compressed versions. We propose a statistical compression and decompression algorithm based on storing a set of summary statistics as well as a statistical model describing the conditional distribution of the full dataset given the summary statistics. We decompress the data by computing conditional expectations and conditional simulations from the model given the summary statistics. Conditional expectations represent our best estimate of the original data but are subject to oversmoothing in space and time. Conditional simulations introduce realistic small-scale noise so that the decompressed fields are neither too smooth nor too rough compared with the original data. Considerable attention is paid to accurately modeling the original dataset—1 year of daily mean temperature data—particularly with regard to the inherent spatial nonstationarity in global fields, and to determining the statistics to be stored, so that the variation in the original data can be closely captured, while allowing for fast decompression and conditional emulation on modest computers. Supplementary materials for this article are available online.


Journal of Geophysical Research | 2018

On the Ability of Space- Based Passive and Active Remote Sensing Observations of CO2 to Detect Flux Perturbations to the Carbon Cycle

Sean Crowell; S. Randolph Kawa; Edward V. Browell; Dorit Hammerling; Berrien Moore; Kevin Schaefer; Scott C. Doney

Space-borne observations of CO2 are vital to gaining understanding of the carbon cycle in regions of the world that are difficult to measure directly, such as the tropical terrestrial biosphere, the high northern and southern latitudes, and in developing nations such as China. Measurements from passive instruments such as GOSAT and OCO-2, however, are constrained by solar zenith angle limitations as well as sensitivity to the presence of clouds and aerosols. Active measurements such as those in development for the Active Sensing of CO2 Emissions over Nights, Days and Seasons (ASCENDS) mission show strong potential for making measurements in the high-latitude winter and in cloudy regions. In this work we examine the enhanced flux constraint provided by the improved coverage from an active measurement such as ASCENDS. The simulation studies presented here show that with sufficient precision, ASCENDS will detect permafrost thaw and fossil fuel emissions shifts at annual and seasonal time scales, even in the presence of transport errors, representativeness errors, and biogenic flux errors. While OCO-2 can detect some of these perturbations at the annual scale, the seasonal sampling provided by ASCENDS provides the stronger constraint. Plain Language Summary Active and passive remote sensors show the potential to provide unprecedented information on the carbon cycle. With the all-season sampling, active remote sensors are more capable of constraining high-latitude emissions. The reduced sensitivity to cloud and aerosol also makes active sensors more capable of providing information in cloudy and polluted scenes with sufficient accuracy. These experiments account for errors that are fundamental to the top-down approach for constraining emissions, and even including these sources of error, we show that satellite remote sensors are critical for understanding the carbon cycle.


Archive | 2015

Incorporating MAGMA into the 'fields' spatial statistics package

Doug Nychka; John Paige; Isaac Lyngaas; Vinay Ramakrishnaiah; Dorit Hammerling; Raghu Kumar

In this report we describe how to incorporate the Cholesky decomposition from the Matrix Algebra on GPU and Multicore Architectures (MAGMA) library into some of the calculations of the ‘fields’ spatial statistics package in R. We provide MAGMA installation instructions as well as demonstrations of performance when applied to simulated datasets and the CO2 dataset available in fields. While there are other spatial statistics packages in R using parallelism, such as bigGP and parspatstat, none to our knowledge directly incorporates GPUs or other coprocessors. Our code is timed on Caldera computational nodes in the National Center for Atmospheric Research’s Yellowstone supercomputing environment. We find that for 40,000 × 40,000 matrices the MAGMA-accelerated decomposition has a 30.7 and 46.2 times speedup for 1 and 2 GPU implementations respectively over chol, the standard Cholesky decomposition function in R (with settings allowing R programmers to use our accelerated function like they would chol). The speedups are greater when using in-place calculations where the original matrix is overwritten and not copied. In that case, the equivalent speedups are 41.8 and 54.4 times for in place decompositions on one and two GPUs respectively. We also time a simple spatial analysis workflow with maximum likelihood estimation with up to over 23,000 observations, where accelerated workflows achieved approximately 4.2 and 4.3 times speedup when using 1 and 2 GPUs respectively over a corresponding unaccelerated workflow. As problem size increases, speedups improve, and the 2 GPU decompositions perform increasingly well compared to their corresponding 1 GPU implementations. Performance for 2 GPU decompositions is slower than with 1 GPU in some cases due to additional communication overheads and data dependencies in the Cholesky decomposition algorithm, and will be explored further in Ramakrishnaiah et al. (2015).


PLOS ONE | 2014

Completing the results of the 2013 Boston marathon.

Dorit Hammerling; Matthew Cefalu; Jessi Cisewski; Francesca Dominici; Giovanni Parmigiani; Charles Paulson; Richard L. Smith

The 2013 Boston marathon was disrupted by two bombs placed near the finish line. The bombs resulted in three deaths and several hundred injuries. Of lesser concern, in the immediate aftermath, was the fact that nearly 6,000 runners failed to finish the race. We were approached by the marathons organizers, the Boston Athletic Association (BAA), and asked to recommend a procedure for projecting finish times for the runners who could not complete the race. With assistance from the BAA, we created a dataset consisting of all the runners in the 2013 race who reached the halfway point but failed to finish, as well as all runners from the 2010 and 2011 Boston marathons. The data consist of split times from each of the 5 km sections of the course, as well as the final 2.2 km (from 40 km to the finish). The statistical objective is to predict the missing split times for the runners who failed to finish in 2013. We set this problem in the context of the matrix completion problem, examples of which include imputing missing data in DNA microarray experiments, and the Netflix prize problem. We propose five prediction methods and create a validation dataset to measure their performance by mean squared error and other measures. The best method used local regression based on a K-nearest-neighbors algorithm (KNN method), though several other methods produced results of similar quality. We show how the results were used to create projected times for the 2013 runners and discuss potential for future application of the same methodology. We present the whole project as an example of reproducible research, in that we are able to make the full data and all the algorithms we have used publicly available, which may facilitate future research extending the methods or proposing completely different approaches.

Collaboration


Dive into the Dorit Hammerling's collaboration.

Top Co-Authors

Avatar

Anna M. Michalak

Carnegie Institution for Science

View shared research outputs
Top Co-Authors

Avatar

Allison H. Baker

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Haiying Xu

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

S. Randolph Kawa

Goddard Space Flight Center

View shared research outputs
Top Co-Authors

Avatar

Doug Nychka

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar

Douglas Nychka

National Center for Atmospheric Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniel Milroy

University of Colorado Boulder

View shared research outputs
Top Co-Authors

Avatar

John M. Dennis

National Center for Atmospheric Research

View shared research outputs
Researchain Logo
Decentralizing Knowledge