Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Andrew O. Finley is active.

Publication


Featured researches published by Andrew O. Finley.


Computational Statistics & Data Analysis | 2009

Improving the performance of predictive process modeling for large datasets

Andrew O. Finley; Huiyan Sang; Sudipto Banerjee; Alan E. Gelfand

Advances in Geographical Information Systems (GIS) and Global Positioning Systems (GPS) enable accurate geocoding of locations where scientific data are collected. This has encouraged collection of large spatial datasets in many fields and has generated considerable interest in statistical modeling for location-referenced spatial data. The setting where the number of locations yielding observations is too large to fit the desired hierarchical spatial random effects models using Markov chain Monte Carlo methods is considered. This problem is exacerbated in spatial-temporal and multivariate settings where many observations occur at each location. The recently proposed predictive process, motivated by kriging ideas, aims to maintain the richness of desired hierarchical spatial modeling specifications in the presence of large datasets. A shortcoming of the original formulation of the predictive process is that it induces a positive bias in the non-spatial error term of the models. A modified predictive process is proposed to address this problem. The predictive process approach is knot-based leading to questions regarding knot design. An algorithm is designed to achieve approximately optimal spatial placement of knots. Detailed illustrations of the modified predictive process using multivariate spatial regression with both a simulated and a real dataset are offered.


Journal of the American Statistical Association | 2016

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

Abhirup Datta; Sudipto Banerjee; Andrew O. Finley; Alan E. Gelfand

Abstract Spatial process models for analyzing geostatistical data entail computations that become prohibitive as the number of spatial locations become large. This article develops a class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets. We establish that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. We embed the NNGP as a sparsity-inducing prior within a rich hierarchical modeling framework and outline how computationally efficient Markov chain Monte Carlo (MCMC) algorithms can be executed without storing or decomposing large matrices. The floating point operations (flops) per iteration of this algorithm is linear in the number of spatial locations, thereby rendering substantial scalability. We illustrate the computational and inferential benefits of the NNGP over competing methods using simulation studies and also analyze forest biomass from a massive U.S. Forest Inventory dataset at a scale that precludes alternative dimension-reducing methods. Supplementary materials for this article are available online.


Ecological Monographs | 2012

Tropical tree growth is correlated with soil phosphorus, potassium, and calcium, though not for legumes

Thomas W. Baribault; Richard K. Kobe; Andrew O. Finley

Tropical forest productivity is widely assumed to be limited by soil phosphorus (P), but biogeochemical processes that deplete P also could deplete base cations, suggesting multiple resource limitation. Limitation by several resources could arise from species and functional diversity and from variation among groups in resource requirements, including ecophysiological strategies that minimize P limitation. We hypothesized that tree growth is positively related to soil base cation and P availability and negatively related to local competition; Fabaceae growth is weakly correlated with soil resources if fixed N is used indirectly to acquire other resources; growth of species with low wood density is more strongly related to soil resource availability than that of species with high wood density. Diameter growth and soil resource availability were measured in five mapped stands situated across natural soil resource gradients in lowland wet tropical forest (La Selva Biological Station, Costa Rica). Soil resourc...


Computational Statistics & Data Analysis | 2012

Approximate Bayesian inference for large spatial datasets using predictive process models

Jo Eidsvik; Andrew O. Finley; Sudipto Banerjee; Håvard Rue

The challenges of estimating hierarchical spatial models to large datasets are addressed. With the increasing availability of geocoded scientific data, hierarchical models involving spatial processes have become a popular method for carrying out spatial inference. Such models are customarily estimated using Markov chain Monte Carlo algorithms that, while immensely flexible, can become prohibitively expensive. In particular, fitting hierarchical spatial models often involves expensive decompositions of dense matrices whose computational complexity increases in cubic order with the number of spatial locations. Such matrix computations are required in each iteration of the Markov chain Monte Carlo algorithm, rendering them infeasible for large spatial datasets. The computational challenges in analyzing large spatial datasets are considered by merging two recent developments. First, the predictive process model is used as a reduced-rank spatial process, to diminish the dimensionality of the model. Then a computational framework is developed for estimating predictive process models using the integrated nested Laplace approximation. The settings where the first stage likelihood is Gaussian or non-Gaussian are discussed. Issues such as predictions and model comparisons are also discussed. Results are presented for synthetic data and several environmental datasets.


Journal of the American Statistical Association | 2010

Hierarchical Spatial Process Models for Multiple Traits in Large Genetic Trials

Sudipto Banerjee; Andrew O. Finley; Patrik Waldmann; Tore Ericsson

This article expands upon recent interest in Bayesian hierarchical models in quantitative genetics by developing spatial process models for inference on additive and dominance genetic variance within the context of large spatially referenced trial datasets of multiple traits of interest. Direct application of such multivariate models to large spatial datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. The situation is even worse in Markov chain Monte Carlo (MCMC) contexts where such computations are performed for several thousand iterations. Here, we discuss approaches that help obviate these hurdles without sacrificing the richness in modeling. For genetic effects, we demonstrate how an initial spectral decomposition of the relationship matrices negates the expensive matrix inversions required in previously proposed MCMC methods. For spatial effects we discuss a multivariate predictive process that reduces the computational burden by projecting the original process onto a subspace generated by realizations of the original process at a specified set of locations (or knots). We illustrate the proposed methods using a synthetic dataset with multivariate additive and dominant genetic effects and anisotropic spatial residuals, and a large dataset from a scots pine (Pinus sylvestris L.) progeny study conducted in northern Sweden. Our approaches enable us to provide a comprehensive analysis of this large trial which amply demonstrates that, in addition to violating basic assumptions of the linear model, ignoring spatial effects can result in downwardly biased measures of heritability.


Frontiers in Ecology and the Environment | 2014

Approaches to advance scientific understanding of macrosystems ecology

Ofir Levy; Becky A. Ball; Ben Bond-Lamberty; Kendra Spence Cheruvelil; Andrew O. Finley; Noah R. Lottig; Surangi W. Punyasena; Jingfeng Xiao; Jizhong Zhou; Lauren B. Buckley; Christopher T. Filstrup; Timothy H. Keitt; James R. Kellner; Alan K. Knapp; Andrew D. Richardson; David K. Tcheng; Michael Toomey; Rodrigo Vargas; James W. Voordeckers; Tyler Wagner; John W. Williams

The emergence of macrosystems ecology (MSE), which focuses on regional- to continental-scale ecological patterns and processes, builds upon a history of long-term and broad-scale studies in ecology. Scientists face the difficulty of integrating the many elements that make up macrosystems, which consist of hierarchical processes at interacting spatial and temporal scales. Researchers must also identify the most relevant scales and variables to be considered, the required data resources, and the appropriate study design to provide the proper inferences. The large volumes of multi-thematic data often associated with macrosystem studies typically require validation, standardization, and assimilation. Finally, analytical approaches need to describe how cross-scale and hierarchical dynamics and interactions relate to macroscale phenomena. Here, we elaborate on some key methodological challenges of MSE research and discuss existing and novel approaches to meet them.


Environmental Research Letters | 2013

Permafrost and organic layer interactions over a climate gradient in a discontinuous permafrost zone

Kristofer Johnson; Jennifer W. Harden; A. David McGuire; Mark Clark; Fengming Yuan; Andrew O. Finley

Permafrost is tightly coupled to the organic soil layer, an interaction that mediates permafrost degradation in response to regional warming. We analyzed changes in permafrost occurrence and organic layer thickness (OLT) using more than 3000 soil pedons across a mean annual temperature (MAT) gradient. Cause and effect relationships between permafrost probability (PF), OLT, and other topographic factors were investigated using structural equation modeling in a multi-group analysis. Groups were defined by slope, soil texture type, and shallow (<28 cm) versus deep organic ( 28 cm) layers. The probability of observing permafrost sharply increased by 0.32 for every 10-cm OLT increase in shallow OLT soils (OLTs) due to an insulation effect, but PF decreased in deep OLT soils (OLTd) by 0.06 for every 10-cm increase. Across the MAT gradient, PF in sandy soils varied little, but PF in loamy and silty soils decreased substantially from cooler to warmer temperatures. The change in OLT was more heterogeneous across soil texture types—in some there was no change while in others OLTs soils thinned and/or OLTd soils thickened at warmer locations. Furthermore, when soil organic carbon was estimated using a relationship with thickness, the average increase in carbon in OLTd soils was almost four times greater compared to the average decrease in carbon in OLTs soils across all soil types. If soils follow a trajectory of warming that mimics the spatial gradients found today, then heterogeneities of permafrost degradation and organic layer thinning and thickening should be considered in the regional carbon balance.


Journal of Agricultural Biological and Environmental Statistics | 2008

Bayesian multivariate process modeling for prediction of forest attributes

Andrew O. Finley; Sudipto Banerjee; Alan R. Ek; Ronald E. McRoberts

This article investigates multivariate spatial process models suitable for predicting multiple forest attributes using a multisource forest inventory approach. Such data settings involve several spatially dependent response variables arising in each location. Not only does each variable vary across space, they are likely to be correlated among themselves. Traditional approaches have attempted to model such data using simplifying assumptions, such as a common rate of decay in the spatial correlation or simplified cross-covariance structures among the response variables. Our current focus is to produce spatially explicit, tree species specific, prediction of forest biomass per hectare over a region of interest. Modeling such associations presents challenges in terms of validity of probability distributions as well as issues concerning identifiability and estimability of parameters. Our template encompasses several models with different correlation structures. These models represent different hypotheses whose tenability are assessed using formal model comparisons. We adopt a Bayesian hierarchical approach offering a sampling-based inferential framework using efficient Markov chain Monte Carlo methods for estimating model parameters.


Journal of Geographical Systems | 2012

Bayesian dynamic modeling for large space-time datasets using Gaussian predictive processes

Andrew O. Finley; Sudipto Banerjee; Alan E. Gelfand

In this paper, we extend the applicability of a previously proposed class of dynamic space-time models by enabling them to accommodate large datasets. We focus on the common setting where space is viewed as continuous but time is taken to be discrete. Scalability is achieved by using a low-rank predictive process to reduce the dimensionality of the data and ease the computational burden of estimating the spatio-temporal process of interest. The proposed models are illustrated using weather station data collected over the northeastern United States between 2000 and 2005. Here our interest is to use readily available predictors, association among measurements at a given station, as well as dependence across space and time to improve prediction for incomplete station records and locations where station data does not exist.


The Annals of Applied Statistics | 2009

Hierarchical spatial models for predicting tree species assemblages across large domains

Andrew O. Finley; Sudipto Banerjee; Ronald E. McRoberts

Spatially explicit data layers of tree species assemblages, referred to as forest types or forest type groups, are a key component in large-scale assessments of forest sustainability, biodiversity, timber biomass, carbon sinks and forest health monitoring. This paper explores the utility of coupling georeferenced national forest inventory (NFI) data with readily available and spatially complete environmental predictor variables through spatially-varying multinomial logistic regression models to predict forest type groups across large forested landscapes. These models exploit underlying spatial associations within the NFI plot array and the spatially-varying impact of predictor variables to improve the accuracy of forest type group predictions. The richness of these models incurs onerous computational burdens and we discuss dimension reducing spatial processes that retain the richness in modeling. We illustrate using NFI data from Michigan, USA, where we provide a comprehensive analysis of this large study area and demonstrate improved prediction with associated measures of uncertainty.

Collaboration


Dive into the Andrew O. Finley's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bruce D. Cook

Goddard Space Flight Center

View shared research outputs
Top Co-Authors

Avatar

Chad Babcock

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John B. Bradford

United States Geological Survey

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ronald E. McRoberts

United States Forest Service

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan R. Ek

University of Minnesota

View shared research outputs
Top Co-Authors

Avatar

Benjamin Zuckerberg

University of Wisconsin-Madison

View shared research outputs
Researchain Logo
Decentralizing Knowledge