Hans J. Skaug | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hans J. Skaug is active.

Explore More

Publication

Featured researches published by Hans J. Skaug.

Optimization Methods & Software | 2012

AD Model Builder: using automatic differentiation for statistical inference of highly parameterized complex nonlinear models

David A. Fournier; Hans J. Skaug; Johnoel Ancheta; James N. Ianelli; Arni Magnusson; Mark N. Maunder; Anders Paarup Nielsen; John R. Sibert

Many criteria for statistical parameter estimation, such as maximum likelihood, are formulated as a nonlinear optimization problem. Automatic Differentiation Model Builder (ADMB) is a programming framework based on automatic differentiation, aimed at highly nonlinear models with a large number of parameters. The benefits of using AD are computational efficiency and high numerical accuracy, both crucial in many practical problems. We describe the basic components and the underlying philosophy of ADMB, with an emphasis on functionality found in no other statistical software. One example of such a feature is the generic implementation of Laplace approximation of high-dimensional integrals for use in latent variable models. We also review the literature in which ADMB has been used, and discuss future development of ADMB as an open source project. Overall, the main advantages of ADMB are flexibility, speed, precision, stability and built-in methods to quantify uncertainty.

Journal of Statistical Software | 2016

TMB: Automatic differentiation and laplace approximation

Kasper Kristensen; Anders Henry Nielsen; Casper Willestofte Berg; Hans J. Skaug; Brad Bell

TMB is an open source R package that enables quick implementation of complex nonlinear random effect (latent variable) models in a manner similar to the established AD Model Builder package (ADMB, admb-project.org). In addition, it offers easy access to parallel computations. The user defines the joint likelihood for the data and the random effects as a C++ template function, while all the other operations are done in R; e.g., reading in the data. The package evaluates and maximizes the Laplace approximation of the marginal likelihood where the random effects are automatically integrated out. This approximation, and its derivatives, are obtained using automatic differentiation (up to order three) of the joint likelihood. The computations are designed to be fast for problems with many random effects (~10^6) and parameters (~10^3). Computation times using ADMB and TMB are compared on a suite of examples ranging from simple models to large spatial models where the random effects are a Gaussian random field. Speedups ranging from 1.5 to about 100 are obtained with increasing gains for large problems. The package and examples are available at this http URL

Methods in Ecology and Evolution | 2013

Strategies for fitting nonlinear ecological models in R, AD Model Builder, and BUGS

Benjamin M. Bolker; Beth Gardner; Mark N. Maunder; Casper Willestofte Berg; Mollie E. Brooks; Liza S. Comita; Elizabeth E. Crone; Sarah Cubaynes; Trevor Davies; Perry de Valpine; Jessica Ford; Olivier Gimenez; Marc Kéry; Eun Jung Kim; Cleridy E. Lennert-Cody; Arni Magnusson; Steve Martell; John C. Nash; Anders Paarup Nielsen; Jim Regetz; Hans J. Skaug; Elise F. Zipkin

1. Ecologists often use nonlinear fitting techniques to estimate the parameters of complex ecological models, with attendant frustration. This paper compares three open-source model fitting tools and discusses general strategies for defining and fitting models. 2. R is convenient and (relatively) easy to learn, AD Model Builder is fast and robust but comes with a steep learning curve, while BUGS provides the greatest flexibility at the price of speed. 3. Our model-fitting suggestions range from general cultural advice (where possible, use the tools and models that are most common in your subfield) to specific suggestions about how to change the mathematical description of models to make them more amenable to parameter estimation. 4. A companion web site (https://groups.nceas.ucsb.edu/nonlinear-modeling/projects) presents detailed examples of application of the three tools to a variety of typical ecological estimation problems; each example links both to a detailed project report and to full source code and data.

Methods in Ecology and Evolution | 2015

Spatial factor analysis: a new tool for estimating joint species distributions and correlations in species range

James T. Thorson; Mark D. Scheuerell; Andrew O. Shelton; Kevin See; Hans J. Skaug; Kasper Kristensen

Summary Predicting and explaining the distribution and density of species is one of the oldest concerns in ecology. Species distributions can be estimated using geostatistical methods, which estimate a latent spatial variable explaining observed variation in densities, but geostatistical methods may be imprecise for species with low densities or few observations. Additionally, simple geostatistical methods fail to account for correlations in distribution among species and generally estimate such cross-correlations as a post hoc exercise. We therefore present spatial factor analysis (SFA), a spatial model for estimating a low-rank approximation to multivariate data, and use it to jointly estimate the distribution of multiple species simultaneously. We also derive an analytic estimate of cross-correlations among species from SFA parameters. As a first example, we show that distributions for 10 bird species in the breeding bird survey in 2012 can be parsimoniously represented using only five spatial factors. As a second case study, we show that forward prediction of catches for 20 rockfishes (Sebastes spp.) off the U.S. West Coast is more accurate using SFA than analysing each species individually. Finally, we show that single-species models give a different picture of cross-correlations than joint estimation using SFA. Spatial factor analysis complements a growing list of tools for jointly modelling the distribution of multiple species and provides a parsimonious summary of cross-correlation without requiring explicit declaration of habitat variables. We conclude by proposing future research that would model species cross-correlations using dissimilarity of species’ traits, and the development of spatial dynamic factor analysis for a low-rank approximation to spatial time-series data.

Ecology | 2015

The importance of spatial models for estimating the strength of density dependence

James T. Thorson; Hans J. Skaug; Kasper Kristensen; Andrew O. Shelton; Eric J. Ward; John H. Harms; James A. Benante

Identifying the existence and magnitude of density dependence is one of the oldest concerns in ecology. Ecologists have aimed to estimate density dependence in population and community data by fitting a simple autoregressive (Gompertz) model for density dependence to time series of abundance for an entire population. However, it is increasingly recognized that spatial heterogeneity in population densities has implications for population and community dynamics. We therefore adapt the Gompertz model to approximate, local densities over continuous space instead of population-wide abundance, and allow productivity to vary spatially using Gaussian random fields. We then show that the conventional (nonspatial) Gompertz model can result in biased estimates of density dependence (e.g., identifying oscillatory dynamics when not present) if densities vary spatially. By contrast, the spatial Gompertz model provides accurate and precise estimates of density dependence for a variety of simulation scenarios and data availabilities. These results are corroborated when comparing spatial and nonspatial models for data from 10 years and -100 sampling stations for three long-lived rockfishes (Sebastes spp.) off the California, USA coast. In this case, the nonspatial model estimates implausible oscillatory dynamics on an annual time scale, while the spatial model estimates strong autocorrelation and is supported by model selection tools. We conclude by discussing the importance of improved data archiving techniques, so that spatial models can be used to reexamine classic questions regarding the existence and magnitude of density. dependence in wild populations.

Journal of Computational and Graphical Statistics | 2002

Automatic Differentiation to Facilitate Maximum Likelihood Estimation in Nonlinear Random Effects Models

Hans J. Skaug

Maximum likelihood estimation in random effects models for non-Gaussian data is a computationally challenging task that currently receives much attention. This article shows that the estimation process can be facilitated by the use of automatic differentiation, which is a technique for exact numerical differentiation of functions represented as computer programs. Automatic differentiation is applied to an approximation of the likelihood function, obtained by using either Laplaces method of integration or importance sampling. The approach is applied to generalized linear mixed models. The computational speed is high compared to the Monte Carlo EM algorithm and the Monte Carlo Newton–Raphson method.

PLOS Computational Biology | 2014

Determining Individual Variation in Growth and Its Implication for Life-History and Population Processes Using the Empirical Bayes Method

Simone Vincenzi; Marc Mangel; Alain J. Crivelli; Stephan B. Munch; Hans J. Skaug

The differences in demographic and life-history processes between organisms living in the same population have important consequences for ecological and evolutionary dynamics. Modern statistical and computational methods allow the investigation of individual and shared (among homogeneous groups) determinants of the observed variation in growth. We use an Empirical Bayes approach to estimate individual and shared variation in somatic growth using a von Bertalanffy growth model with random effects. To illustrate the power and generality of the method, we consider two populations of marble trout Salmo marmoratus living in Slovenian streams, where individually tagged fish have been sampled for more than 15 years. We use year-of-birth cohort, population density during the first year of life, and individual random effects as potential predictors of the von Bertalanffy growth functions parameters k (rate of growth) and (asymptotic size). Our results showed that size ranks were largely maintained throughout marble trout lifetime in both populations. According to the Akaike Information Criterion (AIC), the best models showed different growth patterns for year-of-birth cohorts as well as the existence of substantial individual variation in growth trajectories after accounting for the cohort effect. For both populations, models including density during the first year of life showed that growth tended to decrease with increasing population density early in life. Model validation showed that predictions of individual growth trajectories using the random-effects model were more accurate than predictions based on mean size-at-age of fish.

Molecular Ecology | 2014

Next‐generation sequencing for molecular ecology: a caveat regarding pooled samples

Eric C. Anderson; Hans J. Skaug; Daniel J. Barshis

We develop a model based on the Dirichlet‐compound multinomial distribution (CMD) and Ewens sampling formula to predict the fraction of SNP loci that will appear fixed for alternate alleles between two pooled samples drawn from the same underlying population. We apply this model to next‐generation sequencing (NGS) data from Baltic Sea herring recently published by (Corander et al., , Molecular Ecology, 2931–2940), and show that there are many more fixed loci than expected in the absence of genetic structure. However, we show through coalescent simulations that the degree of population structure required to explain the fraction of alternatively fixed SNPs is extraordinarily high and that the surplus of fixed loci is more likely a consequence of limited representation of individual gene copies in the pooled samples, than it is of population structure. Our analysis signals that the use of NGS on pooled samples to identify divergent SNPs warrants caution. With pooled samples, it is hard to diagnose when an NGS experiment has gone awry; especially when NGS data on pooled samples are of low read depth with a limited number of individuals, it may be worthwhile to temper claims of unexpected population differentiation from pooled samples, pending verification with more reliable methods or stricter adherence to recommended sampling designs for pooled sequencing e.g. Futschik & Schlötterer , Genetics, 186, 207; Gautier et al., , Molecular Ecology, 3766–3779). Analysis of the data and diagnosis of problems is easier and more reliable (and can be less costly) with individually barcoded samples. Consequently, for some scenarios, individual barcoding may be preferable to pooling of samples.

PLOS ONE | 2010

Migration of Antarctic Minke Whales to the Arctic

Kevin A. Glover; Naohisa Kanda; Tore Haug; Luis A. Pastene; Nils Øien; Mutsuo Goto; Bjørghild Breistein Seliussen; Hans J. Skaug

The Antarctic minke whale (Balaenoptera bonaerensis), and the common minke whale found in the North Atlantic (Balaenoptera acutorostrata acutorostrata), undertake synchronized seasonal migrations to feeding areas at their respective poles during spring, and to the tropics in the autumn where they overwinter. Differences in the timing of seasons between hemispheres prevent these species from mixing. Here, based upon analysis of mitochondrial and microsatellite DNA profiles, we report the observation of a single B. bonaerensis in 1996, and a hybrid with maternal contribution from B. bonaerensis in 2007, in the Arctic Northeast Atlantic. Paternal contribution was not conclusively resolved. This is the first documentation of B. bonaerensis north of the tropics, and, the first documentation of hybridization between minke whale species.

Environmental and Ecological Statistics | 2006

Markov Modulated Poisson Processes for Clustered Line Transect Data

Hans J. Skaug

We model the points of the detection along the transect line by a Markov modulated Poisson process (MMPP). The MMPP can accommodate the spatial cluster structure typical of many line transect surveys. The basic idea is that animal density switches between a low and a high level according to a latent Markov process. The MMPP is attractive from a mathematical point of view, as it provides an explicit expression for the likelihood function and other important quantities. We focus on estimating the level of overdispersion in the number of detected animals, as this is important for quantifying the precision of the line transect estimator of animal abundance. The approach is illustrated using both simulated data and data from a minke whale sighting survey conducted in the North Atlantic.

Explore More