Cajo J. F. ter Braak
Wageningen University and Research Centre
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Cajo J. F. ter Braak.
Plant Ecology | 1987
Cajo J. F. ter Braak
Canonical correspondence analysis (CCA) is introduced as a multivariate extension of weighted averaging ordination, which is a simple method for arranging species along environmental variables. CCA constructs those linear combinations of environmental variables, along which the distributions of the species are maximally separated. The eigenvalues produced by CCA measure this separation.
Journal of Statistical Computation and Simulation | 2003
Marti J. Anderson; Cajo J. F. ter Braak
Several permutation strategies are often possible for tests of individual terms in analysis-of-variance (ANOVA) designs. These include restricted permutations, permutation of whole groups of units, permutation of some form of residuals or some combination of these. It is unclear, especially for complex designs involving random factors, mixed models or nested hierarchies, just which permutation strategy should be used for any particular test. The purpose of this paper is two-fold: (i) we provide a guideline for constructing an exact permutation strategy, where possible, for any individual term in any ANOVA design; and (ii) we provide results of Monte Carlo simulations to compare the level accuracy and power of different permutation strategies in two-way ANOVA, including random and mixed models, nested hierarchies and tests of interaction terms. Simulation results showed that permutation of residuals under a reduced model generally had greater power than the exact test or alternative approximate permutation methods (such as permutation of raw data). In several cases, restricted permutations, in particular, suffered more than other procedures, in terms of loss of power, challenging the conventional wisdom of using this approach. Our simulations also demonstrated that the choice of correct exchangeable units under the null hypothesis, in accordance with the guideline we provide, is essential for any permutation test, whether it be an exact test or an approximate test. For reference, we also provide appropriate permutation strategies for individual terms in any two-way or three-way ANOVA for the exact test (where possible) and for the approximate test using permutation of residuals.
Water Resources Research | 2008
Jasper A. Vrugt; Cajo J. F. ter Braak; Martyn P. Clark; James M. Hyman; Bruce A. Robinson
[1] There is increasing consensus in the hydrologic literature that an appropriate framework for streamflow forecasting and simulation should include explicit recognition of forcing and parameter and model structural error. This paper presents a novel Markov chain Monte Carlo (MCMC) sampler, entitled differential evolution adaptive Metropolis (DREAM), that is especially designed to efficiently estimate the posterior probability density function of hydrologic model parameters in complex, high-dimensional sampling problems. This MCMC scheme adaptively updates the scale and orientation of the proposal distribution during sampling and maintains detailed balance and ergodicity. It is then demonstrated how DREAM can be used to analyze forcing data error during watershed model calibration using a five-parameter rainfall-runoff model with streamflow data from two different catchments. Explicit treatment of precipitation error during hydrologic model calibration not only results in prediction uncertainty bounds that are more appropriate but also significantly alters the posterior distribution of the watershed model parameters. This has significant implications for regionalization studies. The approach also provides important new ways to estimate areal average watershed precipitation, information that is of utmost importance for testing hydrologic theory, diagnosing structural errors in models, and appropriately benchmarking rainfall measurement devices.
Statistics and Computing | 2006
Cajo J. F. ter Braak
Differential Evolution (DE) is a simple genetic algorithm for numerical optimization in real parameter spaces. In a statistical context one would not just want the optimum but also its uncertainty. The uncertainty distribution can be obtained by a Bayesian analysis (after specifying prior and likelihood) using Markov Chain Monte Carlo (MCMC) simulation. This paper integrates the essential ideas of DE and MCMC, resulting in Differential Evolution Markov Chain (DE-MC). DE-MC is a population MCMC algorithm, in which multiple chains are run in parallel. DE-MC solves an important problem in MCMC, namely that of choosing an appropriate scale and orientation for the jumping distribution. In DE-MC the jumps are simply a fixed multiple of the differences of two random parameter vectors that are currently in the population. The selection process of DE-MC works via the usual Metropolis ratio which defines the probability with which a proposal is accepted. In tests with known uncertainty distributions, the efficiency of DE-MC with respect to random walk Metropolis with optimal multivariate Normal jumps ranged from 68% for small population sizes to 100% for large population sizes and even to 500% for the 97.5% point of a variable from a 50-dimensional Student distribution. Two Bayesian examples illustrate the potential of DE-MC in practice. DE-MC is shown to facilitate multidimensional updates in a multi-chain “Metropolis-within-Gibbs” sampling approach. The advantage of DE-MC over conventional MCMC are simplicity, speed of calculation and convergence, even for nearly collinear parameters and multimodal densities.
Ecology | 2008
André P. Schaffers; Ivo Raemakers; Karlè V. Sýkora; Cajo J. F. ter Braak
Insects and spiders comprise more than two-thirds of the Earths total species diversity. There is wide concern, however, that the global diversity of arthropods may be declining even more rapidly than the diversity of vertebrates and plants. For adequate conservation planning, ecologists need to understand the driving factors for arthropod communities and devise methods, that provide reliable predictions when resources do not permit exhaustive ground surveys. Which factor most successfully predicts arthropod community structure is still a matter of debate, however. The purpose of this study was to identify the factor best predicting arthropod assemblage composition. We investigated the species composition of seven functionally different arthropod groups (epigeic spiders, grasshoppers, ground beetles, weevils, hoppers, hoverflies, and bees) at 47 sites in The Netherlands comprising a range of seminatural grassland types and one heathland type. We then compared the actual arthropod composition with predictions based on plant species composition, vegetation structure, environmental data, flower richness, and landscape composition. For this we used the recently published method of predictive co-correspondence analysis, and a predictive variant of canonical correspondence analysis, depending on the type of predictor data. Our results demonstrate that local plant species composition is the most effective predictor of arthropod assemblage composition, for all investigated groups. In predicting arthropod assemblages, plant community composition consistently outperforms both vegetation structure and environmental conditions (even when the two are combined), and also performs better than the surrounding landscape. These results run against a common expectation of vegetation structure as the decisive factor. Such expectations, however, have always been biased by the fact that until recently no methods existed that could use an entire (plant) species composition in the explanatory role. Although more recent experimental diversity work has reawakened interest in the role of plant species, these studies still have not used (or have not been able to use) entire species compositions. They only consider diversity measures, both for plant and insect assemblages, which may obscure relationships. The present study demonstrates that the species compositions of insect and plant communities are clearly linked.
Statistics and Computing | 2008
Cajo J. F. ter Braak; Jasper A. Vrugt
Differential Evolution Markov Chain (DE-MC) is an adaptive MCMC algorithm, in which multiple chains are run in parallel. Standard DE-MC requires at least N=2d chains to be run in parallel, where d is the dimensionality of the posterior. This paper extends DE-MC with a snooker updater and shows by simulation and real examples that DE-MC can work for d up to 50–100 with fewer parallel chains (e.g.N=3) by exploiting information from their past by generating jumps from differences of pairs of past states. This approach extends the practical applicability of DE-MC and is shown to be about 5–26 times more efficient than the optimal Normal random walk Metropolis sampler for the 97.5% point of a variable from a 25–50 dimensional Student t3 distribution. In a nonlinear mixed effects model example the approach outperformed a block-updater geared to the specific features of the model.
Aquatic Ecology | 1998
Paul J. Van den Brink; Cajo J. F. ter Braak
Experiments in microcosms and mesocosms, which can be carried out in an advanced tier of risk assessment, usually result in large data sets on the dynamics of biological communities of treated and control cosms. Multivariate techniques are an accepted tool to evaluate the community treatment effects resulting from these complex experiments. In this paper two methods of multivariate analysis are discussed on their merits: 1) the canonical ordination technique Principal Response Curves (PRC) and 2) the similarity indices of Bray-Curtis and Stander. For this, the data sets of a microcosm experiment were used to simultaneously study the impact of nutrient loading and insecticide application.Both similarity indices display, in a single graph, the total effect size against time and do not allow a direct interpretation down to the taxon level. In the PRC method, the principal components of the treatment effects are plotted against time. Since the species of the example data sets, react in qualitatively different ways to the treatments, more than one PRC is needed for a proper description of the treatment effects. The first PRC of one of the data sets describes the effects due to the chlorpyrifos addition, the second one the effects as a result of the nutrient loading. The resulting principal response curves jointly summarize the essential features of the response curves of the individual taxa. This paper goes beyond the first PRC to visualize the effects of chemicals at the community level. In both multivariate analysis methods the statistical significance of the effects can be assessed by Monte Carlo permutation testing.Experiments in microcosms and mesocosms, which can be carried out in an advanced tier of risk assessment, usually result in large data sets on the dynamics of biological communities of treated and control cosms. Multivariate techniques are an accepted tool to evaluate the community treatment effects resulting from these complex experiments. In this paper two methods of multivariate analysis are discussed on their merits: 1) the canonical ordination technique Principal Response Curves (PRC) and 2) the similarity indices of Bray-Curtis and Stander. For this, the data sets of a microcosm experiment were used to simultaneously study the impact of nutrient loading and insecticide application. Both similarity indices display, in a single graph, the total effect size against time and do not allow a direct interpretation down to the taxon level. In the PRC method, the principal components of the treatment effects are plotted against time. Since the species of the example data sets, react in qualitatively different ways to the treatments, more than one PRC is needed for a proper description of the treatment effects. The first PRC of one of the data sets describes the effects due to the chlorpyrifos addition, the second one the effects as a result of the nutrient loading. The resulting principal response curves jointly summarize the essential features of the response curves of the individual taxa. This paper goes beyond the first PRC to visualize the effects of chemicals at the community level. In both multivariate analysis methods the statistical significance of the effects can be assessed by Monte Carlo permutation testing.
Plant Ecology | 1987
Cajo J. F. ter Braak; Niek J. M. Gremmen
Two methods for estimating ecological amplitudes of species with respect to Ellenberg’s moisture scale are discussed, one based on weighted averaging and the other on maximum likelihood. Both methods are applied to phytosociological data from the province of Noord-Brabant (The Netherlands), and estimate the range of occurrence of species to be about 4–6 units on the moisture scale. Due to the implicit nature of Ellenberg’s definition of moisture, it is impossible to improve the indicator values in a statistically sound way on the basis of floristic data only. The internal consistency of the Ellenberg indicator values is checked by using Gaussian logit regression. For 45 out of the 240 species studied the indicator value is inconsistent with those of the other species. The same method is used to estimate the optima and amplitudes of species considered moisture-indifferent and of some species not mentioned by Ellenberg. Some of these ‘indifferent’ species show a remarkably narrow amplitude.
Journal of Vegetation Science | 1994
Cajo J. F. ter Braak; Jaap Wiertz
. A case study is presented on the statistical analysis and interpretation of vegetation change in a wetland subjected to water extraction and acidification, without precise information on the environmental changes. The vegetation is a Junco-Molinion grassland and the changes in vegetation are evaluated on the basis of releves in 1977 and 1988 of 20 plots in a small nature reserve on moist oligotrophic, Pleistocene sands in the Netherlands. The changes are attributed to water extraction (since 1972) and soil acidification and the effect of the environmental changes on the vegetation is inferred from data on water depth and acidity collected in 1988. Many species typical of wetlands decreased in abundance, including rare species such as Parnassia palustris, Selinum carvifolia and Ophioglossum vulgatum. Some species increased, notably Anthoxanthum odoratum, Holcus lanatus and Plantago lanceolata. A significant decrease was found in the mean Ellenberg indicator values for moisture and acidity. The mean indicator value for nutrients did not change significantly. Multivariate analysis of the species data by Redundancy Analysis demonstrated the overall significance of the change in species composition between 1977 and 1988 (P < 0.01, Monte Carlo permutation). The spatial and temporal variation in the species data was displayed in ordination diagrams and interpreted in terms of water depth and pH. A simple model is developed to infer the change in water depth and pH from the releve data and recent data on water depth and pH. Because the correlation between water depth and pH made a joint estimation of the changes useless, the change in pH was estimated for a series of likely changes in water depth. For the most likely change in water depth, significant acidification was inferred from the change in vegetation. The model is more generally applicable as a constrained calibration method.
Ecology | 2004
Cajo J. F. ter Braak; André P. Schaffers
A new ordination method, called co-correspondence analysis, is developed to relate two types of communities (e.g., a plant community and an animal community) sampled at a common set of sites in a direct way. The method improves the simple, indirect approach of applying correspondence analysis (reciprocal averaging) to the separate species data sets and correlating the resulting ordination axes. Co-correspondence analysis maximizes the weighted covariance between weighted averaged species scores of one community and weighted averaged species scores of the other community. It thus attempts to identify the patterns that are common to both communities. Both a symmetric descriptive and an asymmetric predictive form are developed. The symmetric form relates to co-inertia analysis and the asymmetric, predictive form to partial least-squares regression. In two examples the predictive power of co-correspondence analysis is compared with that of canonical correspondence analyses on syntaxonomic and environmental data. In the first example, carabid beetles in roadside verges are shown to be more closely related to plant species composition than to vegetation structure (biomass, height, roughness, among others), and, in the second example, bryophytes in spring meadows are shown to be more closely related to the species composition of the vascular plants than to the measured water chemistry.