Ruud Wetzels | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ruud Wetzels is active.

Explore More

Publication

Featured researches published by Ruud Wetzels.

Journal of Personality and Social Psychology | 2011

Why psychologists must change the way they analyze their data: The case of psi: Comment on Bem (2011).

Eric-Jan Wagenmakers; Ruud Wetzels; Denny Borsboom; Han L. J. van der Maas

Does psi exist? D. J. Bem (2011) conducted 9 studies with over 1,000 participants in an attempt to demonstrate that future events retroactively affect peoples responses. Here we discuss several limitations of Bems experiments on psi; in particular, we show that the data analysis was partly exploratory and that one-sided p values may overstate the statistical evidence against the null hypothesis. We reanalyze Bems data with a default Bayesian t test and show that the evidence for psi is weak to nonexistent. We argue that in order to convince a skeptical audience of a controversial claim, one needs to conduct strictly confirmatory studies and analyze the results with statistical tests that are conservative rather than liberal. We conclude that Bems p values do not indicate evidence in favor of precognition; instead, they indicate that experimental psychologists need to change the way they conduct their experiments and analyze their data.

Perspectives on Psychological Science | 2011

Statistical Evidence in Experimental Psychology An Empirical Comparison Using 855 t Tests

Ruud Wetzels; Dora Matzke; Michael D. Lee; Jeffrey N. Rouder; Geoffrey J. Iverson; Eric-Jan Wagenmakers

Statistical inference in psychology has traditionally relied heavily on p-value significance testing. This approach to drawing conclusions from data, however, has been widely criticized, and two types of remedies have been advocated. The first proposal is to supplement p values with complementary measures of evidence, such as effect sizes. The second is to replace inference with Bayesian measures of evidence, such as the Bayes factor. The authors provide a practical comparison of p values, effect sizes, and default Bayes factors as measures of statistical evidence, using 855 recently published t tests in psychology. The comparison yields two main results. First, although p values and default Bayes factors almost always agree about what hypothesis is better supported by the data, the measures often disagree about the strength of this support; for 70% of the data sets for which the p value falls between .01 and .05, the default Bayes factor indicates that the evidence is only anecdotal. Second, effect sizes can provide additional evidence to p values and default Bayes factors. The authors conclude that the Bayesian approach is comparatively prudent, preventing researchers from overestimating the evidence in favor of an effect.

Perspectives on Psychological Science | 2012

An Agenda for Purely Confirmatory Research

Eric-Jan Wagenmakers; Ruud Wetzels; Denny Borsboom; Han L. J. van der Maas; Rogier A. Kievit

The veracity of substantive research claims hinges on the way experimental data are collected and analyzed. In this article, we discuss an uncomfortable fact that threatens the core of psychology’s academic enterprise: almost without exception, psychologists do not commit themselves to a method of data analysis before they see the actual data. It then becomes tempting to fine tune the analysis to the data in order to obtain a desired result—a procedure that invalidates the interpretation of the common statistical tests. The extent of the fine tuning varies widely across experiments and experimenters but is almost impossible for reviewers and readers to gauge. To remedy the situation, we propose that researchers preregister their studies and indicate in advance the analyses they intend to conduct. Only these analyses deserve the label “confirmatory,” and only for these analyses are the common statistical tests valid. Other analyses can be carried out but these should be labeled “exploratory.” We illustrate our proposal with a confirmatory replication attempt of a study on extrasensory perception.

Psychonomic Bulletin & Review | 2012

A default Bayesian hypothesis test for correlations and partial correlations

Ruud Wetzels; Eric-Jan Wagenmakers

We propose a default Bayesian hypothesis test for the presence of a correlation or a partial correlation. The test is a direct application of Bayesian techniques for variable selection in regression models. The test is easy to apply and yields practical advantages that the standard frequentist tests lack; in particular, the Bayesian test can quantify evidence in favor of the null hypothesis and allows researchers to monitor the test results as the data come in. We illustrate the use of the Bayesian correlation test with three examples from the psychological literature. Computer code and example data are provided in the journal archives.

Psychonomic Bulletin & Review | 2009

How to quantify support for and against the null hypothesis: a flexible WinBUGS implementation of a default Bayesian t test.

Ruud Wetzels; Jeroen G. W. Raaijmakers; Emöke Jakab; Eric-Jan Wagenmakers

We propose a sampling-based Bayesian t test that allows researchers to quantify the statistical evidence in favor of the null hypothesis. This Savage—Dickey (SD) t test is inspired by the Jeffreys—Zellner—Siow (JZS) t test recently proposed by Rouder, Speckman, Sun, Morey, and Iverson (2009). The SD test retains the key concepts of the JZS test but is applicable to a wider range of statistical problems. The SD test allows researchers to test order restrictions and applies to two-sample situations in which the different groups do not share the same variance.

The American Statistician | 2012

A Default Bayesian Hypothesis Test for ANOVA Designs

Ruud Wetzels; Raoul P. P. P. Grasman; Eric-Jan Wagenmakers

This article presents a Bayesian hypothesis test for analysis of variance (ANOVA) designs. The test is an application of standard Bayesian methods for variable selection in regression models. We illustrate the effect of various g-priors on the ANOVA hypothesis test. The Bayesian test for ANOVA designs is useful for empirical researchers and for students; both groups will get a more acute appreciation of Bayesian inference when they can apply it to practical statistical problems such as ANOVA. We illustrate the use of the test with two examples, and we provide R code that makes the test easy to use.

Behavior Research Methods | 2015

A default Bayesian hypothesis test for mediation

Michèle B. Nuijten; Ruud Wetzels; Dora Matzke; Conor V. Dolan; Eric-Jan Wagenmakers

In order to quantify the relationship between multiple variables, researchers often carry out a mediation analysis. In such an analysis, a mediator (e.g., knowledge of a healthy diet) transmits the effect from an independent variable (e.g., classroom instruction on a healthy diet) to a dependent variable (e.g., consumption of fruits and vegetables). Almost all mediation analyses in psychology use frequentist estimation and hypothesis-testing techniques. A recent exception is Yuan and MacKinnon (Psychological Methods, 14, 301–322, 2009), who outlined a Bayesian parameter estimation procedure for mediation analysis. Here we complete the Bayesian alternative to frequentist mediation analysis by specifying a default Bayesian hypothesis test based on the Jeffreys–Zellner–Siow approach. We further extend this default Bayesian test by allowing a comparison to directional or one-sided alternatives, using Markov chain Monte Carlo techniques implemented in JAGS. All Bayesian tests are implemented in the R package BayesMed (Nuijten, Wetzels, Matzke, Dolan, & Wagenmakers, 2014).

The Journal of Problem Solving | 2013

A comparison of reinforcement learning models for the Iowa Gambling Task using parameter space partitioning

Helen Steingroever; Ruud Wetzels; Eric-Jan Wagenmakers

The Iowa gambling task (IGT) is one of the most popular tasks used to study decisionmaking deficits in clinical populations. In order to decompose performance on the IGT in its constituent psychological processes, several cognitive models have been proposed (e.g., the Expectancy Valence (EV) and Prospect Valence Learning (PVL) models). Here we present a comparison of three models—the EV and PVL models, and a combination of these models (EV-PU)—based on the method of parameter space partitioning. This method allows us to assess the choice patterns predicted by the models across their entire parameter space. Our results show that the EV model is unable to account for a frequency-of-losses effect, whereas the PVL and EV-PU models are unable to account for a pronounced preference for the bad decks with many switches. All three models underrepresent pronounced choice patterns that are frequently seen in experiments. Overall, our results suggest that the search of an appropriate IGT model has not yet come to an end.

Frontiers in Psychology | 2013

Validating the PVL-Delta model for the Iowa gambling task

Helen Steingroever; Ruud Wetzels; Eric-Jan Wagenmakers

Decision-making deficits in clinical populations are often assessed with the Iowa gambling task (IGT). Performance on this task is driven by latent psychological processes, the assessment of which requires an analysis using cognitive models. Two popular examples of such models are the Expectancy Valence (EV) and Prospect Valence Learning (PVL) models. These models have recently been subjected to sophisticated procedures of model checking, spawning a hybrid version of the EV and PVL models-the PVL-Delta model. In order to test the validity of the PVL-Delta model we present a parameter space partitioning (PSP) study and a test of selective influence. The PSP study allows one to assess the choice patterns that the PVL-Delta model generates across its entire parameter space. The PSP study revealed that the model accounts for empirical choice patterns featuring a preference for the good decks or the decks with infrequent losses; however, the model fails to account for empirical choice patterns featuring a preference for the bad decks. The test of selective influence investigates the effectiveness of experimental manipulations designed to target only a single model parameter. This test showed that the manipulations were successful for all but one parameter. To conclude, despite a few shortcomings, the PVL-Delta model seems to be a better IGT model than the popular EV and PVL models.

Behavior Research Methods | 2010

Bayesian inference using WBDev: a tutorial for social scientists.

Ruud Wetzels; Michael D. Lee; Eric-Jan Wagenmakers

Over the last decade, the popularity of Bayesian data analysis in the empirical sciences has greatly increased. This is partly due to the availability of WinBUGS, a free and flexible statistical software package that comes with an array of predefined functions and distributions, allowing users to build complex models with ease. For many applications in the psychological sciences, however, it is highly desirable to be able to define one’s own distributions and functions. This functionality is available through the WinBUGS Development Interface (WBDev). This tutorial illustrates the use of WBDev by means of concrete examples, featuring the expectancyvalence model for risky behavior in decision making, and the shifted Wald distribution of response times in speeded choice.

Explore More