Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marjan Bakker is active.

Publication


Featured researches published by Marjan Bakker.


Perspectives on Psychological Science | 2012

The Rules of the Game Called Psychological Science

Marjan Bakker; Annette van Dijk; Jelte M. Wicherts

If science were a game, a dominant rule would probably be to collect results that are statistically significant. Several reviews of the psychological literature have shown that around 96% of papers involving the use of null hypothesis significance testing report significant outcomes for their main results but that the typical studies are insufficiently powerful for such a track record. We explain this paradox by showing that the use of several small underpowered samples often represents a more efficient research strategy (in terms of finding p < .05) than does the use of one larger (more powerful) sample. Publication bias and the most efficient strategy lead to inflated effects and high rates of false positives, especially when researchers also resorted to questionable research practices, such as adding participants after intermediate testing. We provide simulations that highlight the severity of such biases in meta-analyses. We consider 13 meta-analyses covering 281 primary studies in various fields of psychology and find indications of biases and/or an excess of significant results in seven. These results highlight the need for sufficiently powerful replications and changes in journal policies.


PLOS ONE | 2011

Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results

Jelte M. Wicherts; Marjan Bakker; Dylan Molenaar

Background The widespread reluctance to share published research data is often hypothesized to be due to the authors fear that reanalysis may expose errors in their work or may produce conclusions that contradict their own. However, these hypotheses have not previously been studied systematically. Methods and Findings We related the reluctance to share research data for reanalysis to 1148 statistically significant results reported in 49 papers published in two major psychology journals. We found the reluctance to share data to be associated with weaker evidence (against the null hypothesis of no effect) and a higher prevalence of apparent errors in the reporting of statistical results. The unwillingness to share data was particularly clear when reporting errors had a bearing on statistical significance. Conclusions Our findings on the basis of psychological papers suggest that statistical results are particularly hard to verify when reanalysis is more likely to lead to contrasting conclusions. This highlights the importance of establishing mandatory data archiving policies.


Behavior Research Methods | 2011

The (mis)reporting of statistical results in psychology journals

Marjan Bakker; Jelte M. Wicherts

In order to study the prevalence, nature (direction), and causes of reporting errors in psychology, we checked the consistency of reported test statistics, degrees of freedom, and p values in a random sample of high- and low-impact psychology journals. In a second study, we established the generality of reporting errors in a random sample of recent psychological articles. Our results, on the basis of 281 articles, indicate that around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers’ expectations. We classified the most common errors and contacted authors to shed light on the origins of the errors.


Psychological Methods | 2014

Outlier removal, sum scores, and the inflation of the Type I error rate in independent samples t tests: the power of alternatives and recommendations

Marjan Bakker; Jelte M. Wicherts

In psychology, outliers are often excluded before running an independent samples t test, and data are often nonnormal because of the use of sum scores based on tests and questionnaires. This article concerns the handling of outliers in the context of independent samples t tests applied to nonnormal sum scores. After reviewing common practice, we present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. We found Type I error rates of above 20% after removing outliers with a threshold value of Z = 2 in a short and difficult test. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold values of Z after having seen the effects thereof on outcomes. We recommend the use of nonparametric Mann-Whitney-Wilcoxon tests or robust Yuen-Welch tests without removing outliers. These alternatives to independent samples t tests are found to have nominal Type I error rates with a minimal loss of power when no outliers are present in the data and to have nominal Type I error rates and good power when outliers are present.


Behavior Research Methods | 2015

A power fallacy

Eric-Jan Wagenmakers; Josine Verhagen; Alexander Ly; Marjan Bakker; Michael D. Lee; Dora Matzke; Jeffrey N. Rouder; Richard D. Morey

The power fallacy refers to the misconception that what holds on average –across an ensemble of hypothetical experiments– also holds for each case individually. According to the fallacy, high-power experiments always yield more informative data than do low-power experiments. Here we expose the fallacy with concrete examples, demonstrating that a particular outcome from a high-power experiment can be completely uninformative, whereas a particular outcome from a low-power experiment can be highly informative. Although power is useful in planning an experiment, it is less useful—and sometimes even misleading—for making inferences from observed data. To make inferences from data, we recommend the use of likelihood ratios or Bayes factors, which are the extension of likelihood ratios beyond point hypotheses. These methods of inference do not average over hypothetical replications of an experiment, but instead condition on the data that have actually been observed. In this way, likelihood ratios and Bayes factors rationally quantify the evidence that a particular data set provides for or against the null or any other hypothesis.


Frontiers in Computational Neuroscience | 2012

Letting the daylight in: Reviewing the reviewers and other ways to maximize transparency in science

Jelte M. Wicherts; Rogier A. Kievit; Marjan Bakker; Denny Borsboom

With the emergence of online publishing, opportunities to maximize transparency of scientific research have grown considerably. However, these possibilities are still only marginally used. We argue for the implementation of (1) peer-reviewed peer review, (2) transparent editorial hierarchies, and (3) online data publication. First, peer-reviewed peer review entails a community-wide review system in which reviews are published online and rated by peers. This ensures accountability of reviewers, thereby increasing academic quality of reviews. Second, reviewers who write many highly regarded reviews may move to higher editorial positions. Third, online publication of data ensures the possibility of independent verification of inferential claims in published papers. This counters statistical errors and overly positive reporting of statistical results. We illustrate the benefits of these strategies by discussing an example in which the classical publication system has gone awry, namely controversial IQ research. We argue that this case would have likely been avoided using more transparent publication practices. We argue that the proposed system leads to better reviews, meritocratic editorial hierarchies, and a higher degree of replicability of statistical analyses.


PLOS ONE | 2014

Outlier Removal and the Relation with Reporting Errors and Quality of Psychological Research

Marjan Bakker; Jelte M. Wicherts

Background The removal of outliers to acquire a significant result is a questionable research practice that appears to be commonly used in psychology. In this study, we investigated whether the removal of outliers in psychology papers is related to weaker evidence (against the null hypothesis of no effect), a higher prevalence of reporting errors, and smaller sample sizes in these papers compared to papers in the same journals that did not report the exclusion of outliers from the analyses. Methods and Findings We retrieved a total of 2667 statistical results of null hypothesis significance tests from 153 articles in main psychology journals, and compared results from articles in which outliers were removed (Nu200a=u200a92) with results from articles that reported no exclusion of outliers (Nu200a=u200a61). We preregistered our hypotheses and methods and analyzed the data at the level of articles. Results show no significant difference between the two types of articles in median p value, sample sizes, or prevalence of all reporting errors, large reporting errors, and reporting errors that concerned the statistical significance. However, we did find a discrepancy between the reported degrees of freedom of t tests and the reported sample size in 41% of articles that did not report removal of any data values. This suggests common failure to report data exclusions (or missingness) in psychological articles. Conclusions We failed to find that the removal of outliers from the analysis in psychological articles was related to weaker evidence (against the null hypothesis of no effect), sample size, or the prevalence of errors. However, our control sample might be contaminated due to nondisclosure of excluded values in articles that did not report exclusion of outliers. Results therefore highlight the importance of more transparent reporting of statistical analyses.


Group Processes & Intergroup Relations | 2014

Broken windows, mediocre methods, and substandard statistics

Jelte M. Wicherts; Marjan Bakker

Broken windows theory states that cues of inappropriate behavior like litter or graffiti amplify norm-violating behavior. In a series of quasi-experiments, Keizer, Lindenberg, and Steg altered cues of inappropriate behavior in public places and observed how many passersby subsequently violated norms. They concluded that particular norm violations transgress to other misdemeanors (e.g., graffiti leads to littering or even theft) and that the presence of prohibition signs heightens the saliency of norm violations, thereby aggravating the negative effects of cues such as litter and graffiti. We raise several methodological and statistical issues that cast doubt on Keizer et al.’s results. Problems include confounding factors, observer bias, lacking scoring protocols, a failure to establish interobserver reliabilities, inflated Type I error rates due to dependencies, sequential testing, and multiple testing. We highlight results of a highly similar study that does not support the notion that prohibition signs aggravate the effects of observed norm violations. We discuss potential improvements of the paradigm.


European Journal of Personality | 2013

Dwelling on the past

Marjan Bakker; Angélique O. J. Cramer; Dora Matzke; Rogier A. Kievit; H.L.J. van der Maas; Eric-Jan Wagenmakers; Denny Borsboom

With the growing number of fraudulent and nonr replicability of experiments performed in laboratories world practices that investigators may be using to increase the replicability. We laud them for thoughtful intentions and ex domains: the structure of psychological science and the generalizability. The former represents a methodological/sta opportunity. Copyright


Nature | 2009

Sharing: guidelines go one step forwards, two steps back

Jelte M. Wicherts; Marjan Bakker

SIR — Some of the concerns expressed in your News story on the difficulties of collecting and sharing climate data across countries are unjustified (Nature 461, 159; 2009). The World Meteorological Organization’s task force is helping to develop a Global Framework for Climate Services to link weather predictions, projections and information with climate-risk management and adaptation. This international service will provide free and unrestricted collection and exchange of meteorological data. Because the information is for the public good, there will be no competition or exclusion in accessing it. Application by one user will not reduce its availability to others. It would also be impossible — or very costly — to exclude potential users from using the data for their own benefit. Another advantage is that although climate information is expensive to produce, it is relatively cheap to reproduce and distribute — making it economically efficient to supply these valuable data to all for free. Don Gunasekera Centre for Complex Systems Science, CSIRO Marine and Atmospheric Research, Commonwealth Scientific and Industrial Research Organisation, GPO Box 3023, Canberra, ACT 2601, Australia e-mail: [email protected]

Collaboration


Dive into the Marjan Bakker's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Dora Matzke

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Rogier A. Kievit

Cognition and Brain Sciences Unit

View shared research outputs
Top Co-Authors

Avatar

Alexander Ly

University of Amsterdam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge