Featured Researches

Other Statistics

"Playing the whole game": A data collection and analysis exercise with Google Calendar

We provide a computational exercise suitable for early introduction in an undergraduate statistics or data science course that allows students to 'play the whole game' of data science: performing both data collection and data analysis. While many teaching resources exist for data analysis, such resources are not as abundant for data collection given the inherent difficulty of the task. Our proposed exercise centers around student use of Google Calendar to collect data with the goal of answering the question 'How do I spend my time?' On the one hand, the exercise involves answering a question with near universal appeal, but on the other hand, the data collection mechanism is not beyond the reach of a typical undergraduate student. A further benefit of the exercise is that it provides an opportunity for discussions on ethical questions and considerations that data providers and data analysts face in today's age of large-scale internet-based data collection.

Read more
Other Statistics

A Bayesian Redesign of the First Probability/Statistics Course

The traditional calculus-based introduction to statistical inference consists of a semester of probability followed by a semester of frequentist inference. Cobb (2015) challenges the statistical education community to rethink the undergraduate statistics curriculum. In particular, he suggests that we should focus on two goals: making fundamental concepts accessible and minimizing prerequisites to research. Using five underlying principles of Cobb, we describe a new calculus-based introduction to statistics based on simulation-based Bayesian computation.

Read more
Other Statistics

A Bayesian Statistics Course for Undergraduates: Bayesian Thinking, Computing, and Research

We propose a semester-long Bayesian statistics course for undergraduate students with calculus and probability background. We cultivate students' Bayesian thinking with Bayesian methods applied to real data problems. We leverage modern Bayesian computing techniques not only for implementing Bayesian methods, but also to deepen students' understanding of the methods. Collaborative case studies further enrich students' learning and provide experience to solve open-ended applied problems. The course has an emphasis on undergraduate research, where accessible academic journal articles are read, discussed, and critiqued in class. With increased confidence and familiarity, students take the challenge of reading, implementing, and sometimes extending methods in journal articles for their course projects.

Read more
Other Statistics

A Brief History of the Statistics Department of the University of California at Berkeley

The early history of our department was dominated by Jerzy Neyman (1894-1981), while the next phase was largely in the hands of Neyman's students, with Erich Lehmann (1917-2009) being a central, long-lived and much-loved member of this group. We are very fortunate in having Constance Reid's biography "Neyman -- From Life" and Erich's "Reminiscences of a Statistician: The Company I Kept" and other historical material documenting the founding and growth of the department, and the people in it. In what follows, we will draw heavily from these sources, describing what seems to us to be a remarkable success story: one person starting "a cell of statistical research and teaching ... not being hampered by any existing traditions and routines" and seeing that cell grow rapidly into a major force in academic statistics worldwide. That it has remained so for (at least) the half-century after its founding is a testament to the strength of Neyman's model for a department of statistics.

Read more
Other Statistics

A Case Study of Promoting Informal Inferential Reasoning in Learning Sampling Distribution for High School Students

Drawing inference from data is an important skill for students to understand their everyday life, so that the sampling distribution as a central topic in statistical inference is necessary to be learned by the students. However, little is known about how to teach the topic for high school students, especially in Indonesian context. Therefore, the present study provides a teaching experiment to support the students' informal inferential reasoning in understanding the sampling distribution, as well as the students' perceptions toward the teaching experiment. The subjects in the present study were three 11th-grader of one private school in Yogyakarta majoring in mathematics and natural science. The method of data collection was direct observation of sampling distribution learning process, interviews, and documentation. The present study found that that informal inferential reasoning with problem-based learning using contextual problems and real data could support the students to understand the sampling distribution, and they also gave positive responses about their learning experience.

Read more
Other Statistics

A Coin-Tossing Conundrum

It is shown that an equiprobability hypothesis leads to a scenario in which it is possible to predict the outcome of a single toss of a fair coin with a success probability greater than 50%. We discuss whether this hypothesis might be independent of the usual hypotheses governing probability, as well as whether this hypothesis might be assumed as a result of the Principle of Indifference. Also discussed are ways to implement or circumvent the hypothesis.

Read more
Other Statistics

A Combinatorial Solution to Causal Compatibility

Within the field of causal inference, it is desirable to learn the structure of causal relationships holding between a system of variables from the correlations that these variables exhibit; a sub-problem of which is to certify whether or not a given causal hypothesis is compatible with the observed correlations. A particularly challenging setting for assessing causal compatibility is in the presence of partial information; i.e. when some of the variables are hidden/latent. This paper introduces the possible worlds framework as a method for deciding causal compatibility in this difficult setting. We define a graphical object called a possible worlds diagram, which compactly depicts the set of all possible observations. From this construction, we demonstrate explicitly, using several examples, how to prove causal incompatibility. In fact, we use these constructions to prove causal incompatibility where no other techniques have been able to. Moreover, we prove that the possible worlds framework can be adapted to provide a complete solution to the possibilistic causal compatibility problem. Even more, we also discuss how to exploit graphical symmetries and cross-world consistency constraints in order to implement a hierarchy of necessary compatibility tests that we prove converges to sufficiency.

Read more
Other Statistics

A Conceptual Introduction to Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo (MCMC) methods have become a cornerstone of many modern scientific analyses by providing a straightforward approach to numerically estimate uncertainties in the parameters of a model using a sequence of random samples. This article provides a basic introduction to MCMC methods by establishing a strong conceptual understanding of what problems MCMC methods are trying to solve, why we want to use them, and how they work in theory and in practice. To develop these concepts, I outline the foundations of Bayesian inference, discuss how posterior distributions are used in practice, explore basic approaches to estimate posterior-based quantities, and derive their link to Monte Carlo sampling and MCMC. Using a simple toy problem, I then demonstrate how these concepts can be used to understand the benefits and drawbacks of various MCMC approaches. Exercises designed to highlight various concepts are also included throughout the article.

Read more
Other Statistics

A Concise Resolution to the Two Envelope Paradox

In this paper, I will demonstrate a new perspective on the Two Envelope Problem. I hope to show with convincing clarity how the paradox results from an inherent problem pertaining to the interpretation of Bayesian probability. Specifically, a subjective probability that is inconsistent with reality can mislead reasoning based on Bayesian decision theory.

Read more
Other Statistics

A Constructive Algebraic Proof of Student's Theorem

Student's theorem is an important result in statistics which states that for normal population, the sample variance is independent from the sample mean and has a chi-square distribution. The existing proofs of this theorem either overly rely on advanced tools such as moment generating functions, or fail to explicitly construct an orthogonal matrix used in the proof. This paper provides an elegant explicit construction of that matrix, making the algebraic proof complete. The constructive algebraic proof proposed here is thus very suitable for being included in textbooks.

Read more

Ready to get started?

Join us today