Featured Researches

Other Statistics

Introduction to Geodetic Time Series Analysis

This contribution is the chapter 2 of the book "geodetic time series analysis" (10.1007/978-3-030-21718-1). The book is dedicated to the art of fitting a trajectory model to those geodetic time series in order to extract accurate geophysical information with realistic error bars in geodymanics and environmental geodesy related studies. In the vast amount of the literature published on this topic in the past 25 years, we are specifically interested in parametric algorithms which are estimating both functional and stochastic models using various Bayesian statistical tools (maximum likelihood, Monte Carlo Markov chain, Kalman filter, least squares variance component estimation, information criteria). This chapter will focus on how the parameters of the trajectory model can be estimated. It is meant to give researchers new to this topic an easy introduction to the theory with references to key books and articles where more details can be found. In addition, we hope that it refreshes some of the details for the more experienced readers. We pay special attention to the modelling of the noise which has received much attention in the literature in the last years and highlight some of the numerical aspects.

Read more
Other Statistics

Inverse Ising techniques to infer underlying mechanisms from data

As a problem in data science the inverse Ising (or Potts) problem is to infer the parameters of a Gibbs-Boltzmann distributions of an Ising (or Potts) model from samples drawn from that distribution. The algorithmic and computational interest stems from the fact that this inference task cannot be done efficiently by the maximum likelihood criterion, since the normalizing constant of the distribution (the partition function) can not be calculated exactly and efficiently. The practical interest on the other hand flows from several outstanding applications, of which the most well known has been predicting spatial contacts in protein structures from tables of homologous protein sequences. Most applications to date have been to data that has been produced by a dynamical process which, as far as it is known, cannot be expected to satisfy detailed balance. There is therefore no a priori reason to expect the distribution to be of the Gibbs-Boltzmann type, and no a priori reason to expect that inverse Ising (or Potts) techniques should yield useful information. In this review we discuss two types of problems where progress nevertheless can be made. We find that depending on model parameters there are phases where, in fact, the distribution is close to Gibbs-Boltzmann distribution, a non-equilibrium nature of the under-lying dynamics notwithstanding. We also discuss the relation between inferred Ising model parameters and parameters of the underlying dynamics.

Read more
Other Statistics

It is Time to Stop Teaching Frequentism to Non-statisticians

We should cease teaching frequentist statistics to undergraduates and switch to Bayes. Doing so will reduce the amount of confusion and over-certainty rife among users of statistics.

Read more
Other Statistics

Iterated Integrals and Population Time Series Analysis

One of the core advantages topological methods for data analysis provide is that the language of (co)chains can be mapped onto the semantics of the data, providing a natural avenue for human understanding of the results. Here, we describe such a semantic structure on Chen's classical iterated integral cochain model for paths in Euclidean space. Specifically, in the context of population time series data, we observe that iterated integrals provide a model-free measure of pairwise influence that can be used for causality inference. Along the way, we survey recent results and applications, review the current standard methods for causality inference, and briefly provide our outlook on generalizations to go beyond time series data.

Read more
Other Statistics

Its All on the Square- The Importance of the Sum of Squares and Making the General Linear Model Simple

Statistics is one of the most valuable of disciplines. Science is based on proof and it alone produces results, other approaches are not, and do not. Statistics is the only acceptable language of proof in science. Yet statistics is difficult to understand for a large percentage of those who will be evaluating and even doing research. Reasons for this difficulty may be that statistics operates counter to the way people think, as well as the widespread phobia of numeracy. Adding to the difficulty is that undergraduate textbooks tend to make statistical tests seem to be an unorganized conglomeration of unrelated procedures, and this leads to a failure of students to understand that all of the parametric procedures they are studying in an introductory course are ultimately doing the same thing and stem from common sources. In statistics, precisely because the material is complex, the presentation must be simple! This article endeavors to do just that.

Read more
Other Statistics

J. B. S. Haldane's Contribution to the Bayes Factor Hypothesis Test

This article brings attention to some historical developments that gave rise to the Bayes factor for testing a point null hypothesis against a composite alternative. In line with current thinking, we find that the conceptual innovation - to assign prior mass to a general law - is due to a series of three articles by Dorothy Wrinch and Sir Harold Jeffreys (1919, 1921, 1923). However, our historical investigation also suggests that in 1932 J. B. S. Haldane made an important contribution to the development of the Bayes factor by proposing the use of a mixture prior comprising a point mass and a continuous probability density. Jeffreys was aware of Haldane's work and it may have inspired him to pursue a more concrete statistical implementation for his conceptual ideas. It thus appears that Haldane may have played a much bigger role in the statistical development of the Bayes factor than has hitherto been assumed.

Read more
Other Statistics

J.B.S. Haldane Could Have Done Better

In a review on the contribution of J.B.S. Haldane to the development of the Bayes factor hypothesis test (arXiv:1511.08180), Etz and Wagenmakers focus on Haldane's proposition of a mixture prior in a genetic example (Haldane 1932, A note on inverse probability. Mathematical Proceedings of the Cambridge Philosophical Society, 28, 55-61.). As Haldane never followed up on these ideas, it is difficult to gauge his motivation and intentions. I argue that contrary to Haldane's stated intention of replacing flat priors with more reasonable assumptions, he actually chose in this example an unreasonable flat prior. Considering the information available to Haldane, I derive a superior prior and compare to Haldane's flat prior. Haldane's main intent with his article seems to have been to explore the different parameter regions of the binomial and the conjugate beta. Furthermore, I agree with Etz and Wagenmakers that Haldane serendipitously adopted a mixture prior comprising a point mass and smooth distribution in his genetic example.

Read more
Other Statistics

Julian Ernst Besag, 26 March 1945 -- 6 August 2010, a biographical memoir

Julian Besag was an outstanding statistical scientist, distinguished for his pioneering work on the statistical theory and analysis of spatial processes, especially conditional lattice systems. His work has been seminal in statistical developments over the last several decades ranging from image analysis to Markov chain Monte Carlo methods. He clarified the role of auto-logistic and auto-normal models as instances of Markov random fields and paved the way for their use in diverse applications. Later work included investigations into the efficacy of nearest neighbour models to accommodate spatial dependence in the analysis of data from agricultural field trials, image restoration from noisy data, and texture generation using lattice models.

Read more
Other Statistics

Jump balls, rating falls, and elite status: A sensitivity analysis of three quarterback rating statistics

Quarterback performance can be difficult to rank, and much effort has been spent in creating new rating systems. However, the input statistics for such ratings are subject to randomness and factors outside the quarterback's control. To investigate this variance, we perform a sensitivity analysis of three quarterback rating statistics: the Traditional 1971 rating by Smith, the Burke, and the Wages of Wins ratings. The comparisons are made at the team level for the 32 NFL teams from 2002-2015, thus giving each case an even 16 games. We compute quarterback ratings for each offense with 1-5 additional touchdowns, 1-5 fewer interceptions, 1-5 additional sacks, and a 1-5 percent increase in the passing completion rate. Our sensitivity analysis provides insight into whether an elite passing team could seem mediocre or vice versa based on random outcomes. The results indicate that the Traditional rating is the most sensitive statistic with respect to touchdowns, interceptions, and completions, whereas the Burke rating is most sensitive to sacks. The analysis suggests that team passing offense rankings are highly sensitive to aspects of football that are out of the quarterback's hands (e.g., deflected passes that lead to interceptions). Thus, on the margins, we show arguments about whether a specific quarterback has entered the elite or remains mediocre are irrelevant.

Read more
Other Statistics

Justifying the Norms of Inductive Inference

Bayesian inference is limited in scope because it cannot be applied in idealized contexts where none of the hypotheses under consideration is true and because it is committed to always using the likelihood as a measure of evidential favoring, even when that is inappropriate. The purpose of this paper is to study inductive inference in a very general setting where finding the truth is not necessarily the goal and where the measure of evidential favoring is not necessarily the likelihood. I use an accuracy argument to argue for probabilism and I develop a new kind of argument to argue for two general updating rules, both of which are reasonable in different contexts. One of the updating rules has standard Bayesian updating, Bissiri et al's (2016) general Bayesian updating, Douven's (2016) IBE-based updating, and Vassend's (2019a) quasi-Bayesian updating as special cases. The other updating rule is novel.

Read more

Ready to get started?

Join us today