Featured Researches

Other Statistics

A note on Fibonacci Sequences of Random Variables

The focus of this paper is the random sequences in the form \{X_{0},X_{1}, X_{n}=X_{n-2}+X_{n-1},n=2,3,..\dot{\}}, referred to as Fibonacci Random Sequence (FRS). The initial random variables X_{0} and X_{1} are assumed to be absolutely continuous with joint probability density function (pdf) f_{X_{0},X_{1}}. The FRS is completely determined by X_{0} and X_{1} and the members of Fibonacci sequence \digamma \equiv\{0,1,1,2,3,5,8,13,21,34,55,89,144,...\}. We examine the distributional and limit properties of the random sequence X_{n},n=0,1,2,... .

Read more
Other Statistics

A novel entropy recurrence quantification analysis

The growing study of time series, especially those related to nonlinear systems, has challenged the methodologies to characterize and classify dynamical structures of a signal. Here we conceive a new diagnostic tool for time series based on the concept of information entropy, in which the probabilities are associated to microstates defined from the recurrence phase space. Recurrence properties can properly be studied using recurrence plots, a methodology based on binary matrices where trajec- tories in phase space of dynamical systems are evaluated against other embedded trajectory. Our novel entropy methodology has several advantages compared to the traditional recurrence entropy defined in the literature, namely, the correct evaluation of the chaoticity level of the signal, the weak dependence on parameters, correct evaluation of periodic time series properties and more sensitivity to noise level of time series. Furthermore, the new entropy quantifier developed in this manuscript also fixes inconsistent results of the traditional recurrence entropy concept, reproducing classical results with novel insights.

Read more
Other Statistics

A paradox on the spectral representation of stationary random processes

In this note our aim is to show a paradox in the spectral representation of stationary random processes.

Read more
Other Statistics

A reckless guide to P-values: local evidence, global errors

This chapter demystifies P-values, hypothesis tests and significance tests, and introduces the concepts of local evidence and global error rates. The local evidence is embodied in \textit{this} data and concerns the hypotheses of interest for \textit{this} experiment, whereas the global error rate is a property of the statistical analysis and sampling procedure. It is shown using simple examples that local evidence and global error rates can be, and should be, considered together when making inferences. Power analysis for experimental design for hypothesis testing are explained, along with the more locally focussed expected P-values. Issues relating to multiple testing, HARKing, and P-hacking are explained, and it is shown that, in many situation, their effects on local evidence and global error rates are in conflict, a conflict that can always be overcome by a fresh dataset from replication of key experiments. Statistics is complicated, and so is science. There is no singular right way to do either, and universally acceptable compromises may not exist. Statistics offers a wide array of tools for assisting with scientific inference by calibrating uncertainty, but statistical inference is not a substitute for scientific inference. P-values are useful indices of evidence and deserve their place in the statistical toolbox of basic pharmacologists.

Read more
Other Statistics

A response to critiques of "The reproducibility of research and the misinterpretation of p-values"

I proposed (8, 1, 3) that p values should be supplemented by an estimate of the false positive risk (FPR). FPR was defined as the probability that, if you claim that there is a real effect on the basis of p value from a single unbiased experiment, that you will be mistaken and the result has occurred by chance. This is a Bayesian quantity and that means that there is an infinitude of ways to calculate it. My choice of a way to estimate FPR was, therefore, arbitrary. I maintain that it is a reasonable way, and has the advantage of being mathematically simpler than other proposals and easier to understand than other methods. This might make it more easily accepted by users. As always, not every statistician agrees. This paper is a response to a critique of my 2017 paper (1) by Arandjelovic (2)

Read more
Other Statistics

A review of problem- and team-based methods for teaching statistics in Higher Education

The teaching of statistics in higher education in the UK is still largely lecture-based. This is despite recommendations such as those given by the American Statistical Association's GAISE report that more emphasis should be placed on active learning strategies where students take more responsibility for their own learning. One possible model is that of collaborative learning, where students learn in groups through carefully crafted `problems', which has long been suggested as a strategy for teaching statistics. In this article, we review two specific approaches that fall under the collaborative learning model: problem- and team-based learning. We consider the evidence for changing to this model of teaching in statistics, as well as give practical suggestions on how this could be implemented in typical statistics classes in Higher Education.

Read more
Other Statistics

A score function for Bayesian cluster analysis

We propose a score function for Bayesian clustering. The function is parameter free and captures the interplay between the within cluster variance and the between cluster entropy of a clustering. It can be used to choose the number of clusters in well-established clustering methods such as hierarchical clustering or K -means algorithm.

Read more
Other Statistics

A shiny update to an old experiment game

Games can be a powerful tool for learning about statistical methodology. Effective game design involves a fine balance between caricature and realism, to simultaneously illustrate salient concepts in a controlled setting and serve as a testament to real-world applicability. Striking that balance is particularly challenging in response surface and design domains, where real-world scenarios often play out over long time scales, during which theories are revised, model and inferential techniques are improved, and knowledge is updated. Here I present a game, borrowing liberally from one first played over forty years ago, that attempts to achieve that balance while reinforcing a cascade of topics in modern nonparametric response surfaces, sequential design and optimization. The game embeds a blackbox simulation within a shiny app whose interface is designed to simulate a realistic information-availability setting, while offering a stimulating, competitive environment wherein students can try out new methodology, and ultimately appreciate its power and limitations. Interface, rules, timing with course material, and evaluation are described, along with a "case study" involving a cohort of students at Virginia Tech.

Read more
Other Statistics

A statistical framework for measuring the temporal stability of human mobility patterns

Despite the growing popularity of human mobility studies that collect GPS location data, the problem of determining the minimum required length of GPS monitoring has not been addressed in the current statistical literature. In this paper we tackle this problem by laying out a theoretical framework for assessing the temporal stability of human mobility based on GPS location data. We define several measures of the temporal dynamics of human spatiotemporal trajectories based on the average velocity process, and on activity distributions in a spatial observation window. We demonstrate the use of our methods with data that comprise the GPS locations of 185 individuals over the course of 18 months. Our empirical results suggest that GPS monitoring should be performed over periods of time that are significantly longer than what has been previously suggested. Furthermore, we argue that GPS study designs should take into account demographic groups. KEYWORDS: Density estimation; global positioning systems (GPS); human mobility; spatiotemporal trajectories; temporal dynamics

Read more
Other Statistics

A statistical inference course based on p-values

Introductory statistical inference texts and courses treat the point estimation, hypothesis testing, and interval estimation problems separately, with primary emphasis on large-sample approximations. Here I present an alternative approach to teaching this course, built around p-values, emphasizing provably valid inference for all sample sizes. Details about computation and marginalization are also provided, with several illustrative examples, along with a course outline.

Read more

Ready to get started?

Join us today