Featured Researches

Other Statistics

Linear Regression under Special Relativity

This study investigated the problem posed by using ordinary least squares (OLS) to estimate parameters of simple linear regression under a specific context of special relativity, where an independent variable is restricted to an open interval, (-c, c). It is found that the OLS estimate for the slope coefficient is not invariant under Lorentz velocity transformation. Accordingly, an alternative estimator for the parameters of linear regression under special relativity is proposed. This estimator can be considered a generalization of the OLS estimator under special relativity; when c approaches to infinity, the proposed estimator and its variance converges to the OLS estimator and its variance, respectively. The variance of the proposed estimator is larger than that of the OLS estimator, which implies that hypothesis testing using the OLS estimator and its variance may result in a liberal test under special relativity.

Read more
Other Statistics

Liquid Scorecards

Traditional credit scorecards are generalized additive models (GAMs) with step functions as the component functions. The shapes of the step functions may be constrained in order to satisfy the PILE (Palatability, Interpretability, Legal, Explain-ability) constraints. Before 2003, FICO used Linear Programming to find the traditional scorecard that approximately maximizes divergence subject to the PILE constraints. In this paper, I introduce the Liquid Scorecard, that allows the component functions to be, at least partially, smooth curves. I use Quadratic Programming and B-Spline theory to find the Liquid Scorecard that exactly maximizes divergence subject to the PILE constraints. FICO uses aspects of this technology to develop the famous FICO Credit Score.

Read more
Other Statistics

Listwise Deletion in High Dimensions

We consider the properties of listwise deletion when both n and the number of variables grow large. We show that when (i) all data has some idiosyncratic missingness and (ii) the number of variables grows superlogarithmically in n , then, for large n , listwise deletion will drop all rows with probability 1. We present numerical illustrations to demonstrate finite- n implications. These results suggest, in practice, using listwise deletion may mean using few of the variables available to the researcher even when n is very large.

Read more
Other Statistics

MH370 Burst Frequency Offset Analysis and Implications on Descent Rate at End-of-Flight

Malaysian Airlines flight MH370 veered off course unexpectedly during a scheduled trip from Kuala Lumpur to Beijing on the 7th of March 2014. MH370 was tracked via military radar into the Malacca Straits and, after disappearing from radar, was subsequently believed to have turned south towards the southern Indian Ocean before crashing approximately 6 hours later. This article discusses specifically the analysis of burst frequency offset (BFO) metadata from the SATCOM messages. It is shown that the BFOs corresponding to the last two SATCOM messages from the plane at 00:19:29Z and 00:19:37Z 8th March 2014 suggest that flight MH370 was rapidly descending and accelerating downwards when message exchange with the ground station ceased.

Read more
Other Statistics

Machine Learning for high speed channel optimization

Design of printed circuit board (PCB) stack-up requires the consideration of characteristic impedance, insertion loss and crosstalk. As there are many parameters in a PCB stack-up design, the optimization of these parameters needs to be efficient and accurate. A less optimal stack-up would lead to expensive PCB material choices in high speed designs. In this paper, an efficient global optimization method using parallel and intelligent Bayesian optimization is proposed for the stripline design.

Read more
Other Statistics

Mandelbrot's 1/f fractional renewal models of 1963-67: The non-ergodic missing link between change points and long range dependence

The problem of 1/f noise has been with us for about a century. Because it is so often framed in Fourier spectral language, the most famous solutions have tended to be the stationary long range dependent (LRD) models such as Mandelbrot's fractional Gaussian noise. In view of the increasing importance to physics of non-ergodic fractional renewal models, I present preliminary results of my research into the history of Mandelbrot's very little known work in that area from 1963-67. I speculate about how the lack of awareness of this work in the physics and statistics communities may have affected the development of complexity science, and I discuss the differences between the Hurst effect, 1/f noise and LRD, concepts which are often treated as equivalent.

Read more
Other Statistics

Many perspectives on Deborah Mayo's "Statistical Inference as Severe Testing: How to Get Beyond the Statistics Wars"

The new book by philosopher Deborah Mayo is relevant to data science for topical reasons, as she takes various controversial positions regarding hypothesis testing and statistical practice, and also as an entry point to thinking about the philosophy of statistics. The present article is a slightly expanded version of a series of informal reviews and comments on Mayo's book. We hope this discussion will introduce people to Mayo's ideas along with other perspectives on the topics she addresses.

Read more
Other Statistics

Marginalization and Conditioning for LWF Chain Graphs

In this paper, we deal with the problem of marginalization over and conditioning on two disjoint subsets of the node set of chain graphs (CGs) with the LWF Markov property. For this purpose, we define the class of chain mixed graphs (CMGs) with three types of edges and, for this class, provide a separation criterion under which the class of CMGs is stable under marginalization and conditioning and contains the class of LWF CGs as its subclass. We provide a method for generating such graphs after marginalization and conditioning for a given CMG or a given LWF CG. We then define and study the class of anterial graphs, which is also stable under marginalization and conditioning and contains LWF CGs, but has a simpler structure than CMGs.

Read more
Other Statistics

Markov Equivalences for Subclasses of Loopless Mixed Graphs

In this paper we discuss four problems regarding Markov equivalences for subclasses of loopless mixed graphs. We classify these four problems as finding conditions for internal Markov equivalence, which is Markov equivalence within a subclass, for external Markov equivalence, which is Markov equivalence between subclasses, for representational Markov equivalence, which is the possibility of a graph from a subclass being Markov equivalent to a graph from another subclass, and finding algorithms to generate a graph from a certain subclass that is Markov equivalent to a given graph. We particularly focus on the class of maximal ancestral graphs and its subclasses, namely regression graphs, bidirected graphs, undirected graphs, and directed acyclic graphs, and present novel results for representational Markov equivalence and algorithms.

Read more
Other Statistics

Markov properties for mixed graphs

In this paper, we unify the Markov theory of a variety of different types of graphs used in graphical Markov models by introducing the class of loopless mixed graphs, and show that all independence models induced by m -separation on such graphs are compositional graphoids. We focus in particular on the subclass of ribbonless graphs which as special cases include undirected graphs, bidirected graphs, and directed acyclic graphs, as well as ancestral graphs and summary graphs. We define maximality of such graphs as well as a pairwise and a global Markov property. We prove that the global and pairwise Markov properties of a maximal ribbonless graph are equivalent for any independence model that is a compositional graphoid.

Read more

Ready to get started?

Join us today