Featured Researches

Other Statistics

Removing Gaussian Noise by Optimization of Weights in Non-Local Means

A new image denoising algorithm to deal with the additive Gaussian white noise model is given. Like the non-local means method, the filter is based on the weighted average of the observations in a neighborhood, with weights depending on the similarity of local patches. But in contrast to the non-local means filter, instead of using a fixed Gaussian kernel, we propose to choose the weights by minimizing a tight upper bound of mean square error. This approach makes it possible to define the weights adapted to the function at hand, mimicking the weights of the oracle filter. Under some regularity conditions on the target image, we show that the obtained estimator converges at the usual optimal rate. The proposed algorithm is parameter free in the sense that it automatically calculates the bandwidth of the smoothing kernel; it is fast and its implementation is straightforward. The performance of the new filter is illustrated by numerical simulations.

Read more
Other Statistics

Replacing P values with frequentist posterior probabilities - as possible parameter values must have uniform base-rate prior probabilities by definition in a random sampling model

Possible parameter values in a random sampling model are shown by definition to have uniform base-rate prior probabilities. This allows a frequentist posterior probability distribution to be calculated for such possible parameter values conditional solely on actual study observations. If the likelihood probability distribution of a random selection is modelled with a symmetrical continuous function then the frequentist posterior probability of something equal to or more extreme than the null hypothesis will be equal to the P-value; otherwise the P value would be an approximation. An idealistic probability of replication based on an assumption of perfect study methodological reproducibility can be used as the upper bound of a realistic probability of replication that may be affected by various confounding factors. Bayesian distributions can be combined with these frequentist distributions. The idealistic frequentist posterior probability of replication may be easier than the P-value for non-statisticians to understand and to interpret.

Read more
Other Statistics

Replication, Communication, and the Population Dynamics of Scientific Discovery

Many published research results are false, and controversy continues over the roles of replication and publication policy in improving the reliability of research. Addressing these problems is frustrated by the lack of a formal framework that jointly represents hypothesis formation, replication, publication bias, and variation in research quality. We develop a mathematical model of scientific discovery that combines all of these elements. This model provides both a dynamic model of research as well as a formal framework for reasoning about the normative structure of science. We show that replication may serve as a ratchet that gradually separates true hypotheses from false, but the same factors that make initial findings unreliable also make replications unreliable. The most important factors in improving the reliability of research are the rate of false positives and the base rate of true hypotheses, and we offer suggestions for addressing each. Our results also bring clarity to verbal debates about the communication of research. Surprisingly, publication bias is not always an obstacle, but instead may have positive impacts---suppression of negative novel findings is often beneficial. We also find that communication of negative replications may aid true discovery even when attempts to replicate have diminished power. The model speaks constructively to ongoing debates about the design and conduct of science, focusing analysis and discussion on precise, internally consistent models, as well as highlighting the importance of population dynamics.

Read more
Other Statistics

Reproducible Research: A Retrospective

Rapid advances in computing technology over the past few decades have spurred two extraordinary phenomena in science: large-scale and high-throughput data collection coupled with the creation and implementation of complex statistical algorithms for data analysis. Together, these two phenomena have brought about tremendous advances in scientific discovery but have also raised two serious concerns, one relatively new and one quite familiar. The complexity of modern data analyses raises questions about the reproducibility of the analyses, meaning the ability of independent analysts to re-create the results claimed by the original authors using the original data and analysis techniques. While seemingly a straightforward concept, reproducibility of analyses is typically thwarted by the lack of availability of the data and computer code that were used in the analyses. A much more general concern is the replicability of scientific findings, which concerns the frequency with which scientific claims are confirmed by completely independent investigations. While the concepts of reproduciblity and replicability are related, it is worth noting that they are focused on quite different goals and address different aspects of scientific progress. In this review, we will discuss the origins of reproducible research, characterize the current status of reproduciblity in public health research, and connect reproduciblity to current concerns about replicability of scientific findings. Finally, we describe a path forward for improving both the reproducibility and replicability of public health research in the future.

Read more
Other Statistics

Resolving the Lord's Paradox

An explanation to Lord's paradox using ordinary least square regression models is given. It is not a paradox at all, if the regression parameters are interpreted as predictive or as causal with stricter conditions and be aware of laws of averages. We use derivation of a super-model from a given sub-model, when its residuals can be modelled with other potential predictors as a solution.

Read more
Other Statistics

Resolving the induction problem: Can we state with complete confidence via induction that the sun rises forever?

Induction is a form of reasoning from the particular example to the general rule. However, establishing the truth of a general proposition is problematic, because it is always possible that a conflicting observation to occur. This problem is known as the induction problem. The sunrise problem is a quintessential example of the induction problem, which was first introduced by Laplace (1814). However, in Laplace's solution, a zero probability was assigned to the proposition that the sun will rise forever, regardless of the number of observations made. Therefore, it has often been stated that complete confidence regarding a general proposition can never be attained via induction. In this study, we attempted to overcome this skepticism by using a recently developed theoretically consistent procedure. The findings demonstrate that through induction, one can rationally gain complete confidence in propositions based on scientific theory.

Read more
Other Statistics

Restoring a smooth function from its noisy integrals

Numerical (and experimental) data analysis often requires the restoration of a smooth function from a set of sampled integrals over finite bins. We present the bin hierarchy method that efficiently computes the maximally smooth function from the sampled integrals using essentially all the information contained in the data. We perform extensive tests with different classes of functions and levels of data quality, including Monte Carlo data suffering from a severe sign problem and physical data for the Green's function of the Fröhlich polaron.

Read more
Other Statistics

Rethinking probabilistic prediction in the wake of the 2016 U.S. presidential election

To many statisticians and citizens, the outcome of the most recent U.S. presidential election represents a failure of data-driven methods on the grandest scale. This impression has led to much debate and discussion about how the election predictions went awry -- Were the polls inaccurate? Were the models wrong? Did we misinterpret the probabilities? -- and how they went right -- Perhaps the analyses were correct even though the predictions were wrong, that's just the nature of probabilistic forecasting. With this in mind, we analyze the election outcome with respect to a core set of effectiveness principles. Regardless of whether and how the election predictions were right or wrong, we argue that they were ineffective in conveying the extent to which the data was informative of the outcome and the level of uncertainty in making these assessments. Among other things, our analysis sheds light on the shortcomings of the classical interpretations of probability and its communication to consumers in the form of predictions. We present here an alternative approach, based on a notion of validity, which offers two immediate insights for predictive inference. First, the predictions are more conservative, arguably more realistic, and come with certain guarantees on the probability of an erroneous prediction. Second, our approach easily and naturally reflects the (possibly substantial) uncertainty about the model by outputting plausibilities instead of probabilities. Had these simple steps been taken by the popular prediction outlets, the election outcome may not have been so shocking.

Read more
Other Statistics

Revealing Sub-Optimality Conditions of Strategic Decisions

Conceptual view of fitness and fitness measurement of strategic decisions on information systems, technological systems and innovation are becoming more important in recent years. This paper determines some dynamics of fitness landscape which are lead to termination of decision makers' research before reaching the global maximum in strategic decisions. These dynamics are specified according to management decision making models and supported with simulation results. This article determines simulation results by means of "Fitness Value" and "Probability of Optimality". Correlation between these two concepts may be remarkable according to revealing optimal values in innovative and research-based decision making approaches beside sub-optimal results of traditional decision making approaches.

Read more
Other Statistics

Revealing the Beauty behind the Sleeping Beauty Problem

A large number of essays address the Sleeping Beauty problem, which undermines the validity of Bayesian inference and Bas Van Fraassen's 'Reflection Principle'. In this study a straightforward analysis of the problem based on probability theory is presented. The key difference from previous works is that apart from the random experiment imposed by the problem's description, a different one is also considered, in order to negate the confusion on the involved conditional probabilities. The results of the analysis indicate that no inconsistency takes place, whereas both Bayesian inference and 'Reflection Principle' are valid.

Read more

Ready to get started?

Join us today