Featured Researches

Other Statistics

Comparison of Clinical Episode Outcomes between Bundled Payments for Care Improvement (BPCI) Initiative Participants and Non-Participants

Objective: To evaluate differences in major outcomes between Bundled Payments for Care Improvement (BPCI) participating providers and non-participating providers for both Major Joint Replacement of the Lower Extremity (MJRLE) and Acute Myocardial Infarction (AMI) episodes. Methods: A difference-in-differences approach estimated the differential change in outcomes for Medicare beneficiaries who had an MJRLE or AMI at a BPCI participating hospital between the baseline (January 2011 through September 2013) and intervention (October 2013 through December 2016) periods and beneficiaries with the same episode (MJRLE or AMI) at a matched comparison hospital. Main Outcomes and Measures: Medicare payments, LOS, and readmissions during the episode, which includes the anchor hospitalization and the 90-day post discharge period. Results: Mean total Medicare payments for an MJRLE episode and the 90-day post discharge period declined $444 more (p < 0.0001) for Medicare beneficiaries with episodes initiated in a BPCI-participating provider than for the beneficiaries in a comparison provider. This reduction was mainly due to reduced institutional post-acute care (PAC) payments. Slight reductions in carrier payments and LOS were estimated. Readmission rates were not statistically different between the BPCI and the comparison populations. These findings suggest that PAC use can be reduced without adverse effects on recovery from MJRLE. The lack of statistically significant differences in effects for AMI could be explained by a smaller sample size or more heterogenous recovery paths in AMI. Conclusions: Our findings suggest that, as currently designed, bundled payments can be effective in reducing payments for MJRLE episodes of care, but not necessarily for AMI. Most savings came from the declines in PAC. These findings are consistent with the results reported in the BPCI model evaluation for CMS.

Read more
Other Statistics

Comparison of plotting system outputs in beginner analysts

The R programming language is built on an ecosystem of packages, some that allow analysts to accomplish the same tasks. For example, there are at least two clear workflows for creating data visualizations in R: using the base graphics package (referred to as "base R") and the ggplot2 add-on package based on the grammar of graphics. Here we perform an empirical study of the quality of scientific graphics produced by beginning R users. In our experiment, learners taking a data science course on the Coursera platform were randomized to complete identical plotting exercises in either the base R or the ggplot2 system. Learners were then asked to evaluate their peers in terms of visual characteristics key to scientific cognition. We observed that graphics created with the two systems rated similarly on many characteristics. However, ggplot2 graphics were generally judged to be more visually pleasing and, in the case of faceted scientific plots, easier to understand. Our results suggest that while both graphic systems are useful in the hands of beginning users, ggplot2's natural faceting system may be easier to use by beginning users for displaying more complex relationships.

Read more
Other Statistics

Complementary Lipschitz continuity results for the distribution of intersections or unions of independent random sets in finite discrete spaces

We prove that intersections and unions of independent random sets in finite spaces achieve a form of Lipschitz continuity. More precisely, given the distribution of a random set Ξ , the function mapping any random set distribution to the distribution of its intersection (under independence assumption) with Ξ is Lipschitz continuous with unit Lipschitz constant if the space of random set distributions is endowed with a metric defined as the L k norm distance between inclusion functionals also known as commonalities. Moreover, the function mapping any random set distribution to the distribution of its union (under independence assumption) with Ξ is Lipschitz continuous with unit Lipschitz constant if the space of random set distributions is endowed with a metric defined as the L k norm distance between hitting functionals also known as plausibilities. Using the epistemic random set interpretation of belief functions, we also discuss the ability of these distances to yield conflict measures. All the proofs in this paper are derived in the framework of Dempster-Shafer belief functions. Let alone the discussion on conflict measures, it is straightforward to transcribe the proofs into the general (non necessarily epistemic) random set terminology.

Read more
Other Statistics

Computing the Expected Value of Sample Information Efficiently: Expertise and Skills Required for Four Model-Based Methods

Objectives: Value of information (VOI) analyses can help policy-makers make informed decisions about whether to conduct and how to design future studies. Historically, a computationally expensive method to compute the Expected Value of Sample Information (EVSI) restricted the use of VOI to simple decision models and study designs. Recently, four EVSI approximation methods have made such analyses more feasible and accessible. We provide practical recommendations for analysts computing EVSI by evaluating these novel methods. Methods: Members of the Collaborative Network for Value of Information (ConVOI) compared the inputs, analyst's expertise and skills, and software required for four recently developed approximation methods. Information was also collected on the strengths and limitations of each approximation method. Results: All four EVSI methods require a decision-analytic model's probabilistic sensitivity analysis (PSA) output. One of the methods also requires the model to be re-run to obtain new PSA outputs for each EVSI estimation. To compute EVSI, analysts must be familiar with at least one of the following skills: advanced regression modeling, likelihood specification, and Bayesian modeling. All methods have different strengths and limitations, e.g., some methods handle evaluation of study designs with more outcomes more efficiently while others quantify uncertainty in EVSI estimates. All methods are programmed in the statistical language R and two of the methods provide online applications. Conclusion: Our paper helps to inform the choice between four efficient EVSI estimation methods, enabling analysts to assess the methods' strengths and limitations and select the most appropriate EVSI method given their situation and skills.

Read more
Other Statistics

Conditional Visualization for Statistical Models: An Introduction to the condvis Package in R

The condvis package is for interactive visualization of sections in data space, showing fitted models on the section, and observed data near the section. The primary goal is the interpretation of complex models, and showing how the observed data support the fitted model. There is a video accompaniment to this paper available at this https URL. This is a preprint version of an article to appear in the Journal of Statistical Software.

Read more
Other Statistics

Conditional quantile estimation through optimal quantization

In this paper, we use quantization to construct a nonparametric estimator of conditional quantiles of a scalar response Y given a d-dimensional vector of covariates X . First we focus on the population level and show how optimal quantization of X , which consists in discretizing X by projecting it on an appropriate grid of N points, allows to approximate conditional quantiles of Y given X . We show that this is approximation is arbitrarily good as N goes to infinity and provide a rate of convergence for the approximation error. Then we turn to the sample case and define an estimator of conditional quantiles based on quantization ideas. We prove that this estimator is consistent for its fixed- N population counterpart. The results are illustrated on a numerical example. Dominance of our estimators over local constant/linear ones and nearest neighbor ones is demonstrated through extensive simulations in the companion paper Charlier et al.(2014b).

Read more
Other Statistics

Conducting Highly Principled Data Science: A Statistician's Job and Joy

Highly Principled Data Science insists on methodologies that are: (1) scientifically justified, (2) statistically principled, and (3) computationally efficient. An astrostatistics collaboration, together with some reminiscences, illustrates the increased roles statisticians can and should play to ensure this trio, and to advance the science of data along the way.

Read more
Other Statistics

Confidence biases and learning among intuitive Bayesians

We design a double-or-quits game to compare the speed of learning one's specific ability with the speed of rising confidence as the task gets increasingly difficult. We find that people on average learn to be overconfident faster than they learn their true ability and we present an intuitive-Bayesian model of confidence which integrates confidence biases and learning. Uncertainty about one's true ability to perform a task in isolation can be responsible for large and stable confidence biases, namely limited discrimination, the hard--easy effect, the Dunning--Kruger effect, conservative learning from experience and the overprecision phenomenon (without underprecision) if subjects act as Bayesian learners who rely only on sequentially perceived performance cues and contrarian illusory signals induced by doubt. Moreover, these biases are likely to persist since the Bayesian aggregation of past information consolidates the accumulation of errors and the perception of contrarian illusory signals generates conservatism and under-reaction to events. Taken together, these two features may explain why intuitive Bayesians make systematically wrong predictions of their own performance.

Read more
Other Statistics

Conjectures on Optimal Nested Generalized Group Testing Algorithm

Consider a finite population of N items, where item i has a probability p i to be defective. The goal is to identify all items by means of group testing. This is the generalized group testing problem (hereafter GGTP). In the case of p 1 =⋯= p N =p \cite{YH1990} proved that the pairwise testing algorithm is the optimal nested algorithm, with respect to the expected number of tests, for all N if and only if p∈[1−1/ 2 – √ ,(3− 5 – √ )/2] (R-range hereafter) (an optimal at the boundary values). In this note, we present a result that helps to define the generalized pairwise testing algorithm (hereafter GPTA) for the GGTP. We present two conjectures: (1) when all p i ,i=1,…,N belong to the R-range, GPTA is the optimal procedure among nested procedures applied to p i of nondecreasing order; (2) if all p i ,i=1,…,N belong to the R-range, GPTA the optimal nested procedure, i.e., minimises the expected total number of tests with respect to all possible testing orders in the class of nested procedures. Although these conjectures are logically reasonable, we were only able to empirically verify the first one up to a particular level of N . We also provide a short survey of GGTP.

Read more
Other Statistics

Consider avoiding the .05 significance level

It is suggested that some shortcomings of Null Hypothesis Significance Testing (NHST), viewed from the perspective of Bayesian statistics, turn benign once the traditional threshold p value of .05 is substituted by a sufficiently smaller value. To illustrate, the posterior probability of H0 stating P=.5, given data that just render it rejected by NHST with a p value of .05 (and a uniform prior), is shown here to be not much smaller than .50 for most values of N below 100 (and even exceeds .50 for N>=100); in contrast, with a p value of .001 posterior probability does not exceed .06 for N<=100 (neither .25 for N<9000). Yet more interesting, posterior probability becomes quite independent of N with a p value of .0001, hence practically satisfying the alpha postulate - set by Cornfield (1966) as the condition for p value being a measure of evidence in itself. In view of the low prospect that most researchers will soon convert to use Bayesian statistics in any form, we thus suggest that researchers who elect the conservative option of resorting to NHST be encouraged to avoid as much as possible using a p value of .05 as a threshold for rejecting H0. The analysis presented here may be used to discuss afresh which level of threshold p value seems to be a reasonable, practical substitute.

Read more

Ready to get started?

Join us today