Featured Researches

Other Statistics

Development and Initial Validation of a Scale to Measure Instructors' Attitudes toward Concept-Based Teaching of Introductory Statistics in the Health and Behavioral Sciences

Despite more than a decade of reform efforts, students continue to experience difficulty understanding and applying statistical concepts. The predominant focus of reform has been on content, pedagogy, technology and assessment, with little attention to instructor characteristics. However, there is strong theoretical and empirical evidence that instructors' attitudes impact the quality of teaching and learning. The objective of this study was to develop and initially validate a scale to measure instructors' attitudes toward reform-oriented (or concept-based) teaching of introductory statistics in the health and behavioral sciences, at the tertiary level. This scale will be referred to as FATS (Faculty Attitudes Toward Statistics). Data were obtained from 227 instructors (USA and international), and analyzed using factor analysis, multidimensional scaling and hierarchical cluster analysis. The overall scale consists of five sub-scales with a total of 25 items, and an overall alpha of 0.89. Construct validity was established. Specifically, the overall scale, and subscales (except perceived difficulty) plausibly differentiated between low-reform and high-reform practice instructors. Statistically significant differences in attitude were observed with respect to age, but not gender, employment status, membership status in professional organizations, ethnicity, highest academic qualification, and degree concentration. This scale can be considered a reliable and valid measure of instructors' attitudes toward reform-oriented (concept-based or constructivist) teaching of introductory statistics in the health and behavioral sciences at the tertiary level. These five dimensions influence instructors' attitudes. Additional studies are required to confirm these structural and psychometric properties.

Read more
Other Statistics

Differentially Private Exponential Random Graphs

We propose methods to release and analyze synthetic graphs in order to protect privacy of individual relationships captured by the social network. Proposed techniques aim at fitting and estimating a wide class of exponential random graph models (ERGMs) in a differentially private manner, and thus offer rigorous privacy guarantees. More specifically, we use the randomized response mechanism to release networks under ϵ -edge differential privacy. To maintain utility for statistical inference, treating the original graph as missing, we propose a way to use likelihood based inference and Markov chain Monte Carlo (MCMC) techniques to fit ERGMs to the produced synthetic networks. We demonstrate the usefulness of the proposed techniques on a real data example.

Read more
Other Statistics

Differentiating the pseudo determinant

A class of derivatives is defined for the pseudo determinant Det(A) of a Hermitian matrix A . This class is shown to be non-empty and to have a unique, canonical member ∇Det(A)=Det(A) A + , where A + is the Moore-Penrose pseudo inverse. The classic identity for the gradient of the determinant is thus reproduced. Examples are provided, including the maximum likelihood problem for the rank-deficient covariance matrix of the degenerate multivariate Gaussian distribution.

Read more
Other Statistics

Discussion of "Nonparametric generalized fiducial inference for survival functions under censoring"

The following discussion is inspired by the paper Nonparametric generalized fiducial inference for survival functions under censoring by Cui and Hannig. The discussion consists of comments on the results, but also indicates it's importance more generally in the context of fiducial inference. A two page introduction to fiducial inference is given to provide a context.

Read more
Other Statistics

Discussion of "Single and Two-Stage Cross-Sectional and Time Series Benchmarking Procedures for SAE"

We congratulate the authors for a stimulating and valuable manuscript, providing a careful review of the state-of the-art in cross-sectional and time-series benchmarking procedures for small area estimation. They develop a novel two-stage benchmarking method for hierarchical time series models, where they evaluate their procedure by estimating monthly total unemployment using data from the U.S. Census Bureau. We discuss three topics: linearity and model misspecification, computational complexity and model comparisons, and, some aspects on small area estimation in practice. More specifically, we pose the following questions to the authors, that they may wish to answer: How robust is their model to misspecification? Is it time to perhaps move away from linear models of the type considered by (Battese et al. 1988; Fay and Herriot 1979)? What is the asymptotic computational complexity and what comparisons can be made to other models? Should the benchmarking constraints be inherently fixed or should they be random?

Read more
Other Statistics

Discussion on Using Stacking to Average Bayesian Predictive Distributions by Yao et al

I begin by summarizing key ideas of the paper under discussion. Then I will talk about a graphical modeling perspective, posterior contraction rates and alternative methods of aggregation. Moreover, I will also discuss possible applications of the stacking method to other problems, in particular, aggregating (sub)posterior distributions in distributed computing.

Read more
Other Statistics

Discussions on non-probabilistic convex modelling for uncertain problems

Non-probabilistic convex model utilizes a convex set to quantify the uncertainty domain of uncertain-but-bounded parameters, which is very effective for structural uncertainty analysis with limited or poor-quality experimental data. To overcome the complexity and diversity of the formulations of current convex models, in this paper, a unified framework for construction of the non-probabilistic convex models is proposed. By introducing the correlation analysis technique, the mathematical expression of a convex model can be conveniently formulated once the correlation matrix of the uncertain parameters is created. More importantly, from the theoretic analysis level, an evaluation criterion for convex modelling methods is proposed, which can be regarded as a test standard for validity verification of subsequent newly proposed convex modelling methods. And from the practical application level, two model assessment indexes are proposed, by which the adaptabilities of different convex models to a specific uncertain problem with given experimental samples can be estimated. Four numerical examples are investigated to demonstrate the effectiveness of the present study.

Read more
Other Statistics

Distance Correlation: A New Tool for Detecting Association and Measuring Correlation Between Data Sets

The difficulties of detecting association, measuring correlation, and establishing cause and effect have fascinated mankind since time immemorial. Democritus, the Greek philosopher, underscored well the importance and the difficulty of proving causality when he wrote, "I would rather discover one cause than gain the kingdom of Persia." To address the difficulties of relating cause and effect, statisticians have developed many inferential techniques. Perhaps the most well-known method stems from Karl Pearson's coefficient of correlation, which Pearson introduced in the late 19th century based on ideas of Francis Galton. I will describe in this lecture the recently-devised distance correlation coefficient and describe its advantages over the Pearson and other classical measures of correlation. We will examine an application of the distance correlation coefficient to data drawn from large astrophysical databases, where it is desired to classify galaxies according to various types. Further, the lecture will analyze data arising in the ongoing national discussion of the relationship between state-by-state homicide rates and the stringency of state laws governing firearm ownership. The lecture will also describe a remarkable singular integral which lies at the core of the theory of the distance correlation coefficient. We will see that this singular integral admits generalizations to the truncated Maclaurin expansions of the cosine function and to the theory of spherical functions on symmetric cones.

Read more
Other Statistics

Distributed Detection of a Random Process over a Multiple Access Channel under Energy and Bandwidth Constraints

We analyze a binary hypothesis testing problem built on a wireless sensor network (WSN) for detecting a stationary random process distributed both in space and time with circularly-symmetric complex Gaussian distribution under the Neyman-Pearson framework. Using an analog scheme, the sensors transmit different linear combinations of their measurements through a multiple access channel (MAC) to reach the fusion center (FC), whose task is to decide whether the process is present or not. Considering an energy constraint on each node transmission and a limited amount of channel uses, we compute the miss error exponent of the proposed scheme using Large Deviation Theory (LDT) and show that the proposed strategy is asymptotically optimal (when the number of sensors approaches to infinity) among linear orthogonal schemes. We also show that the proposed scheme obtains significant energy saving in the low signal-to-noise ratio regime, which is the typical scenario of WSNs. Finally, a Monte Carlo simulation of a 2-dimensional process in space validates the analytical results.

Read more
Other Statistics

Distributed Hypothesis Testing over a Noisy Channel: Error-exponents Trade-off

A two-terminal distributed binary hypothesis testing (HT) problem over a noisy channel is studied. The two terminals, called the observer and the decision maker, each has access to n independent and identically distributed samples, denoted by U and V , respectively. The observer communicates to the decision maker over a discrete memoryless channel (DMC), and the decision maker performs a binary hypothesis test on the joint probability distribution of (U,V) based on V and the noisy information received from the observer. The trade-off between the exponents of the type I and type II error probabilities in HT is investigated. Two inner bounds are obtained, one using a separation-based scheme that involves type-based compression and unequal error-protection channel coding, and the other using a joint scheme that incorporates type-based hybrid coding. The separation-based scheme is shown to recover the inner bound obtained by Han and Kobayashi for the special case of a rate-limited noiseless channel, and also the one obtained by the authors previously for a corner point of the trade-off. Exact single-letter characterization of the optimal trade-off is established for the special case of testing for the marginal distribution of U , when V is unavailable. Our results imply that a separation holds in this case, in the sense that the optimal trade-off is achieved by a scheme that performs independent HT and channel coding. Finally, we show via an example that the joint scheme achieves a strictly tighter bound than the separation-based scheme for some points of the error-exponent trade-off.

Read more

Ready to get started?

Join us today