Featured Researches

Methodology

Accounting for recall bias in case-control studies: a causal inference approach

A case-control study is designed to help determine if an exposure is associated with an outcome. However, since case-control studies are retrospective, they are often subject to recall bias. Recall bias can occur when study subjects do not remember previous events accurately. In this paper, we first define the estimand of interest: the causal odds ratio (COR) for a case-control study. Second, we develop estimation approaches for the COR and present estimates as a function of recall bias. Third, we define a new quantity called the \textit{R-factor}, which denotes the minimal amount of recall bias that leads to altering the initial conclusion. We show that a failure to account for recall bias can significantly bias estimation of the COR. Finally, we apply the proposed framework to a case-control study of the causal effect of childhood physical abuse on adulthood mental health.

Read more
Methodology

Adaptive Change Point Monitoring for High-Dimensional Data

In this paper, we propose a class of monitoring statistics for a mean shift in a sequence of high-dimensional observations. Inspired by the recent U-statistic based retrospective tests developed by Wang et al.(2019) and Zhang et al.(2020), we advance the U-statistic based approach to the sequential monitoring problem by developing a new adaptive monitoring procedure that can detect both dense and sparse changes in real-time. Unlike Wang et al.(2019) and Zhang et al.(2020), where self-normalization was used in their tests, we instead introduce a class of estimators for q -norm of the covariance matrix and prove their ratio consistency. To facilitate fast computation, we further develop recursive algorithms to improve the computational efficiency of the monitoring procedure. The advantage of the proposed methodology is demonstrated via simulation studies and real data illustrations.

Read more
Methodology

Adaptive Doubly Robust Estimator from Non-stationary Logging Policy under a Convergence of Average Probability

Adaptive experiments, including efficient average treatment effect estimation and multi-armed bandit algorithms, have garnered attention in various applications, such as social experiments, clinical trials, and online advertisement optimization. This paper considers estimating the mean outcome of an action from samples obtained in adaptive experiments. In causal inference, the mean outcome of an action has a crucial role, and the estimation is an essential task, where the average treatment effect estimation and off-policy value estimation are its variants. In adaptive experiments, the probability of choosing an action (logging policy) is allowed to be sequentially updated based on past observations. Due to this logging policy depending on the past observations, the samples are often not independent and identically distributed (i.i.d.), making developing an asymptotically normal estimator difficult. A typical approach for this problem is to assume that the logging policy converges in a time-invariant function. However, this assumption is restrictive in various applications, such as when the logging policy fluctuates or becomes zero at some periods. To mitigate this limitation, we propose another assumption that the average logging policy converges to a time-invariant function and show the doubly robust (DR) estimator's asymptotic normality. Under the assumption, the logging policy itself can fluctuate or be zero for some actions. We also show the empirical properties by simulations.

Read more
Methodology

Adaptive Frequency Band Analysis for Functional Time Series

The frequency-domain properties of nonstationary functional time series often contain valuable information. These properties are characterized through its time-varying power spectrum. Practitioners seeking low-dimensional summary measures of the power spectrum often partition frequencies into bands and create collapsed measures of power within bands. However, standard frequency bands have largely been developed through manual inspection of time series data and may not adequately summarize power spectra. In this article, we propose a framework for adaptive frequency band estimation of nonstationary functional time series that optimally summarizes the time-varying dynamics of the series. We develop a scan statistic and search algorithm to detect changes in the frequency domain. We establish theoretical properties of this framework and develop a computationally-efficient implementation. The validity of our method is also justified through numerous simulation studies and an application to analyzing electroencephalogram data in participants alternating between eyes open and eyes closed conditions.

Read more
Methodology

Adaptive Importance Sampling for Efficient Stochastic Root Finding and Quantile Estimation

In solving simulation-based stochastic root-finding or optimization problems that involve rare events, such as in extreme quantile estimation, running crude Monte Carlo can be prohibitively inefficient. To address this issue, importance sampling can be employed to drive down the sampling error to a desirable level. However, selecting a good importance sampler requires knowledge of the solution to the problem at hand, which is the goal to begin with and thus forms a circular challenge. We investigate the use of adaptive importance sampling to untie this circularity. Our procedure sequentially updates the importance sampler to reach the optimal sampler and the optimal solution simultaneously, and can be embedded in both sample average approximation and stochastic approximation-type algorithms. Our theoretical analysis establishes strong consistency and asymptotic normality of the resulting estimators. We also demonstrate, via a minimax perspective, the key role of using adaptivity in controlling asymptotic errors. Finally, we illustrate the effectiveness of our approach via numerical experiments.

Read more
Methodology

Adaptive Inference for Change Points in High-Dimensional Data

In this article, we propose a class of test statistics for a change point in the mean of high-dimensional independent data. Our test integrates the U-statistic based approach in a recent work by \cite{hdcp} and the L q -norm based high-dimensional test in \cite{he2018}, and inherits several appealing features such as being tuning parameter free and asymptotic independence for test statistics corresponding to even q s. A simple combination of test statistics corresponding to several different q s leads to a test with adaptive power property, that is, it can be powerful against both sparse and dense alternatives. On the estimation front, we obtain the convergence rate of the maximizer of our test statistic standardized by sample size when there is one change-point in mean and q=2 , and propose to combine our tests with a wild binary segmentation (WBS) algorithm to estimate the change-point number and locations when there are multiple change-points. Numerical comparisons using both simulated and real data demonstrate the advantage of our adaptive test and its corresponding estimation method.

Read more
Methodology

Adaptive Randomization in Network Data

Network data have appeared frequently in recent research. For example, in comparing the effects of different types of treatment, network models have been proposed to improve the quality of estimation and hypothesis testing. In this paper, we focus on efficiently estimating the average treatment effect using an adaptive randomization procedure in networks. We work on models of causal frameworks, for which the treatment outcome of a subject is affected by its own covariate as well as those of its neighbors. Moreover, we consider the case in which, when we assign treatments to the current subject, only the subnetwork of existing subjects is revealed. New randomized procedures are proposed to minimize the mean squared error of the estimated differences between treatment effects. In network data, it is usually difficult to obtain theoretical properties because the numbers of nodes and connections increase simultaneously. Under mild assumptions, our proposed procedure is closely related to a time-varying inhomogeneous Markov chain. We then use Lyapunov functions to derive the theoretical properties of the proposed procedures. The advantages of the proposed procedures are also demonstrated by extensive simulations and experiments on real network data.

Read more
Methodology

Adaptive Step-Length Selection in Gradient Boosting for Generalized Additive Models for Location, Scale and Shape

Tuning of model-based boosting algorithms relies mainly on the number of iterations, while the step-length is fixed at a predefined value. For complex models with several predictors such as Generalized Additive Models for Location, Scale and Shape (GAMLSS), imbalanced updates of predictors, where some distribution parameters are updated more frequently than others, can be a problem that prevents some submodels to be appropriately fitted within a limited number of boosting iterations. We propose an approach using adaptive step-length (ASL) determination within a non-cyclical boosting algorithm for GAMLSS to prevent such imbalance. Moreover, for the important special case of the Gaussian distribution, we discuss properties of the ASL and derive a semi-analytical form of the ASL that avoids manual selection of the search interval and numerical optimization to find the optimal step-length, and consequently improves computational efficiency. We show competitive behavior of the proposed approaches compared to penalized maximum likelihood and boosting with a fixed step-length for GAMLSS models in two simulations and two applications, in particular for cases of large variance and/or more variables than observations. In addition, the idea of the ASL is also applicable to other models with more than one predictor like zero-inflated count model, and brings up insights into the choice of the reasonable defaults for the step-length in simpler special case of (Gaussian) additive models.

Read more
Methodology

Adaptive dose-response studies to establish proof-of-concept in learning-phase clinical trials

In learning-phase clinical trials in drug development, adaptive designs can be efficient and highly informative when used appropriately. In this article, we extend the multiple comparison procedures with modeling techniques (MCP-Mod) procedure with generalized multiple contrast tests (GMCTs) to two-stage adaptive designs for establishing proof-of-concept. The results of an interim analysis of first-stage data are used to adapt the candidate dose-response models in the second stage. GMCTs are used in both stages to obtain stage-wise p-values, which are then combined to determine an overall p-value. An alternative approach is also considered that combines the t-statistics across stages, employing the conditional rejection probability (CRP) principle to preserve the Type I error probability. Simulation studies demonstrate that the adaptive designs are advantageous compared to the corresponding tests in a non-adaptive design if the selection of the candidate set of dose-response models is not well informed by evidence from preclinical and early-phase studies.

Read more
Methodology

Adaptive lasso and Dantzig selector for spatial point processes intensity estimation

Lasso and Dantzig selector are standard procedures able to perform variable selection and estimation simultaneously. This paper is concerned with extending these procedures to spatial point process intensity estimation. We propose adaptive versions of these procedures, develop efficient computational methodologies and derive asymptotic results for a large class of spatial point processes under the setting where the number of parameters, i.e. the number of spatial covariates considered, increases with the volume of the observation domain. Both procedures are compared theoretically and in a simulation study.

Read more

Ready to get started?

Join us today