Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Wenjing Zheng is active.

Publication


Featured researches published by Wenjing Zheng.


The International Journal of Biostatistics | 2012

Targeted maximum likelihood estimation of natural direct effects.

Wenjing Zheng; Mark J. van der Laan

In many causal inference problems, one is interested in the direct causal effect of an exposure on an outcome of interest that is not mediated by certain intermediate variables. Robins and Greenland (1992) and Pearl (2001) formalized the definition of two types of direct effects (natural and controlled) under the counterfactual framework. The efficient scores (under a nonparametric model) for the various natural effect parameters and their general robustness conditions, as well as an estimating equation based estimator using the efficient score, are provided in Tchetgen Tchetgen and Shpitser (2011b). In this article, we apply the targeted maximum likelihood framework of van der Laan and Rubin (2006) and van der Laan and Rose (2011) to construct a semiparametric efficient, multiply robust, substitution estimator for the natural direct effect which satisfies the efficient score equation derived in Tchetgen Tchetgen and Shpitser (2011b). We note that the robustness conditions in Tchetgen Tchetgen and Shpitser (2011b) may be weakened, thereby placing less reliance on the estimation of the mediator density. More precisely, the proposed estimator is asymptotically unbiased if either one of the following holds: i) the conditional mean outcome given exposure, mediator, and confounders, and the mediated mean outcome difference are consistently estimated; (ii) the exposure mechanism given confounders, and the conditional mean outcome are consistently estimated; or (iii) the exposure mechanism and the mediator density, or the exposure mechanism and the conditional distribution of the exposure given confounders and mediator, are consistently estimated. If all three conditions hold, then the effect estimate is asymptotically efficient. Extensions to the natural indirect effect are also discussed.


Archive | 2011

Cross-Validated Targeted Minimum-Loss-Based Estimation

Wenjing Zheng; Mark J. van der Laan

In previous chapters, we introduced targeted maximum likelihood estimation in semiparametric models, which incorporates adaptive estimation (e.g., loss-based super learning) of the relevant part of the data-generating distribution and subsequently carries out a targeted bias reduction by maximizing the log-likelihood, or minimizing another loss-specific empirical risk, over a “clever” parametric working model through the initial estimator, treating the initial estimator as offset. This updating process may need to be iterated to convergence. The target parameter of the resulting updated estimator is then evaluated, and is called the targeted minimum-loss- based estimator (also TMLE) of the target parameter of the data-generating distribution. This estimator is, by definition, a substitution estimator, and, under regularity conditions, is a double robust semiparametric efficient estimator.


Journal of Causal Inference | 2013

Estimating the Effect of a Community-Based Intervention with Two Communities

Mark J. van der Laan; Maya L. Petersen; Wenjing Zheng

Abstract Due to the need to evaluate the effectiveness of community-based programs in practice, there is substantial interest in methods to estimate the causal effects of community-level treatments or exposures on individual level outcomes. The challenge one is confronted with is that different communities have different environmental factors affecting the individual outcomes, and all individuals in a community share the same environment and intervention. In practice, data are often available from only a small number of communities, making it difficult if not impossible to adjust for these environmental confounders. In this paper we consider an extreme version of this dilemma, in which two communities each receives a different level of the intervention, and covariates and outcomes are measured on a random sample of independent individuals from each of the two populations; the results presented can be straightforwardly generalized to settings in which more than two communities are sampled. We address the question of what conditions are needed to estimate the causal effect of the intervention, defined in terms of an ideal experiment in which the exposed level of the intervention is assigned to both communities and individual outcomes are measured in the combined population, and then the clock is turned back and a control level of the intervention is assigned to both communities and individual outcomes are measured in the combined population. We refer to the difference in the expectation of these outcomes as the marginal (overall) treatment effect. We also discuss conditions needed for estimation of the treatment effect on the treated community. We apply a nonparametric structural equation model to define these causal effects and to establish conditions under which they are identified. These identifiability conditions provide guidance for the design of studies to investigate community level causal effects and for assessing the validity of causal interpretations when data are only available from a few communities. When the identifiability conditions fail to hold, the proposed statistical parameters still provide nonparametric treatment effect measures (albeit non-causal) whose statistical interpretations do not depend on model specifications. In addition, we study the use of a matched cohort sampling design in which the units of different communities are matched on individual factors. Finally, we provide semiparametric efficient and doubly robust targeted MLE estimators of the community level causal effect based on i.i.d. sampling and matched cohort sampling.


Annals of Statistics | 2017

Targeted sequential design for targeted learning inference of the optimal treatment rule and its mean reward

Antoine Chambaz; Wenjing Zheng; Mark J. van der Laan

This article studies the targeted sequential inference of an optimal treatment rule (TR) and its mean reward in the non-exceptional case, i.e., assuming that there is no stratum of the baseline covariates where treatment is neither beneficial nor harmful, and under a companion margin assumption. Our pivotal estimator, whose definition hinges on the targeted minimum loss estimation (TMLE) principle, actually infers the mean reward under the current estimate of the optimal TR. This data-adaptive statistical parameter is worthy of interest on its own. Our main result is a central limit theorem which enables the construction of confidence intervals on both mean rewards under the current estimate of the optimal TR and under the optimal TR itself. The asymptotic variance of the estimator takes the form of the variance of an efficient influence curve at a limiting distribution, allowing to discuss the efficiency of inference. As a by product, we also derive confidence intervals on two cumulated pseudo-regrets, a key notion in the study of bandits problems. A simulation study illustrates the procedure. One of the corner-stones of the theoretical study is a new maximal inequality for martingales with respect to the uniform entropy integral.


Statistics in Medicine | 2018

Constrained binary classification using ensemble learning: an application to cost‐efficient targeted PrEP strategies

Wenjing Zheng; Laura Balzer; Maya L. Petersen; Mark J. van der Laan

Binary classification problems are ubiquitous in health and social sciences. In many cases, one wishes to balance two competing optimality considerations for a binary classifier. For instance, in resource-limited settings, an human immunodeficiency virus prevention program based on offering pre-exposure prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program. In this article, we consider a general class of constrained binary classification problems wherein the objective function and the constraint are both monotonic with respect to a threshold. These include the minimization of the rate of positive predictions subject to a minimum sensitivity, the maximization of sensitivity subject to a maximum rate of positive predictions, and the Neyman-Pearson paradigm, which minimizes the type II error subject to an upper bound on the type I error. We propose an ensemble approach to these binary classification problems based on the Super Learner methodology. This approach linearly combines a user-supplied library of scoring algorithms, with combination weights and a discriminating threshold chosen to minimize the constrained optimality criterion. We then illustrate the application of the proposed classifier to develop an individualized PrEP targeting strategy in a resource-limited setting, with the goal of minimizing the number of PrEP offerings while achieving a minimum required sensitivity. This proof of concept data analysis uses baseline data from the ongoing Sustainable East Africa Research in Community Health study. Copyright


The International Journal of Biostatistics | 2016

Doubly Robust and Efficient Estimation of Marginal Structural Models for the Hazard Function.

Wenjing Zheng; Maya L. Petersen; Mark J. van der Laan

Abstract In social and health sciences, many research questions involve understanding the causal effect of a longitudinal treatment on mortality (or time-to-event outcomes in general). Often, treatment status may change in response to past covariates that are risk factors for mortality, and in turn, treatment status may also affect such subsequent covariates. In these situations, Marginal Structural Models (MSMs), introduced by Robins (1997. Marginal structural models Proceedings of the American Statistical Association. Section on Bayesian Statistical Science, 1–10), are well-established and widely used tools to account for time-varying confounding. In particular, a MSM can be used to specify the intervention-specific counterfactual hazard function, i. e. the hazard for the outcome of a subject in an ideal experiment where he/she was assigned to follow a given intervention on their treatment variables. The parameters of this hazard MSM are traditionally estimated using the Inverse Probability Weighted estimation Robins (1999. Marginal structural models versus structural nested models as tools for causal inference. In: Statistical models in epidemiology: the environment and clinical trials. Springer-Verlag, 1999:95–134), Robins et al. (2000), (IPTW, van der Laan and Petersen (2007. Causal effect models for realistic individualized treatment and intention to treat rules. Int J Biostat 2007;3:Article 3), Robins et al. (2008. Estimaton and extrapolation of optimal treatment and testing strategies. Statistics in Medicine 2008;27(23):4678–721)). This estimator is easy to implement and admits Wald-type confidence intervals. However, its consistency hinges on the correct specification of the treatment allocation probabilities, and the estimates are generally sensitive to large treatment weights (especially in the presence of strong confounding), which are difficult to stabilize for dynamic treatment regimes. In this paper, we present a pooled targeted maximum likelihood estimator (TMLE, van der Laan and Rubin (2006. Targeted maximum likelihood learning. The International Journal of Biostatistics 2006;2:1–40)) for MSM for the hazard function under longitudinal dynamic treatment regimes. The proposed estimator is semiparametric efficient and doubly robust, offering bias reduction over the incumbent IPTW estimator when treatment probabilities may be misspecified. Moreover, the substitution principle rooted in the TMLE potentially mitigates the sensitivity to large treatment weights in IPTW. We compare the performance of the proposed estimator with the IPTW and a on-targeted substitution estimator in a simulation study.


The International Journal of Biostatistics | 2018

Marginal Structural Models with Counterfactual Effect Modifiers

Wenjing Zheng; Zhehui Luo; Mark J. van der Laan

Abstract In health and social sciences, research questions often involve systematic assessment of the modification of treatment causal effect by patient characteristics. In longitudinal settings, time-varying or post-intervention effect modifiers are also of interest. In this work, we investigate the robust and efficient estimation of the Counterfactual-History-Adjusted Marginal Structural Model (van der Laan MJ, Petersen M. Statistical learning of origin-specific statically optimal individualized treatment rules. Int J Biostat. 2007;3), which models the conditional intervention-specific mean outcome given a counterfactual modifier history in an ideal experiment. We establish the semiparametric efficiency theory for these models, and present a substitution-based, semiparametric efficient and doubly robust estimator using the targeted maximum likelihood estimation methodology (TMLE, e.g. van der Laan MJ, Rubin DB. Targeted maximum likelihood learning. Int J Biostat. 2006;2, van der Laan MJ, Rose S. Targeted learning: causal inference for observational and experimental data, 1st ed. Springer Series in Statistics. Springer, 2011). To facilitate implementation in applications where the effect modifier is high dimensional, our third contribution is a projected influence function (and the corresponding projected TMLE estimator), which retains most of the robustness of its efficient peer and can be easily implemented in applications where the use of the efficient influence function becomes taxing. We compare the projected TMLE estimator with an Inverse Probability of Treatment Weighted estimator (e.g. Robins JM. Marginal structural models. In: Proceedings of the American Statistical Association. Section on Bayesian Statistical Science, 1-10. 1997a, Hernan MA, Brumback B, Robins JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology. 2000;11:561–570), and a non-targeted G-computation estimator (Robins JM. A new approach to causal inference in mortality studies with sustained exposure periods - application to control of the healthy worker survivor effect. Math Modell. 1986;7:1393–1512.). The comparative performance of these estimators is assessed in a simulation study. The use of the projected TMLE estimator is illustrated in a secondary data analysis for the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial where effect modifiers are subject to missing at random.


Statistical Methods in Medical Research | 2018

A new approach to hierarchical data analysis: Targeted maximum likelihood estimation for the causal effect of a cluster-level exposure

Laura Balzer; Wenjing Zheng; Mark J. van der Laan; Maya L. Petersen

We often seek to estimate the impact of an exposure naturally occurring or randomly assigned at the cluster-level. For example, the literature on neighborhood determinants of health continues to grow. Likewise, community randomized trials are applied to learn about real-world implementation, sustainability, and population effects of interventions with proven individual-level efficacy. In these settings, individual-level outcomes are correlated due to shared cluster-level factors, including the exposure, as well as social or biological interactions between individuals. To flexibly and efficiently estimate the effect of a cluster-level exposure, we present two targeted maximum likelihood estimators (TMLEs). The first TMLE is developed under a non-parametric causal model, which allows for arbitrary interactions between individuals within a cluster. These interactions include direct transmission of the outcome (i.e. contagion) and influence of one individual’s covariates on another’s outcome (i.e. covariate interference). The second TMLE is developed under a causal sub-model assuming the cluster-level and individual-specific covariates are sufficient to control for confounding. Simulations compare the alternative estimators and illustrate the potential gains from pairing individual-level risk factors and outcomes during estimation, while avoiding unwarranted assumptions. Our results suggest that estimation under the sub-model can result in bias and misleading inference in an observational setting. Incorporating working assumptions during estimation is more robust than assuming they hold in the underlying causal model. We illustrate our approach with an application to HIV prevention and treatment.


Archive | 2018

Targeting a Simple Statistical Bandit Problem

Antoine Chambaz; Wenjing Zheng; Mark J. van der Laan

An infinite sequence of independent and identically distributed (i.i.d.) random variables (W n , Y n (0), Y n (1))n ≥ 1 drawn from a common law Q0 is to be sequentially and partially disclosed during the course of a controlled experiment. The first component, W n , describes the nth context in which we will have to carry out one action out of two, denoted a = 0 and a = 1. The second and third components, Y n (0) and Y n (1), are the rewards that actions a = 0 and a = 1 would grant. The set \(\mathcal{W}\) of contexts may be high-dimensional. The rewards take their values in ]0, 1[.


Archive | 2018

Mediation Analysis with Time-Varying Mediators and Exposures

Wenjing Zheng; Mark J. van der Laan

An exposure often acts on an outcome of interest directly, or indirectly through the mediation of some intermediate variables. Identifying and quantifying these two types of effects contribute to further understanding of the underlying causal mechanism. Modern developments in formal nonparametric causal inference have produced many advances in causal mediation analysis in nonlongitudinal settings. (e.g., Robins and Greenland 1992; Pearl 2001; van der Laan and Petersen 2008; VanderWeele 2009; Hafeman and VanderWeele 2010; Imai et al. 2010b,a; Pearl 2011; Tchetgen Tchetgen and Shpitser 2011a,b; Zheng and van der Laan 2012b; Lendle and van der Laan 2011).

Collaboration


Dive into the Wenjing Zheng's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Antoine Chambaz

Paris Descartes University

View shared research outputs
Top Co-Authors

Avatar

Laura Balzer

University of Massachusetts Amherst

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Oleg Sofrygin

University of California

View shared research outputs
Top Co-Authors

Avatar

Zhehui Luo

Michigan State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge