Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Marloes H. Maathuis is active.

Publication


Featured researches published by Marloes H. Maathuis.


Annals of Statistics | 2009

Estimating high-dimensional intervention effects from observational data

Marloes H. Maathuis; Markus Kalisch; Peter Bühlmann

We assume that we have observational data generated from an unknown underlying directed acyclic graph (DAG) model. A DAG is typically not identifiable from observational data, but it is possible to consistently estimate the equivalence class of a DAG. Moreover, for any given DAG, causal effects can be estimated using intervention calculus. In this paper, we combine these two parts. For each DAG in the estimated equivalence class, we use intervention calculus to estimate the causal effects of the covariates on the response. This yields a collection of estimated causal effects for each covariate. We show that the distinct values in this set can be consistently estimated by an algorithm that uses only local information of the graph. This local approach is computationally fast and feasible in high-dimensional problems. We propose to use summary measures of the set of possible causal effects to determine variable importance. In particular, we use the minimum absolute value of this set, since that is a lower bound on the size of the causal effect. We demonstrate the merits of our methods in a simulation study and on a data set about riboflavin production.


Nature Methods | 2010

Predicting causal effects in large-scale systems from observational data

Marloes H. Maathuis; Diego Colombo; Markus Kalisch; Peter Bühlmann

Supplementary Figure 1 Comparing IDA, Lasso and Elastic-net on the five DREAM4 networks of size 10 with multifactorial data. Supplementary Table 1 Comparing IDA, Lasso and Elastic-net to random guessing on the Hughes et al. data. Supplementary Table 2 Comparing IDA, Lasso and Elastic-net to random guessing on the five DREAM4 networks of size 10, using the multifactorial data as observational data. Supplementary Methods


Annals of Statistics | 2012

Learning high-dimensional directed acyclic graphs with latent and selection variables

Diego Colombo; Marloes H. Maathuis; Markus Kalisch; Thomas S. Richardson

We consider the problem of learning causal information between random variables in directed acyclic graphs (DAGs) when allowing arbitrarily many latent and selection variables. The FCI (Fast Causal Inference) algorithm has been explicitly designed to infer conditional independence and causal information in such settings. However, FCI is computationally infeasible for large graphs. We therefore propose the new RFCI algorithm, which is much faster than FCI. In some situations the output of RFCI is slightly less informative, in particular with respect to conditional independence information. However, we prove that any causal information in the output of RFCI is correct in the asymptotic limit. We also define a class of graphs on which the outputs of FCI and RFCI are identical. We prove consistency of FCI and RFCI in sparse high-dimensional settings, and demonstrate in simulations that the estimation performances of the algorithms are very similar. All software is implemented in the R-package pcalg.


Annals of Statistics | 2008

Current status data with competing risks: Consistency and rates of convergence of the MLE

Piet Groeneboom; Marloes H. Maathuis; Jon A. Wellner

We study nonparametric estimation of the sub-distribution functions for current status data with competing risks. Our main interest is in the nonparametric maximum likelihood estimator (MLE), and for comparison we also consider a simpler “naive estimator.” Both types of estimators were studied by Jewell, van der Laan and Henneman [Biometrika (2003) 90 183–197], but little was known about their large sample properties. We have started to fill this gap, by proving that the estimators are consistent and converge globally and locally at rate n1/3. We also show that this local rate of convergence is optimal in a minimax sense. The proof of the local rate of convergence of the MLE uses new methods, and relies on a rate result for the sum of the MLEs of the sub-distribution functions which holds uniformly on a fixed neighborhood of a point. Our results are used in Groeneboom, Maathuis and Wellner [Ann. Statist. (2008) 36 1064–1089] to obtain the local limiting distributions of the estimators.


Biometrika | 2010

Variable selection in high-dimensional linear models: partially faithful distributions and the pc-simple algorithm

Peter Bühlmann; Markus Kalisch; Marloes H. Maathuis

We consider variable selection in high-dimensional linear models where the number of covariates greatly exceeds the sample size. We introduce the new concept of partial faithfulness and use it to infer associations between the covariates and the response. Under partial faithfulness, we develop a simplified version of the pc algorithm (Spirtes et al., 2000), which is computationally feasible even with thousands of covariates and provides consistent variable selection under conditions on the random design matrix that are of a different nature than coherence conditions for penalty-based approaches like the lasso. Simulations and application to real data show that our method is competitive compared to penalty-based approaches. We provide an efficient implementation of the algorithm in the R-package pcalg. Copyright 2010, Oxford University Press.


Bioinformatics | 2012

Causal stability ranking

Daniel Stekhoven; Izabel Moraes; Gardar Sveinbjörnsson; Lars Hennig; Marloes H. Maathuis; Peter Bühlmann

Genotypic causes of a phenotypic trait are typically determined via randomized controlled intervention experiments. Such experiments are often prohibitive with respect to durations and costs, and informative prioritization of experiments is desirable. We therefore consider predicting stable rankings of genes (covariates), according to their total causal effects on a phenotype (response), from observational data. Since causal effects are generally non-identifiable from observational data only, we use a method that can infer lower bounds for the total causal effect under some assumptions. We validated our method, which we call Causal Stability Ranking (CStaR), in two situations. First, we performed knock-out experiments with Arabidopsis thaliana according to a predicted ranking based on observational gene expression data, using flowering time as phenotype of interest. Besides several known regulators of flowering time, we found almost half of the tested top ranking mutants to have a significantly changed flowering time. Second, we compared CStaR to established regression-based methods on a gene expression dataset of Saccharomyces cerevisiae. We found that CStaR outperforms these established methods. Our method allows for efficient design and prioritization of future intervention experiments, and due to its generality it can be used for a broad spectrum of applications.


AIDS | 2012

The causal effect of switching to second-line ART in programmes without access to routine viral load monitoring

Thomas Gsponer; Maya L. Petersen; Matthias Egger; Sam Phiri; Marloes H. Maathuis; Andrew Boulle; Patrick Musondad; Hannock Tweya; Karin Peter; Benjamin H. Chi; Olivia Keiser

Objectives:We examined the effect of switching to second-line antiretroviral therapy (ART) on mortality in patients who experienced immunological failure in ART programmes without access to routine viral load monitoring in sub-Saharan Africa. Design and setting:Collaborative analysis of two ART programmes in Lusaka, Zambia and Lilongwe, Malawi. Methods:We included all adult patients experiencing immunological failure based on WHO criteria. We used Cox proportional hazards models weighted by the inverse probability of switching to compare mortality between patients who switched and patients who did not; and between patients who switched immediately and patients who switched later. Results are expressed as hazard ratios with 95% credible intervals (95% CI). Results:Among 2411 patients with immunological failure 324 patients (13.4%) switched to second-line ART during 3932 person-years of follow-up. The median CD4 cell count at start of ART and failure was lower in patients who switched compared to patients who did not: 80 versus 155 cells/&mgr;l (P < 0.001) and 77 versus 146 cells/&mgr;l (P < 0.001), respectively. Adjusting for baseline and time-dependent confounders, mortality was lower among patients who switched compared to patients remaining on failing first-line ART: hazard ratio 0.25 (95% CI 0.09–0.72). Mortality was also lower among patients who remained on failing first-line ART for shorter periods: hazard ratio 0.70 (95% CI 0.44–1.09) per 6 months shorter exposure. Conclusion:In ART programmes switching patients to second-line regimens based on WHO immunological failure criteria appears to reduce mortality, with the greatest benefit in patients switching immediately after immunological failure is diagnosed.


BMC Medical Research Methodology | 2010

Understanding human functioning using graphical models

Markus Kalisch; Bernd A. G. Fellinghauer; Eva Grill; Marloes H. Maathuis; Ulrich Mansmann; Peter Bühlmann; Gerold Stucki

BackgroundFunctioning and disability are universal human experiences. However, our current understanding of functioning from a comprehensive perspective is limited. The development of the International Classification of Functioning, Disability and Health (ICF) on the one hand and recent developments in graphical modeling on the other hand might be combined and open the door to a more comprehensive understanding of human functioning. The objective of our paper therefore is to explore how graphical models can be used in the study of ICF data for a range of applications.MethodsWe show the applicability of graphical models on ICF data for different tasks: Visualization of the dependence structure of the data set, dimension reduction and comparison of subpopulations. Moreover, we further developed and applied recent findings in causal inference using graphical models to estimate bounds on intervention effects in an observational study with many variables and without knowing the underlying causal structure.ResultsIn each field, graphical models could be applied giving results of high face-validity. In particular, graphical models could be used for visualization of functioning in patients with spinal cord injury. The resulting graph consisted of several connected components which can be used for dimension reduction. Moreover, we found that the differences in the dependence structures between subpopulations were relevant and could be systematically analyzed using graphical models. Finally, when estimating bounds on causal effects of ICF categories on general health perceptions among patients with chronic health conditions, we found that the five ICF categories that showed the strongest effect were plausible.ConclusionsGraphical Models are a flexible tool and lend themselves for a wide range of applications. In particular, studies involving ICF data seem to be suited for analysis using graphical models.


Journal of Computational and Graphical Statistics | 2005

Reduction Algorithm for the NPMLE for the Distribution Function of Bivariate Interval-Censored Data

Marloes H. Maathuis

This article considers computational aspects of the nonparametric maximum likelihood estimator (NPMLE) for the distribution function of bivariate interval-censored data. The computation of the NPMLE consists of a parameter reduction step and an optimization step. This article focuses on the reduction step and introduces two new reduction algorithms: the Tree algorithm and the HeightMap algorithm. The Tree algorithm is mentioned only briefly. The HeightMap algorithm is discussed in detail and also given in pseudo code. It is a fast and simple algorithm of time complexityO(n2). This is an order faster than the best known algorithm thus far by Bogaerts and Lesaffre. We compare the new algorithms to earlier algorithms in a simulation study, and demonstrate that the new algorithms are significantly faster. Finally, we discuss how the HeightMap algorithm can be generalized to d-dimensional data with d > 2. Such a multivariate version of the HeightMap algorithm has time complexity O(nd).


Annals of Statistics | 2011

Asymptotic optimality of the Westfall–Young permutation procedure for multiple testing under dependence

Nicolai Meinshausen; Marloes H. Maathuis; Peter Bühlmann

Test statistics are often strongly dependent in large-scale multiple testing applications. Most corrections for multiplicity are unduly conservative for correlated test statistics, resulting in a loss of power to detect true positives. We show that the Westfall--Young permutation method has asymptotically optimal power for a broad class of testing problems with a block-dependence and sparsity structure among the tests, when the number of tests tends to infinity.

Collaboration


Dive into the Marloes H. Maathuis's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jon A. Wellner

University of Washington

View shared research outputs
Top Co-Authors

Avatar

Michael G. Hudgens

University of North Carolina at Chapel Hill

View shared research outputs
Researchain Logo
Decentralizing Knowledge