Clark Glymour
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Clark Glymour.
Social Science Computer Review | 1991
Peter Spirtes; Clark Glymour
Previous asymptotically correct algorithms for recovering causal structure from sample probabilities have been limited even in sparse causal graphs to a few variables. We describe an asymptotically correct algorithm whose complexity for fixed graph connectivity increases polynomially in the number of vertices, and may in practice recover sparse graphs with several hundred variables. From sample data with n = 20,000, an implementation of the algorithm on a DECStation 3100 recovers the edges in a linear version of the ALARM network with 37 vertices and 46 edges. Fewer than 8% of the undirected edges are incorrectly identified in the output. Without prior ordering information, the program also determines the direction of edges for the ALARM graph with an error rate of 14%. Processing time is less than 10 seconds. Keywords DAGS, Causal Modelling.
NeuroImage | 2010
Joseph Ramsey; Stephen José Hanson; Catherine Hanson; Yaroslav O. Halchenko; Russell A. Poldrack; Clark Glymour
Neuroimaging (e.g. fMRI) data are increasingly used to attempt to identify not only brain regions of interest (ROIs) that are especially active during perception, cognition, and action, but also the qualitative causal relations among activity in these regions (known as effective connectivity; Friston, 1994). Previous investigations and anatomical and physiological knowledge may somewhat constrain the possible hypotheses, but there often remains a vast space of possible causal structures. To find actual effective connectivity relations, search methods must accommodate indirect measurements of nonlinear time series dependencies, feedback, multiple subjects possibly varying in identified regions of interest, and unknown possible location-dependent variations in BOLD response delays. We describe combinations of procedures that under these conditions find feed-forward sub-structure characteristic of a group of subjects. The method is illustrated with an empirical data set and confirmed with simulations of time series of non-linear, randomly generated, effective connectivities, with feedback, subject to random differences of BOLD delays, with regions of interest missing at random for some subjects, measured with noise approximating the signal to noise ratio of the empirical data.
Data Mining and Knowledge Discovery | 1997
Clark Glymour; David Madigan; Daryl Pregibon; Padhraic Smyth
Data mining is on the interface of Computer Science andStatistics, utilizing advances in both disciplines to make progressin extracting information from large databases. It is an emergingfield that has attracted much attention in a very short period oftime. This article highlights some statistical themes and lessonsthat are directly relevant to data mining and attempts to identifyopportunities where close cooperation between the statistical andcomputational communities might reasonably provide synergy forfurther progress in data analysis.
Communications of The ACM | 1996
Clark Glymour; David Madigan; Daniel Pregibon; Padhraic Smyth
in a database. For many reasons—encoding errors, measurement errors, unrecorded causes of recorded features—the information in a database is almost always noisy; therefore, inference from databases invites applications of the theory of probability. From a statistical point of view, databases are usually uncontrolled convenience samples; therefore data mining poses a collection of interesting, difficult—sometimes impossible—inference problems, raising many issues, some well studied and others unexplored or at least unsettled. Data mining almost always involves a search architecture requiring evaluation of hypotheses at the stages of the search, evaluation of the search output, and appropriate use of the results. Statistics has little to offer in understanding search architectures but a great deal to offer in evaluation of hypotheses in the course of a search, in evaluating the results of a search, and in understanding the appropriate uses of the results. Statistics may have little to offer the search architectures in a data mining search, but a great deal to offer in evaluating hypotheses in the search, in evaluating the results of the search, and in applying the results.
Multivariate Behavioral Research | 1998
Richard Scheines; Peter Spirtes; Clark Glymour; Christopher Meek; Thomas S. Richardson
The statistical community has brought logical rigor and mathematical precision to the problem of using data to make inferences about a models parameter values. The TETRAD project, and related work in computer science and statistics, aims to apply those standards to the problem of using data and background knowledge to make inferences about a models specification. We begin by drawing the analogy between parameter estimation and model specification search. We then describe how the specification of a structural equation model entails familiar constraints on the covariance matrix for all admissible values of its parameters; we survey results on the equivalence of structural equation models, and we discuss search strategies for model specification. We end by presenting several algorithms that are implemented in the TETRAD I1 program.
Artificial Intelligence in Medicine | 1997
Gregory F. Cooper; Constantin F. Aliferis; Richard Ambrosino; John M. Aronis; Bruce G. Buchanan; Rich Caruana; Michael J. Fine; Clark Glymour; Geoffrey J. Gordon; Barbara H. Hanusa; Janine E. Janosky; Christopher Meek; Tom M. Mitchell; Thomas S. Richardson; Peter Spirtes
This paper describes the application of eight statistical and machine-learning methods to derive computer models for predicting mortality of hospital patients with pneumonia from their findings at initial presentation. The eight models were each constructed based on 9847 patient cases and they were each evaluated on 4352 additional cases. The primary evaluation metric was the error in predicted survival as a function of the fraction of patients predicted to survive. This metric is useful in assessing a models potential to assist a clinician in deciding whether to treat a given patient in the hospital or at home. We examined the error rates of the models when predicting that a given fraction of patients will survive. We examined survival fractions between 0.1 and 0.6. Over this range, each models predictive error rate was within 1% of the error rate of every other model. When predicting that approximately 30% of the patients will survive, all the models have an error rate of less than 1.5%. The models are distinguished more by the number of variables and parameters that they contain than by their error rates; these differences suggest which models may be the most amenable to future implementation as paper-based guidelines.
Sociological Methods & Research | 1998
Peter Spirtes; Thomas S. Richardson; Christopher Meek; Richard Scheines; Clark Glymour
A linear structural equation model (SEM) without free parameters has two parts: a probability distribution and an associated path diagram corresponding to the causal relations among variables specified by the structural equations and the correlations among the error terms. This article shows how path diagrams can be used to solve a number of important problems in structural equation modeling; for example, How much do sample data underdetermine the correct model specification? Given that there are equivalent models, is it possible to extract the features common to those models? When a modeler draws conclusions about coefficients in an unknown underlying SEM from a multivariate regression, precisely what assumptions are being made about the SEM? The authors explain how the path diagram provides much more than heuristics for special cases; the theory of path diagrams helps to clarify several of the issues just noted.
Hist Stud Phys Sci | 1980
John Earman; Clark Glymour
La deviation gravitionnelle de la lumiere verifiee en photographiant une eclipse du soleil (Campbell, Eddington).
NeuroImage | 2011
Joseph Ramsey; Stephen José Hanson; Clark Glymour
Smith et al. report a large study of the accuracy of 38 search procedures for recovering effective connections in simulations of DCM models under 28 different conditions. Their results are disappointing: no method reliably finds and directs connections without large false negatives, large false positives, or both. Using multiple subject inputs, we apply a previously published search algorithm, IMaGES, and novel orientation algorithms, LOFS, in tandem to all of the simulations of DCM models described by Smith et al. (2011). We find that the procedures accurately identify effective connections in almost all of the conditions that Smith et al. simulated and, in most conditions, direct causal connections with precision greater than 90% and recall greater than 80%.
Trends in Cognitive Sciences | 2003
Clark Glymour
Recent research in cognitive and developmental psychology on acquiring and using causal knowledge uses the causal Bayes net formalism, which simultaneously represents hypotheses about causal relations, probability relations, and effects of interventions. The formalism provides new normative standards for reinterpreting experiments on human judgment, offers a precise interpretation of mechanisms, and allows generalizations of existing theories of causal learning. Combined with hypotheses about learning algorithms, the formalism makes predictions about inferences in many experimental designs beyond the classical, Pavlovian cue-->effect design.