Is this you? Create Your Porfile

Dan Geiger

Technion – Israel Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dan Geiger is active.

Explore More

Publication

Featured researches published by Dan Geiger.

Machine Learning | 1997

Bayesian Network Classifiers

Nir Friedman; Dan Geiger; Moises Goldszmidt

Recent work in supervised learning has shown that a surprisingly simple Bayesian classifier with strong assumptions of independence among features, called naive Bayes, is competitive with state-of-the-art classifiers such as C4.5. This fact raises the question of whether a classifier with less restrictive assumptions can perform even better. In this paper we evaluate approaches for inducing classifiers from data, based on the theory of learning Bayesian networks. These networks are factored representations of probability distributions that generalize the naive Bayesian classifier and explicitly represent statements about independence. Among these approaches we single out a method we call Tree Augmented Naive Bayes (TAN), which outperforms naive Bayes, yet at the same time maintains the computational simplicity (no search involved) and robustness that characterize naive Bayes. We experimentally tested these approaches, using problems from the University of California at Irvine repository, and compared them to C4.5, naive Bayes, and wrapper methods for feature selection.

Machine Learning | 1995

Learning Bayesian Networks: The Combination of Knowledge and Statistical Data

David Heckerman; Dan Geiger; David Maxwell Chickering

We describe a Bayesian approach for learning Bayesian networks from a combination of prior knowledge and statistical data. First and foremost, we develop a methodology for assessing informative priors needed for learning. Our approach is derived from a set of assumptions made previously as well as the assumption of likelihood equivalence, which says that data should not help to discriminate network structures that represent the same assertions of conditional independence. We show that likelihood equivalence when combined with previously made assumptions implies that the users priors for network parameters can be encoded in a single Bayesian network for the next case to be seen—a prior network—and a single measure of confidence for that network. Second, using these priors, we show how to compute the relative posterior probabilities of network structures given data. Third, we describe search methods for identifying network structures with high posterior probabilities. We describe polynomial algorithms for finding the highest-scoring network structures in the special case where every node has at most k = 1 parent. For the general case (k > 1), which is NP-hard, we review heuristic search algorithms including local search, iterative local search, and simulated annealing. Finally, we describe a methodology for evaluating Bayesian-network learning algorithms, and apply this approach to a comparison of various approaches.

Networks | 1990

Identifying independence in bayesian networks

Dan Geiger; Thomas Verma; Judea Pearl

An important feature of Bayesian networks is that they facilitate explicit encoding of information about independencies in the domain, information that is indispensable for efficient inferencing. This article characterizes all independence assertions that logically follow from the topology of a network and develops a linear time algorithm that identifies these assertions. The algorithms correctness is based on the soundness of a graphical criterion, called d-separation, and its optimality stems from the completeness of d-separation. An enhanced version of d-separation, called D-separation, is defined, extending the algorithm to networks that encode functional dependencies. Finally, the algorithm is shown to work for a broad class of nonprobabilistic independencies.

uncertainty in artificial intelligence | 1994

Learning Gaussian networks

Dan Geiger; David Heckerman

We describe scoring metrics for learning Bayesian networks from a combination of user knowledge and statistical data. Previous work has concentrated on metrics for domains containing only discrete variables, under the assumption that data represents a multinomial sample. In this paper, we extend this work, developing scoring metrics for domains containing only continuous variables under the assumption that continuous data is sampled from a multivariate normal distribution. Our work extends traditional statistical approaches for identifying vanishing regression coefficients in that we identify two important assumptions, called event equivalence and parameter modularity, that when combined allow the construction of prior distributions for multivariate normal parameters from a single prior Bayesian network specified by a user.

Artificial Intelligence | 1996

Knowledge representation and inference in similarity networks and Bayesian multinets

Dan Geiger; David Heckerman

We examine two representation schemes for uncertain knowledge: the similarity network (Heckerman, 1991) and the Bayesian multinet. These schemes are extensions of the Bayesian network model in that they represent asymmetric independence assertions. We explicate the notion of relevance upon which similarity networks are based and present an efficient inference algorithm that works under the assumption that every event has a nonzero probability. Another inference algorithm is developed that works under no restriction albeit less efficiently. We show that similarity networks are not inferentially complete-namely-not every query can be answered. Nonetheless, we show that a similarity network can always answer any query of the form: “What is the posterior probability of an hypothesis given evidence?” We call this property diagnostic completeIZESS. Finally, we describe a generalization of similarity networks that can encode more types of asymmetric conditional independence assertions than can ordinary similarity networks.

Annals of Statistics | 2006

On the toric algebra of graphical models

Dan Geiger; Christopher Meek; Bernd Sturmfels

We formulate necessary and sufficient conditions for an arbitrary discrete probability distribution to factor according to an undirected graphical model, or a log-linear model, or other more general exponential models. For decomposable graphical models these conditions are equivalent to a set of conditional independence statements similar to the Hammersley-Clifford theorem; however, we show that for nondecomposable graphical models they are not. We also show that nondecomposable models can have nonrational maximum likelihood estimates. These results are used to give several novel characterizations of decomposable graphical models.

American Journal of Human Genetics | 2005

A Mutation in SNAP29, Coding for a SNARE Protein Involved in Intracellular Trafficking, Causes a Novel Neurocutaneous Syndrome Characterized by Cerebral Dysgenesis, Neuropathy, Ichthyosis, and Palmoplantar Keratoderma

Eli Sprecher; Akemi Ishida-Yamamoto; Mordechai Mizrahi-Koren; Debora Rapaport; Dorit Goldsher; Margarita Indelman; Orit Topaz; Ilana Chefetz; Hanni Keren; Timothy J. O’Brien; Dani Bercovich; Stavit A. Shalev; Dan Geiger; Reuven Bergman; Mia Horowitz; Hanna Mandel

Neurocutaneous syndromes represent a vast, largely heterogeneous group of disorders characterized by neurological and dermatological manifestations, reflecting the common embryonic origin of epidermal and neural tissues. In the present report, we describe a novel neurocutaneous syndrome characterized by cerebral dysgenesis, neuropathy, ichthyosis, and keratoderma (CEDNIK syndrome). Using homozygosity mapping in two large families, we localized the disease gene to 22q11.2 and identified, in all patients, a 1-bp deletion in SNAP29, which codes for a SNARE protein involved in vesicle fusion. SNAP29 expression was decreased in the skin of the patients, resulting in abnormal maturation of lamellar granules and, as a consequence, in mislocation of epidermal lipids and proteases. These data underscore the importance of vesicle trafficking regulatory mechanisms for proper neuroectodermal differentiation.

Molecular and Cellular Biology | 2005

Polyadenylation and Degradation of Human Mitochondrial RNA: the Prokaryotic Past Leaves Its Mark

Shimyn Slomovic; David Laufer; Dan Geiger; Gadi Schuster

ABSTRACT RNA polyadenylation serves a purpose in bacteria and organelles opposite from the role it plays in nuclear systems. The majority of nucleus-encoded transcripts are characterized by stable poly(A) tails at their mature 3′ ends, which are essential for stabilization and translation initiation. In contrast, in bacteria, chloroplasts, and plant mitochondria, polyadenylation is a transient feature which promotes RNA degradation. Surprisingly, in spite of their prokaryotic origin, human mitochondrial transcripts possess stable 3′-end poly(A) tails, akin to nucleus-encoded mRNAs. Here we asked whether human mitochondria retain truncated and transiently polyadenylated transcripts in addition to stable 3′-end poly(A) tails, which would be consistent with the preservation of the largely ubiquitous polyadenylation-dependent RNA degradation mechanisms of bacteria and organelles. To this end, using both molecular and bioinformatic methods, we sought and revealed numerous examples of such molecules, dispersed throughout the mitochondrial genome. The broad distribution but low abundance of these polyadenylated truncated transcripts strongly suggests that polyadenylation-dependent RNA degradation occurs in human mitochondria. The coexistence of this system with stable 3′-end polyadenylation, despite their seemingly opposite effects, is so far unprecedented in bacteria and other organelles.

international conference on supercomputing | 2008

Efficient computation of sum-products on GPUs through software-managed cache

Mark Silberstein; Assaf Schuster; Dan Geiger; Anjul Patney; John D. Owens

We present a technique for designing memory-bound algorithms with high data reuse on Graphics Processing Units (GPUs) equipped with close-to-ALU software-managed memory. The approach is based on the efficient use of this memory through the implementation of a software-managed cache. We also present an analytical model for performance analysis of such algorithms. We apply this technique to the implementation of the GPU-based solver of the sum-product or marginalize a product of functions (MPF) problem, which arises in a wide variety of real-life applications in artificial intelligence, statistics, image processing, and digital communications. Our motivation to accelerate MPF originated in the context of the analysis of genetic diseases, which in some cases requires years to complete on modern CPUs. Computing MPF is similar to computing the chain matrix product of multi-dimensional matrices, but is more difficult due to a complex data-dependent access pattern, high data reuse, and a low compute-to-memory access ratio. Our GPU-based MPF solver achieves up to 2700-fold speedup on random data and 270-fold on real-life genetic analysis datasets on GeForce 8800GTX GPU from NVIDIA over the optimized CPU version on an Intel 2.4GHz Core 2 with a 4MB L2 cache.

uncertainty in artificial intelligence | 1990

d-Separation: From Theorems to Algorithms

Dan Geiger; Thomas Verma; Judea Pearl

An efficient algorithm is developed that identifies all independencies implied by the topology of a Bayesian network. Its correctness and maximality stems from the soundness and completeness of d -separation with respect to probability theory. The algorithm runs in time 0 (|E|) where E is the number of edges in the network

Explore More