L.C. van der Gaag
Utrecht University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by L.C. van der Gaag.
IEEE Transactions on Knowledge and Data Engineering | 2000
Marek J. Druzdzel; L.C. van der Gaag
Probabilistic networks are now fairly well established as practical representations of knowledge for reasoning under uncertainty, as demonstrated by an increasing number of successful applications in such domains as (medical) diagnosis and prognosis, planning, vision, information retrieval, and natural language processing. A probabilistic network (also referred to as a belief network, Bayesian network, or, somewhat imprecisely, causal network) consists of a graphical structure, encoding a domain’s variables and the qualitative relationships between them, and a quantitative part, encoding probabilities over the variables [29]. Building a probabilistic network for a domain of application involves three tasks. The first of these is to identify the variables that are of importance, along with their possible values. Once the important domain variables have been identified, the second task is to identify the relationships between the variables discerned and to express these in a graphical structure. The tasks of eliciting the variables and values of importance as well as the relationships between them from domain experts is comparable, to at least some extent, to knowledge engineering for other artificial-intelligence representations and, although it may require significant effort, is generally considered doable. The last task in building a probabilistic network is to obtain the probabilities that are required for its quantitative part. This task often appears more daunting: “Where do the numbers come from?” is a commonly asked question. The three tasks in building a probabilistic network are, in principle, performed one after the other. Building a network, however, often requires a careful trade-off between the desire for a large and rich model to obtain accurate results on the one hand, and the costs of construction and maintenance and the complexity of probabilistic inference on the other hand. In practice, therefore, building a probabilistic network is a process that iterates over these tasks until a network results that is deemed requisite. In collaboration with Finn V. Jensen and Max Henrion, we organised in 1995 a workshop devoted to the theme of obtaining the numbers, the most daunting task in building probabilistic networks [14]. The workshop was held in conjunction with the Fourteenth International Joint Conference on Artificial Intelligence (IJCAI’95) and had a programme of presentations of selected contributions and ample slots for flash communications and discussion. Scientists from such disciplines as decision analysis, statistics, and computer science attended the workshop. The interest in the workshop, both during IJCAI’95 and afterwards, prompted us to follow up on the theme. The current issue of IEEE Transactions on Data and Knowledge Engineering is the result.
Artificial Intelligence in Medicine | 2002
L.C. van der Gaag; Silja Renooij; Cilia Witteman; B.M.P. Aleman; Babs G. Taal
With the help of two experts in gastrointestinal oncology from The Netherlands Cancer Institute, Antoni van Leeuwenhoekhuis, a decision-support system is being developed for patient-specific therapy selection for oesophageal cancer. The kernel of the system is a probabilistic network that describes the presentation characteristics of cancer of the oesophagus and the pathophysiological processes of invasion and metastasis. While the construction of the graphical structure of the network was relatively straightforward, probability elicitation with existing methods proved to be a major obstacle. To overcome this obstacle, we designed a new method for eliciting probabilities from experts that combines the ideas of transcribing probabilities as fragments of text and of using a scale with both numerical and verbal anchors for marking assessments. In this paper, we report experiences with our method in eliciting the probabilities required for the oesophagus network. The method allowed us to elicit many probabilities in reasonable time. To gain some insight in the quality of the probabilities obtained, we conducted a preliminary evaluation study of our network, using data from real patients. We found that for 85% of the patients, the network predicted the correct cancer stage.
The Computer Journal | 1996
L.C. van der Gaag
In artificial intelligence research, the belief network framework for automated reasoning with uncertainty is rapidly gaining in popularity. The framework provides a powerful formalism for representing a joint probability distribution on a set of statistical variables. In addition, it offers algorithms for efficient probabilistic inference. At present, more and more knowledge-based systems employing the framework are being developed for various domains of application ranging from probabilistic information retrieval to medical diagnosis. This paper provides a tutorial introduction to the belief network framework and highlights some issues of ongoing research in applying the framework for real-life problem solving.
Journal of Dairy Science | 2010
W. Steeneveld; L.C. van der Gaag; W. Ouweltjes; H. Mollenhorst; H. Hogeveen
Automatic milking systems (AMS) generate alert lists reporting cows likely to have clinical mastitis (CM). Dutch farmers indicated that they use non-AMS cow information or the detailed alert information from the AMS to decide whether to check an alerted cow for CM. However, it is not yet known to what extent such information can be used to discriminate between true-positive and false-positive alerts. The overall objective was to investigate whether selection of the alerted cows that need further investigation for CM can be made. For this purpose, non-AMS cow information and detailed alert information were used. During a 2-yr study period, 11,156 alerts for CM, including 159 true-positive alerts, were collected at one farm in The Netherlands. Non-AMS cow information on parity, days in milk, season of the year, somatic cell count history, and CM history was added to each alert. In addition, 6 alert information variables were defined. These were the height of electrical conductivity, the alert origin (electrical conductivity, color, or both), whether or not a color alert for mastitic milk was given, whether or not a color alert for abnormal milk was given, deviation from the expected milk yield, and the number of alerts of the cow in the preceding 12 to 96 h. Subsequently, naive Bayesian networks (NBN) were constructed to compute the posterior probability of an alert being truly positive based only on non-AMS cow information, based on only alert information, or based on both types of information. The NBN including both types of information had the highest area under the receiver operating characteristic curve (AUC; 0.78), followed by the NBN including only alert information (AUC=0.75) and the NBN including only non-AMS cow information (AUC=0.62). By combining the 2 types of information and by setting a threshold on the computed probabilities, the number of false-positive alerts on a mastitis alert list was reduced by 35%, and 10% of the true-positive alerts would not be identified. To detect CM cases at a farm with an AMS, checking all alerts is still the best option but would result in a high workload. Checking alerts based on a single alert information variable would result in missing too many true-positive cases. Using a combination of alert information variables, however, is the best way to select cows that need further investigation. The effect of adding non-AMS cow information on making a distinction between true-positive and false-positive alerts would be minor.
international conference on knowledge capture | 2005
Eveline M. Helsper; L.C. van der Gaag; Ad Feelders; W.L.A. Loeffen; Petra L. Geenen; A.R.W. Elbers
Among the tasks involved in building a Bayesian network, obtaining the required probabilities is generally considered the most daunting. Available data collections are often too small to allow for estimating reliable probabilities. Most domain experts, on the other hand, consider assessing the numbers to be quite demanding. Qualitative probabilistic knowledge, however, is provided more easily by experts. We propose a method for obtaining probabilities, that uses qualitative expert knowledge to constrain the probabilities learned from a small data collection. A dedicated elicitation technique is designed to support the acquisition of the qualitative knowledge required for this purpose. We demonstrate the application of our method by quantifying part of a network in the field of classical swine fever.
Journal of Dairy Science | 2009
W. Steeneveld; L.C. van der Gaag; Herman W. Barkema; H. Hogeveen
Clinical mastitis (CM) can be caused by a wide variety of pathogens and farmers must start treatment before the actual causal pathogen is known. By providing a probability distribution for the causal pathogen, naive Bayesian networks (NBN) can serve as a management tool for farmers to decide which treatment to use. The advantage of providing a probability distribution for the causal pathogen, rather than only providing the most likely causal pathogen, is that the uncertainty involved is visible and a more informed treatment decision can be made. The objective of this study was to illustrate provision of probability distributions for the gram status and for the causal pathogen for CM cases. For constructing the NBN, data were used from 274 Dutch dairy herds in which the occurrence of CM was recorded over an 18-mo period. The data set contained information on 3,833 CM cases. Two-thirds of the data set was used for the construction process and one-third was retained for validation. One NBN was constructed with the CM cases classified according to their gram status, and another was built with the CM cases classified into streptococci, Staphylococcus aureus, or Escherichia coli. Information usually available at a dairy farm was included in both NBN (parity, month in lactation, season of the year, quarter position, SCC and CM history, being sick or not, and color and texture of the milk). Accuracy was calculated to obtain insight in the quality of the constructed NBN. The accuracy of classifying CM cases into gram-positive or gram-negative pathogens was 73%, while the accuracy of classifying CM cases into streptococci, Staph. aureus, or E. coli was 52%. Because only CM cases with a high probability for a single causal pathogen will be considered for pathogen-specific treatment, accuracies based on only classifying CM cases above a particular probability threshold were determined. For instance, for CM cases in which either gram-negative or gram-positive had a probability >0.90, classification according to the gram status reached an accuracy of 97%. We found that the greater the probability for a particular pathogen was for a CM case, the more accurate was the classification of this case as being caused by this pathogen. The probability distributions provided by the NBN and the associated accuracies for varying classification thresholds provide the farmer with considerable insight about the most likely causal pathogen for a CM case and the uncertainty involved.
Research in Veterinary Science | 2011
Petra L. Geenen; L.C. van der Gaag; W.L.A. Loeffen; A.R.W. Elbers
For diseases of which the clinical diagnosis is uncertain, naive Bayesian classifiers can be of assistance to the veterinary practitioner. These simple probabilistic models have proven to be very powerful for solving classification problems in a variety of domains, but are not yet widely applied within the veterinary domain. In this paper, naive Bayesian classifiers and methods for their construction are reviewed. We demonstrate how to construct full and selective classifiers from a data set and how to build such classifiers from information in the literature. As a case study, naive Bayesian classifiers to discriminate between classical swine fever (CSF)-infected and non-infected pig herds were constructed from data collected during the 1997/1998 CSF epidemic in the Netherlands. The resulting classifiers were studied in terms of their accuracy and compared with the optimally efficient diagnostic rule that was reported earlier by Elbers et al. (2002). The classifiers were found to have accuracies within the range of 67-70% and performed comparable to or even better than the diagnostic rule on the available data. In contrast with the diagnostic rule, the classifiers had the advantage of taking both the presence and the absence of particular clinical signs into account, which resulted in more discriminative power. These results indicate that naive Bayesian classifiers are promising tools for solving diagnostic problems in the veterinary field.
International Journal of Intelligent Systems | 1998
L.C. van der Gaag; J.-J. Ch. Meyer
The concept of informational independence plays a key role in most knowledge‐based systems. J. Pearl and his co‐researchers analysed the basic properties of the concept and formulated an axiomatic system for informational independence. This axiomatic system focuses on independences among mutually disjoint sets of variables. We show that in the context of probabilistic independence a focus on disjoint sets of variables can hide various interesting properties. To capture these properties, we enhance Pearls axiomatic system with two additional axioms. We investigate the set of models of the thus enhanced system and show that it provides a better characterization of the concept of probabilistic independence than Pearls system does. In addition, we observe that both Pearls axiomatic system and our enhanced system offer inference rules for deriving new independences from an initial set of independence statements and as such allow for a normal form for representing independence. We address the normal forms ensuing from the two axiomatic systems for informational independence.
International Journal of Approximate Reasoning | 1989
L.C. van der Gaag
Abstract Most expert knowledge is ill-defined and heuristic. Therefore, many present-day rule-based expert systems include a mechanism for modeling and manipulating imprecise knowledge. For a long time, probability theory has been the primary quantitative approach for handling uncertainty. Other (mathematical) models of uncertainty have been proposed during the last decade, several of which depart from probability theory. In this paper, so-called inference networks are introduced to demonstrate the application of such a model for inexact reasoning in a rule-based top-down reasoning expert system. This approach enables the formulation of a conceptual model for inexact reasoning in rule-based systems. This conceptual model is used to show some inadequacies in the certainty factor model, a model that has been proposed by the authors of the MYCIN system and that has actually been applied in expert systems. A syntactically correct reformulation of the certainty factor model is proposed, and this new formalism is used to discuss some of the models properties.
International Journal of Human-computer Studies \/ International Journal of Man-machine Studies | 1990
L.C. van der Gaag
In the early years of the research into plausible reasoning several quasi-probabilistic models for handling uncertainty in rule-based expert systems have been proposed. These models were computationally feasible but could not be justified mathematically. Although current research in this sub-area of artificial intelligence concentrates on the development of mathematically sound models, the early quasi-probabilistic models are still employed frequently in present-day rule-based expert systems. In this paper we show that two of these models, the certainty factor model developed by E. H. Shortliffe and B. G. Buchanan, and the subjective Bayesian method developed by R. O. Duda, P. E. Hart and N. J. Nilsson, model different notions of uncertainty. We support this statement by pointing out the difference in the interpretation and application of production rules in the respective models.