David W. Opitz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David W. Opitz is active.

Explore More

Publication

Featured researches published by David W. Opitz.

Journal of Artificial Intelligence Research | 1999

Popular ensemble methods: an empirical study

David W. Opitz; Richard Maclin

An ensemble consists of a set of individually trained classifiers (such as neural networks or decision trees) whose predictions are combined when classifying novel instances. Previous research has shown that an ensemble is often more accurate than any of the single classifiers in the ensemble. Bagging (Breiman, 1996c) and Boosting (Freund & Schapire, 1996; Schapire, 1990) are two relatively new but popular methods for producing ensembles. In this paper we evaluate these methods on 23 data sets using both neural networks and decision trees as our classification algorithm. Our results clearly indicate a number of conclusions. First, while Bagging is almost always more accurate than a single classifier, it is sometimes much less accurate than Boosting. On the other hand, Boosting can create ensembles that are less accurate than a single classifier - especially when using neural networks. Analysis indicates that the performance of the Boosting methods is dependent on the characteristics of the data set being examined. In fact, further results show that Boosting ensembles may overfit noisy data sets, thus decreasing its performance. Finally, consistent with previous studies, our work suggests that most of the gain in an ensembles performance comes in the first few classifiers combined; however, relatively large gains can be seen up to 25 classifiers when Boosting decision trees.

Connection Science | 1996

Actively Searching for an Effective Neural Network Ensemble

David W. Opitz; Jude W. Shavlik

A neural network NN ensemble is a very successful technique where the outputs of a set of separately trained NNs are combined to form one unified prediction. An effective ensemble should consist of a set of networks that are not only highly correct, but ones that make their errors on different parts of the input space as well; however, most existing techniques only indirectly address the problem of creating such a set. We present an algorithm called ADDEMUP that uses genetic algorithms to search explicitly for a highly diverse set of accurate trained networks. ADDEMUP works by first creating an initial population, then uses genetic operators to create new networks continually, keeping the set of networks that are highly accurate while disagreeing with each other as much as possible. Experiments on four real-world domains show that ADDEMUP is able to generate a set of trained networks that is more accurate than several existing ensemble approaches. Experiments also show ADDEMUP is able to incorporate prior...

Photogrammetric Engineering and Remote Sensing | 2005

Classifying and mapping wildfire severity : A comparison of methods

C. Kenneth Brewer; J. Chris Winne; Roland L. Redmond; David W. Opitz; Mark Mangrich

This study evaluates six different approaches to classifying and mapping fire severity using multi-temporal Landsat Thematic Mapper data. The six approaches tested include: two based on temporal image differencing and ratioing between pre-fire and post-fire images, two based on principal component analysis of pre- and post-fire imagery, and two based on artificial neural networks, one using just postfire imagery and the other both pre- and post-fire imagery. Our results demonstrated the potential value for any of these methods to provide quantitative fire severity maps, but one of the image differencing methods (ND4/7) provided a flexible, robust, and analytically simple approach that could be applied anywhere in the Continental U.S. Based on the results of this test, the ND4/7 was implemented operationally to classify and map fire severity over 1.2 million hectares burned in the Northern Rocky Mountains and Northern Great Plains during the 2000 fire season, as well as the 2001 fire season (Gmelin and Brewer, 2002). Approximately the same procedure was adopted in 2001 by the USDA Forest Service, Remote Sensing Applications Center to produce Burned Area Reflectance Classifications for national-level support of Burned Area Emergency Rehabilitation activities (Orlemann, 2002).

Journal of Artificial Intelligence Research | 1997

Connectionist theory refinement: genetically searching the space of network topologies

David W. Opitz; Jude W. Shavlik

An algorithm that learns from a set of examples should ideally be able to exploit the available resources of (a) abundant computing power and (b) domain-specific knowledge to improve its ability to generalize. Connectionist theory-refinement systems, which use background knowledge to select a neural networks topology and initial weights, have proven to be effective at exploiting domain-specific knowledge; however, most do not exploit available computing power. This weakness occurs because they lack the ability to refine the topology of the neural networks they produce, thereby limiting generalization, especially when given impoverished domain theories. We present the Regent algorithm which uses (a) domain-specific knowledge to help create an initial population of knowledge-based neural networks and (b) genetic operators of crossover and mutation (specifically designed for knowledge-based networks) to continually search for better network topologies. Experiments on three real-world domains indicate that our new algorithm is able to significantly increase generalization compared to a standard connectionist theory-refinement system, as well as our previous algorithm for growing knowledge-based networks.

Journal of Chemical Information and Computer Sciences | 2000

Use of statistical and neural net approaches in predicting toxicity of chemicals.

Subhash C. Basak; Brian D. Gute; Krishnan Balasubramanian; David W. Opitz

Hierarchical quantitative structure-activity relationships (H-QSAR) have been developed as a new approach in constructing models for estimating physicochemical, biomedicinal, and toxicological properties of interest. This approach uses increasingly more complex molecular descriptors in a graduated approach to model building. In this study, statistical and neural network methods have been applied to the development of H-QSAR models for estimating the acute aquatic toxicity (LC50) of 69 benzene derivatives to Pimephales promelas (fathead minnow). Topostructural, topochemical, geometrical, and quantum chemical indices were used as the four levels of the hierarchical method. It is clear from both the statistical and neural network models that topostructural indices alone cannot adequately model this set of congeneric chemicals. Not surprisingly, topochemical indices greatly increase the predictive power of both statistical and neural network models. Quantum chemical indices also add significantly to the modeling of this set of acute aquatic toxicity data.

national conference on artificial intelligence | 1999