Cory J. Butz | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cory J. Butz is active.

Explore More

Publication

Featured researches published by Cory J. Butz.

Information Sciences | 2007

Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification

Pawan Lingras; Cory J. Butz

Support vector machines (SVMs) are essentially binary classifiers. To improve their applicability, several methods have been suggested for extending SVMs for multi-classification, including one-versus-one (1-v-1), one-versus-rest (1-v-r) and DAGSVM. In this paper, we first describe how binary classification with SVMs can be interpreted using rough sets. A rough set approach to SVM classification removes the necessity of exact classification and is especially useful when dealing with noisy data. Next, by utilizing the boundary region in rough sets, we suggest two new approaches, extensions of 1-v-r and 1-v-1, to SVM multi-classification that allow for an error rate. We explicitly demonstrate how our extended 1-v-r may shorten the training time of the conventional 1-v-r approach. In addition, we show that our 1-v-1 approach may have reduced storage requirements compared to the conventional 1-v-1 and DAGSVM techniques. Our techniques also provide better semantic interpretations of the classification process. The theoretical conclusions are supported by experimental findings involving a synthetic dataset.

systems man and cybernetics | 2000

On the implication problem for probabilistic conditional independency

S. K. M. Wong; Cory J. Butz; Dan Wu

The implication problem is to test whether a given set of independencies logically implies another independency. This problem is crucial in the design of a probabilistic reasoning system. We advocate that Bayesian networks are a generalization of standard relational databases. On the contrary, it has been suggested that Bayesian networks are different from the relational databases because the implication problem of these two systems does not coincide for some classes of probabilistic independencies. This remark, however, does not take into consideration one important issue, namely, the solvability of the implication problem. In this comprehensive study of the implication problem for probabilistic conditional independencies, it is emphasized that Bayesian networks and relational databases coincide on solvable classes of independencies. The present study suggests that the implication problem for these two closely related systems differs only in unsolvable classes of independencies. This means there is no real difference between Bayesian networks and relational databases, in the sense that only solvable classes of independencies are useful in the design and implementation of these knowledge systems. More importantly, perhaps, these results suggest that many current attempts to generalize Bayesian networks can take full advantage of the generalizations made to standard relational databases.

web intelligence | 2004

A Web-Based Intelligent Tutoring System for Computer Programming

Cory J. Butz; Shan Hua; R. B. Maguire

Web Intelligence is a direction for scientific research that explores practical applications of Artificial Intelligence to the next generation of Web-empowered systems. In this paper, we present a Web-based intelligent tutoring system for computer programming. The decision making process conducted in our intelligent system is guided by Bayesian networks, which are a formal framework for uncertainty management in Artificial Intelligence based on probability theory. Whereas many tutoring systems are static HTML Web pages of a class textbook or lecture notes, our intelligent system can help a student navigate through the online course materials, recommend learning goals, and generate appropriate reading sequences.

knowledge discovery and data mining | 1999

On Information-Theoretic Measures of Attribute Importance

Yiyu Yao; S. K. Michael Wong; Cory J. Butz

An attribute is deemed important in data mining if it partitions the database such that previously unknown regularities are observable. Many information-theoretic measures have been applied to quantify the importance of an attribute. In this paper, we summarize and critically analyze these measures.

international conference on data mining | 2002

FD/spl I.bar/Mine: discovering functional dependencies in a database using equivalences

Hong Yao; Howard J. Hamilton; Cory J. Butz

The discovery of FDs from databases has recently become a significant research problem. In this paper, we propose a new algorithm, called FD-Mine. FD-Mine takes advantage of the rich theory of FDs to reduce both the size of the dataset and the number of FDs to be checked by using discovered equivalences. We show that the pruning does not lead to loss of information. Experiments on 15 UCI datasets show that FD-Mine can prune more candidates than previous methods.

ieee international conference on fuzzy systems | 2002

Exploiting contextual independencies in Web search and user profiling

Cory J. Butz

Several researchers have suggested that Bayesian networks be used in web search and user profiling. One advantage of this approach is that Bayesian networks are more general than the probabilistic models previously used in information retrieval. In practice, experimental results demonstrate the effectiveness the modern Bayesian network approach. On the other hand, since Bayesian networks are defined solely upon the notion of probabilistic conditional independence, these encouraging results do not take advantage of the more general probabilistic independencies recently proposed. In this paper, we show how to exploit contextual independencies in both web search and user profiling. Whereas a conditional independence must hold over all contexts, a contextual independence need only hold for one particular context. For web search applications, it is shown how contextual independencies can be modeled using multiple Bayesian networks. We also point to a more general learning approach for user profiling applications.

Information Sciences | 2009

A simple graphical approach for understanding probabilistic inference in Bayesian networks

Cory J. Butz; Shan Hua; Junying Chen; Hong Yao

We present a simple graphical method for understanding exact probabilistic inference in discrete Bayesian networks (BNs). A conditional probability table (conditional) is depicted as a directed acyclic graph involving one or more black vertices and zero or more white vertices. The probability information propagated in a network can then be graphically illustrated by introducing the black variable elimination (BVE) algorithm. We prove the correctness of BVE and establish its polynomial time complexity. Our method possesses two salient characteristics. This purely graphical approach can be used as a pedagogical tool to introduce BN inference to beginners. This is important as it is commonly stated that newcomers have difficulty learning BN inference due to intricate mathematical equations and notation. Secondly, BVE provides a more precise description of BN inference than the state-of-the-art discrete BN inference technique, called LAZY-AR. LAZY-AR propagates potentials, which are not well-defined probability distributions. Our approach only involves conditionals, a special case of potential.

European Journal of Operational Research | 2010

Rough support vector regression

Pawan Lingras; Cory J. Butz

This paper describes the relationship between support vector regression (SVR) and rough (or interval) patterns. SVR is the prediction component of the support vector techniques. Rough patterns are based on the notion of rough values, which consist of upper and lower bounds, and are used to effectively represent a range of variable values. Predictions of rough values in a variety of different forms within the context of interval algebra and fuzzy theory are attracting research interest. An extension of SVR, called rough support vector regression (RSVR), is proposed to improve the modeling of rough patterns. In particular, it is argued that the upper and lower bounds should be modeled separately. The proposal is shown to be a more flexible version of lower possibilistic regression model using [epsilon]-insensitivity. Experimental results on the Dow Jones Industrial Average demonstrate the suggested RSVR modeling technique.

granular computing | 2005

On the complexity of probabilistic inference in singly connected bayesian networks

Dan Wu; Cory J. Butz

In this paper, we revisit the consensus of computational complexity on exact inference in Bayesian networks. We point out that even in singly connected Bayesian networks, which conventionally are believed to have efficient inference algorithms, the computational complexity is still NP-hard.

IEEE Annual Meeting of the Fuzzy Information, 2004. Processing NAFIPS '04. | 2004

Interval set classifiers using support vector machines

Pawan Lingras; Cory J. Butz

Support vector machines and rough set theory are two classification techniques. Support vector machines can use continuous input variables and transform them to higher dimensions, so that classes can be linear separable. A support vector machine attempts to find the hyperplane that maximizes the margin between classes. This paper shows how the classification obtained from a support vector machine can be represented using interval or rough sets. Such a formulation is especially useful for soft margin classifiers.

Explore More