Haym Hirsh
Rutgers University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Haym Hirsh.
intelligent information systems | 1998
Ronen Feldman; Ido Dagan; Haym Hirsh
Knowledge Discovery in Databases (KDD) focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns within them. While most work on KDD has been concerned with structured databases, there has been little work on handling the huge amount of information that is available only in unstructured textual form. This paper describes the KDT system for Knowledge Discovery in Text, in which documents are labeled by keywords, and knowledge discovery is performed by analyzing the co-occurrence frequencies of the various keywords labeling the documents. We show how this keyword-frequency approach supports a range of KDD operations, providing a suitable foundation for knowledge discovery and exploration for collections of unstructured text.
principles of knowledge representation and reasoning | 1994
William W. Cohen; Haym Hirsh
Abstract We present a series of theoretical and experimental results on the learnability of description logics. We first extend previous formal learnability results on simple description logics to C-CLASSIC, a description logic expressive enough to be practically useful. We then experimentally evaluate two extensions of a learning algorithm suggested by the formal analysis. The first extension learns C-CLASSIC descriptions from individuals. (The formal results assume that examples are themselves descriptions.) The second extension learns disjunctions of C-CLASSIC descriptions from individuals. The experiments, which were conducted using several hundred target concepts from a number of domains, indicate that both extensions reliably learn complex natural concepts.
Journal of Artificial Intelligence Research | 2001
Chumki Basu; Haym Hirsh; William W. Cohen; Craig G. Nevill-Manning
The growing need to manage and exploit the proliferation of online data sources is opening up new opportunities for bringing people closer to the resources they need. For instance, consider a recommendation service through which researchers can receive daily pointers to journal papers in their fields of interest. We survey some of the known approaches to the problem of technical paper recommendation and ask how they can be extended to deal with multiple information sources. More specifically, we focus on a variant of this problem - recommending conference paper submissions to reviewing committee members - which offers us a testbed to try different approaches. Using WHIRL - an information integration system - we are able to implement different recommendation algorithms derived from information retrieval principles. We also use a novel autonomous procedure for gathering reviewer interest information from the Web. We evaluate our approach and compare it to other methods using preference data provided by members of the AAAI-98 conference reviewing committee along with data about the actual submissions.
Communications of The ACM | 2000
Haym Hirsh; Chumki Basu; Brian D. Davison
102 August 2000/Vol. 43, No. 8 COMMUNICATIONS OF THE ACM question in the design of such self-customizing software is what kind of patterns can be recognized by the learning algorithms. At one end, the system may do little more than recognize superficial patterns in a single user’s interactions. At the other, the system may exploit deeper knowledge about the user, what tasks the user is performing, as well as information about what other users have previously done. The challenge becomes one of identifying what information is available for the given “learning to personalize” task and what methods are best suited to the available information. When I used the email program on my PC to forward the file of this article to the editor of this magazine I executed a series of actions that are mostly the same ones I would take to forward any file to another user. I typically click on an item on a menu that pops up a window for the composition of an email message. A fairly routine sequence of actions then follows—I compose the message, select a menu item that creates a pop-up window into which I enter the name of the desired file to be forwarded, finally completing the LEARNING to Personalize Haym Hirsh, Chumki Basu, and Brian D. Davison
Machine Learning | 1994
Haym Hirsh
Although a landmark work, version spaces have proven fundamentally limited by being constrained to only consider candidate classifiers that are strictly consistent with data. This work generalizes version spaces to partially overcome this limitation. The main insight underlying this work is to base learning on version-space intersection, rather than the traditional candidate-elimination algorithm. The resulting learning algorithm, incremental version-space merging (IVSM), allows version spaces to contain arbitrary sets of classifiers, however generated, as long as they can be represented by boundary sets. This extends version spaces by increasing the range of information that can be used in learning; in particular, this paper describes how three examples of very different types of information—ambiguous data, inconsistent data, and background domain theories as traditionally used by explanation-based learning—can each be used by the new version-space approach.
international conference on artificial intelligence | 1997
Khaled Rasheed; Haym Hirsh; Andrew Gelsey
Genetic algorithms (GAs) have been extensively used as a means for performing global optimization in a simple yet reliable manner. However, in some realistic engineering design optimization domains the simple, classical implementation of a GA based on binary encoding and bit mutation and crossover is often inefficient and unable to reach the global optimum. In this paper we describe a GA for continuous design space optimization that uses new GA operators and strategies tailored to the structure and properties of engineering design domains. Empirical results in the domains of supersonic transport aircraft and supersonic missile inlets demonstrate that the newly formulated GA can be significantly better than the classical GA in both efficiency and reliability.
Archive | 1990
Haym Hirsh
1 Overview.- 1.1 Background: Version Spaces and the Candidate-Elimination Algorithm.- 1.2 Contributions.- 1.2.1 Incremental Version-Space Merging.- 1.2.2 Learning from Inconsistent Data.- 1.2.3 Combining Empirical and Analytical Learning.- 1.3 Readers Guide.- 2 Incremental Version-Space Merging.- 2.1 Generalizing Version Spaces.- 2.2 Version-Space Merging.- 2.3 Incremental Version-Space Merging.- 3 The Candidate-Elimination Algorithm: Emulation and Extensions.- 3.1 Learning Example.- 3.2 The Candidate-Elimination Algorithm.- 3.3 Candidate-Elimination Algorithm Emulation.- 3.4 Formal Proof of Equivalence.- 3.5 Ambiguous Training Data.- 3.6 Summary.- 4 Learning from Data with Bounded Inconsistency.- 4.1 Bounded Inconsistency.- 4.2 The Approach.- 4.3 Searching the Version Space.- 4.4 Example.- 4.4.1 Problem.- 4.4.2 Method.- 4.4.3 Results.- 4.5 Comparison to Related Work.- 4.6 Discussion.- 4.7 Formal Results.- 4.8 Summary.- 5 Combining Empirical and Analytical Learning.- 5.1 Explanation-Based Generalization.- 5.2 Combining EBG with Incremental Version-Space Merging.- 5.3 Examples.- 5.3.1 Cup Example.- 5.3.2 Can_put_on_table Example.- 5.4 Perspectives.- 5.4.1 Imperfect Domain Theories.- 5.4.2 Biasing Search.- 5.5 Constraints on the Concept Description Language.- 5.6 Related Work.- 5.7 Summary.- 6 Incremental Batch Learning.- 6.1 RL.- 6.2 The Approach.- 6.3 Example.- 6.3.1 Domain.- 6.3.2 Method.- 6.3.3 Results.- 6.3.4 Analysis.- 6.4 Summary.- 7 Computational Complexity.- 7.1 Version-Space Formation.- 7.2 Version-Space Merging.- 7.3 Incremental Version-Space Merging.- 7.3.1 Exponential Boundary-Set Growth.- 7.3.2 Intractable Domains.- 7.4 Summary.- 8 Theoretical Underpinnings.- 8.1 Terminology and Notation.- 8.2 Generalizing Version Spaces.- 8.2.1 Closure.- 8.2.2 Boundedness.- 8.2.3 Version Spaces.- 8.3 Version-Space Intersections.- 8.4 Version-Space Unions.- 8.5 Summary.- 9 Conclusions.- 9.1 Results.- 9.2 Analysis.- 9.3 Open Problems.- 9.4 Summary.- A IVSM Program Listing.
intelligent information systems | 1997
Ronen Feldman; Haym Hirsh
This paper describes the FACT system for knowledge discovery fromtext. It discovers associations—patterns ofco-occurrence—amongst keywords labeling the items in a collection oftextual documents. In addition, when background knowledge is available aboutthe keywords labeling the documents FACT is able to use this information inits discovery process. FACT takes a query-centered view of knowledgediscovery, in which a discovery request is viewed as a query over theimplicit set of possible results supported by a collection of documents, andwhere background knowledge is used to specify constraints on the desiredresults of this query process. Execution of a knowledge-discovery query isstructured so that these background-knowledge constraints can be exploitedin the search for possible results. Finally, rather than requiring a user tospecify an explicit query expression in the knowledge-discovery querylanguage, FACT presents the user with a simple-to-use graphical interface tothe query language, with the language providing a well-defined semantics forthe discovery actions performed by a user through the interface.
conference on learning theory | 1994
William W. Cohen; Haym Hirsh
Although there is an increasing amount of experimental research on learning concepts expressed in first-order logic, there are still relatively few formal results on the polynomial learnability of first-order representations from examples. Most previous analyses in the pac-model have focused on subsets of Prolog, and only a few highly restricted subsets have been shown to be learnable. In this paper, we will study instead the learnability of the restricted first-order logics known as “description logics”, also sometimes called “terminological logics” or “KL-ONE-type languages”. Description logics are also subsets of predicate calculus, but are expressed using a different syntax, allowing a different set of syntactic restrictions to be explored. We first define a simple description logic, summarize some results on its expressive power, and then analyze its learnability. It is shown that the full logic cannot be tractably learned. However, syntactic restrictions exist that enable tractable learning from positive examples alone, independent of the size of the vocabulary used to describe examples. The learnable sublanguage appears to be incomparable in expressive power to any subset of first-order logic previously known to be learnable.
Machine Learning | 1990
Derek H. Sleeman; Haym Hirsh; Ian Ellery; In-Yung Kim
By its very nature, artificial intelligence is concerned with investigating topics that are ill-defined and ill-understood. This paper describes two approaches to expanding a good but incomplete theory of a domain. The first uses the domain theory as far as possible and fills in specific gaps in the reasoning process, generalizing the suggested missing steps and adding them to the domain theory. The second takes existing operators of the domain theory and applies perturbations to form new plausible operators for the theory. The specific domain to which these techniques have been applied is high-school algebra problems. The domain theory is represented as operators corresponding to algebraic manipulations, and the problem of expanding the domain theory becomes one of discovering new algebraic operators. The general framework used is one of generate and test—generating new operators for the domain and using tests to filter out unreasonable ones. The paper compares two algorithms, INFER and MALGEN, examining their performance on actual data collected in two Scottish schools and concluding with a critical discussion of the two methods.