Raymond J. Mooney | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Raymond J. Mooney is active.

Explore More

Publication

Featured researches published by Raymond J. Mooney.

Machine Learning | 1986

Explanation-Based Learning: An Alternative View

Gerald DeJong; Raymond J. Mooney

In the last issue of this journal Mitchell, Keller, and Kedar-Cabelli presented a unifying framework for the explanation-based approach to machine learning. While it works well for a number of systems, the framework does not adequately capture certain aspects of the systems under development by the explanation-based learning group at Illinois. The primary inadequacies arise in the treatment of concept operationality, organization of knowledge into schemata, and learning from observation. This paper outlines six specific problems with the previously proposed framework and presents an alternative generalization method to perform explanation-based learning of new concepts.

knowledge discovery and data mining | 2003

Adaptive duplicate detection using learnable string similarity measures

Mikhail Bilenko; Raymond J. Mooney

The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied on generic or manually tuned distance metrics for estimating the similarity of potential duplicates. In this paper, we present a framework for improving duplicate detection using trainable measures of textual similarity. We propose to employ learnable text distance functions for each database field, and show that such measures are capable of adapting to the specific notion of similarity that is appropriate for the fields domain. We present two learnable text similarity measures suitable for this task: an extended variant of learnable string edit distance, and a novel vector-space based measure that employs a Support Vector Machine (SVM) for training. Experimental results on a range of datasets show that our framework can improve duplicate detection accuracy over traditional techniques.

knowledge discovery and data mining | 2004

A probabilistic framework for semi-supervised clustering

Sugato Basu; Mikhail Bilenko; Raymond J. Mooney

Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clusters. In recent years, a number of algorithms have been proposed for enhancing clustering quality by employing such supervision. Such methods use the constraints to either modify the objective function, or to learn the distance measure. We propose a probabilistic model for semi-supervised clustering based on Hidden Markov Random Fields (HMRFs) that provides a principled framework for incorporating supervision into prototype-based clustering. The model generalizes a previous approach that combines constraints and Euclidean distance learning, and allows the use of a broad range of clustering distortion measures, including Bregman divergences (e.g., Euclidean distance and I-divergence) and directional similarity measures (e.g., cosine similarity). We present an algorithm that performs partitional semi-supervised clustering of data by minimizing an objective function derived from the posterior energy of the HMRF model. Experimental results on several text data sets demonstrate the advantages of the proposed framework.

international conference on machine learning | 2004

Integrating constraints and metric learning in semi-supervised clustering

Mikhail Bilenko; Sugato Basu; Raymond J. Mooney

Semi-supervised clustering employs a small amount of labeled data to aid unsupervised learning. Previous work in the area has utilized supervised data in one of two approaches: 1) constraint-based methods that guide the clustering algorithm towards a better grouping of the data, and 2) distance-function learning methods that adapt the underlying similarity metric used by the clustering algorithm. This paper provides new methods for the two approaches as well as presents a new semi-supervised clustering algorithm that integrates both of these techniques in a uniform, principled framework. Experimental results demonstrate that the unified approach produces better clusters than both individual approaches as well as previously proposed semi-supervised clustering algorithms.

empirical methods in natural language processing | 2005

A Shortest Path Dependency Kernel for Relation Extraction

Razvan C. Bunescu; Raymond J. Mooney

We present a novel approach to relation extraction, based on the observation that the information required to assert a relationship between two named entities in the same sentence is typically captured by the shortest path between the two entities in the dependency graph. Experiments on extracting top-level relations from the ACE (Automated Content Extraction) newspaper corpus show that the new shortest path dependency kernel outperforms a recent approach based on dependency tree kernels.

IEEE Intelligent Systems | 2003

Adaptive name matching in information integration

Mikhail Bilenko; Raymond J. Mooney; William W. Cohen; Pradeep Ravikumar; Stephen E. Fienberg

Identifying approximately duplicate database records that refer to the same entity is essential for information integration. The authors compare and describe methods for combining and learning textual similarity measures for name matching.

Artificial Intelligence in Medicine | 2005

Comparative experiments on learning information extractors for proteins and their interactions

Razvan C. Bunescu; Ruifang Ge; Rohit J. Kate; Edward M. Marcotte; Raymond J. Mooney; Arun K. Ramani; Yuk Wah Wong

OBJECTIVE Automatically extracting information from biomedical text holds the promise of easily consolidating large amounts of biological knowledge in computer-accessible form. This strategy is particularly attractive for extracting data relevant to genes of the human genome from the 11 million abstracts in Medline. However, extraction efforts have been frustrated by the lack of conventions for describing human genes and proteins. We have developed and evaluated a variety of learned information extraction systems for identifying human protein names in Medline abstracts and subsequently extracting information on interactions between the proteins. METHODS AND MATERIAL We used a variety of machine learning methods to automatically develop information extraction systems for extracting information on gene/protein name, function and interactions from Medline abstracts. We present cross-validated results on identifying human proteins and their interactions by training and testing on a set of approximately 1000 manually-annotated Medline abstracts that discuss human genes/proteins. RESULTS We demonstrate that machine learning approaches using support vector machines and maximum entropy are able to identify human proteins with higher accuracy than several previous approaches. We also demonstrate that various rule induction methods are able to identify protein interactions with higher precision than manually-developed rules. CONCLUSION Our results show that it is promising to use machine learning to automatically build systems for extracting information from biomedical text. The results also give a broad picture of the relative strengths of a wide variety of methods when tested on a reasonably large human-annotated corpus.

Readings in knowledge acquisition and learning | 1993

Symbolic and neural learning algorithms: an experimental comparison

Jude W. Shavlik; Raymond J. Mooney; Geoffrey G. Towell

Despite the fact that many symbolic and neural network (connectionist) learning algorithms address the same problem of learning from classified examples, very little is known regarding their comparative strengths and weaknesses. Experiments comparing the ID3 symbolic learning algorithm with the perceptron and backpropagation neural learning algorithms have been performed using five large, real-world data sets. Overall, backpropagation performs slightly better than the other two algorithms in terms of classification accuracy on new examples, but takes much longer to train. Experimental results suggest that backpropagation can work significantly better on data sets containing numerical data. Also analyzed empirically are the effects of (1) the amount of training data, (2) imperfect training examples, and (3) the encoding of the desired outputs. Backpropagation occasionally outperforms the other two systems when given relatively small amounts of training data. It is slightly more accurate than ID3 when examples are noisy or incompletely specified. Finally, backpropagation more effectively utilizes a “distributed” output encoding.

international conference on computer vision | 2015

Sequence to Sequence -- Video to Text

Subhashini Venugopalan; Marcus Rohrbach; Jeffrey Donahue; Raymond J. Mooney; Trevor Darrell; Kate Saenko

Real-world videos often have complex dynamics, methods for generating open-domain video descriptions should be sensitive to temporal structure and allow both input (sequence of frames) and output (sequence of words) of variable length. To approach this problem we propose a novel end-to-end sequence-to-sequence model to generate captions for videos. For this we exploit recurrent neural networks, specifically LSTMs, which have demonstrated state-of-the-art performance in image caption generation. Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip. Our model naturally is able to learn the temporal structure of the sequence of frames as well as the sequence model of the generated sentences, i.e. a language model. We evaluate several variants of our model that exploit different visual features on a standard set of YouTube videos and two movie description datasets (M-VAD and MPII-MD).

north american chapter of the association for computational linguistics | 2015

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

Subhashini Venugopalan; Huijuan Xu; Jeff Donahue; Marcus Rohrbach; Raymond J. Mooney; Kate Saenko

Solving the visual symbol grounding problem has long been a goal of artificial intelligence. The field appears to be advancing closer to this goal with recent breakthroughs in deep learning for natural language grounding in static images. In this paper, we propose to translate videos directly to sentences using a unified deep neural network with both convolutional and recurrent structure. Described video datasets are scarce, and most existing methods have been applied to toy domains with a small vocabulary of possible words. By transferring knowledge from 1.2M+ images with category labels and 100,000+ images with captions, our method is able to create sentence descriptions of open-domain videos with large vocabularies. We compare our approach with recent work using language generation metrics, subject, verb, and object prediction accuracy, and a human evaluation.

Explore More