James G. Shanahan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where James G. Shanahan is active.

Explore More

Publication

Featured researches published by James G. Shanahan.

Computing Attitude and Affect in Text | 2006

Validating the Coverage of Lexical Resources for Affect Analysis and Automatically Classifying New Words along Semantic Axes

Gregory Grefenstette; Yan Qu; David A. Evans; James G. Shanahan

In addition to factual content, many texts contain an emotional dimension. This emotive, or affect, dimension has not received a great amount of attention in computational linguistics until recently. However, now that messages (including spam) have become more prevalent than edited texts (such as newswire), recognizing this emotive dimension of written text is becoming more important. One resource needed for identifying affect in text is a lexicon of words with emotion-conveying potential. Starting from an existing affect lexicon and lexical patterns that invoke affect, we gathered a large quantity of text to measure the coverage of our existing lexicon. This chapter reports on our methods for identifying new candidate affect words and on our evaluation of our current affect lexicons. We describe how our affect lexicon can be extended based on results from these experiments.

european conference on machine learning | 2003

Improving SVM text classification performance through threshold adjustment

James G. Shanahan; Norbert Roma

In general, support vector machines (SVM), when applied to text classification provide excellent precision, but poor recall. One means of customizing SVMs to improve recall, is to adjust the threshold associated with an SVM. We describe an automatic process for adjusting the thresholds of generic SVM which incorporates a user utility model, an integral part of an information management system. By using thresholds based on utility models and the ranking properties of classifiers, it is possible to overcome the precision bias of SVMs and insure robust performance in recall across a wide variety of topics, even when training data are sparse. Evaluations on TREC data show that our proposed threshold adjusting algorithm boosts the performance of baseline SVMs by at least 20% for standard information retrieval measures.

ieee international conference on fuzzy systems | 1997

Modelling with words using Cartesian granule features

Jf Baldwin; Trevor P. Martin; James G. Shanahan

We present Cartesian granule feature, a new multidimensional feature formed over the cross product of fuzzy partition labels. Traditional fuzzy modelling approaches, mainly use flat features (one dimensional features) and, consequently suffer from decomposition error when modelling systems where there are dependencies between the input variables. Cartesian granule features help reduce (if not eliminate) the error due to the decompositional usage of features. In the approach taken here, we label the (fuzzy) subsets which partition the various universes and incorporate these labels in the form of Cartesian granules into our modelling process. Fuzzy sets defined in terms of these Cartesian granules, are extracted automatically from statistical data using the theory of mass assignments, and are incorporated into fuzzy rules. Consequently we not only compute with words, we also model with words. Due to the interpolative nature of fuzzy sets, this approach can be used to model both classification and prediction problems. Overall Cartesian granule features incorporated into fuzzy rules yield glass-box models and when demonstrated on the ellipse classification problem yields a classification accuracy of 98%, outperforming standard modelling approaches such as neural networks and the data browser.

Journal of Intelligent and Robotic Systems | 2000

A Soft Computing Approach to Road Classification

James G. Shanahan; Barry T. Thomas; Majid Mirmehdi; Trevor P. Martin; Neill W. Campbell; James F. Baldwin

Current learning approaches to computer vision have mainly focussed on low-level image processing and object recognition, while tending to ignore high-level processing such as understanding. Here we propose an approach to object recognition that facilitates the transition from recognition to understanding. The proposed approach embraces the synergistic spirit of soft computing, exploiting the global search powers of genetic programming to determine fuzzy probabilistic models. It begins by segmenting the images into regions using standard image processing approaches, which are subsequently classified using a discovered fuzzy Cartesian granule feature classifier. Understanding is made possible through the transparent and succinct nature of the discovered models. The recognition of roads in images is taken as an illustrative problem in the vision domain. The discovered fuzzy models while providing high levels of accuracy (97%), also provide understanding of the problem domain through the transparency of the learnt models. The learning step in the proposed approach is compared with other techniques such as decision trees, naïve Bayes and neural networks using a variety of performance criteria such as accuracy, understandability and efficiency.

international joint conference on artificial intelligence | 1997

System Identification of Fuzzy Cartesian Granules Feature Models Using Genetic Programming

James F. Baldwin; Trevor P. Martin; James G. Shanahan

A Cartesian granule feature is a multidimensional feature formed over the cross product of words drawn from the linguistic partitions of the constituent input features. Systems can be quite naturally described in terms of Cartesian granule features incorporated into additive models (if-then-rules with weighted antecedents) where each Cartesian granule feature focuses on modelling the interactions of a subset of input variables. This can often lead to models that reduce if not eliminate decomposition error, while enhancing the model’s generalisation powers and transparency. Within a machine learning context the system identification of good, parsimonious additive Cartesian granule feature models is an exponential search problem. In this paper we present the G_DACG constructive induction algorithm as a means of automatically identifying additive Cartesian granule feature models from example data. G_DACG combines the powerful optimisation capabilities of genetic programming with a rather novel and cheap fitness function which relies on the semantic separation of concepts expressed in terms of Cartesian granule fuzzy sets in identifying these additive models. G_DACG helps avoid many of the problems of traditional approaches to system identification that arise from feature selection and feature abstraction such as local minima. G_DACG has been applied in the system identification of additive Cartesian granule feature models on a variety of artificial and real world problems. Here we present a sample of those results including those for the benchmark Pima Diabetes problem. A classification accuracy of 79.7% was achieved on this dataset outperforming previous bests of 78% (generally from black box modelling approaches such as neural nets and oblique decision trees).

International Journal of Approximate Reasoning | 1999

Controlling with words using automatically identified fuzzy Cartesian granule feature models

James F. Baldwin; Trevor P. Martin; James G. Shanahan

Abstract We present a new approach to representing and acquiring controllers based upon Cartesian granule features – multidimensional features formed over the cross product of words drawn from the linguistic partitions of the constituent input features – incorporated into additive models. Controllers expressed in terms of Cartesian granule features enable the paradigm “controlling with words” by translating process data into words that are subsequently used to interrogate a rule base, which ultimately results in a control action. The system identification of good, parsimonious additive Cartesian granule feature models is an exponential search problem. In this paper we present the G_DACG constructive induction algorithm as a means of automatically identifying additive Cartesian granule feature models from example data. G_DACG combines the powerful optimisation capabilities of genetic programming with a novel and cheap fitness function, which relies on the semantic separation of concepts expressed in terms of Cartesian granule fuzzy sets, in identifying these additive models. We illustrate the approach on a variety of problems including the modelling of a dynamical process and a chemical plant controller.

ieee international conference on fuzzy systems | 1998

Automatic fuzzy Cartesian granule feature discovery using genetic programming in image understanding

Jf Baldwin; Trevor P. Martin; James G. Shanahan

Variables defined over Cartesian granule feature universes can be viewed as multidimensional linguistic variables. These variable universes are formed over the cross product of words drawn from the fuzzy partitions of the constituent base features. Here we present a constructive induction algorithm, which identifies not only the Cartesian granule feature model but also the concepts/variables in which the model is expressed. The presented constructive induction algorithm combines the genetic programming search paradigm with a rather novel and cheap fitness function, which is based upon semantic discrimination analysis. Parsimony is promoted in this model discovery process, thereby leading to models with better generalisation power and transparency. The approach is demonstrated on an image understanding problem, an area that has traditionally been dominated by quantitative and black box modelling techniques. Overall the discovered Cartesian granule features models when demonstrated on a large test set of outdoor images provides highly accurate image interpretation, using four input features, with over 78% of the image area labelled correctly.

Lecture Notes in Computer Science | 2003

Agentized, Contextualized Filters for Information Management

David A. Evans; Gregory Grefenstette; Yan Qu; James G. Shanahan; Victor M. Sheftel

When people read or write documents, they spontaneously generate new information needs: for example, to understand the text they are reading; to find additional information related to the points they are making in their drafts. Simultaneously, each Information Object (IO) (i.e., word, entity, term, concept, phrase, proposition, sentence, paragraph, section, document, collection, etc.) someone reads or writes also creates context for the other IOs in the same discourse. We present a conceptual model of Agentized, Contextualized Filters (ACFs)–agents that identify an appropriate context for an information object and then actively fetch and filter relevant information concerning the information object in other information sources the user has access to. We illustrate the use of ACFs in a prototype knowledge management system called ViviDocs.

international joint conference on artificial intelligence | 1995

Structure Cognition from Images

Anca L. Ralescu; James G. Shanahan

Inference of structures in an image based on a fuzzy logic approach to perceptual organization is presented. Fuzzy sets and logic are useful for representing organization properties and for inference. The emphasis is here on an iterative scheme of inference of structures from lower level tokens, rather than on search. Learning to perform perceptual organization is also discussed.

conference on information and knowledge management | 2010

Exploiting sequential relationships for familial classification

Lee S. Jensen; James G. Shanahan

The pervasive nature of the internet has caused a significant transformation in the field of genealogical research. This has impacted not only how research is conducted, but has also dramatically increased the number of people discovering their family history. Recent market research (Maritz Marketing 2000, Harris Interactive 2009) indicates that general interest in the United States has increased from 45% in 1996, to 60% in 2000, and 87% in 2009. Increased popularity has caused a dramatic need for improvements in algorithms related to extracting, accessing, and processing genealogical data for use in building family trees.n This paper presents one approach to algorithmic improvement in the family history domain, where we infer the familial relationships of households found in human transcribed United States census data. By applying advances made in natural language processing, exploiting the sequential nature of the census, and using state of the art machine learning algorithms, we were able to decrease the error by 35% over a hand coded baseline system. The resulting system is immediately applicable to hundreds of millions of other genealogical records where families are represented, but the familial relationships are missing.

Explore More