Craig Saunders | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Craig Saunders is active.

Explore More

Publication

Featured researches published by Craig Saunders.

Journal of Machine Learning Research | 2002

Text classification using string kernels

Huma Lodhi; Craig Saunders; John Shawe-Taylor; Nello Cristianini; Chris Watkins

We introduce a novel kernel for comparing two text documents. The kernel is an inner product in the feature space consisting of all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially decaying factor of their full length in the text, hence emphasising those occurrences which are close to contiguous. A direct computation of this feature vector would involve a prohibitive amount of computation even for modest values of k, since the dimension of the feature space grows exponentially with k. The paper describes how despite this fact the inner product can be efficiently evaluated by a dynamic programming technique. A preliminary experimental comparison of the performance of the kernel compared with a standard word feature space kernel [6] is made showing encouraging results.

international conference on machine learning | 2005

Learning hierarchical multi-category text classification models

Juho Rousu; Craig Saunders; Sandor Szedmak; John Shawe-Taylor

We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. Experiments show that the algorithm can feasibly optimize training sets of thousands of examples and classification hierarchies consisting of hundreds of nodes. The algorithms predictive accuracy is competitive with other recently introduced hierarchical multi-category or multilabel classification learning algorithms.

NeuroImage | 2011

Kernel regression for fMRI pattern prediction

Carlton Chu; Yizhao Ni; Geoffrey Tan; Craig Saunders; John Ashburner

This paper introduces two kernel-based regression schemes to decode or predict brain states from functional brain scans as part of the Pittsburgh Brain Activity Interpretation Competition (PBAIC) 2007, in which our team was awarded first place. Our procedure involved image realignment, spatial smoothing, detrending of low-frequency drifts, and application of multivariate linear and non-linear kernel regression methods: namely kernel ridge regression (KRR) and relevance vector regression (RVR). RVR is based on a Bayesian framework, which automatically determines a sparse solution through maximization of marginal likelihood. KRR is the dual-form formulation of ridge regression, which solves regression problems with high dimensional data in a computationally efficient way. Feature selection based on prior knowledge about human brain function was also used. Post-processing by constrained deconvolution and re-convolution was used to furnish the prediction. This paper also contains a detailed description of how prior knowledge was used to fine tune predictions of specific “feature ratings,” which we believe is one of the key factors in our prediction accuracy. The impact of pre-processing was also evaluated, demonstrating that different pre-processing may lead to significantly different accuracies. Although the original work was aimed at the PBAIC, many techniques described in this paper can be generally applied to any fMRI decoding works to increase the prediction accuracy.

advanced data mining and applications | 2006

A correlation approach for automatic image annotation

David R. Hardoon; Craig Saunders; Sandor Szedmak; John Shawe-Taylor

The automatic annotation of images presents a particularly complex problem for machine learning researchers. In this work we experiment with semantic models and multi-class learning for the automatic annotation of query images. We represent the images using scale invariant transformation descriptors in order to account for similar objects appearing at slightly different scales and transformations. The resulting descriptors are utilised as visual terms for each image. We first aim to annotate query images by retrieving images that are similar to the query image. This approach uses the analogy that similar images would be annotated similarly as well. We then propose an image annotation method that learns a direct mapping from image descriptors to keywords. We compare the semantic based methods of Latent Semantic Indexing and Kernel Canonical Correlation Analysis (KCCA), as well as using a recently proposed vector label based learning method known as Maximum Margin Robot.

Archive | 2006

Subspace, Latent Structure and Feature Selection

Craig Saunders; Marko Grobelnik; Steve R. Gunn; John Shawe-Taylor

Invited Contributions.- Discrete Component Analysis.- Overview and Recent Advances in Partial Least Squares.- Random Projection, Margins, Kernels, and Feature-Selection.- Some Aspects of Latent Structure Analysis.- Feature Selection for Dimensionality Reduction.- Contributed Papers.- Auxiliary Variational Information Maximization for Dimensionality Reduction.- Constructing Visual Models with a Latent Space Approach.- Is Feature Selection Still Necessary?.- Class-Specific Subspace Discriminant Analysis for High-Dimensional Data.- Incorporating Constraints and Prior Knowledge into Factorization Algorithms - An Application to 3D Recovery.- A Simple Feature Extraction for High Dimensional Image Representations.- Identifying Feature Relevance Using a Random Forest.- Generalization Bounds for Subspace Selection and Hyperbolic PCA.- Less Biased Measurement of Feature Selection Benefits.

indian conference on computer vision, graphics and image processing | 2010

Learning moods and emotions from color combinations

Gabriela Csurka; Sandra Skaff; Luca Marchesotti; Craig Saunders

In this paper, we tackle the problem of associating combinations of colors to abstract categories (e.g. capricious, classic, cool, delicate, etc.). It is evident that such concepts would be difficult to distinguish using single colors, therefore we consider combinations of colors or color palettes. We leverage two novel databases for color palettes and we learn categorization models using low and high level descriptors. Preliminary results show that Fisher representation based on GMMs is the most rewarding strategy in terms of classification performance over a baseline model. We also suggest a process for cleaning weakly annotated data, whilst preserving the visual coherence of categories. Finally, we demonstrate how learning abstract categories on color palettes can be used in the application of color transfer, personalization and image re-ranking.

algorithmic learning theory | 2000

Computationally Efficient Transductive Machines

Craig Saunders; Alexander Gammerman; Volodya Vovk

In this paper we propose a new algorithm for providing confidence and credibility values for predictions on a multi-class pattern recognition problem which uses Support Vector machines in its implementation. Previous algorithms which have been proposed to achieve this are very processing intensive and are only practical for small data sets. We present here a method which overcomes these limitations and can deal with larger data sets (such as the US Postal Service database). The measures of confidence and credibility given by the algorithm are shown empirically to reflect the quality of the predictions obtained by the algorithm, and are comparable to those given by the less computationally efficient method. In addition to this the overall performance of the algorithm is shown to be comparable to other techniques (such as standard Support Vector machines), which simply give flat predictions and do not provide the extra confidence/credibility measures.

Proceedings of SPIE | 2012

Image simulation for automatic license plate recognition

Raja Bala; Yonghui Zhao; Aaron Michael Burry; Vladimir Kozitsky; Claude S. Fillion; Craig Saunders; Jose A. Rodriguez-Serrano

Automatic license plate recognition (ALPR) is an important capability for traffic surveillance applications, including toll monitoring and detection of different types of traffic violations. ALPR is a multi-stage process comprising plate localization, character segmentation, optical character recognition (OCR), and identification of originating jurisdiction (i.e. state or province). Training of an ALPR system for a new jurisdiction typically involves gathering vast amounts of license plate images and associated ground truth data, followed by iterative tuning and optimization of the ALPR algorithms. The substantial time and effort required to train and optimize the ALPR system can result in excessive operational cost and overhead. In this paper we propose a framework to create an artificial set of license plate images for accelerated training and optimization of ALPR algorithms. The framework comprises two steps: the synthesis of license plate images according to the design and layout for a jurisdiction of interest; and the modeling of imaging transformations and distortions typically encountered in the image capture process. Distortion parameters are estimated by measurements of real plate images. The simulation methodology is successfully demonstrated for training of OCR.

The Visual Computer | 2011

Building look & feel concept models from color combinations: With applications in image classification, retrieval, and color transfer

Gabriela Csurka; Sandra Skaff; Luca Marchesotti; Craig Saunders

In this paper, we tackle the problem of associating combinations of colors to abstract concepts (e.g. capricious, classic, cool, delicate, etc.). Since such concepts are difficult to represent using single colors, we consider combinations of colors or color palettes. We leverage two novel databases for color palettes, and learn categorization models using both low and high level descriptors. It is shown that the Bag of Colors and Fisher Vectors are the most rewarding descriptors for palettes categorization and retrieval.A simple but novel and efficient method for cleaning weakly annotated data, whilst preserving the visual coherence of categories is also given.Finally, we demonstrate that abstract category models learned on color palettes can be used in different applications such as image personalization, concept-based palette, and image retrieval and color transfer.

meeting of the association for computational linguistics | 2009

Handling phrase reorderings for machine translation

Yizhao Ni; Craig Saunders; Sandor Szedmak; Mahesan Niranjan

We propose a distance phrase reordering model (DPR) for statistical machine translation (SMT), where the aim is to capture phrase reorderings using a structure learning framework. On both the reordering classification and a Chinese-to-English translation task, we show improved performance over a baseline SMT system.

Explore More