Christopher J. C. Burges

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Christopher J. C. Burges is active.

Explore More

Publication

Featured researches published by Christopher J. C. Burges.

Data Mining and Knowledge Discovery | 1998

A Tutorial on Support Vector Machines for Pattern Recognition

Christopher J. C. Burges

The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

IEEE Transactions on Neural Networks | 1999

Input space versus feature space in kernel-based methods

Bernhard Schölkopf; Sebastian Mika; Christopher J. C. Burges; Phil Knirsch; Klaus-Robert Müller; Gunnar Rätsch; Alexander J. Smola

This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the Kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.

international conference on artificial neural networks | 1996

Incorporating Invariances in Support Vector Learning Machines

Bernhard Schölkopf; Christopher J. C. Burges; Vladimir Vapnik

Developed only recently, support vector learning machines achieve high generalization ability by minimizing a bound on the expected test error; however, so far there existed no way of adding knowledge about invariances of a classification problem at hand. We present a method of incorporating prior knowledge about transformation invariances by applying transformations to support vectors, the training examples most critical for determining the classification boundary.

international conference on artificial neural networks | 1996

Comparison of View-Based Object Recognition Algorithms Using Realistic 3D Models

Volker Blanz; Bernhard Schölkopf; Hh Bülthoff; Christopher J. C. Burges; Vladimir Vapnik; Thomas Vetter

Two view-based object recognition algorithms are compared: (1) a heuristic algorithm based on oriented filters, and (2) a support vector learning machine trained on low-resolution images of the objects. Classification performance is assessed using a high number of images generated by a computer graphics system under precisely controlled conditions. Training- and test-images show a set of 25 realistic three-dimensional models of chairs from viewing directions spread over the upper half of the viewing sphere. The percentage of correct identification of all 25 objects is measured.

Mustererkennung 1998, 20. DAGM-Symposium | 1998

Fast Approximation of Support Vector Kernel Expansions, and an Interpretation of Clustering as Approximation in Feature Spaces

Bernhard Schölkopf; Alexander J. Smola; Phil Knirsch; Christopher J. C. Burges

Kernel-based learning methods provide their solutions as expansions in terms of a kernel. We consider the problem of reducing the computational complexity of evaluating these expansions by approximating them using fewer terms. As a by-product, we point out a connection between clustering and approximation in reproducing kernel Hilbert spaces generated by a particular class of kernels.

international conference on acoustics speech and signal processing | 1999

Distinctive feature detection using support vector machines

Partha Niyogi; Christopher J. C. Burges; Padma Ramesh

An important aspect of distinctive feature based approaches to automatic speech recognition is the formulation of a framework for robust detection of these features. We discuss the application of the support vector machines (SVM) that arise when the structural risk minimization principle is applied to such feature detection problems. In particular, we describe the problem of detecting stop consonants in continuous speech and discuss an SVM framework for detecting these sounds. In this paper we use both linear and nonlinear SVMs for stop detection and present experimental results to show that they perform better than a cepstral features based hidden Markov model (HMM) system, on the same task.

Journal of the Acoustical Society of America | 2003

Discriminative gaussian mixture models for speaker verification

Christopher J. C. Burges

Speaker identification is performed using a single Gaussian mixture model (GMM) for multiple speakers—referred to herein as a Discriminative Gaussian mixture model (DGMM). A likelihood sum of the single GMM is factored into two parts, one of which depends only on the Gaussian mixture model, and the other of which is a discriminative term. The discriminative term allows for the use of a binary classifier, such as a support vector machine (SVM). In one embodiment of the invention, a voice messaging system incorporates a DGMM to identify the speaker who generated a message, if that speaker is a member of a chosen list of target speakers, or to identify the speaker as a “non-target” otherwise.

international symposium on neural networks | 1992

Shortest path segmentation: a method for training a neural network to recognize character strings

Christopher J. C. Burges; Ofer Matan; Y. Le Cun; John S. Denker; Lawrence D. Jackel; Charles E. Stenard; Craig R. Nohl; Jan Ben

The authors describe a method which combines dynamic programming and a neural network recognizer for segmenting and recognizing character strings. The method selects the optimal consistent combination of cuts from a set of candidate cuts generated using heuristics. The optimal segmentation is found by representing the image, the candidate segments, and their scores as a graph in which the shortest path corresponds to the optimal interpretation. The scores are given by neural net outputs for each segment. A significant advantage of the method is that the labor required to segment images manually is eliminated. The system was trained on approximately 7000 unsegmented handwritten zip codes provided by the United States Postal Service. The system has achieved a per-zip-code raw recognition rate of 81% on a 2368 handwritten zip-code test set.<<ETX>>

IEEE Computer | 1992

Reading handwritten digits: a ZIP code recognition system

Ofer Matan; Henry S. Baird; Jane Bromley; Christopher J. C. Burges; John S. Denker; Lawrence D. Jackel; Y. Le Cun; Edwin P. D. Pednault; W.D. Satterfield; Charles E. Stenard; T.J. Thompson

A neural network algorithm-based system that reads handwritten ZIP codes appearing on real US mail is described. The system uses a recognition-based segmenter, that is a hybrid of connected-components analysis (CCA), vertical cuts, and a neural network recognizer. Connected components that are single digits are handled by CCA. CCs that are combined or dissected digits are handled by the vertical-cut segmenter. The four main stages of processing are preprocessing, in which noise is removed and the digits are deslanted, CCA segmentation and recognition, vertical-cut-point estimation and segmentation, and directly lookup. The system was trained and tested on approximately 10000 images, five- and nine-digit ZIP code fields taken from real mail.<<ETX>>

International Journal of Pattern Recognition and Artificial Intelligence | 1993

OFF LINE RECOGNITION OF HANDWRITTEN POSTAL WORDS USING NEURAL NETWORKS

Christopher J. C. Burges; Jan Ben; John S. Denker; Yann LeCun; Craig R. Nohl

We describe a method, “Shortest Path Segmentation” (SPS), which combines dynamic programming and a neural net recognizer for segmenting and recognizing character strings. We describe the application of this method to two problems: recognition of handwritten ZIP Codes, and recognition of handwritten words. For the ZIP Codes, we also used the method to automatically segment the images during training: the dynamic programming stage both performs the segmentation and provides inputs and desired outputs to the neural network. Results are reported for a test set of 2642 unsegmented handwritten 212 dpi binary ZIP Code (5- and 9-digit) images. For handwritten word recognition, we combined SPS with a “Space Displacement Neural Network” approach, in which a single-character-recognition network is extended over the entire word image, and in which SPS techniques are then used to rank order a given lexicon. We report results on a test set of 3000 300 ppi gray scale word images, extracted from images of live mail pieces, for lexicons of size 10, 100, and 1000. Representing the problem as a graph as proposed in this paper has advantages beyond the efficient finding of the final optimal segmentation, or the automatic segmentation of images during training. We can also easily extend the technique to generate K “runner up” answers (for example, by finding the K shortest paths). This paper will also describe applications of some of these ideas.

Explore More