Is this you? Create Your Porfile

Nigel Duffy

University of California, Santa Cruz

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Nigel Duffy is active.

Explore More

Publication

Featured researches published by Nigel Duffy.

Bioinformatics | 2000

Support vector machine classification and validation of cancer tissue samples using microarray expression data

Terrence S. Furey; Nello Cristianini; Nigel Duffy; David W. Bednarski; Michèl Schummer; David Haussler

MOTIVATION DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. We have developed a new method to analyse this kind of data using support vector machines (SVMs). This analysis consists of both classification of the tissue samples, and an exploration of the data for mis-labeled or questionable tissue results. RESULTS We demonstrate the method in detail on samples consisting of ovarian cancer tissues, normal ovarian tissues, and other normal tissues. The dataset consists of expression experiment results for 97,802 cDNAs for each tissue. As a result of computational analysis, a tissue sample is discovered and confirmed to be wrongly labeled. Upon correction of this mistake and the removal of an outlier, perfect classification of tissues is achieved, but not with high confidence. We identify and analyse a subset of genes from the ovarian dataset whose expression is highly differentiated between the types of tissues. To show robustness of the SVM method, two previously published datasets from other types of tissues or cells are analysed. The results are comparable to those previously obtained. We show that other machine learning methods also perform comparably to the SVM on many of those datasets. AVAILABILITY The SVM software is available at http://www.cs. columbia.edu/ approximately bgrundy/svm.

Machine Learning | 2002

Boosting Methods for Regression

Nigel Duffy; David P. Helmbold

In this paper we examine ensemble methods for regression that leverage or “boost” base regressors by iteratively calling them on modified samples. The most successful leveraging algorithm for classification is AdaBoost, an algorithm that requires only modest assumptions on the base learning method for its strong theoretical guarantees. We present several gradient descent leveraging algorithms for regression and prove AdaBoost-style bounds on their sample errors using intuitive assumptions on the base learners. We bound the complexity of the regression functions produced in order to derive PAC-style bounds on their generalization errors. Experiments validate our theoretical results.

Theoretical Computer Science | 2002

A geometric approach to leveraging weak learners

Nigel Duffy; David P. Helmbold

AdaBoost is a popular and effective leveraging procedure for improving the hypotheses generated by weak learning algorithms. AdaBoost and many other leveraging algorithms can be viewed as performing a constrained gradient descent over a potential function. At each iteration the distribution over the sample given to the weak learner is proportional to the direction of steepest descent. We introduce a new leveraging algorithm based on a natural potential function. For this potential function, the direction of steepest descent can have negative components. Therefore, we provide two techniques for obtaining suitable distributions from these directions of steepest descent. The resulting algorithms have bounds that are incomparable to AdaBoosts. The analysis suggests that our algorithm is likely to perform better than AdaBoost on noisy data and with weak learners returning low confidence hypotheses. Modest experiments confirm that our algorithm can perform better than AdaBoost in these situations.

european conference on computational learning theory | 1999

A Geometric Approach to Leveraging Weak Learners

Nigel Duffy; David P. Helmbold

AdaBoost is a popular and effective leveraging procedure for improving the hypotheses generated by weak learning algorithms. AdaBoost and many other leveraging algorithms can be viewed as performing a constrained gradient descent over a potential function. At each iteration the distribution over the sample given to the weak learner is the direction of steepest descent. We introduce a new leveraging algorithm based on a natural potential function. For this potential function, the direction of steepest descent can have negative components. Therefore we provide two transformations for obtaining suitable distributions from these directions of steepest descent. The resulting algorithms have bounds that are incomparable to AdaBoosts, and their empirical performance is similar to AdaBoosts.

international symposium on neural networks | 1999

Using multiplicative algorithms to build cascade correlation networks

Nigel Duffy

Cascade correlation has been shown to learn effectively, producing small networks and low generalization error. However, there remain difficulties with this approach. Cascade correlation can produce networks with large depth and large fan-in. We propose the use of a multiplicative learning algorithm to address these problems. Experimental results indicate that these algorithms may produce sparse weight vectors. Furthermore, theoretical results indicate that these algorithms behave substantially differently from the usual additive algorithms such as gradient descent and Quickprop. It is hoped that by combining these two approaches an effective neural network algorithm will result. We attempt to validate this and motivate further research.

neural information processing systems | 2001