Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ryan Rifkin is active.

Publication


Featured researches published by Ryan Rifkin.


Nature | 2002

Prediction of central nervous system embryonal tumour outcome based on gene expression

Scott L. Pomeroy; Pablo Tamayo; Michelle Gaasenbeek; Lisa Marie Sturla; Michael Angelo; Margaret McLaughlin; John Kim; Liliana Goumnerova; Peter McL. Black; Ching Lau; Jeffrey C. Allen; David Zagzag; James M. Olson; Tom Curran; Jaclyn A. Biegel; Tomaso Poggio; Shayan Mukherjee; Ryan Rifkin; Gustavo Stolovitzky; David N. Louis; Jill P. Mesirov; Eric S. Lander; Todd R. Golub

Embryonal tumours of the central nervous system (CNS) represent a heterogeneous group of tumours about which little is known biologically, and whose diagnosis, on the basis of morphologic appearance alone, is controversial. Medulloblastomas, for example, are the most common malignant brain tumour of childhood, but their pathogenesis is unknown, their relationship to other embryonal CNS tumours is debated, and patients’ response to therapy is difficult to predict. We approached these problems by developing a classification system based on DNA microarray gene expression data derived from 99 patient samples. Here we demonstrate that medulloblastomas are molecularly distinct from other brain tumours including primitive neuroectodermal tumours (PNETs), atypical teratoid/rhabdoid tumours (AT/RTs) and malignant gliomas. Previously unrecognized evidence supporting the derivation of medulloblastomas from cerebellar granule cells through activation of the Sonic Hedgehog (SHH) pathway was also revealed. We show further that the clinical outcome of children with medulloblastomas is highly predictable on the basis of the gene expression profiles of their tumours at diagnosis.


Proceedings of the National Academy of Sciences of the United States of America | 2001

Multiclass cancer diagnosis using tumor gene expression signatures

Sridhar Ramaswamy; Pablo Tamayo; Ryan Rifkin; Sayan Mukherjee; Chen-Hsiang Yeang; Michael Angelo; Christine Ladd; Michael R. Reich; Eva Latulippe; Jill P. Mesirov; Tomaso Poggio; William L. Gerald; Massimo Loda; Eric S. Lander; Todd R. Golub

The optimal treatment of patients with cancer depends on establishing accurate diagnoses by using a complex combination of clinical and histopathological data. In some instances, this task is difficult or impossible because of atypical clinical presentation or histopathology. To determine whether the diagnosis of multiple common adult malignancies could be achieved purely by molecular classification, we subjected 218 tumor samples, spanning 14 common tumor types, and 90 normal tissue samples to oligonucleotide microarray gene expression analysis. The expression levels of 16,063 genes and expressed sequence tags were used to evaluate the accuracy of a multiclass classifier based on a support vector machine algorithm. Overall classification accuracy was 78%, far exceeding the accuracy of random classification (9%). Poorly differentiated cancers resulted in low-confidence predictions and could not be accurately classified according to their tissue of origin, indicating that they are molecularly distinct entities with dramatically different gene expression patterns compared with their well differentiated counterparts. Taken together, these results demonstrate the feasibility of accurate, multiclass molecular cancer classification and suggest a strategy for future clinical implementation of molecular cancer diagnostics.


Nature | 2004

General conditions for predictivity in learning theory

Tomaso Poggio; Ryan Rifkin; Sayan Mukherjee; Partha Niyogi

Developing theoretical foundations for learning is a key step towards understanding intelligence. ‘Learning from examples’ is a paradigm in which systems (natural or artificial) learn a functional relationship from a training set of examples. Within this paradigm, a learning algorithm is a map from the space of training sets to the hypothesis space of possible functional solutions. A central question for the theory is to determine conditions under which a learning algorithm will generalize from its finite training set to novel examples. A milestone in learning theory was a characterization of conditions on the hypothesis space that ensure generalization for the natural class of empirical risk minimization (ERM) learning algorithms that are based on minimizing the error on the training set. Here we provide conditions for generalization in terms of a precise stability property of the learning process: when the training set is perturbed by deleting one example, the learned hypothesis does not change much. This stability property stipulates conditions on the learning map rather than on the hypothesis space, subsumes the classical theory for ERM algorithms, and is applicable to more general algorithms. The surprising connection between stability and predictivity has implications for the foundations of learning theory and for the design of novel algorithms, and provides insights into problems as diverse as language learning and inverse problems in physics and engineering.


Journal of Computational Biology | 2003

Estimating Dataset Size Requirements for Classifying DNA Microarray Data

Sayan Mukherjee; Pablo Tamayo; Simon Rogers; Ryan Rifkin; Anna Engle; Colin Campbell; Todd R. Golub; Jill P. Mesirov

A statistical methodology for estimating dataset size requirements for classifying microarray data using learning curves is introduced. The goal is to use existing classification results to estimate dataset size requirements for future classification experiments and to evaluate the gain in accuracy and significance of classifiers built with additional data. The method is based on fitting inverse power-law models to construct empirical learning curves. It also includes a permutation test procedure to assess the statistical significance of classification performance for a given dataset size. This procedure is applied to several molecular classification problems representing a broad spectrum of levels of complexity.


Advances in Computational Mathematics | 2006

Learning theory: stability is sufficient for generalization and necessary and sufficient for consistency of empirical risk minimization

Sayan Mukherjee; Partha Niyogi; Tomaso Poggio; Ryan Rifkin

Abstract Solutions of learning problems by Empirical Risk Minimization (ERM) – and almost-ERM when the minimizer does not exist – need to be consistent, so that they may be predictive. They also need to be well-posed in the sense of being stable, so that they might be used robustly. We propose a statistical form of stability, defined as leave-one-out (LOO) stability. We prove that for bounded loss classes LOO stability is (a) sufficient for generalization, that is convergence in probability of the empirical error to the expected error, for any algorithm satisfying it and, (b) necessary and sufficient for consistency of ERM. Thus LOO stability is a weak form of stability that represents a sufficient condition for generalization for symmetric learning algorithms while subsuming the classical conditions for consistency of ERM. In particular, we conclude that a certain form of well-posedness and consistency are equivalent for ERM.


international conference on acoustics, speech, and signal processing | 2000

Using the Fisher kernel method for Web audio classification

Pedro J. Moreno; Ryan Rifkin

As the multimedia content of the Web increases techniques to automatically classify this content become more important. We present a system to classify audio files collected from the Web. The system classifies any audio file as belonging to one of three categories: speech, music and other. To classify the audio files, we use the technique of Fisher kernels. The technique as proposed by Jaakkola (1998) assumes a probabilistic generative model for the data, in our case a Gaussian mixture model. Then a discriminative classifier uses the GMM as an intermediate step to produce appropriate feature vectors. Support vector machines are our choice of discriminative classifier. We present classification results on a collection of more than 173 hours of Web audio randomly collected. We believe our results represent one of the first realistic studies of audio classification performance on found data. Our final system yielded a classification rate of 81.8%.


Siam Review | 2003

An Analytical Method for Multiclass Molecular Cancer Classification

Ryan Rifkin; Sayan Mukherjee; Pablo Tamayo; Sridhar Ramaswamy; Chen-Hsiang Yeang; Michael Angelo; Michael R. Reich; Tomaso Poggio; Eric S. Lander; Todd R. Golub; Jill P. Mesirov

A treating composition and method for treatment of shock and/or stress in animals. The composition comprises, in a preferred form, equal volume amounts of solutions of sodium acetate and sodium propionate. It may be administered orally, intravenously, subcutaneously, etc. The preferred dosage level is from about 0.25 cc. per pound of body weight to about 0.5 cc. per pound of body weight.


Archive | 2003

Regression and Classification with Regularization

Sayan Mukherjee; Ryan Rifkin; Tomaso Poggio

The purpose of this chapter is to present a theoretical framework for the problem of learning from examples. Learning from examples can be regarded [13] as the problem of approximating a multivariate function from sparse data2. The function can be real valued as in regression or binary valued as in classification. The problem of approximating a function from sparse data is ill-posed and a classical solution is regularization theory [19].


international conference on acoustics, speech, and signal processing | 2007

Noise Robust Phonetic Classificationwith Linear Regularized Least Squares and Second-Order Features

Ryan Rifkin; Ken Schutte; Michelle Saad; Jake V. Bouvrie; James R. Glass

We perform phonetic classification with an architecture whose elements are binary classifiers trained via linear regularized least squares (RLS). RLS is a simple yet powerful regularization algorithm with the desirable property that a good value of the regularization parameter can be found efficiently by minimizing leave-one-out error on the training set. Our system achieves state-of-the-art single classifier performance on the TIMIT phonetic classification task, (slightly) beating other recent systems. We also show that in the presence of additive noise, our model is much more robust than a well-trained Gaussian mixture model.


Journal of Computational Biology | 2003

Bayesian estimation of transcript levels using a general model of array measurement noise.

Ron O. Dror; Jonathan G. Murnick; Nicola J. Rinaldi; Voichita D. Marinescu; Ryan Rifkin; Richard A. Young

Gene arrays demonstrate a promising ability to characterize expression levels across the entire genome but suffer from significant levels of measurement noise. We present a rigorous new approach to estimate transcript levels and ratios from one or more gene array experiments, given a model of measurement noise and available prior information. The Bayesian estimation of array measurements (BEAM) technique provides a principled method to identify changes in expression level, combine repeated measurements, or deal with negative expression level measurements. BEAM is more flexible than existing techniques, because it does not assume a specific functional form for noise and prior models. Instead, it relies on computational techniques that apply to a broad range of models. We use Affymetrix yeast chip data to illustrate the process of developing accurate noise and prior models from existing experimental data. The resulting noise model includes novel features such as heavy-tailed additive noise and a gene-specific bias term. We also verify that the resulting noise and prior models fit data from an Affymetrix human chip set.

Collaboration


Dive into the Ryan Rifkin's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Tomaso Poggio

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Pablo Tamayo

University of California

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ross A. Lippert

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Michael Angelo

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge