Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Cynthia Rudin is active.

Publication


Featured researches published by Cynthia Rudin.


meeting of the association for computational linguistics | 2008

Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking

Ryan M. Roth; Owen Rambow; Nizar Habash; Mona T. Diab; Cynthia Rudin

We investigate the tasks of general morphological tagging, diacritization, and lemmatization for Arabic. We show that for all tasks we consider, both modeling the lexeme explicitly, and retuning the weights of individual classifiers for the specific task, improve the performance.


The Annals of Applied Statistics | 2015

Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model

Benjamin Letham; Cynthia Rudin; Tyler H. McCormick; David Madigan

We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if...then... statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model called Bayesian Rule Lists that yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy on par with the current top algorithms for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2012

Machine Learning for the New York City Power Grid

Cynthia Rudin; David L. Waltz; Roger N. Anderson; Albert Boulanger; Ansaf Salleb-Aouissi; Maggie Chow; Haimonti Dutta; Philip Gross; Bert Huang; Steve Ierome; Delfina Isaac; Arthur Kressner; Rebecca J. Passonneau; Axinia Radeva; Leon Wu

_2


conference on learning theory | 2005

Margin-Based ranking meets boosting in the middle

Cynthia Rudin; Corinna Cortes; Mehryar Mohri; Robert E. Schapire

score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial fibrillation. Our model is as interpretable as CHADS


Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing | 2006

Re-Ranking Algorithms for Name Tagging

Heng Ji; Cynthia Rudin; Ralph Grishman

_2


conference on learning theory | 2006

Ranking with a p-norm push

Cynthia Rudin

, but more accurate.


Machine Learning | 2010

A process for predicting manhole events in Manhattan

Cynthia Rudin; Rebecca J. Passonneau; Axinia Radeva; Haimonti Dutta; Steve Ierome; Delfina Isaac

Power companies can benefit from the use of knowledge discovery methods and statistical machine learning for preventive maintenance. We introduce a general process for transforming historical electrical grid data into models that aim to predict the risk of failures for components and systems. These models can be used directly by power companies to assist with prioritization of maintenance and repair work. Specialized versions of this process are used to produce (1) feeder failure rankings, (2) cable, joint, terminator, and transformer rankings, (3) feeder Mean Time Between Failure (MTBF) estimates, and (4) manhole events vulnerability rankings. The process in its most general form can handle diverse, noisy, sources that are historical (static), semi-real-time, or real-time, incorporates state-of-the-art machine learning algorithms for prioritization (supervised ranking or MTBF), and includes an evaluation of results via cross-validation and blind test. Above and beyond the ranked lists and MTBF estimates are business management interfaces that allow the prediction capability to be integrated directly into corporate planning and decision support; such interfaces rely on several important properties of our general modeling approach: that machine learning features are meaningful to domain experts, that the processing of data is transparent, and that prediction results are accurate enough to support sound decision making. We discuss the challenges in working with historical electrical grid data that were not designed for predictive purposes. The “rawness” of these data contrasts with the accuracy of the statistical models that can be obtained from the process; these models are sufficiently accurate to assist in maintaining New York Citys electrical grid.


Machine Learning | 2014

Machine learning for science and society

Cynthia Rudin; Kiri L. Wagstaff

We present several results related to ranking. We give a general margin-based bound for ranking based on the L∞ covering number of the hypothesis space. Our bound suggests that algorithms that maximize the ranking margin generalize well. We then describe a new algorithm, Smooth Margin Ranking, that precisely converges to a maximum ranking-margin solution. The algorithm is a modification of RankBoost, analogous to Approximate Coordinate Ascent Boosting. We also prove a remarkable property of AdaBoost: under very natural conditions, AdaBoost maximizes the exponentiated loss associated with the AUC and achieves the same AUC as RankBoost. This explains the empirical observations made by Cortes and Mohri, and Caruana and Niculescu-Mizil, about the excellent performance of AdaBoost as a ranking algorithm, as measured by the AUC.


international conference on computer vision | 2009

Online coordinate boosting

Raphael Pelossof; Michael J. Jones; Ilia Vovsha; Cynthia Rudin

Integrating information from different stages of an NLP processing pipeline can yield significant error reduction. We demonstrate how re-ranking can improve name tagging in a Chinese information extraction system by incorporating information from relation extraction, event extraction, and coreference. We evaluate three state-of-the-art re-ranking algorithms (MaxEnt-Rank, SVMRank, and p-Norm Push Ranking), and show the benefit of multi-stage re-ranking for cross-sentence and cross-document inference.


JAMA Psychiatry | 2017

The World Health Organization Adult Attention-Deficit/Hyperactivity Disorder Self-Report Screening Scale for DSM-5.

Berk Ustun; Lenard A. Adler; Cynthia Rudin; Stephen V. Faraone; Thomas J. Spencer; Patricia Berglund; Michael J. Gruber; Ronald C. Kessler

We are interested in supervised ranking with the following twist: our goal is to design algorithms that perform especially well near the top of the ranked list, and are only required to perform sufficiently well on the rest of the list. Towards this goal, we provide a general form of convex objective that gives high-scoring examples more importance. This push near the top of the list can be chosen to be arbitrarily large or small. We choose p -norms to provide a specific type of push; as p becomes large, the algorithm concentrates harder near the top of the list. We derive a generalization bound based on the p-norm objective. We then derive a corresponding boosting-style algorithm, and illustrate the usefulness of the algorithm through experiments on UCI data. We prove that the minimizer of the objective is unique in a specific sense.

Collaboration


Dive into the Cynthia Rudin's collaboration.

Top Co-Authors

Avatar

Benjamin Letham

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar

Berk Ustun

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Theja Tulabandhula

Massachusetts Institute of Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge