Katherine A. Heller | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Katherine A. Heller is active.

Explore More

Publication

Featured researches published by Katherine A. Heller.

international conference on machine learning | 2005

Bayesian hierarchical clustering

Katherine A. Heller; Zoubin Ghahramani

We present a novel algorithm for agglomerative hierarchical clustering based on evaluating marginal likelihoods of a probabilistic model. This algorithm has several advantages over traditional distance-based agglomerative clustering algorithms. (1) It defines a probabilistic model of the data which can be used to compute the predictive distribution of a test point and the probability of it belonging to any of the existing clusters in the tree. (2) It uses a model-based criterion to decide on merging clusters rather than an ad-hoc distance metric. (3) Bayesian hypothesis testing is used to decide which merges are advantageous and to output the recommended depth of the tree. (4) The algorithm can be interpreted as a novel fast bottom-up approximate inference method for a Dirichlet process (i.e. countably infinite) mixture model (DPM). It provides a new lower bound on the marginal likelihood of a DPM by summing over exponentially many clusterings of the data in polynomial time. We describe procedures for learning the model hyperpa-rameters, computing the predictive distribution, and extensions to the algorithm. Experimental results on synthetic and real-world data sets demonstrate useful properties of the algorithm.

Workshop on Data Mining for Computer Security (DMSEC), Melbourne, FL, November 19, 2003 | 2003

One Class Support Vector Machines for Detecting Anomalous Windows Registry Accesses

Katherine A. Heller; Krysta M. Svore; Angelos D. Keromytis; Salvatore J. Stolfo

We present a new Host-based Intrusion Detection System (IDS) that monitors accesses to the Microsoft Windows Registry using Registry Anomaly Detection (RAD). Our system uses a one class Support Vector Machine (OCSVM) to detect anomalous registry behavior by training on a dataset of normal registry accesses. It then uses this model to detect outliers in new (unclassified) data generated from the same system. Given the success of OCSVMs in other applications, we apply them to the Windows Registry anomaly detection problem. We compare our system to the RAD system using the Probabilistic Anomaly Detection (PAD) algorithm on the same dataset. Surprisingly, we find that PAD outperforms our OCSVM system due to properties of the hierarchical prior incorporated in the PAD algorithm. In the future, these properties may be used to develop an improved kernel and increase the performance of the OCSVM system.

Journal of Computer Security | 2005

A comparative evaluation of two algorithms for Windows Registry Anomaly Detection

Salvatore J. Stolfo; Frank Apap; Eleazar Eskin; Katherine A. Heller; Shlomo Hershkop; Andrew Honig; Krysta M. Svore

We present a component anomaly detector for a host-based intrusion detection system (IDS) for Microsoft Windows. The core of the detector is a learning-based anomaly detection algorithm that detects attacks on a host machine by looking for anomalous accesses to the Windows Registry. We present and compare two anomaly detection algorithms for use in our IDS system and evaluate their performance. One algorithm called PAD, for Probabilistic Anomaly Detection, is based upon a probability density estimation while the second uses the Support Vector Machine framework. The key idea behind the detector is to first train a model of normal Registry behavior on a Windows host, even when noise may be present in the training data, and use this model to detect abnormal Registry accesses. At run-time the model is used to check each access to the Registry in real-time to determine whether or not the behavior is abnormal and possibly corresponds to an attack. The system is effective in detecting the actions of malicious software while maintaining a low rate of false alarms. We show that the probabilistic anomaly detection algorithm exhibits better performance in accuracy and in computational complexity over the support vector machine implementation under three different kernel functions.

international conference on machine learning | 2008

Statistical models for partial membership

Katherine A. Heller; Sinead A. Williamson; Zoubin Ghahramani

We present a principled Bayesian framework for modeling partial memberships of data points to clusters. Unlike a standard mixture model which assumes that each data point belongs to one and only one mixture component, or cluster, a partial membership model allows data points to have fractional membership in multiple clusters. Algorithms which assign data points partial memberships to clusters can be useful for tasks such as clustering genes based on microarray data (Gasch & Eisen, 2002). Our Bayesian Partial Membership Model (BPM) uses exponential family distributions to model each cluster, and a product of these distibtutions, with weighted parameters, to model each datapoint. Here the weights correspond to the degree to which the datapoint belongs to each cluster. All parameters in the BPM are continuous, so we can use Hybrid Monte Carlo to perform inference and learning. We discuss relationships between the BPM and Latent Dirichlet Allocation, Mixed Membership models, Exponential Family PCA, and fuzzy clustering. Lastly, we show some experimental results and discuss nonparametric extensions to our model.

BMC Bioinformatics | 2009

R/BHC: fast Bayesian hierarchical clustering for microarray data

Richard S. Savage; Katherine A. Heller; Yang Xu; Zoubin Ghahramani; William Truman; Murray Grant; Katherine J. Denby; David L. Wild

BackgroundAlthough the use of clustering methods has rapidly become one of the standard computational approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained.ResultsWe present an R/Bioconductor port of a fast novel algorithm for Bayesian agglomerative hierarchical clustering and demonstrate its use in clustering gene expression microarray data. The method performs bottom-up hierarchical clustering, using a Dirichlet Process (infinite mixture) to model uncertainty in the data and Bayesian model selection to decide at each step which clusters to merge.ConclusionBiologically plausible results are presented from a well studied data set: expression profiles of A. thaliana subjected to a variety of biotic and abiotic stresses. Our method avoids several limitations of traditional methods, for example how many clusters there should be and how to choose a principled distance metric.

Nature Communications | 2015

An insula-frontostriatal network mediates flexible cognitive control by adaptively predicting changing control demands

Jiefeng Jiang; Jeffrey M. Beck; Katherine A. Heller; Tobias Egner

The anterior cingulate and lateral prefrontal cortices have been implicated in implementing context-appropriate attentional control, but the learning mechanisms underlying our ability to flexibly adapt the control settings to changing environments remain poorly understood. Here we show that human adjustments to varying control demands are captured by a reinforcement learner with a flexible, volatility-driven learning rate. Using model-based functional magnetic resonance imaging, we demonstrate that volatility of control demand is estimated by the anterior insula, which in turn optimizes the prediction of forthcoming demand in the caudate nucleus. The caudates prediction of control demand subsequently guides the implementation of proactive and reactive attentional control in dorsal anterior cingulate and dorsolateral prefrontal cortices. These data enhance our understanding of the neuro-computational mechanisms of adaptive behaviour by connecting the classic cingulate-prefrontal cognitive control network to a subcortical control-learning mechanism that infers future demands by flexibly integrating remote and recent past experiences.

Data Mining and Knowledge Discovery | 2013

Growing a list

Benjamin Letham; Cynthia Rudin; Katherine A. Heller

It is easy to find expert knowledge on the Internet on almost any topic, but obtaining a complete overview of a given topic is not always easy: information can be scattered across many sources and must be aggregated to be useful. We introduce a method for intelligently growing a list of relevant items, starting from a small seed of examples. Our algorithm takes advantage of the wisdom of the crowd, in the sense that there are many experts who post lists of things on the Internet. We use a collection of simple machine learning components to find these experts and aggregate their lists to produce a single complete and meaningful list. We use experiments with gold standards and open-ended experiments without gold standards to show that our method significantly outperforms the state of the art. Our method uses the ranking algorithm Bayesian Sets even when its underlying independence assumption is violated, and we provide a theoretical generalization bound to motivate its use.

knowledge discovery and data mining | 2015

Hierarchical Graph-Coupled HMMs for Heterogeneous Personalized Health Data

Kai Fan; Marisa C. Eisenberg; Alison Walsh; Allison E. Aiello; Katherine A. Heller

The purpose of this study is to leverage modern technology (mobile or web apps) to enrich epidemiology data and infer the transmission of disease. We develop hierarchical Graph-Coupled Hidden Markov Models (hGCHMMs) to simultaneously track the spread of infection in a small cell phone community and capture person-specific infection parameters by leveraging a link prior that incorporates additional covariates. In this paper we investigate two link functions, the beta-exponential link and sigmoid link, both of which allow the development of a principled Bayesian hierarchical framework for disease transmission. The results of our model allow us to predict the probability of infection for each persons on each day, and also to infer personal physical vulnerability and the relevant association with covariates. We demonstrate our approach theoretically and experimentally on both simulation data and real epidemiological records.

The Annals of Applied Statistics | 2010

RANKING RELATIONS USING ANALOGIES IN BIOLOGICAL AND INFORMATION NETWORKS

Ricardo Silva; Katherine A. Heller; Zoubin Ghahramani; Edoardo M. Airoldi

Analogical reasoning depends fundamentally on the ability to learn and generalize about relations between objects. We develop an approach to relational learning which, given a set of pairs of objects S = {A(1) : B(1), A(2) : B(2), …, A(N) : B(N)}, measures how well other pairs A : B fit in with the set S. Our work addresses the following question: is the relation between objects A and B analogous to those relations found in S? Such questions are particularly relevant in information retrieval, where an investigator might want to search for analogous pairs of objects that match the query set of interest. There are many ways in which objects can be related, making the task of measuring analogies very challenging. Our approach combines a similarity measure on function spaces with Bayesian analysis to produce a ranking. It requires data containing features of the objects of interest and a link matrix specifying which relationships exist; no further attributes of such relationships are necessary. We illustrate the potential of our method on text analysis and information networks. An application on discovering functional interactions between pairs of proteins is discussed in detail, where we show that our approach can work in practice even if a small set of protein pairs is provided.

The Journal of Neuroscience | 2018

A Shared Vision for Machine Learning in Neuroscience

Mai-Anh T. Vu; Tülay Adali; Demba Ba; György Buzsáki; David E. Carlson; Katherine A. Heller; Conor Liston; Cynthia Rudin; Vikaas S. Sohal; Alik S. Widge; Helen S. Mayberg; Guillermo Sapiro; Kafui Dzirasa

With ever-increasing advancements in technology, neuroscientists are able to collect data in greater volumes and with finer resolution. The bottleneck in understanding how the brain works is consequently shifting away from the amount and type of data we can collect and toward what we actually do with the data. There has been a growing interest in leveraging this vast volume of data across levels of analysis, measurement techniques, and experimental paradigms to gain more insight into brain function. Such efforts are visible at an international scale, with the emergence of big data neuroscience initiatives, such as the BRAIN initiative (Bargmann et al., 2014), the Human Brain Project, the Human Connectome Project, and the National Institute of Mental Healths Research Domain Criteria initiative. With these large-scale projects, much thought has been given to data-sharing across groups (Poldrack and Gorgolewski, 2014; Sejnowski et al., 2014); however, even with such data-sharing initiatives, funding mechanisms, and infrastructure, there still exists the challenge of how to cohesively integrate all the data. At multiple stages and levels of neuroscience investigation, machine learning holds great promise as an addition to the arsenal of analysis tools for discovering how the brain works.

Explore More