Ayshwarya Subramanian
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Ayshwarya Subramanian.
Bioinformatics | 2010
David Tolliver; Charalampos E. Tsourakakis; Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Motivation: Tumorigenesis is an evolutionary process by which tumor cells acquire sequences of mutations leading to increased growth, invasiveness and eventually metastasis. It is hoped that by identifying the common patterns of mutations underlying major cancer sub-types, we can better understand the molecular basis of tumor development and identify new diagnostics and therapeutic targets. This goal has motivated several attempts to apply evolutionary tree reconstruction methods to assays of tumor state. Inference of tumor evolution is in principle aided by the fact that tumors are heterogeneous, retaining remnant populations of different stages along their development along with contaminating healthy cell populations. In practice, though, this heterogeneity complicates interpretation of tumor data because distinct cell types are conflated by common methods for assaying the tumor state. We previously proposed a method to computationally infer cell populations from measures of tumor-wide gene expression through a geometric interpretation of mixture type separation, but this approach deals poorly with noisy and outlier data. Results: In the present work, we propose a new method to perform tumor mixture separation efficiently and robustly to an experimental error. The method builds on the prior geometric approach but uses a novel objective function allowing for robust fits that greatly reduces the sensitivity to noise and outliers. We further develop an efficient gradient optimization method to optimize this ‘soft geometric unmixing’ objective for measurements of tumor DNA copy numbers assessed by array comparative genomic hybridization (aCGH) data. We show, on a combination of semi-synthetic and real data, that the method yields fast and accurate separation of tumor states. Conclusions: We have shown a novel objective function and optimization method for the robust separation of tumor sub-types from aCGH data and have shown that the method provides fast, accurate reconstruction of tumor states from mixed samples. Better solutions to this problem can be expected to improve our ability to accurately identify genetic abnormalities in primary tumor samples and to infer patterns of tumor evolution. Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.
BioMed Research International | 2012
Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Tumorigenesis can in principle result from many combinations of mutations, but only a few roughly equivalent sequences of mutations, or “progression pathways,” seem to account for most human tumors. Phylogenetics provides a promising way to identify common progression pathways and markers of those pathways. This approach, however, can be confounded by the high heterogeneity within and between tumors, which makes it difficult to identify conserved progression stages or organize them into robust progression pathways. To tackle this problem, we previously developed methods for inferring progression stages from heterogeneous tumor profiles through computational unmixing. In this paper, we develop a novel pipeline for building trees of tumor evolution from the unmixed tumor data. The pipeline implements a statistical approach for identifying robust progression markers from unmixed tumor data and calling those markers in inferred cell states. The result is a set of phylogenetic characters and their assignments in progression states to which we apply maximum parsimony phylogenetic inference to infer tumor progression pathways. We demonstrate the full pipeline on simulated and real comparative genomic hybridization (CGH) data, validating its effectiveness and making novel predictions of major progression pathways and ancestral cell states in breast cancers.
Nature microbiology | 2018
Raaj S. Mehta; David A. Drew; Jason Lloyd-Price; Ayshwarya Subramanian; Paul Lochhead; Amit Joshi; Kerry L. Ivey; Hamed Khalili; Gordon T. Brown; Casey DuLong; Mingyang Song; Long H. Nguyen; Himel Mallick; Eric B. Rimm; Jacques Izard; Curtis Huttenhower; Andrew T. Chan
Characterizing the stability of the gut microbiome is important to exploit it as a therapeutic target and diagnostic biomarker. We metagenomically and metatranscriptomically sequenced the faecal microbiomes of 308 participants in the Health Professionals Follow-Up Study. Participants provided four stool samples—one pair collected 24–72 h apart and a second pair ~6 months later. Within-person taxonomic and functional variation was consistently lower than between-person variation over time. In contrast, metatranscriptomic profiles were comparably variable within and between subjects due to higher within-subject longitudinal variation. Metagenomic instability accounted for ~74% of corresponding metatranscriptomic instability. The rest was probably attributable to sources such as regulation. Among the pathways that were differentially regulated, most were consistently over- or under-transcribed at each time point. Together, these results suggest that a single measurement of the faecal microbiome can provide long-term information regarding organismal composition and functional potential, but repeated or short-term measures may be necessary for dynamic features identified by metatranscriptomics.Metagenomic and metatranscriptomic analyses of stool samples from 308 individuals over time indicate that longitudinal sampling is important for detecting dynamic functional features of the gut microbiome.
IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2013
Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Computational cancer phylogenetics seeks to enumerate the temporal sequences of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes, and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depend on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here, we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multisample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances.
international symposium on bioinformatics research and applications | 2012
Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Computational cancer phylogenetics seeks to enumerate the temporal sequence of aberrations in tumor evolution, thereby delineating the evolution of possible tumor progression pathways, molecular subtypes and mechanisms of action. We previously developed a pipeline for constructing phylogenies describing evolution between major recurring cell types computationally inferred from whole-genome tumor profiles. The accuracy and detail of the phylogenies, however, depends on the identification of accurate, high-resolution molecular markers of progression, i.e., reproducible regions of aberration that robustly differentiate different subtypes and stages of progression. Here we present a novel hidden Markov model (HMM) scheme for the problem of inferring such phylogenetically significant markers through joint segmentation and calling of multi-sample tumor data. Our method classifies sets of genome-wide DNA copy number measurements into a partitioning of samples into normal (diploid) or amplified at each probe. It differs from other similar HMM methods in its design specifically for the needs of tumor phylogenetics, by seeking to identify robust markers of progression conserved across a set of copy number profiles. We show an analysis of our method in comparison to other methods on both synthetic and real tumor data, which confirms its effectiveness for tumor phylogeny inference and suggests avenues for future advances.
international conference on computational advances in bio and medical sciences | 2014
Ayshwarya Subramanian; Russell Schwartz
Effective management and treatment of cancer is greatly complicated by the rapid evolution and resulting heterogeneity of tumors. In prior work, we showed that phylogenetic study of cell populations in single tumors provides a way to make sense of this heterogeneity and identify robust features of evolutionary processes of single tumors. The introduction of single-cell sequencing has shown great promise for advancing single-tumor phylogenetics, but the volume and high noise of these data present many challenges for studying tumor evolution, especially with regard to the chromosome abnormalities that typically dominate tumor evolution. We propose a reference-free approach to mining genome sequence reads to allow predictive classification of tumors into heterogeneous types and reconstruct models of their evolution. The approach extracts k-mer counts from single-cell tumor sequences, using differences in normalized k-mer frequencies as a proxy for overall evolutionary distance between distinct cells. The approach is computationally more efficient in time and space than standard protocols for deriving phylogenetic markers, which rely on first aligning sequence reads to a reference genome and then processing the data downstream to extract meaningful progression markers and use them to construct phylogenetic trees. The approach also provides a way to bypass some of the challenges that massive genome rearrangement typical of tumor genomes present for reference-based methods. To handle the unique challenges of single-cell sequencing data, we have applied a series of noise correction measures intended to account for biases due to the sequencing technology. We illustrate the method using publicly available tumor single cell sequencing data. Phylogenies built from these k-mer spectrum distance matrices yield splits that are statistically significant when tested for their ability to partition cells at different stages of cancer.
Cancer Research | 2014
Salim A. Chowdhury; Ayshwarya Subramanian; Alejandro A. Schäffer; Stanley E. Shackney; Darawalee Wangsa; Kerstin Heselmeyer-Haddad; Thomas Ried; Russell Schwartz
Proceedings: AACR Annual Meeting 2014; April 5-9, 2014; San Diego, CA We describe computational methods to compute likely evolutionary histories from tumor single-cell copy number data and next generation sequencing data and apply the methods to data collected from diverse types of tumors. Experimental techniques for assessing heterogeneity in tumor cell populations have undergone great advances, but these improvements have created a great need for more sophisticated computer algorithms capable of making sense of these data sources in terms of coherent models of tumor evolution. We have addressed this problem by developing computer algorithms for building phylogenetic trees describing evolution of individual tumors based on copy numbers of fluorescence in situ hybridization (FISH) probes from single cells in these tumors. These algorithms reconstruct evolutionary trees for observed cell populations so as to heuristically minimize the number of mutational events needed to explain the observed combinations of probe counts by evolution from a common diploid ancestral cell. We have extended this work from initial simple evolutionary models of evolution by single copy number changes to account for distinct mechanisms of evolution at the gene, chromosome, or whole-genome scale, with potentially different rates of evolution by mutation type. We have applied these algorithms to several FISH data sets, including cervical cancers probed for four genes (LAMP3, PROX1, PRKAA1 and CCND1) measured for up to 250 cells of paired primary and metastatic samples from 16 patients, head-and-neck cancers probed for four genes (TERC, CCND1, EGFR and TP53) measured on up to 250 cells per patient for 65 patients at four tumor stages, prostate cancers probed for six genes (TBL1XR1, CTTNBP2, MYC, PTEN, MEN1 and PDGFB) measured for up to 407 cells in 6 non-progressive and 7 progressive carcinomas, and breast cancers probed for eight genes (COX-2, MYC, CCND1, HER-2, ZNF217, DBC2, CDH1 and TP53) measured on up to 220 cells of paired of ductal carcinoma in situ and invasive ductal carcinoma samples from 13 patients. We have then applied statistical and machine learning analysis to examine the ability of these trees to classify tumors by stage or potential for progression. The evolutionary tree models reveal robust features of evolutionary processes distinguishing progression stages and predicting future progression that lead to improved classification accuracy relative to predictions from cellular heterogeneity data alone. Our software is freely available at ftp://ftp.ncbi.nlm.nih.gov/pub/FISHtrees. In continuing work, we are exploring extension of these approaches to better modeling and analysis of tumor evolution using single-cell sequencing data and to more detailed models of tumor evolution. Citation Format: Salim A. Chowdhury, Ayshwarya Subramanian, Alejandro A. Schaffer, Stanley E. Shackney, Darawalee Wangsa, Kerstin Heselmeyer-Haddad, Thomas Ried, Russell Schwartz. Inferring evolutionary models of tumor progression from single-cell heterogeneity data. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 5338. doi:10.1158/1538-7445.AM2014-5338
Archive | 2013
Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Tumor phylogenetics is a strategy for interpreting the evolution of tumors using computer algorithms for phylogenetics, i.e., the inference of evolutionary trees. The approach takes advantage of a large body of phylogenetic theory and algorithms, developed primarily for inferring evolution among species, to interpret complex tumor data sets as evidence for evolutionary processes. The result is a tumor phylogeny, or phylogenetic tree, a reconstruction of the sequences of mutations that cells within a tumor or class of tumors accumulate over the course of their progression. The goals of finding such trees are to better interpret heterogeneity within and among tumors, identify and classify tumor subtypes, learn markers of progression for key steps in tumor evolution, and enable predictive modeling of likely tumor progression steps that may ultimately assist in diagnosis and treatment. With the rise of whole-genome sequencing data, the need for sophisticated models and algorithms that can make sense of such data has never been more acute. In this chapter, we cover the fundamentals of reconstructing tumor phylogenies with a special focus on next-generation sequencing data and discuss recent research, current trends, and challenges and opportunities currently facing the field.
Cancer Research | 2013
Ayshwarya Subramanian; Stanley E. Shackney; Russell Schwartz
Proceedings: AACR 104th Annual Meeting 2013; Apr 6-10, 2013; Washington, DC Understanding tumors as evolutionary systems is an important area of study with far-reaching implications in diagnostic and treatment paradigms. Computational phylogenetics is a valuable method for inferring tumor evolution in terms of evolutionary trees, phylogenies, where paths in a tree correspond to possible tumor progression pathways. The location of specific cell-types and patient samples in the tree provide information on tumor sub-types and development of heterogeneity. We previously developed a tumor phylogeny inference pipeline for array comparative genome hybridization (aCGH)-based tumor copy number profiles. Steps in the pipeline included extraction of robust progression markers from the data, which could differentiate stages of tumor evolution or the different paths in the tree, and assigning amplification states to the inferred markers in those stages. We introduced a novel multi-sample model for amplicon identification and calling, HMMCNA, which jointly extracted markers from and assigned amplification states to small sets of tumor aCGH profiles. HMMCNA employs a Hidden Markov Model (HMM), a probabilistic model, to classify data into normal and amplified states based on an underlying distribution for the two copy number states and a hidden state space of possible amplification states. We assumed two possible amplification states per sample: normal (0) or amplified (1). Joint segmentation and calling is performed by identifying a most likely sequence of amplification states across all genomic sites probes and samples. This approach limits in the number of samples the HMM can handle since the number of possible hidden amplification states increases exponentially with the number of samples. Here, we present an extension of the approach to handle large datasets. We incorporate a heuristic prior to the HMM classification to reduce the hidden state space by first screening out amplification states not strongly supported at any individual genome coordinates. The introduction of this heuristic reduces the state space on average by 99%. We further reduce the set of possible amplification states based on the frequency of occurrence of the states by only allowing those states occuring at multiple aCGH probes or array genome coordinate. This step accounts for the presence of random noise in the data and gives a further reduction of 80%. We demonstrate the method on a breast tumor aCGH dataset comprising copy number profiles derived from sectioned biopsy samples (NCBI GEO [GSE16672][1], Navin et al., 2010). Our method was able to quickly segment the data into sets of robust normal and amplified segments suitable for downstream phylogeny building. The amplicons inferred carried several known markers of tumor progression. Further steps include tuning the parameters of the HMM to handle noise-levels across different datasets. Citation Format: Ayshwarya Subramanian, Stanley Shackney, Russell Schwartz. Inference of tumor phylogenetic markers from large copy number datasets. [abstract]. In: Proceedings of the 104th Annual Meeting of the American Association for Cancer Research; 2013 Apr 6-10; Washington, DC. Philadelphia (PA): AACR; Cancer Res 2013;73(8 Suppl):Abstract nr 5133. doi:10.1158/1538-7445.AM2013-5133 [1]: /lookup/external-ref?link_type=NCBIGEO&access_num=GSE16672&atom=%2Fcanres%2F73%2F8_Supplement%2F5133.atom
Cancer Research | 2012
Ayshwarya Subramanian; Russell Schwartz; Stanley E. Shackney
Computational cancer phylogenetics can play an important role in delineating possible tumor progression pathways and identifying molecular subtypes and mechanisms of action. We previously developed a pipeline for constructing tumor phylogenies from recurring cell types computationally inferred from whole genome copy number data. The accuracy and detail of these tumor phylogenies, however, depends on the identification of accurate and high-resolution molecular markers of progression, i.e., reproducible regions of copy number variation that can be used to robustly differentiate different subtypes and stages of progression. Here we present a new method for the problem using hidden Markov models (HMMs) to derive robust, high resolution progression markers from sets of tumor samples. We demonstrate our method on a publicly available array comparative genome hybridization (aCGH) dataset (NCBI GEO GSE16672, Navin et al., 2010) from sectioned primary ductal breast tumors. Our method uses an HMM, a class of probabilistic models, to classify sets of aCGH data into a partitioning of samples into normal (diploid) or amplified at each copy number probe. It differs from other similar HMM methods primarily in seeking a parsimonious set of combinations of amplification states able to explain all aCGH profiles simultaneously in order to identify robust markers of progression across samples. The model learns frequencies with which different combinations of amplifications are observed across the samples by modeling individual probes as Gaussian random variables with either normal or tetraploid means, with data more consistent with tetraploid being classified as amplified and those more consistent with diploid classified as normal. To handle a combinatorial explosion in combinations of amplification states with increasing numbers of samples, the method introduces a Gibbs sampling algorithm to learn a parsimonious model of the most frequently occurring combinations of amplification states. We applied our methods to a previously constructed set of inferred cell types derived from the Navin et al. data (Tolliver et al., 2010) and to a comparison set of 9 random samples from the raw aCGH data. We validated our model relative to manual labeling of amplicons on the same data. In both experiments, the HMM method was able to pick up significantly larger numbers of robustly amplified segments per chromosome than did prior methods or manual analysis. The resulting segments can be directly fed into downstream analysis routines for phylogeny inference or other predictions. In future work, the HMM method may be improved by fine-tuning the underlying model for copy number variation. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr 3964. doi:1538-7445.AM2012-3964