Michael Schnall-Levin
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael Schnall-Levin.
Nature Communications | 2017
Grace X. Y. Zheng; Jessica M. Terry; Phillip Belgrader; Paul Ryvkin; Zachary Bent; Ryan Wilson; Solongo B. Ziraldo; Tobias Daniel Wheeler; Geoff McDermott; Junjie Zhu; Mark T. Gregory; Joe Shuga; Luz Montesclaros; Jason Underwood; Donald A. Masquelier; Stefanie Y. Nishimura; Michael Schnall-Levin; Paul Wyatt; Christopher M. Hindson; Rajiv Bharadwaj; Alexander Wong; Kevin Ness; Lan Beppu; H. Joachim Deeg; Christopher McFarland; Keith R. Loeb; William J. Valente; Nolan G. Ericson; Emily A. Stevens; Jerald P. Radich
Characterizing the transcriptome of individual cells is fundamental to understanding complex biological systems. We describe a droplet-based system that enables 3′ mRNA counting of tens of thousands of single cells per sample. Cell encapsulation, of up to 8 samples at a time, takes place in ∼6 min, with ∼50% cell capture efficiency. To demonstrate the systems technical performance, we collected transcriptome data from ∼250k single cells across 29 samples. We validated the sensitivity of the system and its ability to detect rare populations using cell lines and synthetic RNAs. We profiled 68k peripheral blood mononuclear cells to demonstrate the systems ability to characterize large immune populations. Finally, we used sequence variation in the transcriptome data to determine host and donor chimerism at single-cell resolution from bone marrow mononuclear cells isolated from transplant patients.
Proceedings of the National Academy of Sciences of the United States of America | 2010
Michael Schnall-Levin; Yong Zhao; Norbert Perrimon; Bonnie Berger
MicroRNAs (miRNAs) are a class of short noncoding RNAs that regulate protein-coding genes posttranscriptionally. In animals, most known miRNA targeting occurs within the 3′UTR of mRNAs, but the extent of biologically relevant targeting in the ORF or 5′UTR of mRNAs remains unknown. Here, we develop an algorithm (MinoTar—miRNA ORF Targets) to identify conserved regulatory motifs within protein-coding regions and use it to estimate the number of preferentially conserved miRNA-target sites in ORFs. We show that, in Drosophila, preferentially conserved miRNA targeting in ORFs is as widespread as it is in 3′UTRs and that, while far less abundant, conserved targets in Drosophila 5′UTRs number in the hundreds. Using our algorithm, we predicted a set of high-confidence ORF targets and selected seven miRNA-target pairs from among these for experimental validation. We observed down-regulation by the miRNA in five out of seven cases, indicating our approach can recover functional sites with high confidence. Additionally, we observed additive targeting by multiple sites within a single ORF. Altogether, our results demonstrate that the scale of biologically important miRNA targeting in ORFs is extensive and that computational tools such as ours can aid in the identification of such targets. Further evidence suggests that our results extend to mammals, but that the extent of ORF and 5′UTR targeting relative to 3′UTR targeting may be greater in Drosophila.
Nature Communications | 2015
Tudor A. Fulga; Elizabeth M. McNeill; Richard Binari; Julia Yelick; Alexandra Blanche; Matthew Booker; Bruno R. Steinkraus; Michael Schnall-Levin; Yong Zhao; Todd DeLuca; Fernando Bejarano; Zhe Han; Eric C. Lai; Dennis P. Wall; Norbert Perrimon; David Van Vactor
Although the impact of microRNAs (miRNAs) in development and disease is well established, understanding the function of individual miRNAs remains challenging. Development of competitive inhibitor molecules such as miRNA sponges has allowed the community to address individual miRNA function in vivo. However, the application of these loss-of-function strategies has been limited. Here we offer a comprehensive library of 141 conditional miRNA sponges targeting well-conserved miRNAs in Drosophila. Ubiquitous miRNA sponge delivery and consequent systemic miRNA inhibition uncovers a relatively small number of miRNA families underlying viability and gross morphogenesis, with false discovery rates in the 4–8% range. In contrast, tissue-specific silencing of muscle-enriched miRNAs reveals a surprisingly large number of novel miRNA contributions to the maintenance of adult indirect flight muscle structure and function. A strong correlation between miRNA abundance and physiological relevance is not observed, underscoring the importance of unbiased screens when assessing the contributions of miRNAs to complex biological processes.
international conference on machine learning | 2008
Michael Schnall-Levin; Leonid Chindelevitch; Bonnie Berger
Probabilistic grammatical formalisms such as hidden Markov models (HMMs) and stochastic context-free grammars (SCFGs) have been extensively studied and widely applied in a number of fields. Here, we introduce a new algorithmic problem on HMMs and SCFGs that arises naturally from protein and RNA design, and which has not been previously studied. The problem can be viewed as an inverse to the one solved by the Viterbi algorithm on HMMs or by the CKY algorithm on SCFGs. We study this problem theoretically and obtain the first algorithmic results. We prove that the problem is NP-complete, even for a 3-letter emission alphabet, via a reduction from 3-SAT, a result that has implications for the hardness of RNA secondary structure design. We then develop a number of approaches for making the problem tractable. In particular, for HMMs we develop a branch-and-bound algorithm, which can be shown to have fixed-parameter tractable worst-case running time, exponential in the number of states of the HMM but linear in the length of the structure. We also show how to cast the problem as a Mixed Integer Linear Program.
PLOS ONE | 2013
Clemens Bergwitz; Mark J. Wee; Sumi Sinha; Joanne Hyunjung Huang; Charles DeRobertis; Lawrence B. Mensah; Jonathan Brewer Cohen; Adam Friedman; Meghana M. Kulkarni; Yanhui Hu; Arunachalam Vinayagam; Michael Schnall-Levin; Bonnie Berger; Lizabeth A. Perkins; Stephanie E. Mohr; Norbert Perrimon
Phosphate is required for many important cellular processes and having too little phosphate or too much can cause disease and reduce life span in humans. However, the mechanisms underlying homeostatic control of extracellular phosphate levels and cellular effects of phosphate are poorly understood. Here, we establish Drosophila melanogaster as a model system for the study of phosphate effects. We found that Drosophila larval development depends on the availability of phosphate in the medium. Conversely, life span is reduced when adult flies are cultured on high phosphate medium or when hemolymph phosphate is increased in flies with impaired Malpighian tubules. In addition, RNAi-mediated inhibition of MAPK-signaling by knockdown of Ras85D, phl/D-Raf or Dsor1/MEK affects larval development, adult life span and hemolymph phosphate, suggesting that some in vivo effects involve activation of this signaling pathway by phosphate. To identify novel genetic determinants of phosphate responses, we used Drosophila hemocyte-like cultured cells (S2R+) to perform a genome-wide RNAi screen using MAPK activation as the readout. We identified a number of candidate genes potentially important for the cellular response to phosphate. Evaluation of 51 genes in live flies revealed some that affect larval development, adult life span and hemolymph phosphate levels.
bioRxiv | 2017
Patrick Marks; Sarah Garcia; Alvaro Martinez Barrio; Kamila Belhocine; Jorge Bernate; Rajiv Bharadwaj; Keith Bjornson; Claudia Catalanotti; Josh Delaney; Adrian Fehr; Brendan Galvin; Jill Herschleb; Christopher M. Hindson; Esty Holt; Cassandra Jabara; Susanna Jett; Nikka Keivanfar; Sofia Kyriazopoulou-Panagiotopoulou; Monkol Lek; Bill Lin; Adam J. Lowe; Shazia Mahamdallie; Shamoni Maheshwari; Tony Makarewicz; Jamie Marshall; Francesca Meschi; Chris O'keefe; Heather Ordonez; Pranav Patel; A J Price
Large-scale population based analyses coupled with advances in technology have demonstrated that the human genome is more diverse than originally thought. To date, this diversity has largely been uncovered using short read whole genome sequencing. However, standard short-read approaches, used primarily due to accuracy, throughput and costs, fail to give a complete picture of a genome. They struggle to identify large, balanced structural events, cannot access repetitive regions of the genome and fail to resolve the human genome into its two haplotypes. Here we describe an approach that retains long range information while harnessing the advantages of short reads. Starting from only ∼1ng of DNA, we produce barcoded short read libraries. The use of novel informatic approaches allows for the barcoded short reads to be associated with the long molecules of origin producing a novel datatype known as ‘Linked-Reads’. This approach allows for simultaneous detection of small and large variants from a single Linked-Read library. We have previously demonstrated the utility of whole genome Linked-Reads (lrWGS) for performing diploid, de novo assembly of individual genomes (Weisenfeld et al. 2017). In this manuscript, we show the advantages of Linked-Reads over standard short read approaches for reference based analysis. We demonstrate the ability of Linked-Reads to reconstruct megabase scale haplotypes and to recover parts of the genome that are typically inaccessible to short reads, including phenotypically important genes such as STRC, SMN1 and SMN2. We demonstrate the ability of both lrWGS and Linked-Read Whole Exome Sequencing (lrWES) to identify complex structural variations, including balanced events, single exon deletions, and single exon duplications. The data presented here show that Linked-Reads provide a scalable approach for comprehensive genome analysis that is not possible using short reads alone.
bioRxiv | 2018
Noemi Andor; Billy Lau; Claudia Catalanotti; Vijay Kumar; Anuja Sathe; Kamila Belhocine; Tobias Daniel Wheeler; Andrew D. Price; Maengseok Song; David Stafford; Zachary Bent; Laura DeMare; Lance Hepler; Susana Jett; Bill Lin; Shamoni Maheshwari; Anthony J Makarewicz; Mohammad Rahimi; Sanjam Sawhney; Martin Sauzade; Joe Shuga; Katrina Sullivan-Bibee; Adam Weinstein; Wei Yang; Yifeng Yin; Matthew Kubit; Jiamin Chen; Susan M. Grimes; Carlos Suárez; George A. Poultsides
Sequencing the genomes of individual cancer cells provides the highest resolution of intratumoral heterogeneity. To enable high throughput single cell DNA-Seq across thousands of individual cells per sample, we developed a droplet-based, automated partitioning technology for whole genome sequencing. We applied this approach on a set of gastric cancer cell lines and a primary gastric tumor. In parallel, we conducted a separate single cell RNA-Seq analysis on these same cancers and used copy number to compare results. This joint study, covering thousands of single cell genomes and transcriptomes, revealed extensive cellular diversity based on distinct copy number changes, numerous subclonal populations and in the case of the primary tumor, subclonal gene expression signatures. We found genomic evidence of positive selection – where the percentage of replicating cells per clone is higher than expected – indicating ongoing tumor evolution. Our study demonstrates that joining single cell genomic DNA and transcriptomic features provides novel insights into cancer heterogeneity and biology. SIGNIFICANCE We conducted a massively parallel DNA sequencing analysis on a set of gastric cancer cell lines and a primary gastric tumor in combination with a joint single cell RNA-Seq analysis. This joint study, covering thousands of single cell genomes and transcriptomes, revealed extensive cellular diversity based on distinct copy number changes, numerous subclonal populations and in the case of the primary tumor, subclonal gene expression signatures. We found genomic evidence of positive selection where the percentage of replicating cells per clone is higher than expected indicating ongoing tumor evolution. Our study demonstrates that combining single cell genomic DNA and transcriptomic features provides novel insights into cancer heterogeneity and biology.
Cancer Research | 2018
Claudia Catalanotti; Sarah Garcia; Kamila Belhocine; Vijay Kumar; Zeljko Dzakula; A J Price; Shamoni Maheshwar; Yifeng Yin; Michael Schnall-Levin; Rajiv Bharadwaj; Sara Agee Le; Deanna M. Church
Cancer genomes are highly unstable with new genetic variations emerging even within a single metastatic site, making it difficult to track the causal changes that drive metastasis and treatment resistance. Here we present a two-pronged approach for analyzing the full spectrum of genetic variations present in cancer samples. The first approach allows for comprehensive and high-resolution characterization of a broad range of variant types on bulk tumor sample. While the second approach characterizes structural variation at the level of the single cell, allowing for the exploration of tumor clonality and heterogeneity. At the core of our approaches is a microfluidics platform that enables the production of hundreds of thousands to millions of partitioned barcoded reactions. This platform can partition high-molecular weight DNA or single cells. Together, these complementary approaches provide a more complete picture of the genomic variation and clonal structure present in a tumor. For bulk tumor analysis, we obtained high molecular weight DNA from known cancer cell lines and used the 10x Chromium Genome solution to produce Illumina-ready sequencing libraries. In this workflow, partitioning of a limited amount of genomic DNA allows for haplotype-level dilution of genome equivalents, which are then barcoded to create a novel data type referred to as “Linked-Reads”. These molecular barcodes are used to identify reads originating from the same input molecule providing long range information on highly accurate short reads. In addition to highly accurate SNP calling, this further enables identification of complex structural rearrangements in tumor genomes. To gain insight into tumor heterogeneity and clonal structure, we performed single cell DNA sequencing and analysis using 10x Chromium scDNA solution. This platform integrates single cell encapsulation, cell lysis and DNA barcoding into a streamlined workflow. Molecular barcodes are used to associate reads with individual cells allowing for copy number variant (CNV) detection. We applied our scDNA sequencing method to a variety of cancer cell lines revealing their clonal structure, as identified by CNVs, with the capability to identify as few as 10 cells in a sample size of one thousand cells. Using cluster analyses we were able to detect 100kb scale events and by aggregating reads in large clones we were able to confidently identify smaller CNV events down to tens of kilobases. Using whole genome bulk sequencing we identified more than 500 large structural variants in HCC1954, including balanced and unbalanced events. In this presentation, we will integrate this Linked-Read data with single cell genome analysis on the same samples, and compare the genetic variation revealed by these two approaches. We will further explore the power of combining these data types for a more complete picture of tumor genome dynamic Citation Format: Claudia Catalanotti, Sarah Garcia, Kamila Belhocine, Vijay Kumar, Zeljko Dzakula, Andrew Price, Shamoni Maheshwar, Yifeng Yin, Michael Schnall-Levin, Rajiv Bharadwaj, Sara Agee Le, Deanna M. Church. Characterizing genomic variation and tumor heterogeneity in cancer [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2018; 2018 Apr 14-18; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2018;78(13 Suppl):Abstract nr 3400.
Cancer Research | 2016
Sofia Kyriazopoulou-Panagiotopoulou; Patrick Marks; Heather Ordonez; Kristina Giorda; Cassandra Jabara; Billy Lau; John M. Bell; Michael Schnall-Levin; Hanlee P. Ji
Studies have shown that somatic structural variation (SV) plays a key role in the oncogenic process. Traditionally SVs in the cancer genome have been detected using low resolution cytogenetic approaches, such as FISH, or microarray-based techniques. More recently, next-generation sequencing (NGS)-based technologies have been employed to detect SVs, including indels and translocations. However, both short- and long-read NGS-based approaches are limited in their ability to accurately identify SV events and delineate their breakpoints due to the limitations inherent in assembly of billions of short-read sequences across a heterogeneous cancer sample, as well as the costly and burdensome laboratory infrastructure associated with long-read sequencers. We utilized a novel technology that combines microfluidics and molecular barcoding to generate libraries that are sequenced with an Illumina system. Open-source bioinformatics software produces linked-reads that maintain long-range information and single molecule sensitivity. Cell lines and cancer samples were obtained from commercial sources, and genomic DNA was extracted. DNA sample indexing and partitioning was performed using the 10X Genomicx GemCode instrument. One ng of sample DNA was used as input for each reaction, and DNA molecules were partitioned into droplets to fragment the DNA and introduce molecular barcodes. Following barcoding, droplets were fractured, and library DNA was purified and sequenced on Illumina sequencers. The GemCode Long Ranger software suite was used to map sequencing reads back to original long molecules of DNA, generating reads linked to partition barcodes. Thus we can generate phased sequences covering many 109s to 1009s of kilobases. We first benchmarked the ability to call multiple SV types using a well-characterized germline HapMap sample (NA12878) as well as two recently characterized haploid hydatidiform moles (CHM1 and CHM13) that have been studied with multiple orthogonal technologies. Regions with evidence for structural variation were reassembled into distinct haplotypes. The barcode information allowed us to both phase the structural variants we detected and disambiguate calls within highly repetitive regions, such as segmental duplications. We demonstrated high concordance with alternative approaches across all major classes of SVs, including long insertions and deletions as well as copy-neutral events. In cancer cell lines, we detected well-annotated gene fusions, such as the EML4/ALK and ALK/PTPN3 fusions in the lung cancer cell line NCI-H2228, and the SLC26A/PRKAR2A fusion in the triple negative breast cancer cell line HCC38. Citation Format: Sofia Kyriazopoulou-Panagiotopoulou, Patrick Marks, Haynes Heaton, Heather Ordonez, Kristina Giorda, Cassandra Jabara, Billy Lau, John M. Bell, Michael Schnall-Levin, Hanlee P. Ji. Linked-Reads enable detailed, phased resolution of structural variation in the cancer genome. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 3602.
Cancer Research | 2016
Grace X. Y. Zheng; Tarjei S. Mikkelsen; Jessica M. Terry; Phillip Belgrader; Paul Ryvkin; Ryan Wilson; Tobias Daniel Wheeler; Zachary Bent; Geoff McDermott; Solongo B. Ziraldo; Alex K. Wong; Michael Schnall-Levin; Ben Hindson
Advances in single cell RNA quantification techniques have enabled comprehensive study of subpopulations of cells within a heterogeneous population. The application of single cell quantification techniques to oncology is helping to elucidate the complex variability in genetic and epigenetic interactions that occur within tumor cells and their microenvironment. However, current single cell RNA-sequencing methods are limited by their reliance on costly infrastructure and laborious experimental protocols. We developed the GemCode Platform, which combines microfluidics with molecular barcoding and custom bioinformatics software to enable 3’ mRNA counting from thousands of single cells. Here we utilized the GemCode Platform to profile primary cells from healthy donors and cancer patients. Cell lines and cancer samples were obtained from commercial sources. Single cells, reagents and a single gel bead containing barcoded oligonucleotides were encapsulated into picoliter-sized droplets using the 10X Genomics GemCode Platform. The platform achieved extremely high cell loading efficiency (> 50%), enabling the creation of libraries from precious samples. Lysis and barcoded reverse transcription of RNAs from single cells were performed inside each droplet. High quality next generation sequencing libraries were finished in a single bulk reaction. The GemCode software suite was utilized for processing, interactive analysis and visualization of single cell gene expression data. We demonstrated single cell behavior through mouse- and human cell mixing experiments with a low doublet rate of 40,000 peripheral blood mononuclear cells from healthy donors and detected all major subpopulations (i.e., B cells, CD4+ T cells, CD8+ T cells, NK cells, dendritic cells, monocytes) in similar proportions to those previously reported in the literature. Notably, the high-throughput nature of the platform enabled resolution of finer sub-structures such as CD4+ effector memory cells and CD4+ central memory cells. Experiments comparing cells isolated from patients with hematologic malignancies (such as CLL, AML and CML) with whole bone marrow from healthy donors further demonstrate the power of single cell profiling for characterizing disease-associated changes in complex tissues. We demonstrate the ability to perform high-throughput gene expression profiling of mRNAs in single cells. The high-throughput platform enables detection of rare cells in a heterogeneous tumor population. Moreover, efficient cell loading enables analysis of clinically relevant sample types with limited cell input. An integrated single cell mRNA analysis will lead to novel insights into the molecular characteristics of individual cancer cells and provide targets for therapeutic intervention. Citation Format: Grace X.y. Zheng, Tarjei Mikkelsen, Jessica Terry, Phillip Belgrader, Paul Ryvkin, Ryan Wilson, Tobias D. Wheeler, Zachary Bent, Geoff McDermott, Solongo Ziraldo, Alexander Wong, Michael Schnall-Levin, Ben Hindson. Single cell mRNA quantification from 1000s of cells in healthy and malignant tumor samples using a high-throughput droplet-based system. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 150.