Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Samuel L. Wolock is active.

Publication


Featured researches published by Samuel L. Wolock.


Nature | 2018

Clonal analysis of lineage fate in native haematopoiesis

Alejo Rodriguez-Fraticelli; Samuel L. Wolock; Caleb Weinreb; Riccardo Panero; Sachin Patel; Maja Jankovic; Jianlong Sun; Raffaele Calogero; Allon M. Klein; Fernando D. Camargo

Haematopoiesis, the process of mature blood and immune cell production, is functionally organized as a hierarchy, with self-renewing haematopoietic stem cells and multipotent progenitor cells sitting at the very top. Multiple models have been proposed as to what the earliest lineage choices are in these primitive haematopoietic compartments, the cellular intermediates, and the resulting lineage trees that emerge from them. Given that the bulk of studies addressing lineage outcomes have been performed in the context of haematopoietic transplantation, current models of lineage branching are more likely to represent roadmaps of lineage potential than native fate. Here we use transposon tagging to clonally trace the fates of progenitors and stem cells in unperturbed haematopoiesis. Our results describe a distinct clonal roadmap in which the megakaryocyte lineage arises largely independently of other haematopoietic fates. Our data, combined with single-cell RNA sequencing, identify a functional hierarchy of unilineage- and oligolineage-producing clones within the multipotent progenitor population. Finally, our results demonstrate that traditionally defined long-term haematopoietic stem cells are a significant source of megakaryocyte-restricted progenitors, suggesting that the megakaryocyte lineage is the predominant native fate of long-term haematopoietic stem cells. Our study provides evidence for a substantially revised roadmap for unperturbed haematopoiesis, and highlights unique properties of multipotent progenitors and haematopoietic stem cells in situ.


Bioinformatics | 2018

SPRING: a kinetic interface for visualizing high dimensional single-cell expression data

Caleb Weinreb; Samuel L. Wolock; Allon M. Klein

Abstract Motivation Single-cell gene expression profiling technologies can map the cell states in a tissue or organism. As these technologies become more common, there is a need for computational tools to explore the data they produce. In particular, visualizing continuous gene expression topologies can be improved, since current tools tend to fragment gene expression continua or capture only limited features of complex population topologies. Results Force-directed layouts of k-nearest-neighbor graphs can visualize continuous gene expression topologies in a manner that preserves high-dimensional relationships and captures complex population topologies. We describe SPRING, a pipeline for data filtering, normalization and visualization using force-directed layouts and show that it reveals more detailed biological relationships than existing approaches when applied to branching gene expression trajectories from hematopoietic progenitor cells and cells of the upper airway epithelium. Visualizations from SPRING are also more reproducible than those of stochastic visualization methods such as tSNE, a state-of-the-art tool. We provide SPRING as an interactive web-tool with an easy to use GUI. Availability and implementation https://kleintools.hms.harvard.edu/tools/spring.html, https://github.com/AllonKleinLab/SPRING/. Supplementary information Supplementary data are available at Bioinformatics online.


Nature | 2018

Population snapshots predict early haematopoietic and erythroid hierarchies

Betsabeh Khoramian Tusi; Samuel L. Wolock; Caleb Weinreb; Yung Hwang; Daniel Hidalgo; Rapolas Zilionis; Ari Waisman; Jun R. Huh; Allon M. Klein; Merav Socolovsky

The formation of red blood cells begins with the differentiation of multipotent haematopoietic progenitors. Reconstructing the steps of this differentiation represents a general challenge in stem-cell biology. Here we used single-cell transcriptomics, fate assays and a theory that allows the prediction of cell fates from population snapshots to demonstrate that mouse haematopoietic progenitors differentiate through a continuous, hierarchical structure into seven blood lineages. We uncovered coupling between the erythroid and the basophil or mast cell fates, a global haematopoietic response to erythroid stress and novel growth factor receptors that regulate erythropoiesis. We defined a flow cytometry sorting strategy to purify early stages of erythroid differentiation, completely isolating classically defined burst-forming and colony-forming progenitors. We also found that the cell cycle is progressively remodelled during erythroid development and during a sharp transcriptional switch that ends the colony-forming progenitor stage and activates terminal differentiation. Our work showcases the utility of linking transcriptomic data to predictive fate models, and provides insights into lineage development in vivo.


Proceedings of the National Academy of Sciences of the United States of America | 2018

Fundamental limits on dynamic inference from single-cell snapshots

Caleb Weinreb; Samuel L. Wolock; Betsabeh Khoramian Tusi; Merav Socolovsky; Allon M. Klein

Significance Seeing a snapshot of individuals at different stages of a dynamic process can reveal what the process would look like for a single individual over time. Biologists apply this principle to infer temporal sequences of gene expression states in cells from measurements made at a single moment in time. However, the sparsity and high dimensionality of single-cell data have made inference difficult using formal approaches. Here, we apply recent innovations in spectral graph theory to devise a simple and asymptotically exact algorithm for inferring the unique dynamic solution under defined approximations and apply it to data from bone marrow stem cells. Single-cell expression profiling reveals the molecular states of individual cells with unprecedented detail. Because these methods destroy cells in the process of analysis, they cannot measure how gene expression changes over time. However, some information on dynamics is present in the data: the continuum of molecular states in the population can reflect the trajectory of a typical cell. Many methods for extracting single-cell dynamics from population data have been proposed. However, all such attempts face a common limitation: for any measured distribution of cell states, there are multiple dynamics that could give rise to it, and by extension, multiple possibilities for underlying mechanisms of gene regulation. Here, we describe the aspects of gene expression dynamics that cannot be inferred from a static snapshot alone and identify assumptions necessary to constrain a unique solution for cell dynamics from static snapshots. We translate these constraints into a practical algorithmic approach, population balance analysis (PBA), which makes use of a method from spectral graph theory to solve a class of high-dimensional differential equations. We use simulations to show the strengths and limitations of PBA, and then apply it to single-cell profiles of hematopoietic progenitor cells (HPCs). Cell state predictions from this analysis agree with HPC fate assays reported in several papers over the past two decades. By highlighting the fundamental limits on dynamic inference faced by any method, our framework provides a rigorous basis for dynamic interpretation of a gene expression continuum and clarifies best experimental designs for trajectory reconstruction from static snapshot measurements.


Blood | 2018

A single cell hematopoietic landscape resolves eight lineage trajectories and defects in Kit mutant mice

Joakim S. Dahlin; Fiona Hamey; Blanca Pijuan-Sala; Mairi Shepherd; Winnie Wing Lau; Sonia Nestorowa; Caleb Weinreb; Samuel L. Wolock; Rebecca Hannah; Evangelia Diamanti; David G. Kent; Berthold Göttgens; Nicola K. Wilson

Hematopoietic stem and progenitor cells (HSPCs) maintain the adult blood system, and their dysregulation causes a multitude of diseases. However, the differentiation journeys toward specific hematopoietic lineages remain ill defined, and system-wide disease interpretation remains challenging. Here, we have profiled 44 802 mouse bone marrow HSPCs using single-cell RNA sequencing to provide a comprehensive transcriptional landscape with entry points to 8 different blood lineages (lymphoid, megakaryocyte, erythroid, neutrophil, monocyte, eosinophil, mast cell, and basophil progenitors). We identified a common basophil/mast cell bone marrow progenitor and characterized its molecular profile at the single-cell level. Transcriptional profiling of 13 815 HSPCs from the c-Kit mutant (W41/W41) mouse model revealed the absence of a distinct mast cell lineage entry point, together with global shifts in cell type abundance. Proliferative defects were accompanied by reduced Myc expression. Potential compensatory processes included upregulation of the integrated stress response pathway and downregulation of proapoptotic gene expression in erythroid progenitors, thus providing a template of how large-scale single-cell transcriptomic studies can bridge between molecular phenotypes and quantitative population changes.


Journal of Child Psychology and Psychiatry | 2013

Gene × smoking interactions on human brain gene expression: finding common mechanisms in adolescents and adults

Samuel L. Wolock; Andrew R. Yates; Stephen A. Petrill; Jason W. Bohland; Clancy Blair; Ning Li; Raghu Machiraju; Kun Huang; Christopher W. Bartlett

BACKGROUND Numerous studies have examined gene × environment interactions (G × E) in cognitive and behavioral domains. However, these studies have been limited in that they have not been able to directly assess differential patterns of gene expression in the human brain. Here, we assessed G × E interactions using two publically available datasets to assess if DNA variation is associated with post-mortem brain gene expression changes based on smoking behavior, a biobehavioral construct that is part of a complex system of genetic and environmental influences. METHODS We conducted an expression quantitative trait locus (eQTL) study on two independent human brain gene expression datasets assessing G × E for selected psychiatric genes and smoking status. We employed linear regression to model the significance of the Gene × Smoking interaction term, followed by meta-analysis across datasets. RESULTS Overall, we observed that the effect of DNA variation on gene expression is moderated by smoking status. Expression of 16 genes was significantly associated with single nucleotide polymorphisms that demonstrated G × E effects. The strongest finding (p = 1.9 × 10⁻¹¹) was neurexin 3-alpha (NRXN3), a synaptic cell-cell adhesion molecule involved in maintenance of neural connections (such as the maintenance of smoking behavior). Other significant G × E associations include four glutamate genes. CONCLUSIONS This is one of the first studies to demonstrate G × E effects within the human brain. In particular, this study implicated NRXN3 in the maintenance of smoking. The effect of smoking on NRXN3 expression and downstream behavior is different based upon SNP genotype, indicating that DNA profiles based on SNPs could be useful in understanding the effects of smoking behaviors. These results suggest that better measurement of psychiatric conditions, and the environment in post-mortem brain studies may yield an important avenue for understanding the biological mechanisms of G × E interactions in psychiatry.


Human Heredity | 2015

Molecular Genetic Evidence for Shared Etiology of Autism and Prodigy

Joanne Ruthsatz; Stephen A. Petrill; Ning Li; Samuel L. Wolock; Christopher W. Bartlett

Child prodigies are rare individuals with an exceptional working memory and unique attentional skills that may facilitate the attainment of professional skill levels at an age well before what is observed in the general population. Some characteristics of prodigy have been observed to be quantitatively similar to those observed in autism spectrum disorder (ASD), suggesting possible shared etiology, though objectively validated prodigies are so rare that evidence has been sparse. We performed a family-based genome-wide linkage analysis on 5 nuclear and extended families to search for genetic loci that influence the presence of both prodigy and ASD, assuming that the two traits have the same genetic etiology in the analysis model in order to find shared loci. A shared locus on chromosome 1p31-q21 reached genome-wide significance with two extended family-based linkage methods consisting of the Bayesian PPL method and the LOD score maximized over the trait parameters (i.e., MOD), yielding a simulation-based empirical significance of p = 0.000742 and p = 0.000133, respectively. Within linkage regions, we performed association analysis and assessed if copy number variants could account for the linkage signal. No evidence of specificity for either the prodigy or the ASD trait was observed. This finding suggests that a locus on chromosome 1 increases the likelihood of both prodigy and autism in these families.


Cancer Informatics | 2014

StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data.

Robert Wolfgang Rumpf; Samuel L. Wolock; William C. Ray

As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria – effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene–SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.


bioRxiv | 2018

Scrublet: computational identification of cell doublets in single-cell transcriptomic data

Samuel L. Wolock; Romain Lopez; Allon M. Klein

Single-cell RNA-sequencing has become a widely used, powerful approach for studying cell populations. However, these methods often generate multiplet artifacts, where two or more cells receive the same barcode, resulting in a hybrid transcriptome. In most experiments, multiplets account for several percent of transcriptomes and can confound downstream data analysis. Here, we present Scrublet (Single-Cell Remover of Doublets), a framework for predicting the impact of multiplets in a given analysis and identifying problematic multiplets. Scrublet avoids the need for expert knowledge or cell clustering by simulating multiplets from the data and building a nearest neighbor classifier. To demonstrate the utility of this approach, we test Scrublet on several datasets that include independent knowledge of cell multiplets.


BMC Bioinformatics | 2014

Addressing the unmet need for visualizing conditional random fields in biological data

William C. Ray; Samuel L. Wolock; Nicholas Callahan; Min Dong; Qingshun Quinn Li; Chun Liang; Thomas J. Magliery; Christopher W. Bartlett

BackgroundThe biological world is replete with phenomena that appear to be ideally modeled and analyzed by one archetypal statistical framework - the Graphical Probabilistic Model (GPM). The structure of GPMs is a uniquely good match for biological problems that range from aligning sequences to modeling the genome-to-phenome relationship. The fundamental questions that GPMs address involve making decisions based on a complex web of interacting factors. Unfortunately, while GPMs ideally fit many questions in biology, they are not an easy solution to apply. Building a GPM is not a simple task for an end user. Moreover, applying GPMs is also impeded by the insidious fact that the “complex web of interacting factors” inherent to a problem might be easy to define and also intractable to compute upon.DiscussionWe propose that the visualization sciences can contribute to many domains of the bio-sciences, by developing tools to address archetypal representation and user interaction issues in GPMs, and in particular a variety of GPM called a Conditional Random Field(CRF). CRFs bring additional power, and additional complexity, because the CRF dependency network can be conditioned on the query data.ConclusionsIn this manuscript we examine the shared features of several biological problems that are amenable to modeling with CRFs, highlight the challenges that existing visualization and visual analytics paradigms induce for these data, and document an experimental solution called StickWRLD which, while leaving room for improvement, has been successfully applied in several biological research projects.Software and tutorials are available at http://www.stickwrld.org/

Collaboration


Dive into the Samuel L. Wolock's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Betsabeh Khoramian Tusi

University of Massachusetts Medical School

View shared research outputs
Top Co-Authors

Avatar

Merav Socolovsky

University of Massachusetts Medical School

View shared research outputs
Top Co-Authors

Avatar

Yung Hwang

University of Massachusetts Medical School

View shared research outputs
Top Co-Authors

Avatar

Christopher W. Bartlett

The Research Institute at Nationwide Children's Hospital

View shared research outputs
Top Co-Authors

Avatar

Daniel Hidalgo

University of Massachusetts Medical School

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ning Li

Nationwide Children's Hospital

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge