Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zichen Wang is active.

Publication


Featured researches published by Zichen Wang.


BMC Bioinformatics | 2013

Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

Edward Y. Chen; Christopher M. Tan; Yan Kou; Qiaonan Duan; Zichen Wang; Gabriela Vaz Meirelles; Neil R. Clark; Avi Ma’ayan

BackgroundSystem-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement.ResultsHere, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios.ConclusionsEnrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.


Nucleic Acids Research | 2016

Enrichr: a comprehensive gene set enrichment analysis web server 2016 update

Maxim V. Kuleshov; Matthew R. Jones; Andrew D. Rouillard; Nicolas F. Fernandez; Qiaonan Duan; Zichen Wang; Simon Koplev; Sherry L. Jenkins; Kathleen M. Jagodnik; Alexander Lachmann; Michael G. McDermott; Caroline D. Monteiro; Gregory W. Gundersen; Avi Ma'ayan

Enrichment analysis is a popular method for analyzing gene sets generated by genome-wide experiments. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for analysis and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as clustergrams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biological knowledge for further biological discoveries. Enrichr is freely available at: http://amp.pharm.mssm.edu/Enrichr.


Database | 2016

The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins

Andrew D. Rouillard; Gregory W. Gundersen; Nicolas F. Fernandez; Zichen Wang; Caroline D. Monteiro; Michael G. McDermott; Avi Ma’ayan

Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene–gene and attribute–attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation. Database URL: http://amp.pharm.mssm.edu/Harmonizome.


Nature Neuroscience | 2016

Polycomb repressive complex 2 (PRC2) silences genes responsible for neurodegeneration

Melanie von Schimmelmann; Philip Feinberg; Josefa M. Sullivan; Stacy M. Ku; Ana Badimon; Mary Kaye Duff; Zichen Wang; Alexander Lachmann; Scott Dewell; Avi Ma'ayan; Ming-Hu Han; Alexander Tarakhovsky; Anne Schaefer

Normal brain function depends on the interaction between highly specialized neurons that operate within anatomically and functionally distinct brain regions. Neuronal specification is driven by transcriptional programs that are established during early neuronal development and remain in place in the adult brain. The fidelity of neuronal specification depends on the robustness of the transcriptional program that supports the neuron type-specific gene expression patterns. Here we show that polycomb repressive complex 2 (PRC2), which supports neuron specification during differentiation, contributes to the suppression of a transcriptional program that is detrimental to adult neuron function and survival. We show that PRC2 deficiency in striatal neurons leads to the de-repression of selected, predominantly bivalent PRC2 target genes that are dominated by self-regulating transcription factors normally suppressed in these neurons. The transcriptional changes in PRC2-deficient neurons lead to progressive and fatal neurodegeneration in mice. Our results point to a key role of PRC2 in protecting neurons against degeneration.


Developmental Cell | 2015

An Integrated Transcriptome Atlas of Embryonic Hair Follicle Progenitors, Their Niche, and the Developing Skin.

Rachel Sennett; Zichen Wang; Ame´ lie Rezza; Laura Grisanti; Nataly Roitershtein; Cristina Sicchio; Ka Wai Mok; Nicholas Heitman; Carlos Clavel; Avi Ma’ayan; Michael Rendl

Defining the unique molecular features of progenitors and their niche requires a genome-wide, whole-tissue approach with cellular resolution. Here, we co-isolate embryonic hair follicle (HF) placode and dermal condensate cells, precursors of adult HF stem cells and the dermal papilla/sheath niche, along with lineage-related keratinocytes and fibroblasts, Schwann cells, melanocytes, and a population inclusive of all remaining skin cells. With next-generation RNA sequencing, we define gene expression patterns in the context of the entire embryonic skin, and through transcriptome cross-comparisons, we uncover hundreds of enriched genes in cell-type-specific signatures. Axon guidance signaling and many other pathway genes are enriched in multiple signatures, implicating these factors in driving the large-scale cellular rearrangements necessary for HF formation. Finally, we share all data in an interactive, searchable companion website. Our study provides an overarching view of signaling within the entire embryonic skin and captures a molecular snapshot of HF progenitors and their niche.


Nature Communications | 2014

Histone H3.3 and its proteolytically processed form drive a cellular senescence programme

Luis F. Duarte; Andrew J. Young; Zichen Wang; Hsan-Au Wu; Taniya Panda; Yan Kou; Avnish Kapoor; Dan Hasson; Nicholas R. Mills; Avi Ma'ayan; Masashi Narita; Emily Bernstein

The process of cellular senescence generates a repressive chromatin environment, however, the role of histone variants and histone proteolytic cleavage in senescence remains unclear. Using models of oncogene-induced and replicative senescence, here we report novel histone H3 tail cleavage events mediated by the protease Cathepsin L. We find that cleaved forms of H3 are nucleosomal and the histone variant H3.3 is the preferred cleaved form of H3. Ectopic expression of H3.3 and its cleavage product (H3.3cs1), which lacks the first twenty-one amino acids of the H3 tail, is sufficient to induce senescence. Further, H3.3cs1 chromatin incorporation is mediated by the HUCA histone chaperone complex. Genome-wide transcriptional profiling revealed that H3.3cs1 facilitates transcriptional silencing of cell cycle regulators including RB/E2F target genes, likely via the permanent removal of H3K4me3. Collectively, our study identifies histone H3.3 and its proteolytically processed forms as key regulators of cellular senescence.


Nature Communications | 2016

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Zichen Wang; Caroline D. Monteiro; Kathleen M. Jagodnik; Nicolas F. Fernandez; Gregory W. Gundersen; Andrew D. Rouillard; Sherry L. Jenkins; Axel S Feldmann; Kevin Hu; Michael G. McDermott; Qiaonan Duan; Neil R. Clark; Matthew R. Jones; Yan Kou; Troy Goff; Holly Woodland; Fabio M R. Amaral; Gregory L. Szeto; Oliver Fuchs; Sophia Miryam Schüssler-Fiorenza Rose; Shvetank Sharma; Uwe Schwartz; Xabier Bengoetxea Bausela; Maciej Szymkiewicz; Vasileios Maroulis; Anton Salykin; Carolina M. Barra; Candice D. Kruth; Nicholas J. Bongio; Vaibhav Mathur

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.


npj Systems Biology and Applications | 2016

L1000CDS2: LINCS L1000 characteristic direction signatures search engine

Qiaonan Duan; St. Patrick Reid; Neil R. Clark; Zichen Wang; Nicolas F. Fernandez; Andrew D. Rouillard; Ben Readhead; Sarah R. Tritsch; Rachel Hodos; Marc Hafner; Mario Niepel; Peter K. Sorger; Joel T. Dudley; Sina Bavari; Rekha G. Panchal; Avi Ma’ayan

The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.


Bioinformatics | 2016

Drug-induced adverse events prediction with the LINCS L1000 data.

Zichen Wang; Neil R. Clark; Avi Ma'ayan

MOTIVATION Adverse drug reactions (ADRs) are a central consideration during drug development. Here we present a machine learning classifier to prioritize ADRs for approved drugs and pre-clinical small-molecule compounds by combining chemical structure (CS) and gene expression (GE) features. The GE data is from the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 dataset that measured changes in GE before and after treatment of human cells with over 20 000 small-molecule compounds including most of the FDA-approved drugs. Using various benchmarking methods, we show that the integration of GE data with the CS of the drugs can significantly improve the predictability of ADRs. Moreover, transforming GE features to enrichment vectors of biological terms further improves the predictive capability of the classifiers. The most predictive biological-term features can assist in understanding the drug mechanisms of action. Finally, we applied the classifier to all  >20 000 small-molecules profiled, and developed a web portal for browsing and searching predictive small-molecule/ADR connections. AVAILABILITY AND IMPLEMENTATION The interface for the adverse event predictions for the  >20 000 LINCS compounds is available at http://maayanlab.net/SEP-L1000/ CONTACT: [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


F1000Research | 2016

An open RNA-Seq data analysis pipeline tutorial with an example of reprocessing data from a recent Zika virus study

Zichen Wang; Avi Ma'ayan

RNA-seq analysis is becoming a standard method for global gene expression profiling. However, open and standard pipelines to perform RNA-seq analysis by non-experts remain challenging due to the large size of the raw data files and the hardware requirements for running the alignment step. Here we introduce a reproducible open source RNA-seq pipeline delivered as an IPython notebook and a Docker image. The pipeline uses state-of-the-art tools and can run on various platforms with minimal configuration overhead. The pipeline enables the extraction of knowledge from typical RNA-seq studies by generating interactive principal component analysis (PCA) and hierarchical clustering (HC) plots, performing enrichment analyses against over 90 gene set libraries, and obtaining lists of small molecules that are predicted to either mimic or reverse the observed changes in mRNA expression. We apply the pipeline to a recently published RNA-seq dataset collected from human neuronal progenitors infected with the Zika virus (ZIKV). In addition to confirming the presence of cell cycle genes among the genes that are downregulated by ZIKV, our analysis uncovers significant overlap with upregulated genes that when knocked out in mice induce defects in brain morphology. This result potentially points to the molecular processes associated with the microcephaly phenotype observed in newborns from pregnant mothers infected with the virus. In addition, our analysis predicts small molecules that can either mimic or reverse the expression changes induced by ZIKV. The IPython notebook and Docker image are freely available at: http://nbviewer.jupyter.org/github/maayanlab/Zika-RNAseq-Pipeline/blob/master/Zika.ipynb and https://hub.docker.com/r/maayanlab/zika/.

Collaboration


Dive into the Zichen Wang's collaboration.

Top Co-Authors

Avatar

Avi Ma'ayan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Avi Ma’ayan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Neil R. Clark

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Qiaonan Duan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Andrew D. Rouillard

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Nicolas F. Fernandez

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Alexander Lachmann

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Caroline D. Monteiro

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Yan Kou

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Gregory W. Gundersen

Icahn School of Medicine at Mount Sinai

View shared research outputs
Researchain Logo
Decentralizing Knowledge