Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Neil R. Clark is active.

Publication


Featured researches published by Neil R. Clark.


BMC Bioinformatics | 2013

Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool

Edward Y. Chen; Christopher M. Tan; Yan Kou; Qiaonan Duan; Zichen Wang; Gabriela Vaz Meirelles; Neil R. Clark; Avi Ma’ayan

BackgroundSystem-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement.ResultsHere, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios.ConclusionsEnrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.


BMC Bioinformatics | 2014

The characteristic direction: a geometrical approach to identify differentially expressed genes

Neil R. Clark; Kevin Hu; Axel S Feldmann; Yan Kou; Edward Y. Chen; Qiaonan Duan; Avi Ma’ayan

BackgroundIdentifying differentially expressed genes (DEG) is a fundamental step in studies that perform genome wide expression profiling. Typically, DEG are identified by univariate approaches such as Significance Analysis of Microarrays (SAM) or Linear Models for Microarray Data (LIMMA) for processing cDNA microarrays, and differential gene expression analysis based on the negative binomial distribution (DESeq) or Empirical analysis of Digital Gene Expression data in R (edgeR) for RNA-seq profiling.ResultsHere we present a new geometrical multivariate approach to identify DEG called the Characteristic Direction. We demonstrate that the Characteristic Direction method is significantly more sensitive than existing methods for identifying DEG in the context of transcription factor (TF) and drug perturbation responses over a large number of microarray experiments. We also benchmarked the Characteristic Direction method using synthetic data, as well as RNA-Seq data. A large collection of microarray expression data from TF perturbations (73 experiments) and drug perturbations (130 experiments) extracted from the Gene Expression Omnibus (GEO), as well as an RNA-Seq study that profiled genome-wide gene expression and STAT3 DNA binding in two subtypes of diffuse large B-cell Lymphoma, were used for benchmarking the method using real data. ChIP-Seq data identifying DNA binding sites of the perturbed TFs, as well as known drug targets of the perturbing drugs, were used as prior knowledge silver-standard for validation. In all cases the Characteristic Direction DEG calling method outperformed other methods. We find that when drugs are applied to cells in various contexts, the proteins that interact with the drug-targets are differentially expressed and more of the corresponding genes are discovered by the Characteristic Direction method. In addition, we show that the Characteristic Direction conceptualization can be used to perform improved gene set enrichment analyses when compared with the gene-set enrichment analysis (GSEA) and the hypergeometric test.ConclusionsThe application of the Characteristic Direction method may shed new light on relevant biological mechanisms that would have remained undiscovered by the current state-of-the-art DEG methods. The method is freely accessible via various open source code implementations using four popular programming languages: R, Python, MATLAB and Mathematica, all available at: http://www.maayanlab.net/CD.


Nature Communications | 2016

Extraction and analysis of signatures from the Gene Expression Omnibus by the crowd.

Zichen Wang; Caroline D. Monteiro; Kathleen M. Jagodnik; Nicolas F. Fernandez; Gregory W. Gundersen; Andrew D. Rouillard; Sherry L. Jenkins; Axel S Feldmann; Kevin Hu; Michael G. McDermott; Qiaonan Duan; Neil R. Clark; Matthew R. Jones; Yan Kou; Troy Goff; Holly Woodland; Fabio M R. Amaral; Gregory L. Szeto; Oliver Fuchs; Sophia Miryam Schüssler-Fiorenza Rose; Shvetank Sharma; Uwe Schwartz; Xabier Bengoetxea Bausela; Maciej Szymkiewicz; Vasileios Maroulis; Anton Salykin; Carolina M. Barra; Candice D. Kruth; Nicholas J. Bongio; Vaibhav Mathur

Gene expression data are accumulating exponentially in public repositories. Reanalysis and integration of themed collections from these studies may provide new insights, but requires further human curation. Here we report a crowdsourcing project to annotate and reanalyse a large number of gene expression profiles from Gene Expression Omnibus (GEO). Through a massive open online course on Coursera, over 70 participants from over 25 countries identify and annotate 2,460 single-gene perturbation signatures, 839 disease versus normal signatures, and 906 drug perturbation signatures. All these signatures are unique and are manually validated for quality. Global analysis of these signatures confirms known associations and identifies novel associations between genes, diseases and drugs. The manually curated signatures are used as a training set to develop classifiers for extracting similar signatures from the entire GEO repository. We develop a web portal to serve these signatures for query, download and visualization.


npj Systems Biology and Applications | 2016

L1000CDS2: LINCS L1000 characteristic direction signatures search engine

Qiaonan Duan; St. Patrick Reid; Neil R. Clark; Zichen Wang; Nicolas F. Fernandez; Andrew D. Rouillard; Ben Readhead; Sarah R. Tritsch; Rachel Hodos; Marc Hafner; Mario Niepel; Peter K. Sorger; Joel T. Dudley; Sina Bavari; Rekha G. Panchal; Avi Ma’ayan

The library of integrated network-based cellular signatures (LINCS) L1000 data set currently comprises of over a million gene expression profiles of chemically perturbed human cell lines. Through unique several intrinsic and extrinsic benchmarking schemes, we demonstrate that processing the L1000 data with the characteristic direction (CD) method significantly improves signal to noise compared with the MODZ method currently used to compute L1000 signatures. The CD processed L1000 signatures are served through a state-of-the-art web-based search engine application called L1000CDS2. The L1000CDS2 search engine provides prioritization of thousands of small-molecule signatures, and their pairwise combinations, predicted to either mimic or reverse an input gene expression signature using two methods. The L1000CDS2 search engine also predicts drug targets for all the small molecules profiled by the L1000 assay that we processed. Targets are predicted by computing the cosine similarity between the L1000 small-molecule signatures and a large collection of signatures extracted from the gene expression omnibus (GEO) for single-gene perturbations in mammalian cells. We applied L1000CDS2 to prioritize small molecules that are predicted to reverse expression in 670 disease signatures also extracted from GEO, and prioritized small molecules that can mimic expression of 22 endogenous ligand signatures profiled by the L1000 assay. As a case study, to further demonstrate the utility of L1000CDS2, we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 min. Querying these signatures with L1000CDS2 we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose-dependent efficacy in inhibiting Ebola infection in vitro without causing cellular toxicity in human cell lines. In summary, the L1000CDS2 tool can be applied in many biological and biomedical settings, while improving the extraction of knowledge from the LINCS L1000 resource.


Bioinformatics | 2013

Network2Canvas: network visualization on a canvas with enrichment analysis

Christopher M. Tan; Edward Y. Chen; Ruth Dannenfelser; Neil R. Clark; Avi Ma’ayan

MOTIVATION Networks are vital to computational systems biology research, but visualizing them is a challenge. For networks larger than ∼100 nodes and ∼200 links, ball-and-stick diagrams fail to convey much information. To address this, we developed Network2Canvas (N2C), a web application that provides an alternative way to view networks. N2C visualizes networks by placing nodes on a square toroidal canvas. The network nodes are clustered on the canvas using simulated annealing to maximize local connections where a nodes brightness is made proportional to its local fitness. The interactive canvas is implemented in HyperText Markup Language (HTML)5 with the JavaScript library Data-Driven Documents (D3). We applied N2C to visualize 30 canvases made from human and mouse gene-set libraries and 6 canvases made from the Food and Drug Administration (FDA)-approved drug-set libraries. Given lists of genes or drugs, enriched terms are highlighted on the canvases, and their degree of clustering is computed. Because N2C produces visual patterns of enriched terms on canvases, a trained eye can detect signatures instantly. In summary, N2C provides a new flexible method to visualize large networks and can be used to perform and visualize gene-set and drug-set enrichment analyses. AVAILABILITY N2C is freely available at http://www.maayanlab.net/N2C and is open source. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Journal of The American Society of Nephrology | 2013

Renoprotective Effect of Combined Inhibition of Angiotensin-Converting Enzyme and Histone Deacetylase

Yifei Zhong; Edward Y. Chen; Ruijie Liu; Peter Y. Chuang; Sandeep K. Mallipattu; Christopher M. Tan; Neil R. Clark; Yueyi Deng; Paul E. Klotman; Avi Ma'ayan; John Cijiang He

The Connectivity Map database contains microarray signatures of gene expression derived from approximately 6000 experiments that examined the effects of approximately 1300 single drugs on several human cancer cell lines. We used these data to prioritize pairs of drugs expected to reverse the changes in gene expression observed in the kidneys of a mouse model of HIV-associated nephropathy (Tg26 mice). We predicted that the combination of an angiotensin-converting enzyme (ACE) inhibitor and a histone deacetylase inhibitor would maximally reverse the disease-associated expression of genes in the kidneys of these mice. Testing the combination of these inhibitors in Tg26 mice revealed an additive renoprotective effect, as suggested by reduction of proteinuria, improvement of renal function, and attenuation of kidney injury. Furthermore, we observed the predicted treatment-associated changes in the expression of selected genes and pathway components. In summary, these data suggest that the combination of an ACE inhibitor and a histone deacetylase inhibitor could have therapeutic potential for various kidney diseases. In addition, this study provides proof-of-concept that drug-induced expression signatures have potential use in predicting the effects of combination drug therapy.


BMC Bioinformatics | 2012

Genes2FANs: connecting genes through functional association networks.

Ruth Dannenfelser; Neil R. Clark; Avi Ma'ayan

BackgroundProtein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent.ResultsGenes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories.ConclusionsGenes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in many cancers are mostly connected through PPIs whereas other complex diseases, such as autism and type-2 diabetes, are mostly connected through FANs without PPIs, can guide better strategies for disease gene discovery. Genes2FANs is available at: http://actin.pharm.mssm.edu/genes2FANs.


Bioinformatics | 2016

Drug-induced adverse events prediction with the LINCS L1000 data.

Zichen Wang; Neil R. Clark; Avi Ma'ayan

MOTIVATION Adverse drug reactions (ADRs) are a central consideration during drug development. Here we present a machine learning classifier to prioritize ADRs for approved drugs and pre-clinical small-molecule compounds by combining chemical structure (CS) and gene expression (GE) features. The GE data is from the Library of Integrated Network-based Cellular Signatures (LINCS) L1000 dataset that measured changes in GE before and after treatment of human cells with over 20 000 small-molecule compounds including most of the FDA-approved drugs. Using various benchmarking methods, we show that the integration of GE data with the CS of the drugs can significantly improve the predictability of ADRs. Moreover, transforming GE features to enrichment vectors of biological terms further improves the predictive capability of the classifiers. The most predictive biological-term features can assist in understanding the drug mechanisms of action. Finally, we applied the classifier to all  >20 000 small-molecules profiled, and developed a web portal for browsing and searching predictive small-molecule/ADR connections. AVAILABILITY AND IMPLEMENTATION The interface for the adverse event predictions for the  >20 000 LINCS compounds is available at http://maayanlab.net/SEP-L1000/ CONTACT: [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


BMC Systems Biology | 2012

Sets2Networks: network inference from repeated observations of sets

Neil R. Clark; Ruth Dannenfelser; Christopher M. Tan; Michael E Komosinski; Avi Ma'ayan

BackgroundThe skeleton of complex systems can be represented as networks where vertices represent entities, and edges represent the relations between these entities. Often it is impossible, or expensive, to determine the network structure by experimental validation of the binary interactions between every vertex pair. It is usually more practical to infer the network from surrogate observations. Network inference is the process by which an underlying network of relations between entities is determined from indirect evidence. While many algorithms have been developed to infer networks from quantitative data, less attention has been paid to methods which infer networks from repeated co-occurrence of entities in related sets. This type of data is ubiquitous in the field of systems biology and in other areas of complex systems research. Hence, such methods would be of great utility and value.ResultsHere we present a general method for network inference from repeated observations of sets of related entities. Given experimental observations of such sets, we infer the underlying network connecting these entities by generating an ensemble of networks consistent with the data. The frequency of occurrence of a given link throughout this ensemble is interpreted as the probability that the link is present in the underlying real network conditioned on the data. Exponential random graphs are used to generate and sample the ensemble of consistent networks, and we take an algorithmic approach to numerically execute the inference method. The effectiveness of the method is demonstrated on synthetic data before employing this inference approach to problems in systems biology and systems pharmacology, as well as to construct a co-authorship collaboration network. We predict direct protein-protein interactions from high-throughput mass-spectrometry proteomics, integrate data from Chip-seq and loss-of-function/gain-of-function followed by expression data to infer a network of associations between pluripotency regulators, extract a network that connects 53 cancer drugs to each other and to 34 severe adverse events by mining the FDA’s Adverse Events Reporting Systems (AERS), and construct a co-authorship network that connects Mount Sinai School of Medicine investigators. The predicted networks and online software to create networks from entity-set libraries are provided online at http://www.maayanlab.net/S2N.ConclusionsThe network inference method presented here can be applied to resolve different types of networks in current systems biology and systems pharmacology as well as in other fields of research.


Science Signaling | 2011

Introduction to Statistical Methods to Analyze Large Data Sets: Principal Components Analysis

Neil R. Clark; Avi Ma'ayan

Lecture notes, slides, and problem sets introduce the mathematical concepts behind principal components analysis. This Teaching Resource provides lecture notes, slides, and a problem set for a series of lectures from a course entitled “Systems Biology: Biomedical Modeling.” The materials are a lecture introducing the mathematical concepts behind principal components analysis (PCA). The lecture describes how to handle large data sets with correlation methods and unsupervised clustering with this popular method of analysis, PCA.

Collaboration


Dive into the Neil R. Clark's collaboration.

Top Co-Authors

Avatar

Avi Ma’ayan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Avi Ma'ayan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Zichen Wang

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Qiaonan Duan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Yan Kou

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Christopher M. Tan

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Edward Y. Chen

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Andrew D. Rouillard

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Kevin Hu

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Ruth Dannenfelser

Icahn School of Medicine at Mount Sinai

View shared research outputs
Researchain Logo
Decentralizing Knowledge