Pankaj Agarwal
GlaxoSmithKline
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Pankaj Agarwal.
Nature Biotechnology | 2012
Philippe Sanseau; Pankaj Agarwal; Michael R. Barnes; Tomi Pastinen; J. Brent Richards; Lon R. Cardon; Vincent Mooser
1. Weaver, T., Maurer, J. & Hayashizaki, Y. Nat. Rev. Genet. 5, 861–866 (2004). 2. Fan, M., Tsai, J., Chen, B., Fan, K. & LaBaer, J. Science 307, 1877 (2005). 3. Campbell, E.G. et al. J. Am. Med. Assoc. 287, 473– 480 (2002). control of reagents by the institution, they can often cause long delays for the researcher looking to obtain these reagents. Addgene has streamlined the technology transfer process by (i) using the universal biological material transfer agreement (UBMTA) as the basis for all transfers, (ii) making the agreements as consistent as possible across all institutions and (iii) allowing for electronic signatures from institutions that both contribute and request materials. This system has been used for >80,000 orders from >2,500 institutions worldwide. As more technology transfer offices have adapted to this system, the time required for MTA approval has been halved, with the median time now <36 h. Moving forward, it would be more efficient for institutions to implement a similar electronic MTA system for all academic resource transfers. Ultimately, BRCs like Addgene will be important for guiding academic laboratories into a new age of high-throughput research and corporate funding. We are seeing a paradigm shift in the pharmaceutical industry toward greater collaborations with academia self-sustaining and does not rely on outside funding. The most popular plasmids in the collection are empty backbones created for specific gene expression or knockdown experiments, control plasmids, and constructs used for generating lentiviruses and retroviruses. A quick look at Addgene’s most requested plasmids, according to laboratory (Table 1), reveals a collection of vectors that can be used in various applications across multiple disciplines. If a BRC like Addgene were not archiving and distributing these valuable reagents, they would be far less accessible to the scientific community3. Indeed, many researchers, especially those outside the discipline of the contributing laboratory, might not even realize that some of these powerful tools exist. Addgene has become a global repository, sending out approximately half of its requests to scientists outside the United States. Addgene now distributes genomic resources for large-scale projects, such as the Zinc Finger Consortium (http://www. zincfingers.org/), the Structural Genomics Consortium (http://www.thesgc.org/) and the Center for Genomic Engineering (http:// www.cge.umn.edu/). Moving forward, Addgene hopes to collaborate with additional groups to help support their archival and distribution efforts. In addition to archiving and distributing a physical reagent, Addgene also plays a crucial role by archiving information about these reagents and making it accessible to all potential users through an online database. Addgene’s website receives an average of 35,000 page views per weekday. Having clone information available helps with reproducibility and future use, especially because checking the accuracy of this information is often an onerous task for many laboratories. Similar to other BRCs, Addgene can handle large volumes of samples and data, which facilitates the development of efficient, large-scale processes for standardizing quality control and maintaining comprehensive databases of information. Currently, Addgene sequences key regions of all incoming constructs, which helps maintain a standardized bar for accuracy throughout the repository. Addgene has developed one of the first electronic material transfer agreement (MTA) systems, which has helped expedite the MTA process. Over the past few decades, there has been an increase in the use of MTAs for transferring reagents between academic and nonprofit organizations. Although MTAs may be a practical means of maintaining Use of genome-wide association studies for drug repositioning
Clinical Pharmacology & Therapeutics | 2013
Mark R. Hurle; Lun Yang; Qing Xie; Deepak K. Rajpal; Philippe Sanseau; Pankaj Agarwal
Traditionally, most drugs have been discovered using phenotypic or target‐based screens. Subsequently, their indications are often expanded on the basis of clinical observations, providing additional benefit to patients. This review highlights computational techniques for systematic analysis of transcriptomics (Connectivity Map, CMap), side effects, and genetics (genome‐wide association study, GWAS) data to generate new hypotheses for additional indications. We also discuss data domains such as electronic health records (EHRs) and phenotypic screening that we consider promising for novel computational repositioning methods.
PLOS ONE | 2011
Lun Yang; Pankaj Agarwal
Drug repositioning helps fully explore indications for marketed drugs and clinical candidates. Here we show that the clinical side-effects (SEs) provide a human phenotypic profile for the drug, and this profile can suggest additional disease indications. We extracted 3,175 SE-disease relationships by combining the SE-drug relationships from drug labels and the drug-disease relationships from PharmGKB. Many relationships provide explicit repositioning hypotheses, such as drugs causing hypoglycemia are potential candidates for diabetes. We built Naïve Bayes models to predict indications for 145 diseases using the SEs as features. The AUC was above 0.8 in 92% of these models. The method was extended to predict indications for clinical compounds, 36% of the models achieved AUC above 0.7. This suggests that closer attention should be paid to the SEs observed in trials not just to evaluate the harmful effects, but also to rationally explore the repositioning potential based on this “clinical phenotypic assay”.
PLOS ONE | 2009
Yong Li; Pankaj Agarwal
It is increasingly evident that human diseases are not isolated from each other. Understanding how different diseases are related to each other based on the underlying biology could provide new insights into disease etiology, classification, and shared biological mechanisms. We have taken a computational approach to studying disease relationships through 1) systematic identification of disease associated genes by literature mining, 2) associating diseases to biological pathways where disease genes are enriched, and 3) linking diseases together based on shared pathways. We identified 4,195 candidate disease associated genes for 1028 diseases. On average, about 50% of disease associated genes of a disease are statistically mapped to pathways. We generated a disease network which consists of 591 diseases and 6,931 disease relationships. We examined properties of this network and provided examples of novel disease relationships which cannot be readily captured through simple literature search or gene overlap analysis. Our results could potentially provide insights into the design of novel, pathway-guided therapeutic interventions for diseases.
Bioinformatics | 2008
Yong Li; Pankaj Agarwal; Dilip Rajagopalan
MOTIVATION Given the complex nature of biological systems, pathways often need to function in a coordinated fashion in order to produce appropriate physiological responses to both internal and external stimuli. Therefore, understanding the interaction and crosstalk between pathways is important for understanding the function of both cells and more complex systems. RESULTS We have developed a computational approach to detect crosstalk among pathways based on protein interactions between the pathway components. We built a global mammalian pathway crosstalk network that includes 580 pathways (covering 4753 genes) with 1815 edges between pathways. This crosstalk network follows a power-law distribution: P(k) approximately k(-)(gamma), gamma = 1.45, where P(k) is the number of pathways with k neighbors, thus pathway interactions may exhibit the same scale-free phenomenon that has been documented for protein interaction networks. We further used this network to understand colorectal cancer progression to metastasis based on transcriptomic data. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Bioinformatics | 2005
Dilip Rajagopalan; Pankaj Agarwal
MOTIVATION A number of omic technologies such as transcriptional profiling, proteomics, literature searches, genetic association, etc. help in the identification of sets of important genes. A subset of these genes may act in a coordinated manner, possibly because they are part of the same biological pathway. Interpreting such gene lists and relating them to pathways is a challenging task. Databases of biological relationships between thousands of mammalian genes can help in deciphering omics data. The relationships between genes can be assembled into a biological network with each protein as a node and each relationship as an edge between two proteins (or nodes). This network may then be searched for subnetworks consisting largely of interesting genes from the omics experiment. The subset of genes in the subnetwork along with the web of relationships between them helps to decipher the underlying pathways. Finding such subnetworks that maximally include all proteins from the query set but few others is the focus for this paper. RESULTS We present a heuristic algorithm and a scoring function that work well both on simulated data and on data from known pathways. The scoring function is an extension of a previous study for a single biological experiment. We use a simple set of heuristics that provide a more efficient solution than the simulated annealing method. We find that our method works on reasonably complex curated networks containing approximately 9000 biological entities (genes and metabolites), and approximately 30,000 biological relationships. We also show that our method can pick up a pathway signal from a query list including a moderate number of genes unrelated to the pathway. In addition, we quantify the sensitivity and specificity of the technique.
The EMBO Journal | 2001
Matthew J. Betts; Roderic Guigó; Pankaj Agarwal; Robert B. Russell
The evolutionary significance of introns remains a mystery. The current availability of several complete eukaryotic genomes permits new studies to probe the possible function of these peculiar genomic features. Here we investigate the degree to which gene structure (intron position, phase and length) is conserved between homologous protein domains. We find that for certain extracellular‐signalling and nuclear domains, gene structures are similar even when pro tein sequence similarity is low or not significant and sequences can only be aligned with a knowledge of protein tertiary structure. In contrast, other domains, including most intracellular signalling modules, show little gene structure conservation. Intriguingly, many domains with conserved gene structures, such as cytokines, are involved in similar biological processes, such as the immune response. This suggests that gene structure conservation may be a record of key events in evolution, such as the origin of the vertebrate immune system or the duplication of nuclear receptors in nematodes. The results suggest ways to detect new and potentially very remote homologues, and to construct phylogenies for proteins with limited sequence similarity.
Nature Reviews Drug Discovery | 2009
Pankaj Agarwal; David B. Searls
Drug discovery must be guided not only by medical need and commercial potential, but also by the areas in which new science is creating therapeutic opportunities, such as target identification and the understanding of disease mechanisms. To systematically identify such areas of high scientific activity, we use bibliometrics and related data-mining methods to analyse over half a terabyte of data, including PubMed abstracts, literature citation data and patent filings. These analyses reveal trends in scientific activity related to disease studied at varying levels, down to individual genes and pathways, and provide methods to monitor areas in which scientific advances are likely to create new therapeutic opportunities.
Genome Medicine | 2014
Jie Cheng; Lun Yang; Vinod Kumar; Pankaj Agarwal
BackgroundConnectivity map data and associated methodologies have become a valuable tool in understanding drug mechanism of action (MOA) and discovering new indications for drugs. One of the key ideas of connectivity map (CMAP) is to measure the connectivity between disease gene expression signatures and compound-induced gene expression profiles. Despite multiple impressive anecdotal validations, only a few systematic evaluations have assessed the accuracy of this aspect of CMAP, and most of these utilize drug-to-drug matching to transfer indications across the two drugs.MethodsTo assess CMAP methodologies in a more direct setting, namely the power of classifying known drug-disease relationships, we evaluated three CMAP-based methods on their prediction performance against a curated dataset of 890 true drug-indication pairs. The disease signatures were generated using Gene Logic BioExpress™ system and the compound profiles were derived from the Connectivity Map database (CMAP, build 02, http://www.broadinstitute.org/CMAP/).ResultsThe similarity scoring algorithm called eXtreme Sum (XSum) performs better than the standard Kolmogorov-Smirnov (KS) statistic in terms of the area under curve and can achieve a four-fold enrichment at 0.01 false positive rate level, with AUC = 2.2E-4, P value = 0.0035.ConclusionConnectivity map can significantly enrich true positive drug-indication pairs given an effective matching algorithm.
european conference on machine learning | 2013
Ping Zhang; Pankaj Agarwal; Zoran Obradovic
Drug repositioning helps identify new indications for marketed drugs and clinical candidates. In this study, we proposed an integrative computational framework to predict novel drug indications for both approved drugs and clinical molecules by integrating chemical, biological and phenotypic data sources. We defined different similarity measures for each of these data sources and utilized a weighted k-nearest neighbor algorithm to transfer similarities of nearest neighbors to prediction scores for a given compound. A large margin method was used to combine individual metrics from multiple sources into a global metric. A large-scale study was conducted to repurpose 1007 drugs against 719 diseases. Experimental results showed that the proposed algorithm outperformed similar previously developed computational drug repositioning approaches. Moreover, the new algorithm also ranked drug information sources based on their contributions to the prediction, thus paving the way for prioritizing multiple data sources and building more reliable drug repositioning models.