Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Anna Gaulton is active.

Publication


Featured researches published by Anna Gaulton.


Nucleic Acids Research | 2012

ChEMBL: a large-scale bioactivity database for drug discovery

Anna Gaulton; Louisa J. Bellis; A. Patrícia Bento; Jon Chambers; Mark Davies; Anne Hersey; Yvonne Light; Shaun McGlinchey; David Michalovich; Bissan Al-Lazikani; John P. Overington

ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.


Nucleic Acids Research | 2014

The ChEMBL bioactivity database: an update

A. Patrícia Bento; Anna Gaulton; Anne Hersey; Louisa J. Bellis; Jon Chambers; Mark Davies; Felix A. Kruger; Yvonne Light; Lora Mak; Shaun McGlinchey; Michał Nowotka; George Papadatos; Rita Santos; John P. Overington

ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services.


Nucleic Acids Research | 2003

PRINTS and its automatic supplement, prePRINTS

Terri K. Attwood; Paul Bradley; Darren R. Flower; Anna Gaulton; Neil Maudling; Alex L. Mitchell; G. Moulton; A. Nordle; Kelly Paine; Paul D. Taylor; A. Uddin; Christianna Zygouri

The PRINTS database houses a collection of protein fingerprints. These may be used to assign uncharacterised sequences to known families and hence to infer tentative functions. The September 2002 release (version 36.0) includes 1800 fingerprints, encoding approximately 11 000 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here the development of an automatic supplement, prePRINTS, designed to increase the coverage of the resource and reduce some of the manual burdens inherent in its maintenance. The databases are accessible for interrogation and searching at http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/.


Nature Reviews Drug Discovery | 2017

A comprehensive map of molecular drug targets

Rita Santos; Oleg Ursu; Anna Gaulton; Bento Ap; Donadi Rs; Cristian G. Bologa; Anna Karlsson; Bissan Al-Lazikani; Anne Hersey; Tudor I. Oprea; John P. Overington

The success of mechanism-based drug discovery depends on the definition of the drug target. This definition becomes even more important as we try to link drug response to genetic variation, understand stratified clinical efficacy and safety, rationalize the differences between drugs in the same therapeutic class and predict drug utility in patient subgroups. However, drug targets are often poorly defined in the literature, both for launched drugs and for potential therapeutic agents in discovery and development. Here, we present an updated comprehensive map of molecular targets of approved drugs. We curate a total of 893 human and pathogen-derived biomolecules through which 1,578 US FDA-approved drugs act. These biomolecules include 667 human-genome-derived proteins targeted by drugs for human disease. Analysis of these drug targets indicates the continued dominance of privileged target families across disease areas, but also the growth of novel first-in-class mechanisms, particularly in oncology. We explore the relationships between bioactivity class and clinical success, as well as the presence of orthologues between human and animal models and between pathogen and human genomes. Through the collaboration of three independent teams, we highlight some of the ongoing challenges in accurately defining the targets of molecular therapeutics and present conventions for deconvoluting the complexities of molecular pharmacology and drug efficacy.


Nature Methods | 2011

PSICQUIC and PSISCORE: accessing and scoring molecular interactions

Bruno Aranda; Hagen Blankenburg; Samuel Kerrien; Fiona S. L. Brinkman; Arnaud Ceol; Emilie Chautard; Jose M. Dana; Javier De Las Rivas; Marine Dumousseau; Eugenia Galeota; Anna Gaulton; Johannes Goll; Robert E. W. Hancock; Ruth Isserlin; Rafael C. Jimenez; Jules Kerssemakers; Jyoti Khadake; David J. Lynn; Magali Michaut; Gavin O'Kelly; Keiichiro Ono; Sandra Orchard; Carlos Tejero Prieto; Sabry Razick; Olga Rigina; Lukasz Salwinski; Milan Simonovic; Sameer Velankar; Andrew Winter; Guanming Wu

To study proteins in the context of a cellular system, it is essential that the molecules with which a protein interacts are identified and the functional consequence of each interaction is understood. A plethora of resources now exist to capture molecular interaction data from the many laboratories generating…


Nucleic Acids Research | 2002

PRINTS and PRINTS-S shed light on protein ancestry

Terri K. Attwood; Martin J. Blythe; Darren R. Flower; Anna Gaulton; J. E. Mabey; Neil Maudling; L. McGregor; Alex L. Mitchell; G. Moulton; Kelly Paine; Philip Scordis

The PRINTS database houses a collection of protein fingerprints. These may be used to make family and tentative functional assignments for uncharacterised sequences. The September 2001 release (version 32.0) includes 1600 fingerprints, encoding approximately 10 000 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here its use as a source of annotation in the InterPro resource, and the use of its relational cousin, PRINTS-S, to model relationships between families, including those beyond the reach of conventional sequence analysis approaches. The database is accessible for BLAST, fingerprint and text searches at http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/.


Nucleic Acids Research | 2017

The ChEMBL database in 2017

Anna Gaulton; Anne Hersey; Michał Nowotka; A. Patrícia Bento; Jon Chambers; David Mendez; Prudence Mutowo; Francis Atkinson; Louisa J. Bellis; Elena Cibrián-Uhalte; Mark Davies; Nathan Dedman; Anneli Karlsson; María Paula Magariños; John P. Overington; George Papadatos; Ines Smit; Andrew R. Leach

ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.


Nature Reviews Drug Discovery | 2011

Minimum information about a bioactive entity (MIABE)

Sandra Orchard; Bissan Al-Lazikani; Steve Bryant; Dominic Clark; Elizabeth Calder; Ian Dix; Ola Engkvist; Mark J. Forster; Anna Gaulton; Michael Gilson; Robert Glen; Martin Grigorov; Kim E. Hammond-Kosack; Lee Harland; Andrew Hopkins; Christopher Larminie; Nick Lynch; Romeena K. Mann; Peter Murray-Rust; Elena Lo Piparo; Christopher Southan; Christoph Steinbeck; David Wishart; Henning Hermjakob; John P. Overington; Janet M. Thornton

Bioactive molecules such as drugs, pesticides and food additives are produced in large numbers by many commercial and academic groups around the world. Enormous quantities of data are generated on the biological properties and quality of these molecules. Access to such data — both on licensed and commercially available compounds, and also on those that fail during development — is crucial for understanding how improved molecules could be developed. For example, computational analysis of aggregated data on molecules that are investigated in drug discovery programmes has led to a greater understanding of the properties of successful drugs. However, the information required to perform these analyses is rarely published, and when it is made available it is often missing crucial data or is in a format that is inappropriate for efficient data-mining. Here, we propose a solution: the definition of reporting guidelines for bioactive entities — the Minimum Information About a Bioactive Entity (MIABE) — which has been developed by representatives of pharmaceutical companies, data resource providers and academic groups.


Nucleic Acids Research | 2016

SureChEMBL: a large-scale, chemically annotated patent document database

George Papadatos; Mark Davies; Nathan Dedman; Jon Chambers; Anna Gaulton; James Siddle; Richard Koks; Sean A. Irvine; Joe Pettersson; Nicholas T. Goncharoff; Anne Hersey; John P. Overington

SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/.


Journal of Cheminformatics | 2013

UniChem: a unified chemical structure cross-referencing and identifier tracking system

Jon Chambers; Mark Davies; Anna Gaulton; Anne Hersey; Sameer Velankar; Robert Petryszak; Janna Hastings; Louisa J. Bellis; Shaun McGlinchey; John P. Overington

UniChem is a freely available compound identifier mapping service on the internet, designed to optimize the efficiency with which structure-based hyperlinks may be built and maintained between chemistry-based resources. In the past, the creation and maintenance of such links at EMBL-EBI, where several chemistry-based resources exist, has required independent efforts by each of the separate teams. These efforts were complicated by the different data models, release schedules, and differing business rules for compound normalization and identifier nomenclature that exist across the organization. UniChem, a large-scale, non-redundant database of Standard InChIs with pointers between these structures and chemical identifiers from all the separate chemistry resources, was developed as a means of efficiently sharing the maintenance overhead of creating these links. Thus, for each source represented in UniChem, all links to and from all other sources are automatically calculated and immediately available for all to use. Updated mappings are immediately available upon loading of new data releases from the sources. Web services in UniChem provide users with a single simple automatable mechanism for maintaining all links from their resource to all other sources represented in UniChem. In addition, functionality to track changes in identifier usage allows users to monitor which identifiers are current, and which are obsolete. Lastly, UniChem has been deliberately designed to allow additional resources to be included with minimal effort. Indeed, the recent inclusion of data sources external to EMBL-EBI has provided a simple means of providing users with an even wider selection of resources with which to link to, all at no extra cost, while at the same time providing a simple mechanism for external resources to link to all EMBL-EBI chemistry resources.

Collaboration


Dive into the Anna Gaulton's collaboration.

Top Co-Authors

Avatar

John P. Overington

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Anne Hersey

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

George Papadatos

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Mark Davies

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

A. Patrícia Bento

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Bissan Al-Lazikani

Institute of Cancer Research

View shared research outputs
Top Co-Authors

Avatar

Jon Chambers

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar

Louisa J. Bellis

European Bioinformatics Institute

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge