Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Sayoni Das is active.

Publication


Featured researches published by Sayoni Das.


Nucleic Acids Research | 2015

CATH: comprehensive structural and functional annotations for genome sequences

Ian Sillitoe; Tony E. Lewis; Alison L. Cuff; Sayoni Das; Paul Ashford; Natalie L. Dawson; Nicholas Furnham; Roman A. Laskowski; David A. Lee; Jonathan G. Lees; Sonja Lehtinen; Romain A. Studer; Janet M. Thornton; Christine A. Orengo

The latest version of the CATH-Gene3D protein structure classification database (4.0, http://www.cathdb.info) provides annotations for over 235 000 protein domain structures and includes 25 million domain predictions. This article provides an update on the major developments in the 2 years since the last publication in this journal including: significant improvements to the predictive power of our functional families (FunFams); the release of our ‘current’ putative domain assignments (CATH-B); a new, strictly non-redundant data set of CATH domains suitable for homology benchmarking experiments (CATH-40) and a number of improvements to the web pages.


Nucleic Acids Research | 2014

Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis

Jonathan G. Lees; David A. Lee; Romain A. Studer; Natalie L. Dawson; Ian Sillitoe; Sayoni Das; Corin Yeats; Benoit H. Dessailly; Robert Rentzsch; Christine A. Orengo

Gene3D (http://gene3d.biochem.ucl.ac.uk) is a database of protein domain structure annotations for protein sequences. Domains are predicted using a library of profile HMMs from 2738 CATH superfamilies. Gene3D assigns domain annotations to Ensembl and UniProt sequence sets including >6000 cellular genomes and >20 million unique protein sequences. This represents an increase of 45% in the number of protein sequences since our last publication. Thanks to improvements in the underlying data and pipeline, we see large increases in the domain coverage of sequences. We have expanded this coverage by integrating Pfam and SUPERFAMILY domain annotations, and we now resolve domain overlaps to provide highly comprehensive composite multi-domain architectures. To make these data more accessible for comparative genome analyses, we have developed novel search algorithms for searching genomes to identify related multi-domain architectures. In addition to providing domain family annotations, we have now developed a pipeline for 3D homology modelling of domains in Gene3D. This has been applied to the human genome and will be rolled out to other major organisms over the next year.


Nucleic Acids Research | 2017

CATH: an expanded resource to predict protein function through structure and sequence

Natalie L. Dawson; Tony E. Lewis; Sayoni Das; Jonathan G. Lees; David A. Lee; Paul Ashford; Christine A. Orengo; Ian Sillitoe

The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the number of predicted protein domains in the previous version. The daily-updated CATH-B, which contains our very latest domain assignment data, provides putative classifications for over 100 000 additional protein domains. This article describes developments to the CATH-Gene3D resource over the last two years since the publication in 2015, including: significant increases to our structural and sequence coverage; expansion of the functional families in CATH; building a support vector machine (SVM) to automatically assign domains to superfamilies; improved search facilities to return alignments of query sequences against multiple sequence alignments; the redesign of the web pages and download site.


Nucleic Acids Research | 2016

Gene3D: expanding the utility of domain assignments

Su Datt Lam; Natalie L. Dawson; Sayoni Das; Ian Sillitoe; Paul Ashford; David A. Lee; Sonja Lehtinen; Christine A. Orengo; Jonathan G. Lees

Gene3D http://gene3d.biochem.ucl.ac.uk is a database of domain annotations of Ensembl and UniProtKB protein sequences. Domains are predicted using a library of profile HMMs representing 2737 CATH superfamilies. Gene3D has previously featured in the Database issue of NAR and here we report updates to the website and database. The current Gene3D (v14) release has expanded its domain assignments to ∼20 000 cellular genomes and over 43 million unique protein sequences, more than doubling the number of protein sequences since our last publication. Amongst other updates, we have improved our Functional Family annotation method. We have also improved the quality and coverage of our 3D homology modelling pipeline of predicted CATH domains. Additionally, the structural models have been expanded to include an extra model organism (Drosophila melanogaster). We also document a number of additional visualization tools in the Gene3D website.


Nucleic Acids Research | 2015

CATH FunFHMMer web server: protein functional annotations using functional family assignments

Sayoni Das; Ian Sillitoe; David A. Lee; Jonathan G. Lees; Natalie L. Dawson; John M. Ward; Christine A. Orengo

The widening function annotation gap in protein databases and the increasing number and diversity of the proteins being sequenced presents new challenges to protein function prediction methods. Multidomain proteins complicate the protein sequence–structure–function relationship further as new combinations of domains can expand the functional repertoire, creating new proteins and functions. Here, we present the FunFHMMer web server, which provides Gene Ontology (GO) annotations for query protein sequences based on the functional classification of the domain-based CATH-Gene3D resource. Our server also provides valuable information for the prediction of functional sites. The predictive power of FunFHMMer has been validated on a set of 95 proteins where FunFHMMer performs better than BLAST, Pfam and CDD. Recent validation by an independent international competition ranks FunFHMMer as one of the top function prediction methods in predicting GO annotations for both the Biological Process and Molecular Function Ontology. The FunFHMMer web server is available at http://www.cathdb.info/search/by_funfhmmer.


Current Opinion in Genetics & Development | 2015

Diversity in protein domain superfamilies

Sayoni Das; Natalie L. Dawson; Christine A. Orengo

Whilst ∼93% of domain superfamilies appear to be relatively structurally and functionally conserved based on the available data from the CATH-Gene3D domain classification resource, the remainder are much more diverse. In this review, we consider how domains in some of the most ubiquitous and promiscuous superfamilies have evolved, in particular the plasticity in their functional sites and surfaces which expands the repertoire of molecules they interact with and actions performed on them. To what extent can we identify a core function for these superfamilies which would allow us to develop a ‘domain grammar of function’ whereby a proteins biological role can be proposed from its constituent domains? Clearly the first step is to understand the extent to which these components vary and how changes in their molecular make-up modifies function.


PLOS Computational Biology | 2016

Novel computational protocols for functionally classifying and characterising serine beta-lactamases

David A. Lee; Sayoni Das; Natalie L. Dawson; Dragana Dobrijevic; John M. Ward; Christine A. Orengo

Beta-lactamases represent the main bacterial mechanism of resistance to beta-lactam antibiotics and are a significant challenge to modern medicine. We have developed an automated classification and analysis protocol that exploits structure- and sequence-based approaches and which allows us to propose a grouping of serine beta-lactamases that more consistently captures and rationalizes the existing three classification schemes: Classes, (A, C and D, which vary in their implementation of the mechanism of action); Types (that largely reflect evolutionary distance measured by sequence similarity); and Variant groups (which largely correspond with the Bush-Jacoby clinical groups). Our analysis platform exploits a suite of in-house and public tools to identify Functional Determinants (FDs), i.e. residue sites, responsible for conferring different phenotypes between different classes, different types and different variants. We focused on Class A beta-lactamases, the most highly populated and clinically relevant class, to identify FDs implicated in the distinct phenotypes associated with different Class A Types and Variants. We show that our FunFHMMer method can separate the known beta-lactamase classes and identify those positions likely to be responsible for the different implementations of the mechanism of action in these enzymes. Two novel algorithms, ASSP and SSPA, allow detection of FD sites likely to contribute to the broadening of the substrate profiles. Using our approaches, we recognise 151 Class A types in UniProt. Finally, we used our beta-lactamase FunFams and ASSP profiles to detect 4 novel Class A types in microbiome samples. Our platforms have been validated by literature studies, in silico analysis and some targeted experimental verification. Although developed for the serine beta-lactamases they could be used to classify and analyse any diverse protein superfamily where sub-families have diverged over both long and short evolutionary timescales.


Methods | 2016

Protein function annotation using protein domain family resources

Sayoni Das; Christine A. Orengo

As a result of the genome sequencing and structural genomics initiatives, we have a wealth of protein sequence and structural data. However, only about 1% of these proteins have experimental functional annotations. As a result, computational approaches that can predict protein functions are essential in bridging this widening annotation gap. This article reviews the current approaches of protein function prediction using structure and sequence based classification of protein domain family resources with a special focus on functional families in the CATH-Gene3D resource.


Acta Crystallographica Section D Structural Biology | 2017

An overview of comparative modelling and resources dedicated to large-scale modelling of genome sequences

Su Datt Lam; Sayoni Das; Ian Sillitoe; Christine Orengo

This paper reviews the recent advances in computational template-based structural modelling and proposes the subclustering of protein domain superfamilies to guide the template-selection process.


Biochemical Journal | 2018

Protein CoAlation and antioxidant function of coenzyme A in prokaryotic cells

Yugo Tsuchiya; Alexander Zhyvoloup; Jovana Baković; Naam Thomas; Bess Yi Kun Yu; Sayoni Das; Christine Orengo; Clare Newell; John M. Ward; Giorgio Saladino; Federico Comitani; Francesco Luigi Gervasio; Oksana Malanchuk; A. I. Khoruzhenko; Valeriy Filonenko; Sew Yeu Peak-Chew; Mark Skehel; Ivan Gout

In all living organisms, coenzyme A (CoA) is an essential cofactor with a unique design allowing it to function as an acyl group carrier and a carbonyl-activating group in diverse biochemical reactions. It is synthesized in a highly conserved process in prokaryotes and eukaryotes that requires pantothenic acid (vitamin B5), cysteine and ATP. CoA and its thioester derivatives are involved in major metabolic pathways, allosteric interactions and the regulation of gene expression. A novel unconventional function of CoA in redox regulation has been recently discovered in mammalian cells and termed protein CoAlation. Here, we report for the first time that protein CoAlation occurs at a background level in exponentially growing bacteria and is strongly induced in response to oxidizing agents and metabolic stress. Over 12% of Staphylococcus aureus gene products were shown to be CoAlated in response to diamide-induced stress. In vitro CoAlation of S. aureus glyceraldehyde-3-phosphate dehydrogenase was found to inhibit its enzymatic activity and to protect the catalytic cysteine 151 from overoxidation by hydrogen peroxide. These findings suggest that in exponentially growing bacteria, CoA functions to generate metabolically active thioesters, while it also has the potential to act as a low-molecular-weight antioxidant in response to oxidative and metabolic stress.

Collaboration


Dive into the Sayoni Das's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Ian Sillitoe

University College London

View shared research outputs
Top Co-Authors

Avatar

David A. Lee

Queen Mary University of London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Paul Ashford

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John M. Ward

University College London

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Su Datt Lam

University College London

View shared research outputs
Researchain Logo
Decentralizing Knowledge