Abu Z. Dayem Ullah
Queen Mary University of London
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Abu Z. Dayem Ullah.
Nucleic Acids Research | 2012
Abu Z. Dayem Ullah; Nicholas R. Lemoine; Claude Chelala
Broader functional annotation of single nucleotide variations is a valuable mean for prioritizing targets in further disease studies and large-scale genotyping projects. We originally developed SNPnexus to assess the potential significance of known and novel SNPs on the major transcriptome, proteome, regulatory and structural variation models in order to identify the phenotypically important variants. Being committed to providing continuous support to the scientific community, we have substantially improved SNPnexus over time by incorporating a broader range of variations such as insertions/deletions, block substitutions, IUPAC codes submission and region-based analysis, expanding the query size limit, and most importantly including additional categories for the assessment of functional impact. SNPnexus provides a comprehensive set of annotations for genomic variation data by characterizing related functional consequences at the transcriptome/proteome levels of seven major annotation systems with in-depth analysis of potential deleterious effects, inferring physical and cytogenetic mapping, reporting information on HapMap genotype/allele data, finding overlaps with potential regulatory elements, structural variations and conserved elements, and retrieving links with previously reported genetic disease studies. SNPnexus has a user-friendly web interface with an improved query structure, enhanced functional annotation categories and flexible output presentation making it practically useful for biologists. SNPnexus is freely available at http://www.snp-nexus.org.
Briefings in Bioinformatics | 2013
Abu Z. Dayem Ullah; Nicholas R. Lemoine; Claude Chelala
Broader functional annotation of known as well as putative genetic variations is a valuable mean for prioritizing targets in disease studies and large-scale genotyping projects. In this article, we present a practical guide to SNPnexus, a web-based tool that provides an aggregate set of functional annotations for genomic variation data by characterizing related consequences at the transcriptome/proteome levels with in-depth analysis of potential deleterious effects, inferring physical and cytogenetic mapping, reporting related HapMap data, finding overlaps with potential regulatory, structural as well as conserved elements and retrieving links with previously reported genetic disease studies. We focus on the SNPnexus query system, its annotation categories and the biological interpretation of results.
workshop on algorithms in bioinformatics | 2008
Hans-Joachim Böckenhauer; Abu Z. Dayem Ullah; Leonidas Kapsokalivas; Kathleen Steinhöfel
The HP model is one of the most popular discretized models for the protein folding problem, i.e., for computationally predicting the three-dimensional structure of a protein from its amino acid sequence. This model considers the interactions between hydrophobic amino acids to be the driving force in the folding process. Thus, it distinguishes between polar and hydrophobic amino acids only and asks for an embedding of the amino acid sequence into a rectangular grid lattice which maximizes the number of neighboring pairs (contacts) of hydrophobic amino acids in the lattice. In this paper, we consider an HP-like model which uses a more appropriate grid structure, namely the 2D triangular grid and the face-centered cubic lattice in 3D. We consider a local-search approach for finding an optimal embedding. For defining the local-search neighborhood, we design a move set, the so-called pull moves, and prove its reversibility and completeness. We then use these moves for a tabu search algorithm which is experimentally shown to lead into optimum energy configurations and improve the current best results for several sequences in 2D and 3D.
BMC Bioinformatics | 2010
Abu Z. Dayem Ullah; Kathleen Steinhöfel
BackgroundThe protein folding problem remains one of the most challenging open problems in computational biology. Simplified models in terms of lattice structure and energy function have been proposed to ease the computational hardness of this optimization problem. Heuristic search algorithms and constraint programming are two common techniques to approach this problem. The present study introduces a novel hybrid approach to simulate the protein folding problem using constraint programming technique integrated within local search.ResultsUsing the face-centered-cubic lattice model and 20 amino acid pairwise interactions energy function for the protein folding problem, a constraint programming technique has been applied to generate the neighbourhood conformations that are to be used in generic local search procedure. Experiments have been conducted for a few small and medium sized proteins. Results have been compared with both pure constraint programming approach and local search using well-established local move set. Substantial improvements have been observed in terms of final energy values within acceptable runtime using the hybrid approach.ConclusionConstraint programming approaches usually provide optimal results but become slow as the problem size grows. Local search approaches are usually faster but do not guarantee optimal solutions and tend to stuck in local minima. The encouraging results obtained on the small proteins show that these two approaches can be combined efficiently to obtain better quality solutions within acceptable time. It also encourages future researchers on adopting hybrid techniques to solve other hard optimization problems.
Nucleic Acids Research | 2014
Abu Z. Dayem Ullah; Rosalind J. Cutts; Millika Ghetia; Emanuela Gadaleta; Stephan A. Hahn; Tatjana Crnogorac-Jurcevic; Nicholas R. Lemoine; Claude Chelala
The Pancreatic Expression Database (PED, http://www.pancreasexpression.org) is the only device currently available for mining of pancreatic cancer literature data. It brings together the largest collection of multidimensional pancreatic data from the literature including genomic, proteomic, microRNA, methylomic and transcriptomic profiles. PED allows the user to ask specific questions on the observed levels of deregulation among a broad range of specimen/experimental types including healthy/patient tissue and body fluid specimens, cell lines and murine models as well as related treatments/drugs data. Here we provide an update to PED, which has been previously featured in the Database issue of this journal. Briefly, PED data content has been substantially increased and expanded to cover methylomics studies. We introduced an extensive controlled vocabulary that records specific details on the samples and added data from large-scale meta-analysis studies. The web interface has been improved/redesigned with a quick search option to rapidly extract information about a gene/protein of interest and an upload option allowing users to add their own data to PED. We added a user guide and implemented integrated graphical tools to overlay and visualize retrieved information. Interoperability with biomart-compatible data sets was significantly improved to allow integrative queries with pancreatic cancer data.
Nucleic Acids Research | 2015
Rosalind J. Cutts; José Afonso Guerra-Assunção; Emanuela Gadaleta; Abu Z. Dayem Ullah; Claude Chelala
BCCTBbp (http://bioinformatics.breastcancertissue bank.org) was initially developed as the data-mining portal of the Breast Cancer Campaign Tissue Bank (BCCTB), a vital resource of breast cancer tissue for researchers to support and promote cutting-edge research. BCCTBbp is dedicated to maximising research on patient tissues by initially storing genomics, methylomics, transcriptomics, proteomics and microRNA data that has been mined from the literature and linking to pathways and mechanisms involved in breast cancer. Currently, the portal holds 146 datasets comprising over 227 795 expression/genomic measurements from various breast tissues (e.g. normal, malignant or benign lesions), cell lines and body fluids. BCCTBbp can be used to build on breast cancer knowledge and maximise the value of existing research. By recording a large number of annotations on samples and studies, and linking to other databases, such as NCBI, Ensembl and Reactome, a wide variety of different investigations can be carried out. Additionally, BCCTBbp has a dedicated analytical layer allowing researchers to further analyse stored datasets. A future important role for BCCTBbp is to make available all data generated on BCCTB tissues thus building a valuable resource of information on the tissues in BCCTB that will save repetition of experiments and expand scientific knowledge.
Nucleic Acids Research | 2012
Rosalind J. Cutts; Abu Z. Dayem Ullah; Ajanthah Sangaralingam; Emanuela Gadaleta; Nicholas R. Lemoine; Claude Chelala
High-throughput profiling has generated massive amounts of data across basic, clinical and translational research fields. However, open source comprehensive web tools for analysing data obtained from different platforms and technologies are still lacking. To fill this gap and the unmet computational needs of ongoing research projects, we developed O-miner, a rapid, comprehensive, efficient web tool that covers all the steps required for the analysis of both transcriptomic and genomic data starting from raw image files through in-depth bioinformatics analysis and annotation to biological knowledge extraction. O-miner was developed from a biologist end-user perspective. Hence, it is as simple to use as possible within the confines of the complexity of the data being analysed. It provides a strong analytical suite able to overlay and harness large, complicated, raw and heterogeneous sets of profiles with biological/clinical data. Biologists can use O-miner to analyse and integrate different types of data and annotations to build knowledge of relevant altered mechanisms and pathways in order to identify and prioritize novel targets for further biological validation. Here we describe the analytical workflows currently available using O-miner and present examples of use. O-miner is freely available at www.o-miner.org.
International Journal of Bioinformatics Research and Applications | 2012
Abu Z. Dayem Ullah; Sudhakar Sahoo; Kathleen Steinhöfel; Andreas Alexander Albrecht
In the present study, we define derivative scoring functions from PITA and STarMir predictions. The scoring functions are evaluated for up to five selected miRNAs with a relatively large number of validated targets reported by TarBase and miRecords. The average ranking of validated targets returned by PITA and STarMir is compared to the average ranking produced by the new derivatives scores. We obtain an average improvement of 13.6% (STD∼5.7%) relative to the average ranking of validated targets produced by PITA and STarMir.
Nucleic Acids Research | 2018
Abu Z. Dayem Ullah; Jorge Oscanoa; Jun Wang; Ai Nagano; Nicholas R. Lemoine; Claude Chelala
Abstract Broader functional annotation of genetic variation is a valuable means for prioritising phenotypically-important variants in further disease studies and large-scale genotyping projects. We developed SNPnexus to meet this need by assessing the potential significance of known and novel SNPs on the major transcriptome, proteome, regulatory and structural variation models. Since its previous release in 2012, we have made significant improvements to the annotation categories and updated the query and data viewing systems. The most notable changes include broader functional annotation of noncoding variants and expanding annotations to the most recent human genome assembly GRCh38/hg38. SNPnexus has now integrated rich resources from ENCODE and Roadmap Epigenomics Consortium to map and annotate the noncoding variants onto different classes of regulatory regions and noncoding RNAs as well as providing their predicted functional impact from eight popular non-coding variant scoring algorithms and computational methods. A novel functionality offered now is the support for neo-epitope predictions from leading tools to facilitate its use in immunotherapeutic applications. These updates to SNPnexus are in preparation for its future expansion towards a fully comprehensive computational workflow for disease-associated variant prioritization from sequencing data, placing its users at the forefront of translational research. SNPnexus is freely available at http://www.snp-nexus.org.
Nucleic Acids Research | 2018
Jun Wang; Abu Z. Dayem Ullah; Claude Chelala
Abstract The vast majority of germline and somatic variations occur in the noncoding part of the genome, only a small fraction of which are believed to be functional. From the tens of thousands of noncoding variations detectable in each genome, identifying and prioritizing driver candidates with putative functional significance is challenging. To address this, we implemented IW-Scoring, a new Integrative Weighted Scoring model to annotate and prioritise functionally relevant noncoding variations. We evaluate 11 scoring methods, and apply an unsupervised spectral approach for subsequent selective integration into two linear weighted functional scoring schemas for known and novel variations. IW-Scoring produces stable high-quality performance as the best predictors for three independent data sets. We demonstrate the robustness of IW-Scoring in identifying recurrent functional mutations in the TERT promoter, as well as disease SNPs in proximity to consensus motifs and with gene regulatory effects. Using follicular lymphoma as a paradigmatic cancer model, we apply IW-Scoring to locate 11 recurrently mutated noncoding regions in 14 follicular lymphoma genomes, and validate 9 of these regions in an extension cohort, including the promoter and enhancer regions of PAX5. Overall, IW-Scoring demonstrates greater versatility in identifying trait- and disease-associated noncoding variants. Scores from IW-Scoring as well as other methods are freely available from http://www.snp-nexus.org/IW-Scoring/.