Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fabian A. Buske is active.

Publication


Featured researches published by Fabian A. Buske.


Nucleic Acids Research | 2009

MEME Suite: tools for motif discovery and searching

Timothy L. Bailey; Mikael Bodén; Fabian A. Buske; Martin C. Frith; Charles E. Grant; Luca Clementi; Jingyuan Ren; Wilfred W. Li; William Stafford Noble

The MEME Suite web server provides a unified portal for online discovery and analysis of sequence motifs representing features such as DNA binding sites and protein interaction domains. The popular MEME motif discovery algorithm is now complemented by the GLAM2 algorithm which allows discovery of motifs containing gaps. Three sequence scanning algorithms—MAST, FIMO and GLAM2SCAN—allow scanning numerous DNA and protein sequence databases for motifs discovered by MEME and GLAM2. Transcription factor motifs (including those discovered using MEME) can be compared with motifs in many popular motif databases using the motif database scanning algorithm Tomtom. Transcription factor motifs can be further analyzed for putative function by association with Gene Ontology (GO) terms using the motif-GO term association tool GOMO. MEME output now contains sequence LOGOS for each discovered motif, as well as buttons to allow motifs to be conveniently submitted to the sequence and motif database scanning algorithms (MAST, FIMO and Tomtom), or to GOMO, for further analysis. GLAM2 output similarly contains buttons for further analysis using GLAM2SCAN and for rerunning GLAM2 with different parameters. All of the motif-based tools are now implemented as web services via Opal. Source code, binaries and a web server are freely available for noncommercial use at http://meme.nbcr.net.


RNA Biology | 2011

Potential in vivo roles of nucleic acid triple-helices

Fabian A. Buske; John S. Mattick; Timothy L. Bailey

The ability of double-stranded DNA to form a triple-helical structure by hydrogen bonding with a third strand is well established, but the biological functions of these structures remain largely unknown. There is considerable albeit circumstantial evidence for the existence of nucleic triplexes in vivo and their potential participation in a variety of biological processes including chromatin organization, DNA repair, transcriptional regulation, and RNA processing has been investigated in a number of studies to date. There is also a range of possible mechanisms to regulate triplex formation through differential expression of triplex-forming RNAs, alteration of chromatin accessibility, sequence unwinding and nucleotide modifications. With the advent of next generation sequencing technology combined with targeted approaches to isolate triplexes, it is now possible to survey triplex formation with respect to their genomic context, abundance and dynamical changes during differentiation and development, which may open up new vistas in understanding genome biology and gene regulation.


Bioinformatics | 2010

Assigning roles to DNA regulatory motifs using comparative genomics

Fabian A. Buske; Mikael Bodén; Denis C. Bauer; Timothy L. Bailey

Motivation: Transcription factors (TFs) are crucial during the lifetime of the cell. Their functional roles are defined by the genes they regulate. Uncovering these roles not only sheds light on the TF at hand but puts it into the context of the complete regulatory network. Results: Here, we present an alignment- and threshold-free comparative genomics approach for assigning functional roles to DNA regulatory motifs. We incorporate our approach into the Gomo algorithm, a computational tool for detecting associations between a user-specified DNA regulatory motif [expressed as a position weight matrix (PWM)] and Gene Ontology (GO) terms. Incorporating multiple species into the analysis significantly improves Gomos ability to identify GO terms associated with the regulatory targets of TFs. Including three comparative species in the process of predicting TF roles in Saccharomyces cerevisiae and Homo sapiens increases the number of significant predictions by 75 and 200%, respectively. The predicted GO terms are also more specific, yielding deeper biological insight into the role of the TF. Adjusting motif (binding) affinity scores for individual sequence composition proves to be essential for avoiding false positive associations. We describe a novel DNA sequence-scoring algorithm that compensates a thermodynamic measure of DNA-binding affinity for individual sequence base composition. Gomos prediction accuracy proves to be relatively insensitive to how promoters are defined. Because Gomo uses a threshold-free form of gene set analysis, there are no free parameters to tune. Biologists can investigate the potential roles of DNA regulatory motifs of interest using Gomo via the web (http://meme.nbcr.net). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Genome Research | 2012

Triplexator: Detecting nucleic acid triple helices in genomic and transcriptomic data

Fabian A. Buske; Denis C. Bauer; John S. Mattick; Timothy L. Bailey

Double-stranded DNA is able to form triple-helical structures by accommodating a third nucleotide strand in its major groove. This sequence-specific process offers a potent mechanism for targeting genomic loci of interest that is of great value for biotechnological and gene-therapeutic applications. It is likely that nature has leveraged this addressing system for gene regulation, because computational studies have uncovered an abundance of putative triplex target sites in various genomes, with enrichment particularly in gene promoters. However, to draw a more complete picture of the in vivo role of triplexes, not only the putative targets but also the sequences acting as the third strand and their capability to pair with the predicted target sites need to be studied. Here we present Triplexator, the first computational framework that integrates all aspects of triplex formation, and showcase its potential by discussing research examples for which the different aspects of triplex formation are important. We find that chromatin-associated RNAs have a significantly higher fraction of sequence features able to form triplexes than expected at random, suggesting their involvement in gene regulation. We furthermore identify hundreds of human genes that contain sequence features in their promoter predicted to be able to form a triplex with a target within the same promoter, suggesting the involvement of triplexes in feedback-based gene regulation. With focus on biotechnological applications, we screen mammalian genomes for high-affinity triplex target sites that can be used to target genomic loci specifically and find that triplex formation offers a resolution of ~1300 nt.


Bioinformatics | 2012

Epigenetic priors for identifying active transcription factor binding sites

Gabriel Cuellar-Partida; Fabian A. Buske; Robert C. McLeay; Tom Whitington; William Stafford Noble; Timothy L. Bailey

MOTIVATION Accurate knowledge of the genome-wide binding of transcription factors in a particular cell type or under a particular condition is necessary for understanding transcriptional regulation. Using epigenetic data such as histone modification and DNase I, accessibility data has been shown to improve motif-based in silico methods for predicting such binding, but this approach has not yet been fully explored. RESULTS We describe a probabilistic method for combining one or more tracks of epigenetic data with a standard DNA sequence motif model to improve our ability to identify active transcription factor binding sites (TFBSs). We convert each data type into a position-specific probabilistic prior and combine these priors with a traditional probabilistic motif model to compute a log-posterior odds score. Our experiments, using histone modifications H3K4me1, H3K4me3, H3K9ac and H3K27ac, as well as DNase I sensitivity, show conclusively that the log-posterior odds score consistently outperforms a simple binary filter based on the same data. We also show that our approach performs competitively with a more complex method, CENTIPEDE, and suggest that the relative simplicity of the log-posterior odds scoring method makes it an appealing and very general method for identifying functional TFBSs on the basis of DNA and epigenetic evidence. AVAILABILITY AND IMPLEMENTATION FIMO, part of the MEME Suite software toolkit, now supports log-posterior odds scoring using position-specific priors for motif search. A web server and source code are available at http://meme.nbcr.net. Utilities for creating priors are at http://research.imb.uq.edu.au/t.bailey/SD/Cuellar2011. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.


Genome Research | 2016

Three-dimensional disorganization of the cancer genome occurs coincident with long-range genetic and epigenetic alterations.

Phillippa C. Taberlay; Joanna Achinger-Kawecka; Aaron T. L. Lun; Fabian A. Buske; Kenneth S. Sabir; Cathryn M. Gould; Elena Zotenko; Saul A. Bert; Katherine A. Giles; Denis C. Bauer; Gordon K. Smyth; Clare Stirzaker; Seán I. O'Donoghue; Susan J. Clark

A three-dimensional chromatin state underpins the structural and functional basis of the genome by bringing regulatory elements and genes into close spatial proximity to ensure proper, cell-type-specific gene expression profiles. Here, we performed Hi-C chromosome conformation capture sequencing to investigate how three-dimensional chromatin organization is disrupted in the context of copy-number variation, long-range epigenetic remodeling, and atypical gene expression programs in prostate cancer. We find that cancer cells retain the ability to segment their genomes into megabase-sized topologically associated domains (TADs); however, these domains are generally smaller due to establishment of additional domain boundaries. Interestingly, a large proportion of the new cancer-specific domain boundaries occur at regions that display copy-number variation. Notably, a common deletion on 17p13.1 in prostate cancer spanning the TP53 tumor suppressor locus results in bifurcation of a single TAD into two distinct smaller TADs. Change in domain structure is also accompanied by novel cancer-specific chromatin interactions within the TADs that are enriched at regulatory elements such as enhancers, promoters, and insulators, and associated with alterations in gene expression. We also show that differential chromatin interactions across regulatory regions occur within long-range epigenetically activated or silenced regions of concordant gene activation or repression in prostate cancer. Finally, we present a novel visualization tool that enables integrated exploration of Hi-C interaction data, the transcriptome, and epigenome. This study provides new insights into the relationship between long-range epigenetic and genomic dysregulation and changes in higher-order chromatin interactions in cancer.


Nature Methods | 2015

Aquaria: simplifying discovery and insight from protein structures

Seán I. O'Donoghue; Kenneth S. Sabir; Maria Kalemanov; Christian Stolte; Benjamin Wellmann; Vivian Ho; Manfred Roos; Nelson Perdigão; Fabian A. Buske; Julian Heinrich; Burkhard Rost; Andrea Schafferhans

To the Editor: Since the discovery of the DNA double helix, biologists have been aware that atomic-scale three-dimensional (3D) structures can provide significant insight. The Protein Data Bank1 (PDB) contains a wealth of structural information, but few biologists take full advantage of it2. Thus, we developed Aquaria (http://aquaria. ws), a publicly available web resource that streamlines and simplifies the process of gleaning insight from protein structures. In contrast to most molecular graphics tools (for example, Astex3 or Chimera4), the user interface of Aquaria is organized primarily by protein sequence, not structure (Fig. 1). A user starts by specifying a protein of interest by name and organism (Supplementary Fig. 1), by identifier or by URL (for example, http://aquaria.ws/ P04637); Aquaria then generates a concise visual summary of all related PDB structures (Fig. 1 and Supplementary Methods), using a precalculated all-against-all comparison of Swiss-Prot5 and PDB1 sequences (updated monthly). The related structures are grouped first by alignment to the specified sequence and second by oligomeric state. Structures are then ranked—in both groupings—by sequence similarity to the specified protein. Users can quickly review all known structural information for a protein and find the structures most relevant to them (Supplementary Video 1). Initially, 3D structures are colored to highlight amino acid differences from the specified protein sequence, with bright, saturated colors indicating identical residues and with slightly dark and very dark coloring indicating conserved and nonconserved substitutions, respectively (Fig. 1). Aquaria also allows mapping of InterPro6 and UniProt5 sequence features (for example, domains, single-nucleotide polymorphisms or posttranslational modifications) onto 3D structures: a simple yet effective way to gain insight into molecular function2 (Supplementary Figs. 2 and 3). Aquaria is designed for biologists; its user interface creates clear and useful default views that show only the most relevant structural information tightly integrated with sequence, features and text that provide biological context. Aquaria uses a minimal set of mouse-based controls that are intuitive yet powerful7. For example, its “Autofocus” feature allows exploration of large complexes by focusing on one molecule at a time. Aquaria can also be controlled via hand gestures using the Leap Motion8. Currently, Aquaria contains 46 million precalculated sequenceto-structure alignments, resulting in at least one matching structure for 87% of Swiss-Prot proteins and a median of 35 structures per protein; this provides a depth of sequence-to-structure information currently not available from other resources.


intelligent systems in molecular biology | 2011

Sorting the nuclear proteome

Denis C. Bauer; Kai Willadsen; Fabian A. Buske; Kim-Anh Lê Cao; Timothy L. Bailey; Graham Dellaire; Mikael Bodén

Motivation: Quantitative experimental analyses of the nuclear interior reveal a morphologically structured yet dynamic mix of membraneless compartments. Major nuclear events depend on the functional integrity and timely assembly of these intra-nuclear compartments. Yet, unknown drivers of protein mobility ensure that they are in the right place at the time when they are needed. Results: This study investigates determinants of associations between eight intra-nuclear compartments and their proteins in heterogeneous genome-wide data. We develop a model based on a range of candidate determinants, capable of mapping the intra-nuclear organization of proteins. The model integrates protein interactions, protein domains, post-translational modification sites and protein sequence data. The predictions of our model are accurate with a mean AUC (over all compartments) of 0.71. We present a complete map of the association of 3567 mouse nuclear proteins with intra-nuclear compartments. Each decision is explained in terms of essential interactions and domains, and qualified with a false discovery assessment. Using this resource, we uncover the collective role of transcription factors in each of the compartments. We create diagrams illustrating the outcomes of a Gene Ontology enrichment analysis. Associated with an extensive range of transcription factors, the analysis suggests that PML bodies coordinate regulatory immune responses. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


BMC Bioinformatics | 2010

Dual-functioning transcription factors in the developmental gene network of Drosophila melanogaster

Denis C. Bauer; Fabian A. Buske; Timothy L. Bailey

BackgroundQuantitative models for transcriptional regulation have shown great promise for advancing our understanding of the biological mechanisms underlying gene regulation. However, all of the models to date assume a transcription factor (TF) to have either activating or repressing function towards all the genes it is regulating.ResultsIn this paper we demonstrate, on the example of the developmental gene network in D. melanogaster, that the data-fit can be improved by up to 40% if the model is allowing certain TFs to have dual function, that is, acting as activator for some genes and as repressor for others. We demonstrate that the improvement is not due to additional flexibility in the model but rather derived from the data itself. We also found no evidence for the involvement of other known site-specific TFs in regulating this network. Finally, we propose SUMOylation as a candidate biological mechanism allowing TFs to switch their role when a small ubiquitin-like modifier (SUMO) is covalently attached to the TF. We strengthen this hypothesis by demonstrating that the TFs predicted to have dual function also contain the known SUMO consensus motif, while TFs predicted to have only one role lack this motif.ConclusionsWe argue that a SUMOylation-dependent mechanism allowing TFs to have dual function represents a promising area for further research and might be another step towards uncovering the biological mechanisms underlying transcriptional regulation.


Bioinformatics | 2014

NGSANE: a lightweight production informatics framework for high-throughput data analysis

Fabian A. Buske; Hugh French; Martin A. Smith; Susan J. Clark; Denis C. Bauer

Summary: The initial steps in the analysis of next-generation sequencing data can be automated by way of software ‘pipelines’. However, individual components depreciate rapidly because of the evolving technology and analysis methods, often rendering entire versions of production informatics pipelines obsolete. Constructing pipelines from Linux bash commands enables the use of hot swappable modular components as opposed to the more rigid program call wrapping by higher level languages, as implemented in comparable published pipelining systems. Here we present Next Generation Sequencing ANalysis for Enterprises (NGSANE), a Linux-based, high-performance-computing-enabled framework that minimizes overhead for set up and processing of new projects, yet maintains full flexibility of custom scripting when processing raw sequence data. Availability and implementation: Ngsane is implemented in bash and publicly available under BSD (3-Clause) licence via GitHub at https://github.com/BauerLab/ngsane. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Collaboration


Dive into the Fabian A. Buske's collaboration.

Top Co-Authors

Avatar

Denis C. Bauer

Commonwealth Scientific and Industrial Research Organisation

View shared research outputs
Top Co-Authors

Avatar

Mikael Bodén

University of Queensland

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

John S. Mattick

Garvan Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar

Susan J. Clark

Garvan Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar

Cathryn M. Gould

Peter MacCallum Cancer Centre

View shared research outputs
Top Co-Authors

Avatar

Clare Stirzaker

Garvan Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar

Kenneth S. Sabir

Garvan Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar

Seán I. O'Donoghue

Commonwealth Scientific and Industrial Research Organisation

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge