Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Ben Busby is active.

Publication


Featured researches published by Ben Busby.


Environmental Microbiology | 2013

Contribution of phage‐derived genomic islands to the virulence of facultative bacterial pathogens

Ben Busby; David M. Kristensen; Eugene V. Koonin

Facultative pathogens have extremely dynamic pan-genomes, to a large extent derived from bacteriophages and other mobile elements. We developed a simple approach to identify phage-derived genomic islands and apply it to show that pathogens from diverse bacterial genera are significantly enriched in clustered phage-derived genes compared with related benign strains. These findings show that genome expansion by integration of prophages containing virulence factors is a major route of evolution of facultative bacterial pathogens.


Genetics | 2014

Effect of Domestication on the Spread of the [PIN+] Prion in Saccharomyces cerevisiae

Amy C. Kelly; Ben Busby; Reed B. Wickner

Prions (infectious proteins) cause fatal neurodegenerative diseases in mammals. In the yeast Saccharomyces cerevisiae, many toxic and lethal variants of the [PSI+] and [URE3] prions have been identified in laboratory strains, although some commonly studied variants do not seem to impair cell growth. Phylogenetic analysis has revealed four major clades of S. cerevisiae that share histories of two prion proteins and largely correspond to different ecological niches of yeast. The [PIN+] prion was most prevalent in commercialized niches, infrequent among wine/vineyard strains, and not observed in ancestral isolates. As previously reported, the [PSI+] and [URE3] prions are not found in any of these strains. Patterns of heterozygosity revealed genetic mosaicism and indicated extensive outcrossing among divergent strains in commercialized environments. In contrast, ancestral isolates were all homozygous and wine/vineyard strains were closely related to each other and largely homozygous. Cellular growth patterns were highly variable within and among clades, although ancestral isolates were the most efficient sporulators and domesticated strains showed greater tendencies for flocculation. [PIN+]-infected strains had a significantly higher likelihood of polyploidy, showed a higher propensity for flocculation compared to uninfected strains, and had higher sporulation efficiencies compared to domesticated, uninfected strains. Extensive phenotypic variability among strains from different environments suggests that S. cerevisiae is a niche generalist and that most wild strains are able to switch from asexual to sexual and from unicellular to multicellular growth in response to environmental conditions. Our data suggest that outbreeding and multicellular growth patterns adapted for domesticated environments are ecological risk factors for the [PIN+] prion in wild yeast.


F1000Research | 2016

MetaNetVar: Pipeline for applying network analysis tools for genomic variants analysis

Eric Moyer; Megan H. Hagenauer; Matthew Lesko; Felix Francis; Oscar Rodriguez; Vijayaraj Nagarajan; Vojtech Huser; Ben Busby

Network analysis can make variant analysis better. There are existing tools like HotNet2 and dmGWAS that can provide various analytical methods. We developed a prototype of a pipeline called MetaNetVar that allows execution of multiple tools. The code is published at https://github.com/NCBI-Hackathons/Network_SNPs. A working prototype is published as an Amazon Machine Image - ami-4510312f .


F1000Research | 2016

Closing gaps between open software and public data in a hackathon setting: User-centered software prototyping

Ben Busby; Matthew Lesko; August; January Hackathon participants; Lisa Federer

In genomics, bioinformatics and other areas of data science, gaps exist between extant public datasets and the open-source software tools built by the community to analyze similar data types. The purpose of biological data science hackathons is to assemble groups of genomics or bioinformatics professionals and software developers to rapidly prototype software to address these gaps. The only two rules for the NCBI-assisted hackathons run so far are that 1) data either must be housed in public data repositories or be deposited to such repositories shortly after the hackathon’s conclusion, and 2) all software comprising the final pipeline must be open-source or open-use. Proposed topics, as well as suggested tools and approaches, are distributed to participants at the beginning of each hackathon and refined during the event. Software, scripts, and pipelines are developed and published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development. The code resulting from each hackathon is published at https://github.com/NCBI-Hackathons/ with separate directories or repositories for each team.


BMC Cancer | 2016

Mitogen-activated protein kinase signaling causes malignant melanoma cells to differentially alter extracellular matrix biosynthesis to promote cell survival

Anna Afasizheva; Alexus Devine; Heather Tillman; King Leung Fung; Wilfred D. Vieira; Benjamin H. Blehm; Yorihisa Kotobuki; Ben Busby; Emily I. Chen; Kandice Tanner

BackgroundIntrinsic and acquired resistance to drug therapies remains a challenge for malignant melanoma patients. Intratumoral heterogeneities within the tumor microenvironment contribute additional complexity to the determinants of drug efficacy and acquired resistance.MethodsWe use 3D biomimetic platforms to understand dynamics in extracellular matrix (ECM) biogenesis following pharmaceutical intervention against mitogen-activated protein kinases (MAPK) signaling. We further determined temporal evolution of secreted ECM components by isogenic melanoma cell clones.ResultsWe found that the cell clones differentially secrete and assemble a myriad of ECM molecules into dense fibrillar and globular networks. We show that cells can modulate their ECM biosynthesis in response to external insults. Fibronectin (FN) is one of the key architectural components, modulating the efficacy of a broad spectrum of drug therapies. Stable cell lines engineered to secrete minimal levels of FN showed a concomitant increase in secretion of Tenascin-C and became sensitive to BRAFV600E and ERK inhibition as clonally- derived 3D tumor aggregates. These cells failed to assemble exogenous FN despite maintaining the integrin machinery to facilitate cell- ECM cross-talk. We determined that only clones that increased FN production via p38 MAPK and β1 integrin survived drug treatment.ConclusionsThese data suggest that tumor cells engineer drug resistance by altering their ECM biosynthesis. Therefore, drug treatment may induce ECM biosynthesis, contributing to de novo resistance.


bioRxiv | 2015

Building Genomic Analysis Pipelines in a Hackathon Setting with Bioinformatician Teams: DNA-seq, Epigenomics, Metagenomics and RNA-seq

Ben Busby; Allissa Dillman; Claire L. Simpson; Ian Fingerman; Sijung Yun; David M. Kristensen; Lisa Federer; Naisha Shah; Matthew C. LaFave; Laura Jimenez-Barron; Manusha Pande; Wen Luo; Brendan Miller; Cem Mayden; Dhruva Chandramohan; Kipper Fletez-Brant; Paul W. Bible; Sergej Nowoshilow; Alfred Chan; Eric Jc Galvez; Jeremy F. Chignell; Joseph N. Paulson; Manoj Kandpal; Suhyeon Yoon; Esther Asaki; Abhinav Nellore; Adam Stine; Robert D. Sanders; Jesse Becker; Matt Lesko

We assembled teams of genomics professionals to assess whether we could rapidly develop pipelines to answer biological questions commonly asked by biologists and others new to bioinformatics by facilitating analysis of high-throughput sequencing data. In January 2015, teams were assembled on the National Institutes of Health (NIH) campus to address questions in the DNA-seq, epigenomics, metagenomics and RNA-seq subfields of genomics. The only two rules for this hackathon were that either the data used were housed at the National Center for Biotechnology Information (NCBI) or would be submitted there by a participant in the next six months, and that all software going into the pipeline was open-source or open-use. Questions proposed by organizers, as well as suggested tools and approaches, were distributed to participants a few days before the event and were refined during the event. Pipelines were published on GitHub, a web service providing publicly available, free-usage tiers for collaborative software development (https://github.com/features/). The code was published at https://github.com/DCGenomics/ with separate repositories for each team, starting with hackathon_v001.


F1000Research | 2017

DangerTrack: A scoring system to detect difficult-to-assess regions

Igor Dolgalev; Fritz J. Sedlazeck; Ben Busby

Over recent years, multiple groups have shown that a large number of structural variants, repeats, or problems with the underlying genome assembly have dramatic effects on the mapping, calling, and overall reliability of single nucleotide polymorphism calls. This project endeavored to develop an easy-to-use track for looking at structural variant and repeat regions. This track, DangerTrack, can be displayed alongside the existing Genome Reference Consortium assembly tracks to warn clinicians and biologists when variants of interest may be incorrectly called, of dubious quality, or on an insertion or copy number expansion. While mapping and variant calling can be automated, it is our opinion that when these regions are of interest to a particular clinical or research group, they warrant a careful examination, potentially involving localized reassembly. DangerTrack is available at https://github.com/DCGenomics/DangerTrack.


F1000Research | 2017

PubRunner: A light-weight framework for updating text mining results

Kishore R. Anekalla; Jean-Paul Courneya; Nicolas Fiorini; Jake Lever; Michael Muchow; Ben Busby

Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are underused because their results are static and do not reflect the constantly expanding knowledge in the field. In order for biomedical text mining to become an indispensable tool used by researchers, this problem must be addressed. To this end, we present PubRunner, a framework for regularly running text mining tools on the latest publications. PubRunner is lightweight, simple to use, and can be integrated with an existing text mining tool. The workflow involves downloading the latest abstracts from PubMed, executing a user-defined tool, pushing the resulting data to a public FTP, and publicizing the location of these results on the public PubRunner website. This shows a proof of concept that we hope will encourage text mining developers to build tools that truly will aid biologists in exploring the latest publications.


bioRxiv | 2018

Magic-BLAST, a accurate DNA and RNA-seq aligner for long and short reads

Grzegorz M Boratyn; Jean Thierry-Mieg; Danielle Thierry-Mieg; Ben Busby; Thomas L. Madden

Next-generation sequencing technologies can produce tens of millions of reads, often paired-end, from transcripts or genomes. But few programs can align RNA on the genome and accurately discover introns, especially with long reads. To address these issues, we introduce Magic-BLAST, a new aligner based on ideas from the Magic pipeline. It uses innovative techniques that include the optimization of a spliced alignment score and selective masking during seed selection. We evaluate the performance of Magic-BLAST to accurately map short or long sequences and its ability to discover introns on real RNA-seq data sets from PacBio, Roche and Illumina runs, and on six benchmarks, and compare it to other popular aligners. Additionally, we look at alignments of human idealized RefSeq mRNA sequences perfectly matching the genome. We show that Magic-BLAST is the best at intron discovery over a wide range of conditions. It is versatile and robust to high levels of mismatches or extreme base composition and works well with very long reads. It is reasonably fast. It can align reads to a BLAST database or a FASTA file. It can accept a FASTQ file as input or automatically retrieve an accession from the SRA repository at the NCBI.


bioRxiv | 2018

GeneHummus: A pipeline to define gene families and their expression in legumes and beyond

Jose V. Die; Moamen Elmassry; Kimberly H. LeBlanc; Olaitan Awe; Allissa Dillman; Ben Busby

During the last decade, plant biotechnological laboratories have sparked a monumental revolution with the rapid development of next sequencing technologies at affordable prices. Soon, these sequencing technologies and assembling of whole genomes will extend beyond the plant computational biologists and become commonplace within the plant biology disciplines. The current availability of large-scale genomic resources for non-traditional plant model systems (the so-called ‘orphan crops’) is enabling the construction of high-density integrated physical and genetic linkage maps with potential applications in plant breeding. The newly available fully sequenced plant genomes represent an incredible opportunity for comparative analyses that may reveal new aspects of genome biology and evolution. Analysis of the expansion and evolution of gene families across species is a common approach to infer biological functions. To date, the extent and role of gene families in plants has only been partially addressed and many gene families remain to be investigated. Manual identification of gene families is highly time-consuming and laborious, requiring an iterative process of manual and computational analysis to identify members of a given family, typically combining numerous BLAST searches and manually cleaning data. Due to the increasing abundance of genome sequences and the agronomical interest in plant gene families, the field needs a clear, automated annotation tool. Here, we present the GeneHummus pipeline, a step-by-step R-based pipeline for the identification, characterization and expression analysis of plant gene families. The impact of this pipeline comes from a reduction in hands-on annotation time combined with high specificity and sensitivity in extracting only proteins from the RefSeq database and providing the conserved domain architectures based on SPARCLE. As a case study we focused on the auxin receptor factors gene (ARF) family in Cicer arietinum (chickpea) and other legumes. We anticipate that our pipeline should be suitable for any plant gene family, and likely other gene families, vastly improving the speed and ease of genomic data processing.

Collaboration


Dive into the Ben Busby's collaboration.

Top Co-Authors

Avatar

Allissa Dillman

Uniformed Services University of the Health Sciences

View shared research outputs
Top Co-Authors

Avatar

Lisa Federer

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Satyajeet Raje

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Claire L. Simpson

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

David M. Kristensen

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Laura Jimenez-Barron

Cold Spring Harbor Laboratory

View shared research outputs
Top Co-Authors

Avatar

Liz Amos

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Matthew C. LaFave

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Matthew Lesko

National Institutes of Health

View shared research outputs
Researchain Logo
Decentralizing Knowledge