Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Carol Lushbough is active.

Publication


Featured researches published by Carol Lushbough.


Nucleic Acids Research | 2007

PlantGDB: a resource for comparative plant genomics

Jon Duvick; Ann Fu; Usha K. Muppirala; Mukul Sabharwal; Matthew D. Wilkerson; Carolyn J. Lawrence; Carol Lushbough; Volker Brendel

PlantGDB (http://www.plantgdb.org/) is a genomics database encompassing sequence data for green plants (Viridiplantae). PlantGDB provides annotated transcript assemblies for >100 plant species, with transcripts mapped to their cognate genomic context where available, integrated with a variety of sequence analysis tools and web services. For 14 plant species with emerging or complete genome sequence, PlantGDBs genome browsers (xGDB) serve as a graphical interface for viewing, evaluating and annotating transcript and protein alignments to chromosome or bacterial artificial chromosome (BAC)-based genome assemblies. Annotation is facilitated by the integrated yrGATE module for community curation of gene models. Novel web services at PlantGDB include Tracembler, an iterative alignment tool that generates contigs from GenBank trace file data and BioExtract Server, a web-based server for executing custom sequence analysis workflows. PlantGDB also hosts a plant genomics research outreach portal (PGROP) that facilitates access to a large number of resources for research and training.


Plant Physiology | 2005

Comparative Plant Genomics Resources at PlantGDB

Qunfeng Dong; Carolyn J. Lawrence; Shannon D. Schlueter; Matthew D. Wilkerson; Stefan Kurtz; Carol Lushbough; Volker Brendel

PlantGDB (http://www.plantgdb.org/) is a database of plant molecular sequences. Expressed sequence tag (EST) sequences are assembled into contigs that represent tentative unique genes. EST contigs are functionally annotated with information derived from known protein sequences that are highly similar to the putative translation products. Tentative Gene Ontology terms are assigned to match those of the similar sequences identified. Genome survey sequences are assembled similarly. The resulting genome survey sequence contigs are matched to ESTs and conserved protein homologs to identify putative full-length open reading frame-containing genes, which are subsequently provisionally classified according to established gene family designations. For Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa), the exon-intron boundaries for gene structures are annotated by spliced alignment of ESTs and full-length cDNAs to their respective complete genome sequences. Unique genome browsers have been developed to present all available EST and cDNA evidence for current transcript models (for Arabidopsis, see the AtGDB site at http://www.plantgdb.org/AtGDB/; for rice, see the OsGDB site at http://www.plantgdb.org/OsGDB/). In addition, a number of bioinformatic tools have been integrated at PlantGDB that enable researchers to carry out sequence analyses on-site using both their own data and data residing within the database.


Nucleic Acids Research | 2011

The BioExtract Server: a web-based bioinformatic workflow platform

Carol Lushbough; Douglas Jennewein; Volker Brendel

The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet.


International Journal of Plant Genomics | 2011

POPcorn: An Online Resource Providing Access to Distributed and Diverse Maize Project Data

Ethalinda K. S. Cannon; Scott M. Birkett; Bremen L. Braun; Sateesh Kumar Kodavali; Douglas Jennewein; Alper Yilmaz; Valentin Antonescu; Corina Antonescu; Lisa C. Harper; Jack M. Gardiner; Mary L. Schaeffer; Darwin A. Campbell; Carson M. Andorf; Destri Andorf; Damon Lisch; Karen E. Koch; Donald R. McCarty; John Quackenbush; Erich Grotewold; Carol Lushbough; Taner Z. Sen; Carolyn J. Lawrence

The purpose of the online resource presented here, POPcorn (Project Portal for corn), is to enhance accessibility of maize genetic and genomic resources for plant biologists. Currently, many online locations are difficult to find, some are best searched independently, and individual project websites often degrade over time—sometimes disappearing entirely. The POPcorn site makes available (1) a centralized, web-accessible resource to search and browse descriptions of ongoing maize genomics projects, (2) a single, stand-alone tool that uses web Services and minimal data warehousing to search for sequence matches in online resources of diverse offsite projects, and (3) a set of tools that enables researchers to migrate their data to the long-term model organism database for maize genetic and genomic information: MaizeGDB. Examples demonstrating POPcorns utility are provided herein.


Concurrency and Computation: Practice and Experience | 2015

Life science data analysis workflow development using the bioextract server leveraging the iPlant collaborative cyberinfrastructure

Carol Lushbough; Etienne Z. Gnimpieba; Rion Dooley

In order to handle the vast quantities of biological data gener6ated by high‐throughput experimental technologies, the BioExtract Server (bioextract.org) has leveraged iPlant Collaborative (www.iplantcollaborative.org) functionality to help address big data storage and analysis issues in the bioinformatics field. The BioExtract Server is a Web‐based, workflow‐enabling system that offers researchers a flexible environment for analyzing genomic data. It provides researchers with the ability to save a series of BioExtract Server tasks (e.g., query a data source, save a data extract, and execute an analytic tool) as a workflow and the opportunity for researchers to share their data extracts, analytic tools, and workflows with collaborators. The iPlant Collaborative is a community of researchers, educators, and students working to enrich science through the development of cyberinfrastructure—the physical computing resources, collaborative environment, virtual machine resources, and interoperable analysis software and data services—that are essential components of modern biology. The iPlant AGAVE Advanced Programming Interface, developed through the iPlant Collaborative, is a hosted, Software‐as‐a‐Service resource providing access to a collection of high performance computing and cloud resources. Leveraging AGAVE, the BioExtract Server gives researchers easy access to multiple high performance computers and delivers computation and storage as dynamically allocated resources via the Internet.


international conference on bioinformatics | 2014

Automatic biosystems comparison using semantic and name similarity

Mathialakan Thavappiragasam; Carol Lushbough; Etienne Z. Gnimpieba

With the growth of bio-systems model development, automatic approaches are needed to support systems biologists in model similarity evaluation. Several algorithms have been proposed, but they lack efficiency. We have developed an efficient, intuitive approach using name and semantic similarity checking. Individual components in two given SBML models are compared by their names using ParaABioS (a heuristic Parallelizable Algorithm for Similarity Based Biosystems Comparison) and by their meaning using annotated URI (Unified Resource Identifier). We developed a tool SMBLcompare, an implementation of this approach for automatic bio-systems model comparison in SBML format. This implementation has been embedded into a web portal for small biosystems comparison and also integrated into the Bioextract Server (bioextract.org) in order to be able to use within workflows designed to address escience challenges. SBMLcompare has been successfully used on FOCM (Folate One Carbon Metabolite) models and two genome-scale yeast metabolic models iND750, iFF708. The similarity result showed a significant improvement compared to existing related work (over 10%).


international conference on bioinformatics | 2014

Heuristic parallelizable algorithm for similarity based biosystems comparison

Mathialakan Thavappiragasam; Carol Lushbough; Etienne Z. Gnimpieba

Biosystem comparison plays a major role in system biology. Similar biosystems are identified based on the similarity of species naming. Since the species naming does not follow a standard nomenclature, similarity is not easy to formalize. A single metabolite can have different name strings that vary slightly in pattern. Several algorithms have been designed to find similarity between two species using different measures. However, these algorithms failed to achieve good performance in biological species similarity checking due to failure to account for important facts about biological name analysis. We developed ParaABioS, a heuristic, intuitive algorithm for biosystem similarity evaluation. This algorithm integrates sub-name analysis and a symbol management strategy that conserves species name specifications. ParaABioS provides similarity checking between two names that consumes the time O(k!) where k is the number of sub names in worst case. It is implemented in Java and parallelized on a Texas Advanced Computing Center TACC high performance computing (HPC) server and accessed through the iPlant Collaborative Foundation API in order to compare large models. It is available online and also through the BioExtract Server workflow management system (WMS).


international conference on bioinformatics | 2014

RNA-seq gene and transcript expression analysis using the BioExtract server and iPlant collaborative

Etienne Z. Gnimpieba; Abalo Chango; Carol Lushbough

Background: The development of Next Generation Sequencing (NGS) technology provides great opportunities to study gene expression, gene spliced transcripts, post-transcriptional changes, and gene fusion mutations/SNPs. The large amount of data being generated from these approaches presents many challenges. For example, how can we manage and analyze these vast datasets in order to extract new knowledge. Aims: This paper provides an integrated, adaptable, and scalable scenario to guide researchers through a complex, data analysis process using the iPlant Collaborative AGAVE RESTful API through the BioExtract Server. In 3 modules, we show how a High Performance Cluster (HPC) can be leveraged in a Workflow Management System (WMS) by following simple analytic steps. Results: A workflow has been developed in the BioExtract Server to analyze RNA-Seq data. The running of this workflow on a 21.6GB dataset provides reliable gene and transcript expression results. The BioExtract Servers results compared to an existing manual workflow on the same dataset shows ≈800% improvement in execution time (from ≈18h to ≈2h10min). Additionally, there are several qualitative improvements such as; automation, reproducibility, sharability, and scalability. (Note: the performance was not compared to the workflow installed at Galaxy, https://usegalaxy.org/, due to extensive wait times on their public site.) Our workflow execution provides analysis results from input datasets and reveals a 0.05 fold discovery rate (FDR) showing that 342 genes, 228 isoforms, 270 TSS, 47 CDS and 23 promoters are significantly differentially expressed. Conclusion: Having the ability to easily create and execute workflows leveraging the robust iPlant cyberinfrastructure to analyze NGS data represents one more steps in eScience initiative improvement. It improves, considerably, the ability of life science researchers to apply NGS tools. However, enhancements to this approach remains important as everyday improvements in HPC and WMS technology, techniques, and software continues. Our coming challenge will consist to follow that evolution in order to minimize the gap between researchers and these powerful resources. Availability: Tools used here are freely available on referenced link. Additional data analysis from our workflow execution is available on demand. Our workflow is available on MyExperiment under creative commons (cc) license (http://www.myexperiment.org/workflows/3895.html?version=1).


international conference on cluster computing | 2013

BioExtract Server, a Web-based workflow enabling system, leveraging iPlant collaborative resources

Carol Lushbough; Etienne Z. Gnimpieba; Rion Dooley

In order to handle the vast quantities of biological data generated by high-throughput experimental technologies, the BioExtract Server (bioextract.org) has leveraged iPlant Collaborative (www.iplantcollaborative.org) functionality to help address big data storage and analysis issues in the bioinformatics field. The BioExtract Server is a Web-based, workflow-enabling system that offers researchers a flexible environment for analyzing genomic data. It provides researchers with the ability to save a series of BioExtract Server tasks (e.g. query a data source, save a data extract, and execute an analytic tool) as a workflow and the opportunity for researchers to share their data extracts, analytic tools and workflows with collaborators. The iPlant Collaborative is a community of researchers, educators, and students working to enrich science through the development of cyberinfrastructure - the physical computing resources, collaborative environment, virtual machine resources, and interoperable analysis software and data services - that are essential components of modern biology. The iPlant Agave API (Agave), developed through the iPlant Collaborative, is a hosted, Software-as-a-Service resource providing access to a collection of High Performance Computing (HPC) and Cloud resources [6]. Leveraging Agave, the BioExtract Server gives researchers easy access to multiple high performance computers and delivers computation and storage as dynamically allocated resources via the Internet.


Nucleic Acids Research | 2017

Bio-TDS: bioscience query tool discovery system

Etienne Z. Gnimpieba; Menno S. VanDiermen; Shayla M. Gustafson; Bill Conn; Carol Lushbough

Abstract Bioinformatics and computational biology play a critical role in bioscience and biomedical research. As researchers design their experimental projects, one major challenge is to find the most relevant bioinformatics toolkits that will lead to new knowledge discovery from their data. The Bio-TDS (Bioscience Query Tool Discovery Systems, http://biotds.org/) has been developed to assist researchers in retrieving the most applicable analytic tools by allowing them to formulate their questions as free text. The Bio-TDS is a flexible retrieval system that affords users from multiple bioscience domains (e.g. genomic, proteomic, bio-imaging) the ability to query over 15 000 analytic tool descriptions integrated from well-established, community repositories. One of the primary components of the Bio-TDS is the ontology and natural language processing workflow for annotation, curation, query processing, and evaluation. The Bio-TDS’s scientific impact was evaluated using sample questions posed by researchers retrieved from Biostars, a site focusing on biological data analysis. The Bio-TDS was compared to five similar bioscience analytic tool retrieval systems with the Bio-TDS outperforming the others in terms of relevance and completeness. The Bio-TDS offers researchers the capacity to associate their bioscience question with the most relevant computational toolsets required for the data analysis in their knowledge discovery process.

Collaboration


Dive into the Carol Lushbough's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Douglas Jennewein

University of South Dakota

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Doreen Ware

Cold Spring Harbor Laboratory

View shared research outputs
Top Co-Authors

Avatar

Liya Wang

Cold Spring Harbor Laboratory

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bill Conn

University of South Dakota

View shared research outputs
Top Co-Authors

Avatar

Joe Reynoldson

University of South Dakota

View shared research outputs
Researchain Logo
Decentralizing Knowledge