Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Doreen Main is active.

Publication


Featured researches published by Doreen Main.


Genome Biology | 2014

Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies

David B. Neale; Jill L. Wegrzyn; Kristian A. Stevens; Aleksey V. Zimin; Daniela Puiu; Marc W. Crepeau; Charis Cardeno; Maxim Koriabine; Ann Holtz-Morris; John D. Liechty; Pedro J. Martínez-García; Hans A. Vasquez-Gross; Brian Y. Lin; Jacob J. Zieve; William M. Dougherty; Sara Fuentes-Soriano; Le Shin Wu; Don Gilbert; Guillaume Marçais; Michael Roberts; Carson Holt; Mark Yandell; John M. Davis; Katherine E. Smith; Jeffrey F. D. Dean; W. Walter Lorenz; Ross W. Whetten; Ronald R. Sederoff; Nicholas Wheeler; Patrick E. McGuire

BackgroundThe size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination.ResultsWe develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome.ConclusionsIn addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.


Genetics | 2014

Unique Features of the Loblolly Pine (Pinus taeda L.) Megagenome Revealed Through Sequence Annotation

Jill L. Wegrzyn; John D. Liechty; Kristian A. Stevens; Le Shin Wu; Carol A. Loopstra; Hans A. Vasquez-Gross; William M. Dougherty; Brian Y. Lin; Jacob J. Zieve; Pedro J. Martínez-García; Carson Holt; Mark Yandell; Aleksey V. Zimin; James A. Yorke; Marc W. Crepeau; Daniela Puiu; Pieter J. de Jong; Keithanne Mockaitis; Doreen Main; Charles H. Langley; David B. Neale

The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20–40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.


Database | 2011

Tripal: a construction toolkit for online genome databases

Stephen P. Ficklin; Lacey-Anne Sanderson; Chun-Huai Cheng; Margaret Staton; Taein Lee; Il-Hyung Cho; Sook Jung; Kirstin E. Bett; Doreen Main

As the availability, affordability and magnitude of genomics and genetics research increases so does the need to provide online access to resulting data and analyses. Availability of a tailored online database is the desire for many investigators or research communities; however, managing the Information Technology infrastructure needed to create such a database can be an undesired distraction from primary research or potentially cost prohibitive. Tripal provides simplified site development by merging the power of Drupal, a popular web Content Management System with that of Chado, a community-derived database schema for storage of genomic, genetic and other related biological data. Tripal provides an interface that extends the content management features of Drupal to the data housed in Chado. Furthermore, Tripal provides a web-based Chado installer, genomic data loaders, web-based editing of data for organisms, genomic features, biological libraries, controlled vocabularies and stock collections. Also available are Tripal extensions that support loading and visualizations of NCBI BLAST, InterPro, Kyoto Encyclopedia of Genes and Genomes and Gene Ontology analyses, as well as an extension that provides integration of Tripal with GBrowse, a popular GMOD tool. An Application Programming Interface is available to allow creation of custom extensions by site developers, and the look-and-feel of the site is completely customizable through Drupal-based PHP template files. Addition of non-biological content and user-management is afforded through Drupal. Tripal is an open source and freely available software package found at http://tripal.sourceforge.net


Standards in Genomic Sciences | 2011

Complete genome of the onion pathogen Enterobacter cloacae EcWSU1

Jodi L. Humann; Mark R. Wildung; Chun-Huai Cheng; Taein Lee; Jane E. Stewart; Jennifer C. Drew; Eric W. Triplett; Doreen Main; Brenda K. Schroeder

Previous studies have shown that the members of the Enterobacter cloacae complex are difficult to differentiate with biochemical tests and in phylogenetic studies using multilocus sequence analysis, strains of the same species separate into numerous clusters. There are only a few complete E. cloacae genome sequences and very little knowledge about the mechanism of pathogenesis of E. cloacae on plants and humans. Enterobacter cloacae EcWSU1 causes Enterobacter bulb decay in stored onions (Allium cepa). The EcWSU1 genome consists of a 4,734,438 bp chromosome and a mega-plasmid of 63,653 bp. The chromosome has 4,632 protein coding regions, 83 tRNA sequences, and 8 rRNA operons.


Tree Genetics & Genomes | 2012

Uniform standards for genome databases in forest and fruit trees

Jill L. Wegrzyn; Doreen Main; B. Figueroa; M. Choi; J. Yu; David B. Neale; Sook Jung; Taein Lee; M. Stanton; Ping Zheng; Stephen P. Ficklin; Il-Hyung Cho; Cameron Peace; Kate Evans; Gayle M. Volk; Nnadozie Oraguzie; Chunxian Chen; Mercy A. Olmstead; G. Gmitter; A. G. Abbott

TreeGenes and tree fruit Genome Database Resources serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype and phenotype projects have recently spawned the development of independent tools and interfaces within these repositories to deliver information to both geneticists and breeders. The increase in next generation sequencing projects has increased the amount of data as well as the scale of analysis that can be performed. These two repositories are now working towards a similar goal of archiving the diverse, independent data sets generated from genotype/phenotype experiments. This is achieved through focused development on data input standards (templates), pipelines for the storage and automated curation, and consistent annotation efforts through the application of widely accepted ontologies to improve the extraction and exchange of the data for comparative analysis. Efforts towards standardization are not limited to genotype/phenotype experiments but are also being applied to other data types to improve gene prediction and annotation for de novo sequencing projects. The resources developed towards these goals represent the first large-scale coordinated effort in plant databases to add informatics value to diverse genotype/phenotype experiments.


Database | 2018

AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture

Lisa C. Harper; Jacqueline D. Campbell; Ethalinda K. S. Cannon; Sook Jung; Monica Poelchau; Ramona L. Walls; Carson M. Andorf; Elizabeth Arnaud; Tanya Z. Berardini; Clayton Birkett; Steve Cannon; James A. Carson; Bradford Condon; Laurel Cooper; Nathan Dunn; Christine G. Elsik; Andrew D. Farmer; Stephen P. Ficklin; David Grant; Emily Grau; Nic Herndon; Zhi-Liang Hu; Jodi L. Humann; Pankaj Jaiswal; Clement Jonquet; Marie-Angélique Laporte; Pierre Larmande; Gerard R. Lazo; Fiona M. McCarthy; Naama Menda

Abstract The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.


Database | 2018

Growing and cultivating the forest genomics database, TreeGenes

Taylor Falk; Nic Herndon; Emily Grau; Sean Buehler; Peter Richter; Sumaira Zaman; Eliza M Baker; Risharde Ramnath; Stephen P. Ficklin; Margaret Staton; Frank Alex Feltus; Sook Jung; Doreen Main; Jill L. Wegrzyn

Abstract Forest trees are valued sources of pulp, timber and biofuels, and serve a role in carbon sequestration, biodiversity maintenance and watershed stability. Examining the relationships among genetic, phenotypic and environmental factors for these species provides insight on the areas of concern for breeders and researchers alike. The TreeGenes database is a web-based repository that is home to 1790 tree species and over 1500 registered users. The database provides a curated archive for high-throughput genomics, including reference genomes, transcriptomes, genetic maps and variant data. These resources are paired with extensive phenotypic information and environmental layers. TreeGenes recently migrated to Tripal, an integrated and open-source database schema and content management system. This migration enabled developments focused on data exchange, data transfer and improved analytical capacity, as well as providing TreeGenes the opportunity to communicate with the following partner databases: Hardwood Genomics Web, Genome Database for Rosaceae, and the Citrus Genome Database. Recent development in TreeGenes has focused on coordinating information for georeferenced accessions, including metadata acquisition and ontological frameworks, to improve integration across studies combining genetic, phenotypic and environmental data. This focus was paired with the development of tools to enable comparative genomics and data visualization. By combining advanced data importers, relevant metadata standards and integrated analytical frameworks, TreeGenes provides a platform for researchers to store, submit and analyze forest tree data.


Theoretical and Applied Genetics | 2005

Candidate gene database and transcript map for peach, a model species for fruit trees

Renate Horn; Anne-Claire Lecouls; Ann Callahan; Abhaya M. Dandekar; Lilibeth Garay; Per McCord; Werner Howad; Helen Chan; Ignazio Verde; Doreen Main; Sook Jung; Laura L. Georgi; Sam Forrest; J. Mook; Tatyana Zhebentyayeva; Yeisoo Yu; Hye Ran Kim; Christopher Jesudurai; Bryon Sosinski; Pere Arús; Vance Baird; Dan E. Parfitt; Gregory L. Reighard; Ralph Scorza; Jeffrey Tomkins; Rod A. Wing; A. G. Abbott


Tree Genetics & Genomes | 2008

A framework physical map for peach, a model Rosaceae species

Tetyana Zhebentyayeva; G. A. Swire-Clark; Laura L. Georgi; L. Garay; Sook Jung; S. Forrest; A. V. Blenda; Barbara Blackmon; J. Mook; Renate Horn; Werner Howad; Pere Arús; Doreen Main; Jeffrey Tomkins; Bryon Sosinski; W. V. Baird; Gregory L. Reighard; A. G. Abbott


Crop Science | 2002

Construction and characterization of a deep-coverage bacterial artificial chromosome library for maize

Jeffrey Tomkins; Georgia L. Davis; Doreen Main; Young-Sun Yim; Ngozi A. Duru; Theresa A. Musket; Jose Luis Goicoechea; David Frisch; Edward H. Coe; Rod A. Wing

Collaboration


Dive into the Doreen Main's collaboration.

Top Co-Authors

Avatar

Sook Jung

Washington State University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stephen P. Ficklin

Washington State University

View shared research outputs
Top Co-Authors

Avatar

Jill L. Wegrzyn

University of Connecticut

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Taein Lee

Washington State University

View shared research outputs
Top Co-Authors

Avatar

Chun-Huai Cheng

Washington State University

View shared research outputs
Top Co-Authors

Avatar

David B. Neale

University of California

View shared research outputs
Top Co-Authors

Avatar

Il-Hyung Cho

Saginaw Valley State University

View shared research outputs
Researchain Logo
Decentralizing Knowledge