P. Kerr Wall | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P. Kerr Wall is active.

Explore More

Publication

Featured researches published by P. Kerr Wall.

BMC Plant Biology | 2005

Floral gene resources from basal angiosperms for comparative genomics research

Victor A. Albert; Douglas E. Soltis; John E. Carlson; William G. Farmerie; P. Kerr Wall; Daniel C. Ilut; Teri M Solow; Lukas A. Mueller; Lena Landherr; Yi Hu; Matyas Buzgo; Sangtae Kim; Mi-Jeong Yoo; Michael W. Frohlich; Rafael Perl-Treves; Scott E. Schlarbaum; Barbara J Bliss; Xiaohong Zhang; Steven D. Tanksley; David G. Oppenheimer; Pamela S. Soltis; Hong Ma; Claude W. dePamphilis; Jim Leebens-Mack

BackgroundThe Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants.ResultsRandom sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms.ConclusionInitial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways.

Nucleic Acids Research | 2007

PlantTribes: a gene and gene family resource for comparative genomics in plants

P. Kerr Wall; Jim Leebens-Mack; Kai Müller; Dawn Field; Naomi Altman; Claude W. dePamphilis

The PlantTribes database (http://fgp.huck.psu.edu/tribe.html) is a plant gene family database based on the inferred proteomes of five sequenced plant species: Arabidopsis thaliana, Carica papaya, Medicago truncatula, Oryza sativa and Populus trichocarpa. We used the graph-based clustering algorithm MCL [Van Dongen (Technical Report INS-R0010 2000) and Enright et al. (Nucleic Acids Res. 2002; 30: 1575–1584)] to classify all of these species’ protein-coding genes into putative gene families, called tribes, using three clustering stringencies (low, medium and high). For all tribes, we have generated protein and DNA alignments and maximum-likelihood phylogenetic trees. A parallel database of microarray experimental results is linked to the genes, which lets researchers identify groups of related genes and their expression patterns. Unified nomenclatures were developed, and tribes can be related to traditional gene families and conserved domain identifiers. SuperTribes, constructed through a second iteration of MCL clustering, connect distant, but potentially related gene clusters. The global classification of nearly 200 000 plant proteins was used as a scaffold for sorting ∼4 million additional cDNA sequences from over 200 plant species. All data and analyses are accessible through a flexible interface allowing users to explore the classification, to place query sequences within the classification, and to download results for further study.

Nucleic Acids Research | 2006

ChloroplastDB: the Chloroplast Genome Database

Liying Cui; Narayanan Veeraraghavan; Alexander Richter; P. Kerr Wall; Robert K. Jansen; Jim Leebens-Mack; Izabela Makalowska; Claude W. dePamphilis

The Chloroplast Genome Database (ChloroplastDB) is an interactive, web-based database for fully sequenced plastid genomes, containing genomic, protein, DNA and RNA sequences, gene locations, RNA-editing sites, putative protein families and alignments (). With recent technical advances, the rate of generating new organelle genomes has increased dramatically. However, the established ontology for chloroplast genes and gene features has not been uniformly applied to all chloroplast genomes available in the sequence databases. For example, annotations for some published genome sequences have not evolved with gene naming conventions. ChloroplastDB provides unified annotations, gene name search, BLAST and download functions for chloroplast encoded genes and genomic sequences. A user can retrieve all orthologous sequences with one search regardless of gene names in GenBank. This feature alone greatly facilitates comparative research on sequence evolution including changes in gene content, codon usage, gene structure and post-transcriptional modifications such as RNA editing. Orthologous protein sets are classified by TribeMCL and each set is assigned a standard gene name. Over the next few years, as the number of sequenced chloroplast genomes increases rapidly, the tools available in ChloroplastDB will allow researchers to easily identify and compile target data for comparative analysis of chloroplast genes and genomes.

Molecular Biology and Evolution | 2009

Evolution of Plant MADS Box Transcription Factors: Evidence for Shifts in Selection Associated with Early Angiosperm Diversification and Concerted Gene Duplications

Hongyan Shan; Laura M. Zahn; Stéphane Guindon; P. Kerr Wall; Hongzhi Kong; Hong Ma; Claude W. dePamphilis; Jim Leebens-Mack

Phylogenomic analyses show that gene and genome duplication events have led to the diversification of transcription factor gene families throughout the evolutionary history of land plants and that gene duplications have played an important role in shaping regulatory networks influencing key phenotypic characters including floral development and flowering time. A molecular evolutionary investigation of the mode and tempo of selection acting on the angiosperm MADS box AP1/SQUA, AP3/PI, AG/AGL11, and SEP gene subfamilies revealed site-specific patterns of shifting evolutionary constraint throughout angiosperm history. Specific positions in the four canonical MADS box gene regions, especially K domains and C-terminal regions of all four of these MADS box gene subfamilies exhibited clade-specific shifts in selective constraint following concerted duplication events. Moreover, the frequency of site-specific shifts in constraint was correlated with gene duplications and early angiosperm diversification. We hypothesize that coevolution among interacting MADS box proteins may be responsible for simultaneous increases in the ratio of nonsynonymous to synonymous substitutions (d(N)/d(S) = omega) early in angiosperm history and following concerted duplication events.

Genome Biology | 2008

The Amborella genome: an evolutionary reference for plant biology.

Douglas E. Soltis; Victor A. Albert; Jim Leebens-Mack; Jeffrey D. Palmer; Rod A. Wing; Claude W. dePamphilis; Hong Ma; John E. Carlson; Naomi Altman; Sangtae Kim; P. Kerr Wall; Andrea Zuccolo; Pamela S. Soltis

The nuclear genome sequence of Amborella trichopoda, the sister species to all other extant angiosperms, will be an exceptional resource for plant genomics.

IEEE/ACM Transactions on Computational Biology and Bioinformatics | 2011

The Impact of Multiple Protein Sequence Alignment on Phylogenetic Estimation

Li-San Wang; Jim Leebens-Mack; P. Kerr Wall; Kevin Beckmann; Claude W. dePamphilis; Tandy J. Warnow

Multiple sequence alignment is typically the first step in estimating phylogenetic trees, with the assumption being that as alignments improve, so will phylogenetic reconstructions. Over the last decade or so, new multiple sequence alignment methods have been developed to improve comparative analyses of protein structure, but these new methods have not been typically used in phylogenetic analyses. In this paper, we report on a simulation study that we performed to evaluate the consequences of using these new multiple sequence alignment methods in terms of the resultant phylogenetic reconstruction. We find that while alignment accuracy is positively correlated with phylogenetic accuracy, the amount of improvement in phylogenetic estimation that results from an improved alignment can range from quite small to substantial. We observe that phylogenetic accuracy is most highly correlated with alignment accuracy when sequences are most difficult to align, and that variation in alignment accuracy can have little impact on phylogenetic accuracy when alignment error rates are generally low. We discuss these observations and implications for future work.

Plant Molecular Biology | 2006

EST database for early flower development in California poppy (Eschscholzia californica Cham., Papaveraceae) tags over 6000 genes from a basal eudicot

John E. Carlson; Jim Leebens-Mack; P. Kerr Wall; Laura M. Zahn; Lucas A. Mueller; Lena Landherr; Yi Hu; Danniel C. Ilut; Jennifer M. Arrington; Stephanie Choirean; Annette Becker; Dawn Field; Steven D. Tanksley; Hong Ma; Claude W. dePamphilis

The Floral Genome Project (FGP) selected California poppy (Eschscholzia californica Cham. ssp. Californica) to help identify new florally-expressed genes related to floral diversity in basal eudicots. A large, non-normalized cDNA library was constructed from premeiotic and meiotic floral buds and sequenced to generate a database of 9079 high quality Expressed Sequence Tags (ESTs). These sequences clustered into 5713 unigenes, including 1414 contigs and 4299 singletons. Homologs of genes regulating many aspects of flower development were identified, including those for organ identity and development, cell and tissue differentiation, cell cycle control, and secondary metabolism. Over 5% of the transcriptome consisted of homologs to known floral gene families. Most are the first representatives of their respective gene families in basal eudicots and their conservation suggests they are important for floral development and/or function. App. 10% of the transcripts encoded transcription factors and other regulatory genes, including nine genes from the seven major lineages of the important MADS-box family of developmental regulators. Homologs of alkaloid pathway genes were also recovered, providing opportunities to explore adaptive evolution in secondary products. Furthermore, comparison of the poppy ESTs with the Arabidopsis genome provided support for putative Arabidopsis genes that previously lacked annotation. Finally, over 1800 unique sequences had no observable homology in the public databases. The California poppy EST database and library will help bridge our understanding of flower initiation and development among higher eudicot and monocot model plants and provide new opportunities for comparative analysis of gene families across angiosperm species.

Tree Genetics & Genomes | 2008

An EST database for Liriodendron tulipifera L. floral buds: the first EST resource for functional and comparative genomics in Liriodendron

Haiying Liang; John E. Carlson; Jim Leebens-Mack; P. Kerr Wall; Lukas A. Mueller; Matyas Buzgo; Lena Landherr; Yi Hu; D. Scott DiLoreto; Daniel C. Ilut; Dawn Field; Steven D. Tanksley; Hong Ma; Claude W. dePamphilis

Liriodendron tulipifera L. was selected by the Floral Genome Project for identification of new genes related to floral diversity in basal angiosperms. A large, non-normalized cDNA library was constructed from premeiotic and meiotic floral buds and sequenced to generate a database of 9,531 high-quality expressed sequence tags. These sequences clustered into 6,520 unigenes, of which 5,251 were singletons, and 1,269 were in contigs. Homologs of genes regulating many aspects of flower development were identified, including those for organ identity and development, cell and tissue differentiation, and cell-cycle control. Almost 5% of the transcriptome consisted of homologs to known floral gene families. Homologs of most of the genes involved in cell-wall construction were also recovered. This provides a new opportunity for comparative studies in lignin biosynthesis, a trait of key importance in the evolution of land plants and in the utilization of fiber from economically important tree species, such as Liriodendron. Also of note is that 1,089 unigenes did not match any sequence in the public databases, including the complete genomes of Arabidopsis, rice, and Populus. Some of these novel genes might be unique in basal angiosperm species and, when better characterized, may be informative for understanding the origins of diverged gene families. Thus, the Liriodendron expressed sequence tag database and library will help bridge our understanding of the mechanisms of flower initiation and development that are shared among basal angiosperms, eudicots, and monocots, and provide new opportunities for comparative analysis of gene families across angiosperm species.

BMC Bioinformatics | 2005

Gene capture prediction and overlap estimation in EST sequencing from one or multiple libraries

Ji Ping Wang; Bruce G. Lindsay; Liying Cui; P. Kerr Wall; Josh Marion; Jiaxuan Zhang; Claude W. dePamphilis

BackgroundIn expressed sequence tag (EST) sequencing, we are often interested in how many genes we can capture in an EST sample of a targeted size. This information provides insights to sequencing efficiency in experimental design, as well as clues to the diversity of expressed genes in the tissue from which the library was constructed.ResultsWe propose a compound Poisson process model that can accurately predict the gene capture in a future EST sample based on an initial EST sample. It also allows estimation of the number of expressed genes in one cDNA library or co-expressed in two cDNA libraries. The superior performance of the new prediction method over an existing approach is established by a simulation study. Our analysis of four Arabidopsis thaliana EST sets suggests that the number of expressed genes present in four different cDNA libraries of Arabidopsis thaliana varies from 9155 (root) to 12005 (silique). An observed fraction of co-expressed genes in two different EST sets as low as 25% can correspond to an actual overlap fraction greater than 65%.ConclusionThe proposed method provides a convenient tool for gene capture prediction and cDNA library property diagnosis in EST sequencing.

BMC Plant Biology | 2013

Characterization of the basal angiosperm Aristolochia fimbriata: a potential experimental system for genetic studies

Barbara J Bliss; Stefan Wanke; Abdelali Barakat; Saravanaraj Ayyampalayam; Norman J. Wickett; P. Kerr Wall; Yuannian Jiao; Lena Landherr; Paula E. Ralph; Yi Hu; Christoph Neinhuis; Jim Leebens-Mack; Kathiravetpilla Arumuganathan; Sandra W. Clifton; Siela N. Maximova; Hong Ma; Claude W. dePamphilis

BackgroundPrevious studies in basal angiosperms have provided insight into the diversity within the angiosperm lineage and helped to polarize analyses of flowering plant evolution. However, there is still not an experimental system for genetic studies among basal angiosperms to facilitate comparative studies and functional investigation. It would be desirable to identify a basal angiosperm experimental system that possesses many of the features found in existing plant model systems (e.g., Arabidopsis and Oryza).ResultsWe have considered all basal angiosperm families for general characteristics important for experimental systems, including availability to the scientific community, growth habit, and membership in a large basal angiosperm group that displays a wide spectrum of phenotypic diversity. Most basal angiosperms are woody or aquatic, thus are not well-suited for large scale cultivation, and were excluded. We further investigated members of Aristolochiaceae for ease of culture, life cycle, genome size, and chromosome number. We demonstrated self-compatibility for Aristolochia elegans and A. fimbriata, and transformation with a GFP reporter construct for Saruma henryi and A. fimbriata. Furthermore, A. fimbriata was easily cultivated with a life cycle of just three months, could be regenerated in a tissue culture system, and had one of the smallest genomes among basal angiosperms. An extensive multi-tissue EST dataset was produced for A. fimbriata that includes over 3.8 million 454 sequence reads.ConclusionsAristolochia fimbriata has numerous features that facilitate genetic studies and is suggested as a potential model system for use with a wide variety of technologies. Emerging genetic and genomic tools for A. fimbriata and closely related species can aid the investigation of floral biology, developmental genetics, biochemical pathways important in plant-insect interactions as well as human health, and various other features present in early angiosperms.

Explore More