Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Catherine E. Welsh is active.

Publication


Featured researches published by Catherine E. Welsh.


Nature Genetics | 2011

Subspecific origin and haplotype diversity in the laboratory mouse

Hyuna Yang; Jeremy R. Wang; John P. Didion; Ryan J. Buus; Timothy A. Bell; Catherine E. Welsh; Franãois Bonhomme; Alex Hon-Tsen Yu; Michael W. Nachman; Jaroslav Piálek; Priscilla K. Tucker; Pierre Boursot; Leonard McMillan; Gary A. Churchill; Fernando Pardo-Manuel de Villena

Here we provide a genome-wide, high-resolution map of the phylogenetic origin of the genome of most extant laboratory mouse inbred strains. Our analysis is based on the genotypes of wild-caught mice from three subspecies of Mus musculus. We show that classical laboratory strains are derived from a few fancy mice with limited haplotype diversity. Their genomes are overwhelmingly Mus musculus domesticus in origin, and the remainder is mostly of Japanese origin. We generated genome-wide haplotype maps based on identity by descent from fancy mice and show that classical inbred strains have limited and non-randomly distributed genetic diversity. In contrast, wild-derived laboratory strains represent a broad sampling of diversity within M. musculus. Intersubspecific introgression is pervasive in these strains, and contamination by laboratory stocks has played a role in this process. The subspecific origin, haplotype diversity and identity by descent maps can be visualized using the Mouse Phylogeny Viewer (see URLs).


Genetics | 2012

High-Resolution Genetic Mapping Using the Mouse Diversity Outbred Population

Karen L. Svenson; Daniel M. Gatti; William Valdar; Catherine E. Welsh; Riyan Cheng; Elissa J. Chesler; Abraham A. Palmer; Leonard McMillan; Gary A. Churchill

The JAX Diversity Outbred population is a new mouse resource derived from partially inbred Collaborative Cross strains and maintained by randomized outcrossing. As such, it segregates the same allelic variants as the Collaborative Cross but embeds these in a distinct population architecture in which each animal has a high degree of heterozygosity and carries a unique combination of alleles. Phenotypic diversity is striking and often divergent from phenotypes seen in the founder strains of the Collaborative Cross. Allele frequencies and recombination density in early generations of Diversity Outbred mice are consistent with expectations based on simulations of the mating design. We describe analytical methods for genetic mapping using this resource and demonstrate the power and high mapping resolution achieved with this population by mapping a serum cholesterol trait to a 2-Mb region on chromosome 3 containing only 11 genes. Analysis of the estimated allele effects in conjunction with complete genome sequence data of the founder strains reduced the pool of candidate polymorphisms to seven SNPs, five of which are located in an intergenic region upstream of the Foxo1 gene.


Nature Genetics | 2015

Analyses of allele-specific gene expression in highly divergent mouse crosses identifies pervasive allelic imbalance

James J. Crowley; Vasyl Zhabotynsky; Wei Sun; Shunping Huang; Isa Kemal Pakatci; Yunjung Kim; Jeremy R. Wang; Andrew P. Morgan; John D. Calaway; David L. Aylor; Zaining Yun; Timothy A. Bell; Ryan J. Buus; Mark Calaway; John P. Didion; Terry J. Gooch; Stephanie D. Hansen; Nashiya N. Robinson; Ginger D. Shaw; Jason S. Spence; Corey R. Quackenbush; Cordelia J. Barrick; Randal J. Nonneman; Kyungsu Kim; James Xenakis; Yuying Xie; William Valdar; Alan B. Lenarcic; Wei Wang; Catherine E. Welsh

Complex human traits are influenced by variation in regulatory DNA through mechanisms that are not fully understood. Because regulatory elements are conserved between humans and mice, a thorough annotation of cis regulatory variants in mice could aid in further characterizing these mechanisms. Here we provide a detailed portrait of mouse gene expression across multiple tissues in a three-way diallel. Greater than 80% of mouse genes have cis regulatory variation. Effects from these variants influence complex traits and usually extend to the human ortholog. Further, we estimate that at least one in every thousand SNPs creates a cis regulatory effect. We also observe two types of parent-of-origin effects, including classical imprinting and a new global allelic imbalance in expression favoring the paternal allele. We conclude that, as with humans, pervasive regulatory variation influences complex genetic traits in mice and provide a new resource toward understanding the genetic control of transcription in mammals.


Mammalian Genome | 2012

Status and access to the Collaborative Cross population

Catherine E. Welsh; Darla R. Miller; Kenneth F. Manly; Jeremy Wang; Leonard McMillan; Grant Morahan; Richard Mott; Fuad A. Iraqi; David W. Threadgill; Fernando Pardo-Manuel de Villena

The Collaborative Cross (CC) is a panel of recombinant inbred lines derived from eight genetically diverse laboratory inbred strains. Recently, the genetic architecture of the CC population was reported based on the genotype of a single male per line, and other publications reported incompletely inbred CC mice that have been used to map a variety of traits. The three breeding sites, in the US, Israel, and Australia, are actively collaborating to accelerate the inbreeding process through marker-assisted inbreeding and to expedite community access of CC lines deemed to have reached defined thresholds of inbreeding. Plans are now being developed to provide access to this novel genetic reference population through distribution centers. Here we provide a description of the distribution efforts by the University of North Carolina Systems Genetics Core, Tel Aviv University, Israel and the University of Western Australia.


G3: Genes, Genomes, Genetics | 2016

The Mouse Universal Genotyping Array: From Substrains to Subspecies

Andrew P. Morgan; Chen Ping Fu; Chia Yu Kao; Catherine E. Welsh; John P. Didion; Liran Yadgary; Leeanna Hyacinth; Martin T. Ferris; Timothy A. Bell; Darla R. Miller; Paola Giusti-Rodriguez; Randal J. Nonneman; Kevin D. Cook; Jason K. Whitmire; Lisa E. Gralinski; Mark P. Keller; Alan D. Attie; Gary A. Churchill; Petko M. Petkov; Patrick F. Sullivan; J. Brennan; Leonard McMillan; Fernando Pardo-Manuel de Villena

Genotyping microarrays are an important resource for genetic mapping, population genetics, and monitoring of the genetic integrity of laboratory stocks. We have developed the third generation of the Mouse Universal Genotyping Array (MUGA) series, GigaMUGA, a 143,259-probe Illumina Infinium II array for the house mouse (Mus musculus). The bulk of the content of GigaMUGA is optimized for genetic mapping in the Collaborative Cross and Diversity Outbred populations, and for substrain-level identification of laboratory mice. In addition to 141,090 single nucleotide polymorphism probes, GigaMUGA contains 2006 probes for copy number concentrated in structurally polymorphic regions of the mouse genome. The performance of the array is characterized in a set of 500 high-quality reference samples spanning laboratory inbred strains, recombinant inbred lines, outbred stocks, and wild-caught mice. GigaMUGA is highly informative across a wide range of genetically diverse samples, from laboratory substrains to other Mus species. In addition to describing the content and performance of the array, we provide detailed probe-level annotation and recommendations for quality control.


Mammalian Genome | 2015

Informatics resources for the Collaborative Cross and related mouse populations

Andrew P. Morgan; Catherine E. Welsh

Relatedness Relatedness in the genetic sense refers to the proportion of alleles shared between two individuals. The degree to which two individuals are genetically related depends on the number of common ancestors they share and the number of generations which have elapsed since they shared them. A pedigree describes the expected relatedness between individuals: first-degree relatives (parents or siblings) share, on average, half of their alleles; second-degree relatives (grandparents) one-fourth; and so on. With dense genotype data, we can instead compute realized relatedness as the proportion of shared, unlinked alleles. Using dense genotypes, we can define relatedness both at the genome-wide and at the local scale. In the presence of admixture or introgression (see below), local relatedness in different regions of the genome may deviate from the genome-wide average. Population structure A population is “structured” when it has experienced deviations from random mating, or equivalently, when it is divided into subpopulations with restricted genetic exchange between them. In a structured population, some groups of individuals are more closely related to (share more alleles with) each other than with other groups. Geography and mating behavior generate at least some degree of structure in most natural populations. Population structure in laboratory mouse strains is widespread: for instance, the 129 and C57BL strain groups form a genetic cluster distinct from so-called “Swiss mice” including FVB/NJ, the NOD substrains, and ICR outbred stock (Beck et al. 2000). Failure to account for population structure can lead to false-positive QTL in genetic mapping of complex traits. Linkage disequilibrium (LD) Two loci are said to be in LD if the frequencies of pairwise genotypes depart from those expected if alleles were sampled randomly at each locus. LD is decreased by recombination, and therefore generally decreases with time and with physical distance between loci. Unlinked markers are expected to be in linkage equilibrium, but non-random mating can produce “long-range” LD between unlinked loci in structured populations. Haplotype block A haplotype block is a chromosomal segment in which there is no evidence for recombination during the history of a sample of individuals. Within a block, individuals in a population can be collapsed into one of a small (relative to the population size) number of ancestral haplotypes (Wall et al. 2003). LD is relatively high between loci within a block, but relatively low between loci in adjacent blocks. Although many schemes have been proposed for defining haplotype blocks, the one discussed in this review is the four-gamete test (Hudson et al. 1985). Consider two loci A and B with alleles A,a and B,b, respectively. There are four possible haploid genotypes (gametes)—AB, aB, Ab, and ab—and if all four are observed in a sample, recombination between A and B must have occurred at least once in the past. Haplotype blocks are a useful means of investigating patterns of genetic diversity at intermediate timescales since a common ancestor, such as among classical inbred strains of mice (Yang et al. 2011). But because recombination events accumulate and LD decreases with time, haplotype blocks shared between two individuals with a common ancestor far in the past—for example, a wild-derived inbred strain and a classical laboratory strain—will be very short. For this reason, haplotype blocks were not inferred for the wild mice and wild-derived strains in Yang et al. (2011). Identity by descent (IBD) A chromosomal segment is shared identical-by-descent between two individuals if it was inherited from their common ancestor without recombination. The notion of IBD is closely related to the haplotype block. Admixture Admixture refers to inter-breeding between individuals from populations which were previously genetically isolated from one another. Admixture facilitates gene flow between populations, and in the process creates heterogeneity of relatedness across the genome. Introgression Introgression refers to the introduction of a chromosomal segment from one population into a separate, genetically distinct population. It is often used to describe gene flow between species or subspecies which can still form fertile hybrids. Unlike admixture, which describes ongoing inter-breeding, introgression describes events which are episodic in nature. In this review, we refer to genetic exchange between mouse subspecies, which do not interbreed in the wild except at narrow hybrid zones (Ursin 1952), as introgression. Ancestry inference Broadly speaking, an ancestry-inference procedure steps along the genome of an individual and attempts to assign each segment to one of a few ancestral clusters. These clusters may represent ancestral population groups, for samples from natural populations, or founder haplotypes in laboratory populations. Examples of ancestry inference discussed in this review include assignment of subspecific origin in wild mice (Yang et al. 2011), which labels genomic regions with one of three subspecies; and haplotype reconstruction on the CC and DO (Fu et al. 2012), which assigns genomic regions to one of those populations’ 8 founder strains. Hidden Markov model (HMM) A hidden Markov model is a probabilistic model which describes how an observed sequence can be generated from an underlying, unknown sequence of “hidden states” (Baum and Petrie 1966; Rabiner 1989). Efficient algorithms can be used to “decode” the sequence of hidden states given an observed sequence. In this review, we discuss HMMs in which the observed sequences are genotypes along a chromosome, and the hidden states are founder haplotypes.


Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine | 2012

Inferring ancestry in admixed populations using microarray probe intensities

Chen Ping Fu; Catherine E. Welsh; Fernando Pardo-Manuel de Villena; Leonard McMillan

Numerous methods exist for inferring the ancestry mosaic of an admixed individual based on its genotypes and those of its ancestors. These methods rely on bialleic SNPs obtained from genotype calling algorithms, which classify each marker as belonging to one of four states (reference allele, alternate allele, heterozygous, or no call) based on probe hybridization intensity signals. We demonstrate that this conversion of probe intensities to discrete genotypes can lead to a loss of information and introduce errors via incorrect genotype calls. We propose a method that directly infers ancestry from probe intensities by minimizing the intensity difference between a target individual and one or more of its ancestors. We demonstrate our method on mice from the developing Collaborative Cross (CC) genetic reference population, which are admixtures of a common set of eight ancestors. Our samples were genotyped using a 7.8K-marker Illumina Infinium platform called the Mouse Universal Genotyping Array (MUGA). We compare our reconstructions with a standard genotype-based method and validate our results using DNA sequencing data. Our algorithm is able to use information not captured by genotype calls and avoid errors due to incorrect calls.


G3: Genes, Genomes, Genetics | 2012

Accelerating the Inbreeding of Multi-Parental Recombinant Inbred Lines Generated By Sibling Matings

Catherine E. Welsh; Leonard McMillan

Inbred model organisms are powerful tools for genetic studies because they provide reproducible genomes for use in mapping and genetic manipulation. Generating inbred lines via sibling matings, however, is a costly undertaking that requires many successive generations of breeding, during which time many lines fail. We evaluated several approaches for accelerating inbreeding, including the systematic use of back-crosses and marker-assisted breeder selection, which we contrasted with randomized sib-matings. Using simulations, we explored several alternative breeder-selection methods and monitored the gain and loss of genetic diversity, measured by the number of recombination-induced founder intervals, as a function of generation. For each approach we simulated 100,000 independent lines to estimate distributions of generations to achieve full-fixation as well as to achieve a mean heterozygosity level equal to 20 generations of randomized sib-mating. Our analyses suggest that the number of generations to fully inbred status can be substantially reduced with minimal impact on genetic diversity through combinations of parental backcrossing and marker-assisted inbreeding. Although simulations do not consider all confounding factors underlying the inbreeding process, such as a loss of fecundity, our models suggest many viable alternatives for accelerating the inbreeding process.


international conference on bioinformatics | 2013

Fine-Scale Recombination Mapping of High-Throughput Sequence Data

Catherine E. Welsh; Chen Ping Fu; Fernando Pardo-Manuel de Villena; Leonard McMillan

In this paper, we contrast the resolution and accuracy of determining recombination boundaries using genotyping arrays compared to high-throughput sequencing. In addition, we consider the impacts of sequence coverage and genetic diversity on localizing recombination boundaries. We developed a hidden Markov model for estimating recombination breakpoints based on variant observations seen in the read coverage spanning uniformly sized genomic windows. Our model includes 36 states representing all combinations of 8 genomes, and estimates a founder mosaic that is consistent with the variants observed in the aligned sequences. At HMM transition locations we consider the most likely founder-pair and refine the recombination breakpoints down to an interval spanning two informative variants. We compare this solution to alternate solutions based on microarrays that we have estimated. At 30x coverage the recombination mapping accuracy far exceeds the resolution attainable by any microarray. Even at coverages of 1x and below we are generally able to estimate recombination breakpoints with comparable accuracy.


Nature Genetics | 2015

Erratum: Analyses of allele-specific gene expression in highly divergent mouse crosses identifies pervasive allelic imbalance (Nature Genetics (2015) 47 (353-360))

James J. Crowley; Vasyl Zhabotynsky; Wei Sun; Shunping Huang; Isa Kemal Pakatci; Yunjung Kim; Jeremy R. Wang; Andrew P. Morgan; John D. Calaway; David L. Aylor; Zaining Yun; Timothy A. Bell; Ryan J. Buus; Mark Calaway; John P. Didion; Terry J. Gooch; Stephanie D. Hansen; Nashiya N. Robinson; Ginger D. Shaw; Jason S. Spence; Corey R. Quackenbush; Cordelia J. Barrick; Randal J. Nonneman; Kyungsu Kim; James Xenakis; Yuying Xie; William Valdar; Alan B. Lenarcic; Wei Wang; Catherine E. Welsh

Nat. Genet. 47, 353–360 (2015); published online 2 March 2015; corrected after print 16 April 2015 In the version of this article initially published, an accession number was not provided for RNA-seq data sets. The RNA-seq data sets that passed quality control are available at the Sequence Read Archive (SRA) under accession SRP056236.

Collaboration


Dive into the Catherine E. Welsh's collaboration.

Top Co-Authors

Avatar

Leonard McMillan

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Fernando Pardo-Manuel de Villena

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Andrew P. Morgan

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

John P. Didion

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Timothy A. Bell

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Chen Ping Fu

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Darla R. Miller

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jeremy R. Wang

University of North Carolina at Chapel Hill

View shared research outputs
Top Co-Authors

Avatar

Randal J. Nonneman

University of North Carolina at Chapel Hill

View shared research outputs
Researchain Logo
Decentralizing Knowledge