Marian Thomson
University of Edinburgh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marian Thomson.
Genome Research | 2009
Peter D. Keightley; Urmi Trivedi; Marian Thomson; Fiona Oliver; Sujai Kumar; Mark Blaxter
We inferred the rate and properties of new spontaneous mutations in Drosophila melanogaster by carrying out whole-genome shotgun sequencing-by-synthesis of three mutation accumulation (MA) lines that had been maintained by close inbreeding for an average of 262 generations. We tested for the presence of new mutations by generating alignments of each MA line to the D. melanogaster reference genome sequence and then compared these alignments base by base. We determined empirically that at least five reads at a site within each line are required for accurate single nucleotide mutation calling. We mapped a total of 174 single-nucleotide mutations, giving a single nucleotide mutation rate of 3.5 x 10(-9) per site per generation. There were no false positives in a random sample of 40 of these mutations checked by Sanger sequencing. Variation in the numbers of mutations among the MA lines was small and nonsignificant. Numbers of transition and transversion mutations were 86 and 88, respectively, implying that transition mutation rate is close to 2x the transversion rate. We observed 1.5x as many G or C --> A or T as A or T --> G or C mutations, implying that the G or C --> A or T mutation rate is close to 2x the A or T --> G or C mutation rate. The base composition of the genome is therefore not at an equilibrium determined solely by mutation. The predicted G + C content at mutational equilibrium (33%) is similar to that observed in transposable element remnants. Nearest-neighbor mutational context dependencies are nonsignificant, suggesting that this is a weak phenomenon in Drosophila. We also saw nonsignificant differences in the mutation rate between transcribed and untranscribed regions, implying that any transcription-coupled repair process is weak. Of seven short indel mutations confirmed, six were deletions, consistent with the deletion bias that is thought to exist in Drosophila.
Molecular Ecology | 2013
Mathieu Gautier; Julien Foucaud; Karim Gharbi; Timothee Cezard; Maxime Galan; Anne Loiseau; Marian Thomson; Pierre Pudlo; Carole Kerdelhué; Arnaud Estoup
Molecular markers produced by next‐generation sequencing (NGS) technologies are revolutionizing genetic research. However, the costs of analysing large numbers of individual genomes remain prohibitive for most population genetics studies. Here, we present results based on mathematical derivations showing that, under many realistic experimental designs, NGS of DNA pools from diploid individuals allows to estimate the allele frequencies at single nucleotide polymorphisms (SNPs) with at least the same accuracy as individual‐based analyses, for considerably lower library construction and sequencing efforts. These findings remain true when taking into account the possibility of substantially unequal contributions of each individual to the final pool of sequence reads. We propose the intuitive notion of effective pool size to account for unequal pooling and derive a Bayesian hierarchical model to estimate this parameter directly from the data. We provide a user‐friendly application assessing the accuracy of allele frequency estimation from both pool‐ and individual‐based NGS population data under various sampling, sequencing depth and experimental error designs. We illustrate our findings with theoretical examples and real data sets corresponding to SNP loci obtained using restriction site–associated DNA (RAD) sequencing in pool‐ and individual‐based experiments carried out on the same population of the pine processionary moth (Thaumetopoea pityocampa). NGS of DNA pools might not be optimal for all types of studies but provides a cost‐effective approach for estimating allele frequencies for very large numbers of SNPs. It thus allows comparison of genome‐wide patterns of genetic variation for large numbers of individuals in multiple populations.
BMC Genomics | 2012
Ross Houston; John W. Davey; Stephen Bishop; Natalie R. Lowe; J. C. Mota-Velasco; Alastair Hamilton; Derrick R Guy; A. E. Tinch; Marian Thomson; Mark Blaxter; Karim Gharbi; James E. Bron; John B. Taggart
BackgroundRestriction site-associated DNA sequencing (RAD-Seq) is a genome complexity reduction technique that facilitates large-scale marker discovery and genotyping by sequencing. Recent applications of RAD-Seq have included linkage and QTL mapping with a particular focus on non-model species. In the current study, we have applied RAD-Seq to two Atlantic salmon families from a commercial breeding program. The offspring from these families were classified into resistant or susceptible based on survival/mortality in an Infectious Pancreatic Necrosis (IPN) challenge experiment, and putative homozygous resistant or susceptible genotype at a major IPN-resistance QTL. From each family, the genomic DNA of the two heterozygous parents and seven offspring of each IPN phenotype and genotype was digested with the SbfI enzyme and sequenced in multiplexed pools.ResultsSequence was obtained from approximately 70,000 RAD loci in both families and a filtered set of 6,712 segregating SNPs were identified. Analyses of genome-wide RAD marker segregation patterns in the two families suggested SNP discovery on all 29 Atlantic salmon chromosome pairs, and highlighted the dearth of male recombination. The use of pedigreed samples allowed us to distinguish segregating SNPs from putative paralogous sequence variants resulting from the relatively recent genome duplication of salmonid species. Of the segregating SNPs, 50 were linked to the QTL. A subset of these QTL-linked SNPs were converted to a high-throughput assay and genotyped across large commercial populations of IPNV-challenged salmon fry. Several SNPs showed highly significant linkage and association with resistance to IPN, and population linkage-disequilibrium-based SNP tests for resistance were identified.ConclusionsWe used RAD-Seq to successfully identify and characterise high-density genetic markers in pedigreed aquaculture Atlantic salmon. These results underline the effectiveness of RAD-Seq as a tool for rapid and efficient generation of QTL-targeted and genome-wide marker data in a large complex genome, and its possible utility in farmed animal selection programs.
Molecular Ecology | 2013
Nian Wang; Marian Thomson; William J. A. Bodles; R. M. M. Crawford; Harriet V. Hunt; Alan Watson Featherstone; Jaume Pellicer; Richard J. A. Buggs
New sequencing technologies allow development of genome‐wide markers for any genus of ecological interest, including plant genera such as Betula (birch) that have previously proved difficult to study due to widespread polyploidy and hybridization. We present a de novo reference genome sequence assembly, from 66× short read coverage, of Betula nana (dwarf birch) – a diploid that is the keystone woody species of subarctic scrub communities but of conservation concern in Britain. We also present 100 bp PstI RAD markers for B. nana and closely related Betula tree species. Assembly of RAD markers in 15 individuals by alignment to the reference B. nana genome yielded 44–86k RAD loci per individual, whereas de novo RAD assembly yielded 64–121k loci per individual. Of the loci assembled by the de novo method, 3k homologous loci were found in all 15 individuals studied, and 35k in 10 or more individuals. Matching of RAD loci to RAD locus catalogues from the B. nana individual used for the reference genome showed similar numbers of matches from both methods of RAD locus assembly but indicated that the de novo RAD assembly method may overassemble some paralogous loci. In 12 individuals hetero‐specific to B. nana 37–47k RAD loci matched a catalogue of RAD loci from the B. nana individual used for the reference genome, whereas 44–60k RAD loci aligned to the B. nana reference genome itself. We present a preliminary study of allele sharing among species, demonstrating the utility of the data for introgression studies and for the identification of species‐specific alleles.
Journal of General Virology | 2010
Charles Cunningham; Derek Gatherer; Birgitta Hilfrich; Katarina Baluchova; Derrick J. Dargan; Marian Thomson; Paul D. Griffiths; Gavin William Grahame Wilkinson; Thomas F. Schulz; Andrew J. Davison
We have assessed two approaches to sequencing complete human cytomegalovirus (HCMV) genomes (236 kbp) in DNA extracted from infected cell cultures (strains 3157, HAN13, HAN20 and HAN38) or clinical specimens (strains JP and 3301). The first approach involved amplifying genomes from the DNA samples as overlapping PCR products, sequencing these by the Sanger method, acquiring reads from a capillary instrument and assembling these using the Staden programs. The second approach involved generating sequence data from the DNA samples by using an Illumina Genome Analyzer (IGA), processing the filtered reads by reference-independent (de novo) assembly, utilizing the resulting sequence to direct reference-dependent assembly of the same data and finishing by limited PCR sequencing. Both approaches were successful. In particular, the investigation demonstrated the utility of IGA data for efficiently sequencing genomes from clinical samples containing as little as 3 % HCMV DNA. Analysis of the genome sequences obtained showed that each of the strains grown in cell culture was a mutant. Certain of the mutations were shared among strains from independent clinical sources, thus suggesting that they may have arisen in a common ancestor during natural infection. Moreover, one of the strains (JP) sequenced directly from a clinical specimen was mutated in two genes, one of which encodes a proposed immune-evasion function, viral interleukin-10. These observations imply that HCMV mutants exist in human infections.
Biological Psychiatry | 2001
Daniel Souery; Sophie Van Gestel; Isabelle Massat; Sylvie Blairy; Rolf Adolfsson; Douglas Blackwood; Jurgen Del-Favero; Dimitris Dikeos; Miro Jakovljević; Radka Kaneva; Enrico Lattuada; Bernard Lerer; Roberta Lilli; Vihbra Milanova; Walter J. Muir; Markus M. Nöthen; Lilijana Oruč; George N. Papadimitriou; Peter Propping; Thomas G. Schulze; Alessandro Serretti; Baruch Shapira; Enrico Smeraldi; Costas N. Stefanis; Marian Thomson; Christine Van Broeckhoven; Julien Mendlewicz
BACKGROUND Being the rate-limiting enzyme in the biosynthesis of serotonin, the tryptophan hydroxylase gene (TPH) has been considered a possible candidate gene in bipolar and unipolar affective disorders (BPAD and UPAD). Several studies have investigated the possible role of TPH polymorphisms in affective disorders and suicidal behavior. METHODS The TPH A218C polymorphism has been investigated in 927 patients (527 BPAD and 400 UPAD) and their matched healthy control subjects collected within the European Collaborative Project on Affective Disorders. RESULTS No difference of genotype distribution or allele distribution was found in BPAD or UPAD. No statistically significant difference was observed for allele frequency and genotypes counts. In a genotype per genotype analysis in UPAD patients with a personal history of suicide attempt, the frequency of the C-C genotype (homozygosity for the short allele) was lower in UPAD patients (24%) than in control subjects (43%) (chi(2) = 4.67, p =.03). There was no difference in allele or genotype frequency between patients presenting violent suicidal behavior (n = 48) and their matched control subjects. CONCLUSIONS We failed to detect an association between the A218C polymorphism of the TPH gene and BPAD and UPAD in a large European sample. Homozygosity for the short allele is significantly less frequent in a subgroup of UPAD patients with a history of suicide attempt than in control subjects.
Bioinformatics | 2015
Mick Watson; Marian Thomson; Judith Risse; Richard Talbot; Javier Santoyo-Lopez; Karim Gharbi; Mark Blaxter
Motivation: The Oxford Nanopore MinION device represents a unique sequencing technology. As a mobile sequencing device powered by the USB port of a laptop, the MinION has huge potential applications. To enable these applications, the bioinformatics community will need to design and build a suite of tools specifically for MinION data. Results: Here we present poRe, a package for R that enables users to manipulate, organize, summarize and visualize MinION nanopore sequencing data. As a package for R, poRe has been tested on Windows, Linux and MacOSX. Crucially, the Windows version allows users to analyse MinION data on the Windows laptop attached to the device. Availability and implementation: poRe is released as a package for R at http://sourceforge.net/projects/rpore/. A tutorial and further information are available at https://sourceforge.net/p/rpore/wiki/Home/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.
BMC Genomics | 2010
Paul Hunt; Axel Martinelli; Katarzyna Modrzynska; Sofia T. Borges; Alison M. Creasey; Louise Rodrigues; Dario Beraldi; Laurence Loewe; Richard Fawcett; Sujai Kumar; Marian Thomson; Urmi Trivedi; Thomas D. Otto; Arnab Pain; Mark Blaxter; Pedro Cravo
BackgroundClassical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.ResultsA lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (Illumina® Solexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.ConclusionsThis integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations.
GigaScience | 2015
Judith Risse; Marian Thomson; Sheila Patrick; Garry W. Blakely; Georgios Koutsovoulos; Mark Blaxter; Mick Watson
BackgroundSecond and third generation sequencing technologies have revolutionised bacterial genomics. Short-read Illumina reads result in cheap but fragmented assemblies, whereas longer reads are more expensive but result in more complete genomes. The Oxford Nanopore MinION device is a revolutionary mobile sequencer that can produce thousands of long, single molecule reads.ResultsWe sequenced Bacteroides fragilis strain BE1 using both the Illumina MiSeq and Oxford Nanopore MinION platforms. We were able to assemble a single chromosome of 5.18 Mb, with no gaps, using publicly available software and commodity computing hardware. We identified gene rearrangements and the state of invertible promoters in the strain.ConclusionsThe single chromosome assembly of Bacteroides fragilis strain BE1 was achieved using only modest amounts of data, publicly available software and commodity computing hardware. This combination of technologies offers the possibility of ultra-cheap, high quality, finished bacterial genomes.
Journal of Endocrinology | 2017
Jethro S. Johnson; Monica N Opiyo; Marian Thomson; Karim Gharbi; Jonathan R. Seckl; Andreas Heger; Karen E. Chapman
The enzyme 11β-hydroxysteroid dehydrogenase (11β-HSD) interconverts active glucocorticoids and their intrinsically inert 11-keto forms. The type 1 isozyme, 11β-HSD1, predominantly reactivates glucocorticoids in vivo and can also metabolise bile acids. 11β-HSD1-deficient mice show altered inflammatory responses and are protected against the adverse metabolic effects of a high-fat diet. However, the impact of 11β-HSD1 on the composition of the gut microbiome has not previously been investigated. We used high-throughput 16S rDNA amplicon sequencing to characterise the gut microbiome of 11β-HSD1-deficient and C57Bl/6 control mice, fed either a standard chow diet or a cholesterol- and fat-enriched ‘Western’ diet. 11β-HSD1 deficiency significantly altered the composition of the gut microbiome, and did so in a diet-specific manner. On a Western diet, 11β-HSD1 deficiency increased the relative abundance of the family Bacteroidaceae, and on a chow diet, it altered relative abundance of the family Prevotellaceae. Our results demonstrate that (i) genetic effects on host–microbiome interactions can depend upon diet and (ii) that alterations in the composition of the gut microbiome may contribute to the aspects of the metabolic and/or inflammatory phenotype observed with 11β-HSD1 deficiency.