Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Pekka Marttinen is active.

Publication


Featured researches published by Pekka Marttinen.


Molecular Ecology | 2006

Bayesian identification of admixture events using multilocus molecular markers

Jukka Corander; Pekka Marttinen

Bayesian statistical methods for the estimation of hidden genetic structure of populations have gained considerable popularity in the recent years. Utilizing molecular marker data, Bayesian mixture models attempt to identify a hidden population structure by clustering individuals into genetically divergent groups, whereas admixture models target at separating the ancestral sources of the alleles observed in different individuals. We discuss the difficulties involved in the simultaneous estimation of the number of ancestral populations and the levels of admixture in studied individuals’ genomes. To resolve this issue, we introduce a computationally efficient method for the identification of admixture events in the population history. Our approach is illustrated by analyses of several challenging real and simulated data sets. The software (baps), implementing the methods introduced here, is freely available at http://www.rni.helsinki.fi/~jic/bapspage.html.


Bioinformatics | 2004

BAPS 2: enhanced possibilities for the analysis of genetic population structure

Jukka Corander; Patrik Waldmann; Pekka Marttinen; Mikko J. Sillanpää

UNLABELLED Bayesian statistical methods based on simulation techniques have recently been shown to provide powerful tools for the analysis of genetic population structure. We have previously developed a Markov chain Monte Carlo (MCMC) algorithm for characterizing genetically divergent groups based on molecular markers and geographical sampling design of the dataset. However, for large-scale datasets such algorithms may get stuck to local maxima in the parameter space. Therefore, we have modified our earlier algorithm to support multiple parallel MCMC chains, with enhanced features that enable considerably faster and more reliable estimation compared to the earlier version of the algorithm. We consider also a hierarchical tree representation, from which a Bayesian model-averaged structure estimate can be extracted. The algorithm is implemented in a computer program that features a user-friendly interface and built-in graphics. The enhanced features are illustrated by analyses of simulated data and an extensive human molecular dataset. AVAILABILITY Freely available at http://www.rni.helsinki.fi/~jic/bapspage.html.


Science | 2014

Single-cell genomics reveals hundreds of coexisting subpopulations in wild Prochlorococcus.

Nadav Kashtan; Sara E. Roggensack; Sébastien Rodrigue; Jessie W. Thompson; Steven J. Biller; Allison Coe; Huiming Ding; Pekka Marttinen; Rex R. Malmstrom; Roman Stocker; Michael J. Follows; Ramunas Stepanauskas; Sallie W. Chisholm

Cyanobacterial Diversity What does it mean to be a global species? The marine cyanobacterium Prochlorococcus is ubiquitous and, arguably, the most abundant and productive of all living organisms. Although to our eyes the seas look uniform, to a bacterium the oceans bulk is a plethora of microhabitats, and by large-scale single-cell genomic analysis of uncultured cells, Kashtan et al. (p. 416; see the Perspective by Bowler and Scanlan) reveal that Prochlorococcus has diversified to match. This “species” constitutes a mass of subpopulations—each with million-year ancestry—that vary seasonally in abundance. The subpopulations in turn have clades nested within that show covariation between sets of core alleles and variable gene content, indicating flexibility of responses to rapid environmental changes. Large sets of coexisting populations could be a general feature of other free-living bacterial species living in highly mixed habitats. Covariation between the core alleles and flexible gene content of a marine cyanobacterium underpins vast diversity. [Also see Perspective by Bowler and Scanlan] Extensive genomic diversity within coexisting members of a microbial species has been revealed through selected cultured isolates and metagenomic assemblies. Yet, the cell-by-cell genomic composition of wild uncultured populations of co-occurring cells is largely unknown. In this work, we applied large-scale single-cell genomics to study populations of the globally abundant marine cyanobacterium Prochlorococcus. We show that they are composed of hundreds of subpopulations with distinct “genomic backbones,” each backbone consisting of a different set of core gene alleles linked to a small distinctive set of flexible genes. These subpopulations are estimated to have diverged at least a few million years ago, suggesting ancient, stable niche partitioning. Such a large set of coexisting subpopulations may be a general feature of free-living bacterial species with huge populations in highly mixed habitats.


Nature Genetics | 2014

Dense genomic sampling identifies highways of pneumococcal recombination

Claire Chewapreecha; Simon R. Harris; Nicholas J. Croucher; Claudia Turner; Pekka Marttinen; Lu Cheng; Alberto Pessia; David M. Aanensen; Alison E. Mather; Andrew J. Page; Susannah J. Salter; David J. Harris; François Nosten; David Goldblatt; Jukka Corander; Julian Parkhill; Paul Turner; Stephen D. Bentley

Evasion of clinical interventions by Streptococcus pneumoniae occurs through selection of non-susceptible genomic variants. We report whole-genome sequencing of 3,085 pneumococcal carriage isolates from a 2.4-km2 refugee camp. This sequencing provides unprecedented resolution of the process of recombination and its impact on population evolution. Genomic recombination hotspots show remarkable consistency between lineages, indicating common selective pressures acting at certain loci, particularly those associated with antibiotic resistance. Temporal changes in antibiotic consumption are reflected in changes in recombination trends, demonstrating rapid spread of resistance when selective pressure is high. The highest frequencies of receipt and donation of recombined DNA fragments were observed in non-encapsulated lineages, implying that this largely overlooked pneumococcal group, which is beyond the reach of current vaccines, may have a major role in genetic exchange and the adaptation of the species as a whole. These findings advance understanding of pneumococcal population dynamics and provide information for the design of future intervention strategies.


Nucleic Acids Research | 2012

Detection of recombination events in bacterial genomes from large population samples

Pekka Marttinen; William P. Hanage; Nicholas J. Croucher; Thomas Richard Connor; Simon R. Harris; Stephen D. Bentley; Jukka Corander

Analysis of important human pathogen populations is currently under transition toward whole-genome sequencing of growing numbers of samples collected on a global scale. Since recombination in bacteria is often an important factor shaping their evolution by enabling resistance elements and virulence traits to rapidly transfer from one evolutionary lineage to another, it is highly beneficial to have access to tools that can detect recombination events. Multiple advanced statistical methods exist for such purposes; however, they are typically limited either to only a few samples or to data from relatively short regions of a total genome. By harnessing the power of recent advances in Bayesian modeling techniques, we introduce here a method for detecting homologous recombination events from whole-genome sequence data for bacterial population samples on a large scale. Our statistical approach can efficiently handle hundreds of whole genome sequenced population samples and identify separate origins of the recombinant sequence, offering an enhanced insight into the diversification of bacterial clones at the level of the whole genome. A data set of 241 whole genome sequences from an important pandemic lineage of Streptococcus pneumoniae is used together with multiple simulated data sets to demonstrate the potential of our approach.


PLOS Genetics | 2014

Comprehensive Identification of Single Nucleotide Polymorphisms Associated with Beta-lactam Resistance within Pneumococcal Mosaic Genes

Claire Chewapreecha; Pekka Marttinen; Nicholas J. Croucher; Susannah J. Salter; Simon R. Harris; Alison E. Mather; William P. Hanage; David Goldblatt; François Nosten; Claudia Turner; Paul Turner; Stephen D. Bentley; Julian Parkhill

Traditional genetic association studies are very difficult in bacteria, as the generally limited recombination leads to large linked haplotype blocks, confounding the identification of causative variants. Beta-lactam antibiotic resistance in Streptococcus pneumoniae arises readily as the bacteria can quickly incorporate DNA fragments encompassing variants that make the transformed strains resistant. However, the causative mutations themselves are embedded within larger recombined blocks, and previous studies have only analysed a limited number of isolates, leading to the description of “mosaic genes” as being responsible for resistance. By comparing a large number of genomes of beta-lactam susceptible and non-susceptible strains, the high frequency of recombination should break up these haplotype blocks and allow the use of genetic association approaches to identify individual causative variants. Here, we performed a genome-wide association study to identify single nucleotide polymorphisms (SNPs) and indels that could confer beta-lactam non-susceptibility using 3,085 Thai and 616 USA pneumococcal isolates as independent datasets for the variant discovery. The large sample sizes allowed us to narrow the source of beta-lactam non-susceptibility from long recombinant fragments down to much smaller loci comprised of discrete or linked SNPs. While some loci appear to be universal resistance determinants, contributing equally to non-susceptibility for at least two classes of beta-lactam antibiotics, some play a larger role in resistance to particular antibiotics. All of the identified loci have a highly non-uniform distribution in the populations. They are enriched not only in vaccine-targeted, but also non-vaccine-targeted lineages, which may raise clinical concerns. Identification of single nucleotide polymorphisms underlying resistance will be essential for future use of genome sequencing to predict antibiotic sensitivity in clinical microbiology.


Genome Biology | 2012

Phylogeographic variation in recombination rates within a global clone of methicillin-resistant Staphylococcus aureus

Santiago Castillo-Ramírez; Jukka Corander; Pekka Marttinen; Mona Aldeljawi; William P. Hanage; Henrik Westh; Kit Boye; Zeynep Gülay; Stephen D. Bentley; Julian Parkhill; Matthew T. G. Holden; Edward J. Feil

BackgroundNext-generation sequencing (NGS) is a powerful tool for understanding both patterns of descent over time and space (phylogeography) and the molecular processes underpinning genome divergence in pathogenic bacteria. Here, we describe a synthesis between these perspectives by employing a recently developed Bayesian approach, BRATNextGen, for detecting recombination on an expanded NGS dataset of the globally disseminated methicillin-resistant Staphylococcus aureus (MRSA) clone ST239.ResultsThe data confirm strong geographical clustering at continental, national and city scales and demonstrate that the rate of recombination varies significantly between phylogeographic sub-groups representing independent introductions from Europe. These differences are most striking when mobile non-core genes are included, but remain apparent even when only considering the stable core genome. The monophyletic ST239 sub-group corresponding to isolates from South America shows heightened recombination, the sub-group predominantly from Asia shows an intermediate level, and a very low level of recombination is noted in a third sub-group representing a large collection from Turkey.ConclusionsWe show that the rapid global dissemination of a single pathogenic bacterial clone results in local variation in measured recombination rates. Possible explanatory variables include the size and time since emergence of each defined sub-population (as determined by the sampling frame), variation in transmission dynamics due to host movement, and changes in the bacterial genome affecting the propensity for recombination.


Molecular Ecology | 2014

Cryptic ecology among host generalist Campylobacter jejuni in domestic animals

Samuel K. Sheppard; Lu Cheng; Guillaume Méric; Caroline P. A. de Haan; Ann-Katrin Llarena; Pekka Marttinen; Ana Vidal; A.M. Ridley; F. A. Clifton-Hadley; Thomas Richard Connor; Norval J. C. Strachan; Ken J. Forbes; Frances M. Colles; Keith A. Jolley; Stephen D. Bentley; Martin C. J. Maiden; Marja-Liisa Hänninen; Julian Parkhill; William P. Hanage; Jukka Corander

Homologous recombination between bacterial strains is theoretically capable of preventing the separation of daughter clusters, and producing cohesive clouds of genotypes in sequence space. However, numerous barriers to recombination are known. Barriers may be essential such as adaptive incompatibility, or ecological, which is associated with the opportunities for recombination in the natural habitat. Campylobacter jejuni is a gut colonizer of numerous animal species and a major human enteric pathogen. We demonstrate that the two major generalist lineages of C. jejuni do not show evidence of recombination with each other in nature, despite having a high degree of host niche overlap and recombining extensively with specialist lineages. However, transformation experiments show that the generalist lineages readily recombine with one another in vitro. This suggests ecological rather than essential barriers to recombination, caused by a cryptic niche structure within the hosts.


Nature Communications | 2016

Sequence element enrichment analysis to determine the genetic basis of bacterial phenotypes

John A. Lees; Minna Vehkala; Niko Välimäki; Simon R. Harris; Claire Chewapreecha; Nicholas J. Croucher; Pekka Marttinen; Mark R. Davies; Andrew C. Steer; Stephen Y.C. Tong; Antti Honkela; Julian Parkhill; Stephen D. Bentley; Jukka Corander

Bacterial genomes vary extensively in terms of both gene content and gene sequence. This plasticity hampers the use of traditional SNP-based methods for identifying all genetic associations with phenotypic variation. Here we introduce a computationally scalable and widely applicable statistical method (SEER) for the identification of sequence elements that are significantly enriched in a phenotype of interest. SEER is applicable to tens of thousands of genomes by counting variable-length k-mers using a distributed string-mining algorithm. Robust options are provided for association analysis that also correct for the clonal population structure of bacteria. Using large collections of genomes of the major human pathogens Streptococcus pneumoniae and Streptococcus pyogenes, SEER identifies relevant previously characterized resistance determinants for several antibiotics and discovers potential novel factors related to the invasiveness of S. pyogenes. We thus demonstrate that our method can answer important biologically and medically relevant questions.


Molecular Biology and Evolution | 2011

Reconstructing Population Histories from Single Nucleotide Polymorphism Data

Jukka Sirén; Pekka Marttinen; Jukka Corander

Population genetics encompasses a strong theoretical and applied research tradition on the multiple demographic processes that shape genetic variation present within a species. When several distinct populations exist in the current generation, it is often natural to consider the pattern of their divergence from a single ancestral population in terms of a binary tree structure. Inference about such population histories based on molecular data has been an intensive research topic in the recent years. The most common approach uses coalescent theory to model genealogies of individuals sampled from the current populations. Such methods are able to compare several different evolutionary scenarios and to estimate demographic parameters. However, their major limitation is the enormous computational complexity associated with the indirect modeling of the demographies, which limits the application to small data sets. Here, we propose a novel Bayesian method for inferring population histories from unlinked single nucleotide polymorphisms, which is applicable also to data sets harboring large numbers of individuals from distinct populations. We use an approximation to the neutral Wright-Fisher diffusion to model random fluctuations in allele frequencies. The population histories are modeled as binary rooted trees that represent the historical order of divergence of the different populations. A combination of analytical, numerical, and Monte Carlo integration techniques are utilized for the inferences. A particularly important feature of our approach is that it provides intuitive measures of statistical uncertainty related with the estimates computed, which may be entirely lacking for the alternative methods in this context. The potential of our approach is illustrated by analyses of both simulated and real data sets.

Collaboration


Dive into the Pekka Marttinen's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jussi Gillberg

Helsinki Institute for Information Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Julian Parkhill

Wellcome Trust Sanger Institute

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Stephen D. Bentley

Wellcome Trust Sanger Institute

View shared research outputs
Researchain Logo
Decentralizing Knowledge