Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Aaron Petkau is active.

Publication


Featured researches published by Aaron Petkau.


Emerging Infectious Diseases | 2011

Comparative Genomics of Vibrio cholerae from Haiti, Asia, and Africa

Aleisha R. Reimer; Gary Van Domselaar; Steven Stroika; Matthew Walker; Heather Kent; Cheryl L. Tarr; Deborah F. Talkington; Lori A. Rowe; Melissa Olsen-Rasmussen; Michael Frace; Scott Sammons; Georges Dahourou; Jacques Boncy; Anthony M. Smith; Philip Mabon; Aaron Petkau; Morag Graham; Matthew W. Gilmour; Peter Gerner-Smidt

A strain from Haiti shares genetic ancestry with those from Asia and Africa.


Mbio | 2013

Evolutionary dynamics of Vibrio cholerae O1 following a single-source introduction to Haiti

Lee S. Katz; Aaron Petkau; John Beaulaurier; Shaun Tyler; Elena S. Antonova; Maryann Turnsek; Yan Guo; Susana Wang; Ellen E. Paxinos; Fabini D. Orata; Lori Gladney; Steven Stroika; Jason P. Folster; Lori A. Rowe; Molly M. Freeman; Natalie Knox; Mike Frace; Jacques Boncy; Morag Graham; Brian K. Hammer; Yan Boucher; Ali Bashir; William P. Hanage; Gary Van Domselaar; Cheryl L. Tarr

ABSTRACT Prior to the epidemic that emerged in Haiti in October of 2010, cholera had not been documented in this country. After its introduction, a strain of Vibrio cholerae O1 spread rapidly throughout Haiti, where it caused over 600,000 cases of disease and >7,500 deaths in the first two years of the epidemic. We applied whole-genome sequencing to a temporal series of V. cholerae isolates from Haiti to gain insight into the mode and tempo of evolution in this isolated population of V. cholerae O1. Phylogenetic and Bayesian analyses supported the hypothesis that all isolates in the sample set diverged from a common ancestor within a time frame that is consistent with epidemiological observations. A pangenome analysis showed nearly homogeneous genomic content, with no evidence of gene acquisition among Haiti isolates. Nine nearly closed genomes assembled from continuous-long-read data showed evidence of genome rearrangements and supported the observation of no gene acquisition among isolates. Thus, intrinsic mutational processes can account for virtually all of the observed genetic polymorphism, with no demonstrable contribution from horizontal gene transfer (HGT). Consistent with this, the 12 Haiti isolates tested by laboratory HGT assays were severely impaired for transformation, although unlike previously characterized noncompetent V. cholerae isolates, each expressed hapR and possessed a functional quorum-sensing system. Continued monitoring of V. cholerae in Haiti will illuminate the processes influencing the origin and fate of genome variants, which will facilitate interpretation of genetic variation in future epidemics. IMPORTANCE Vibrio cholerae is the cause of substantial morbidity and mortality worldwide, with over three million cases of disease each year. An understanding of the mode and rate of evolutionary change is critical for proper interpretation of genome sequence data and attribution of outbreak sources. The Haiti epidemic provides an unprecedented opportunity to study an isolated, single-source outbreak of Vibrio cholerae O1 over an established time frame. By using multiple approaches to assay genetic variation, we found no evidence that the Haiti strain has acquired any genes by horizontal gene transfer, an observation that led us to discover that it is also poorly transformable. We have found no evidence that environmental strains have played a role in the evolution of the outbreak strain. Vibrio cholerae is the cause of substantial morbidity and mortality worldwide, with over three million cases of disease each year. An understanding of the mode and rate of evolutionary change is critical for proper interpretation of genome sequence data and attribution of outbreak sources. The Haiti epidemic provides an unprecedented opportunity to study an isolated, single-source outbreak of Vibrio cholerae O1 over an established time frame. By using multiple approaches to assay genetic variation, we found no evidence that the Haiti strain has acquired any genes by horizontal gene transfer, an observation that led us to discover that it is also poorly transformable. We have found no evidence that environmental strains have played a role in the evolution of the outbreak strain.


Bioinformatics | 2010

Interactive microbial genome visualization with GView

Aaron Petkau; Matthew Stuart-Edwards; Paul Stothard; Gary Van Domselaar

Summary: GView is a Java application for viewing and examining prokaryotic genomes in a circular or linear context. It accepts standard sequence file formats and an optional style specification file to generate customizable, publication quality genome maps in bitmap and scalable vector graphics formats. GView features an interactive pan-and-zoom interface, a command-line interface for incorporation in genome analysis pipelines, and a public Application Programming Interface for incorporation in other Java applications. Availability: GView is a freely available application licensed under the GNU Public License. The application, source code, documentation, file specifications, tutorials and image galleries are available at http://gview.ca Contact: [email protected]


Frontiers in Microbiology | 2017

A Comparative Analysis of the Lyve-SET Phylogenomics Pipeline for Genomic Epidemiology of Foodborne Pathogens

Lee S. Katz; Taylor Griswold; Darlene Wagner; Aaron Petkau; Cameron Sieffert; Gary Van Domselaar; Xiangyu Deng; Heather A. Carleton

Modern epidemiology of foodborne bacterial pathogens in industrialized countries relies increasingly on whole genome sequencing (WGS) techniques. As opposed to profiling techniques such as pulsed-field gel electrophoresis, WGS requires a variety of computational methods. Since 2013, United States agencies responsible for food safety including the CDC, FDA, and USDA, have been performing whole-genome sequencing (WGS) on all Listeria monocytogenes found in clinical, food, and environmental samples. Each year, more genomes of other foodborne pathogens such as Escherichia coli, Campylobacter jejuni, and Salmonella enterica are being sequenced. Comparing thousands of genomes across an entire species requires a fast method with coarse resolution; however, capturing the fine details of highly related isolates requires a computationally heavy and sophisticated algorithm. Most L. monocytogenes investigations employing WGS depend on being able to identify an outbreak clade whose inter-genomic distances are less than an empirically determined threshold. When the difference between a few single nucleotide polymorphisms (SNPs) can help distinguish between genomes that are likely outbreak-associated and those that are less likely to be associated, we require a fine-resolution method. To achieve this level of resolution, we have developed Lyve-SET, a high-quality SNP pipeline. We evaluated Lyve-SET by retrospectively investigating 12 outbreak data sets along with four other SNP pipelines that have been used in outbreak investigation or similar scenarios. To compare these pipelines, several distance and phylogeny-based comparison methods were applied, which collectively showed that multiple pipelines were able to identify most outbreak clusters and strains. Currently in the US PulseNet system, whole genome multi-locus sequence typing (wgMLST) is the preferred primary method for foodborne WGS cluster detection and outbreak investigation due to its ability to name standardized genomic profiles, its central database, and its ability to be run in a graphical user interface. However, creating a functional wgMLST scheme requires extended up-front development and subject-matter expertise. When a scheme does not exist or when the highest resolution is needed, SNP analysis is used. Using three Listeria outbreak data sets, we demonstrated the concordance between Lyve-SET SNP typing and wgMLST. Availability: Lyve-SET can be found at https://github.com/lskatz/Lyve-SET.


bioRxiv | 2017

SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial genomic epidemiology

Aaron Petkau; Philip Mabon; Cameron Sieffert; Natalie Knox; Jennifer Cabral; Mariam Iskander; Mark Iskander; Kelly Weedmark; Rahat Zaheer; Lee S. Katz; Celine Nadon; Aleisha Reimer; Eduardo N. Taboada; Robert G. Beiko; William C. Hsiao; Fiona S. L. Brinkman; Morag Graham; Gary Van Domselaar

The recent widespread application of whole-genome sequencing (WGS) for microbial disease investigations has spurred the development of new bioinformatics tools, including a notable proliferation of phylogenomics pipelines designed for infectious disease surveillance and outbreak investigation. Transitioning the use of WGS data out of the research laboratory and into the front lines of surveillance and outbreak response requires user-friendly, reproducible and scalable pipelines that have been well validated. Single Nucleotide Variant Phylogenomics (SNVPhyl) is a bioinformatics pipeline for identifying high-quality single-nucleotide variants (SNVs) and constructing a whole-genome phylogeny from a collection of WGS reads and a reference genome. Individual pipeline components are integrated into the Galaxy bioinformatics framework, enabling data analysis in a user-friendly, reproducible and scalable environment. We show that SNVPhyl can detect SNVs with high sensitivity and specificity, and identify and remove regions of high SNV density (indicative of recombination). SNVPhyl is able to correctly distinguish outbreak from non-outbreak isolates across a range of variant-calling settings, sequencing-coverage thresholds or in the presence of contamination. SNVPhyl is available as a Galaxy workflow, Docker and virtual machine images, and a Unix-based command-line application. SNVPhyl is released under the Apache 2.0 license and available at http://snvphyl.readthedocs.io/ or at https://github.com/phac-nml/snvphyl-galaxy.


Clinical Microbiology Reviews | 2016

A Primer on Infectious Disease Bacterial Genomics

Tarah Lynch; Aaron Petkau; Natalie Knox; Morag Graham; Gary Van Domselaar

SUMMARY The number of large-scale genomics projects is increasing due to the availability of affordable high-throughput sequencing (HTS) technologies. The use of HTS for bacterial infectious disease research is attractive because one whole-genome sequencing (WGS) run can replace multiple assays for bacterial typing, molecular epidemiology investigations, and more in-depth pathogenomic studies. The computational resources and bioinformatics expertise required to accommodate and analyze the large amounts of data pose new challenges for researchers embarking on genomics projects for the first time. Here, we present a comprehensive overview of a bacterial genomics projects from beginning to end, with a particular focus on the planning and computational requirements for HTS data, and provide a general understanding of the analytical concepts to develop a workflow that will meet the objectives and goals of HTS projects.


BMC Genomics | 2016

Genomic insights from whole genome sequencing of four clonal outbreak Campylobacter jejuni assessed within the global C. jejuni population

Clifford G. Clark; Chrystal Berry; Matthew Walker; Aaron Petkau; Dillon O. R. Barker; Cai Guan; Aleisha Reimer; Eduardo N. Taboada

BackgroundWhole genome sequencing (WGS) is useful for determining clusters of human cases, investigating outbreaks, and defining the population genetics of bacteria. It also provides information about other aspects of bacterial biology, including classical typing results, virulence, and adaptive strategies of the organism. Cell culture invasion and protein expression patterns of four related multilocus sequence type 21 (ST21) C. jejuni isolates from a significant Canadian water-borne outbreak were previously associated with the presence of a CJIE1 prophage. Whole genome sequencing was used to examine the genetic diversity among these isolates and confirm that previous observations could be attributed to differential prophage carriage. Moreover, we sought to determine the presence of genome sequences that could be used as surrogate markers to delineate outbreak-associated isolates.ResultsDifferential carriage of the CJIE1 prophage was identified as the major genetic difference among the four outbreak isolates. High quality single-nucleotide variant (hqSNV) and core genome multilocus sequence typing (cgMLST) clustered these isolates within expanded datasets consisting of additional C. jejuni strains. The number and location of homopolymeric tract regions was identical in all four outbreak isolates but differed from all other C. jejuni examined. Comparative genomics and PCR amplification enabled the identification of large chromosomal inversions of approximately 93 kb and 388 kb within the outbreak isolates associated with transducer-like proteins containing long nucleotide repeat sequences. The 93-kb inversion was characteristic of the outbreak-associated isolates, and the gene content of this inverted region displayed high synteny with the reference strain.ConclusionsThe four outbreak isolates were clonally derived and differed mainly in the presence of the CJIE1 prophage, validating earlier findings linking the prophage to phenotypic differences in virulence assays and protein expression. The identification of large, genetically syntenous chromosomal inversions in the genomes of outbreak-associated isolates provided a unique method for discriminating outbreak isolates from the background population. Transducer-like proteins appear to be associated with the chromosomal inversions. CgMLST and hqSNV analysis also effectively delineated the outbreak isolates within the larger C. jejuni population structure.


Bioinformatics | 2016

Sequence database versioning for command line and Galaxy bioinformatics servers

Damion Dooley; Aaron Petkau; Gary Van Domselaar; William W. L. Hsiao

Motivation: There are various reasons for rerunning bioinformatics tools and pipelines on sequencing data, including reproducing a past result, validation of a new tool or workflow using a known dataset, or tracking the impact of database changes. For identical results to be achieved, regularly updated reference sequence databases must be versioned and archived. Database administrators have tried to fill the requirements by supplying users with one-off versions of databases, but these are time consuming to set up and are inconsistent across resources. Disk storage and data backup performance has also discouraged maintaining multiple versions of databases since databases such as NCBI nr can consume 50 Gb or more disk space per version, with growth rates that parallel Moores law. Results: Our end-to-end solution combines our own Kipper software package—a simple key-value large file versioning system—with BioMAJ (software for downloading sequence databases), and Galaxy (a web-based bioinformatics data processing platform). Available versions of databases can be recalled and used by command-line and Galaxy users. The Kipper data store format makes publishing curated FASTA databases convenient since in most cases it can store a range of versions into a file marginally larger than the size of the latest version. Availability and implementation: Kipper v1.0.0 and the Galaxy Versioned Data tool are written in Python and released as free and open source software available at https://github.com/Public-Health-Bioinformatics/kipper and https://github.com/Public-Health-Bioinformatics/versioned_data, respectively; detailed setup instructions can be found at https://github.com/Public-Health-Bioinformatics/versioned_data/blob/master/doc/setup.md Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


bioRxiv | 2018

The Integrated Rapid Infectious Disease Analysis (IRIDA) Platform

Thomas Matthews; Franklin Bristow; Emma J. Griffiths; Aaron Petkau; Josh Adam; Damion Dooley; Peter Kruczkiewicz; John Curatcha; Jennifer Cabral; Dan Fornika; Geoffrey L. Winsor; Mélanie Courtot; Claire Bertelli; Ataollah Roudgar; Pedro Feijao; Philip Mabon; Eric Enns; Joel Thiessen; Alexander Keddy; Judith L. Isaac-Renton; Jennifer L. Gardy; Patrick Tang; João A. Carriço; Leonid Chindelevitch; Cedric Chauve; Morag Graham; Andrew G. McArthur; Eduardo N. Taboada; Robert G. Beiko; Fiona S. L. Brinkman

Whole genome sequencing (WGS) is a powerful tool for public health infectious disease investigations owing to its higher resolution, greater efficiency, and cost-effectiveness over traditional genotyping methods. Implementation of WGS in routine public health microbiology laboratories is impeded by a lack of user-friendly automated and semi-automated pipelines, restrictive jurisdictional data sharing policies, and the proliferation of non-interoperable analytical and reporting systems. To address these issues, we developed the Integrated Rapid Infectious Disease Analysis (IRIDA) platform (irida.ca), a user-friendly, decentralized, open-source bioinformatics and analytical web platform to support real-time infectious disease outbreak investigations using WGS data. Instances can be independently installed on local high-performance computing infrastructure, enabling private and secure data management and analyses according to organizational policies and governance. IRIDA’s data management capabilities enable secure upload, storage and sharing of all WGS data and metadata. The core platform currently includes pipelines for quality control, assembly, annotation, variant detection, phylogenetic analysis, in silico serotyping, multi-locus sequence typing, and genome distance calculation. Analysis pipeline results can be visualized within the platform through dynamic line lists and integrated phylogenomic clustering for research and discovery, and for enhancing decision-making support and hypothesis generation in epidemiological investigations. Communication and data exchange between instances are provided through customizable access controls. IRIDA complements centralized systems, empowering local analytics and visualizations for genomics-based microbial pathogen investigations. IRIDA is currently transforming the Canadian public health ecosystem and is freely available at https://github.com/phac-nml/irida and www.irida.ca. Impact Statement Whole genome sequencing (WGS) is revolutionizing infectious disease analysis and surveillance due to its cost effectiveness, utility, and improved analytical power. To date, no “one-size-fits-all” genomics platform has been universally adopted, owing to differences in national (and regional) health information systems, data sharing policies, computational infrastructures, lack of interoperability and prohibitive costs. The Integrated Rapid Infectious Disease Analysis (IRIDA) platform is a user-friendly, decentralized, open-source bioinformatics and analytical web platform developed to support real-time infectious disease outbreak investigations using WGS data. IRIDA empowers public health, regulatory and clinical microbiology laboratory personnel to better incorporate WGS technology into routine operations by shielding them from the computational and analytical complexities of big data genomics. IRIDA is now routinely used as part of a validated suite of tools to support outbreak investigations in Canada. While IRIDA was designed to serve the needs of the Canadian public health system, it is generally applicable to any public health and multi-jurisdictional environment. IRIDA enables localized analyses but provides mechanisms and standard outputs to enable data sharing. This approach can help overcome pervasive challenges in real-time global infectious disease surveillance, investigation and control, resulting in faster responses, and ultimately, better public health outcomes. DATA SUMMARY Data used to generate some of the figures in this manuscript can be found in the NCBI BioProject PRJNA305824.


F1000Research | 2016

Outbreak surveillance and investigation using IRIDA and SNVPhyl

Aaron Petkau; Franklin Bristow; Thomas Matthews; Josh Adam; Philip Mabon; Cameron Sieffert; Eric Enns; Jennifer Cabral; Joel Thiessen; Natalie Knox; Damion Dooley; Aleisha Reimer; Eduardo N. Taboada; Alex Keddy; Robert G. Beiko; William C. Hsiao; Morag Graham; Gary Van Domselaar; Fiona S. L. Brinkman

Collaboration


Dive into the Aaron Petkau's collaboration.

Top Co-Authors

Avatar

Gary Van Domselaar

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Morag Graham

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Damion Dooley

University of British Columbia

View shared research outputs
Top Co-Authors

Avatar

Eduardo N. Taboada

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Natalie Knox

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Philip Mabon

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Aleisha Reimer

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar

Cameron Sieffert

Public Health Agency of Canada

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge