Art F. Y. Poon
University of Western Ontario
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Art F. Y. Poon.
Journal of Virology | 2009
Samantha Lycett; Melissa J. Ward; Fraser Lewis; Art F. Y. Poon; S. L. Kosakovsky Pond; A. J. Leigh Brown
ABSTRACT Highly pathogenic avian influenza (HPAI) virus H5N1 infects water and land fowl and can infect and cause mortality in mammals, including humans. However, HPAI H5N1 strains are not equally virulent in mammals, and some strains have been shown to cause only mild symptoms in experimental infections. Since most experimental studies of the basis of virulence in mammals have been small in scale, we undertook a meta-analysis of available experimental studies and used Bayesian graphical models (BGM) to increase the power of inference. We applied text-mining techniques to identify 27 individual studies that experimentally determined pathogenicity in HPAI H5N1 strains comprising 69 complete genome sequences. Amino acid sequence data in all 11 genes were coded as binary data for the presence or absence of mutations related to virulence in mammals or nonconsensus residues. Sites previously implicated as virulence determinants were examined for association with virulence in mammals in this data set, and the sites with the most significant association were selected for further BGM analysis. The analyses show that virulence in mammals is a complex genetic trait directly influenced by mutations in polymerase basic 1 (PB1) and PB2, nonstructural 1 (NS1), and hemagglutinin (HA) genes. Several intra- and intersegment correlations were also found, and we postulate that there may be two separate virulence mechanisms involving particular combinations of polymerase and NS1 mutations or of NS1 and HA mutations.
Molecular Biology and Evolution | 2017
Oliver Ratmann; Emma B. Hodcroft; Michael Pickles; Anne Cori; Matthew Hall; Samantha Lycett; Caroline Colijn; Bethany Lorna Dearlove; Xavier Didelot; Simon D. W. Frost; As Md Mukarram Hossain; Jeffrey B. Joy; Michelle Kendall; Denise Kühnert; Gabriel E. Leventhal; Richard H. Liang; Giacomo Plazzotta; Art F. Y. Poon; David A. Rasmussen; Tanja Stadler; Erik M. Volz; Caroline Weis; Andrew J. Brown; Christophe Fraser
Viral phylogenetic methods contribute to understanding how HIV spreads in populations, and thereby help guide the design of prevention interventions. So far, most analyses have been applied to well-sampled concentrated HIV-1 epidemics in wealthy countries. To direct the use of phylogenetic tools to where the impact of HIV-1 is greatest, the Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium generates full-genome viral sequences from across sub-Saharan Africa. Analyzing these data presents new challenges, since epidemics are principally driven by heterosexual transmission and a smaller fraction of cases is sampled. Here, we show that viral phylogenetic tools can be adapted and used to estimate epidemiological quantities of central importance to HIV-1 prevention in sub-Saharan Africa. We used a community-wide methods comparison exercise on simulated data, where participants were blinded to the true dynamics they were inferring. Two distinct simulations captured generalized HIV-1 epidemics, before and after a large community-level intervention that reduced infection levels. Five research groups participated. Structured coalescent modeling approaches were most successful: phylogenetic estimates of HIV-1 incidence, incidence reductions, and the proportion of transmissions from individuals in their first 3 months of infection correlated with the true values (Pearson correlationu2009>u200990%), with small bias. However, on some simulations, true values were markedly outside reported confidence or credibility intervals. The blinded comparison revealed current limits and strengths in using HIV phylogenetics in challenging settings, provided benchmarks for future methods’ development, and supports using the latest generation of phylogenetic tools to advance HIV surveillance and prevention.
Virus Evolution | 2016
Art F. Y. Poon
For infectious diseases, a genetic cluster is a group of closely related infections that is usually interpreted as representing a recent outbreak of transmission. Genetic clustering methods are becoming increasingly popular for molecular epidemiology, especially in the context of HIV where there is now considerable interest in applying these methods to prioritize groups for public health resources such as pre-exposure prophylaxis. To date, genetic clustering has generally been performed with ad hoc algorithms, only some of which have since been encoded and distributed as free software. These algorithms have seldom been validated on simulated data where clusters are known, and their interpretation and similarities are not transparent to users outside of the field. Here, I provide a brief overview on the development and inter-relationships of genetic clustering methods, and an evaluation of six methods on data simulated under an epidemic model in a risk-structured population. The simulation analysis demonstrates that the majority of clustering methods are systematically biased to detect variation in sampling rates among subpopulations, not variation in transmission rates. I discuss these results in the context of previous work and the implications for public health applications of genetic clustering.
PLOS Computational Biology | 2017
Rosemary M. McCloskey; Art F. Y. Poon
Clustering infections by genetic similarity is a popular technique for identifying potential outbreaks of infectious disease, in part because sequences are now routinely collected for clinical management of many infections. A diverse number of nonparametric clustering methods have been developed for this purpose. These methods are generally intuitive, rapid to compute, and readily scale with large data sets. However, we have found that nonparametric clustering methods can be biased towards identifying clusters of diagnosis—where individuals are sampled sooner post-infection—rather than the clusters of rapid transmission that are meant to be potential foci for public health efforts. We develop a fundamentally new approach to genetic clustering based on fitting a Markov-modulated Poisson process (MMPP), which represents the evolution of transmission rates along the tree relating different infections. We evaluated this model-based method alongside five nonparametric clustering methods using both simulated and actual HIV sequence data sets. For simulated clusters of rapid transmission, the MMPP clustering method obtained higher mean sensitivity (85%) and specificity (91%) than the nonparametric methods. When we applied these clustering methods to published sequences from a study of HIV-1 genetic clusters in Seattle, USA, we found that the MMPP method categorized about half (46%) as many individuals to clusters compared to the other methods. Furthermore, the mean internal branch lengths that approximate transmission rates were significantly shorter in clusters extracted using MMPP, but not by other methods. We determined that the computing time for the MMPP method scaled linearly with the size of trees, requiring about 30 seconds for a tree of 1,000 tips and about 20 minutes for 50,000 tips on a single computer. This new approach to genetic clustering has significant implications for the application of pathogen sequence analysis to public health, where it is critical to robustly and accurately identify clusters for the most cost-effective deployment of outbreak management and prevention resources.
Virus Research | 2017
Chanson J. Brumme; Art F. Y. Poon
Genetic sequencing (genotyping) plays a critical role in the modern clinical management of HIV infection. This virus evolves rapidly within patients because of its error-prone reverse transcriptase and short generation time. Consequently, HIV variants with mutations that confer resistance to one or more antiretroviral drugs can emerge during sub-optimal treatment. There are now multiple HIV drug resistance interpretation algorithms that take the region of the HIV genome encoding the major drug targets as inputs; expert use of these algorithms can significantly improve to clinical outcomes in HIV treatment. Next-generation sequencing has the potential to revolutionize HIV resistance genotyping by lowering the threshold that rare but clinically significant HIV variants can be detected reproducibly, and by conferring improved cost-effectiveness in high-throughput scenarios. In this review, we discuss the relative merits and challenges of deploying the Illumina MiSeq instrument for clinical HIV genotyping.
Archive | 2019
Mariano Avino; Art F. Y. Poon
The comparative study of homologous proteins can provide abundant information about the functional and structural constraints on protein evolution. For example, an amino acid substitution that is deleterious may become permissive in the presence of another substitution at a second site of the protein. A popular approach for detecting coevolving residues is by looking for correlated substitution events on branches of the molecular phylogeny relating the protein-coding sequences. Here we describe a machine learning method (Bayesian graphical models) implemented in the open-source phylogenetic software package HyPhy, http://hyphy.org , for extracting a network of coevolving residues from a sequence alignment.
bioRxiv | 2018
Abayomi S Olabode; Mariano Avino; Tammy Ng; Faisal Abu-Sardanah; David W Dick; Art F. Y. Poon
Reconstructing the early dynamics of the HIV-1 pandemic can provide crucial insights into the socioeconomic drivers of emerging infectious diseases in human populations, including the roles of urbanization and transportation networks. Current evidence indicates that the global pandemic comprising almost entirely of HIV-1/M originated around the 1920s in central Africa. However, these estimates are based on molecular clock estimates that are assumed to apply uniformly across the virus genome. There is growing evidence that recombination has played a significant role in the early history of the HIV-1 pandemic, such that different regions of the HIV-1 genome have different evolutionary histories. In this study, we have conducted a dated-tip analysis of all near full-length HIV-1/M genome sequences that were published in the GenBank database. We used a sliding window approach similar to the ‘bootscanning’ method for detecting breakpoints in intersubtype recombinant sequences. We found evidence of substantial variation in estimated root dates among windows, with an estimated mean time to the most recent common ancestor (tMRCA) of 1922. Estimates were significantly autocorrelated, which was more consistent with an early recombination event than with stochastic error variation in phylogenetic reconstruction and dating analyses. A piecewise regression analysis supported the existence of at least one recombination breakpoint in the HIV-1/M genome with interval-specific means around 1929 and 1913, respectively. This analysis demonstrates that a sliding window approach can accommodate early recombination events outside the established nomenclature of HIV-1/M subtypes, although it is difficult to incorporate the earliest available samples due to their limited genome coverage.
bioRxiv | 2018
Jasper C Ho; Garway T Ng; Mathias S Renaud; Art F. Y. Poon
Genotypic resistance interpretation systems for the prediction and interpretation of HIV-1 antiretroviral resistance are an important part of the clinical management of HIV-1 infection. Current interpretation systems are generally hosted on remote webservers that enable clinical laboratories to generate resistance predictions easily and quickly from patient HIV-1 sequences encoding the primary targets of modern antiretroviral therapy. However they also potentially compromise a health provider’s ethical, professional, and legal obligations to data security, patient information confidentiality, and data provenance. Furthermore, reliance on web-based algorithms makes the clinical management of HIV-1 dependent on a network connection. Here, we describe the development and validation of sierra-local, an open-source implementation of the Stanford HIVdb genotypic resistance interpretation system for local execution, which aims to resolve the ethical, legal, and infrastructure issues associated with remote computing. This package reproduces the HIV-1 resistance scoring by the web-based Stanford HIVdb algorithm with a high degree of concordance (99.997%) and a higher level of performance than current methods of accessing HIVdb programmatically.
bioRxiv | 2018
Stephen Solis-Reyes; Mariano Avino; Art F. Y. Poon; Lila Kari
For many disease-causing virus species, global diversity is clustered into a taxonomy of subtypes with clinical significance. In particular, the classification of infections among the subtypes of human immunodeficiency virus type 1 (HIV-1) is a routine component of clinical management, and there are now many classification algorithms available for this purpose. Although several of these algorithms are similar in accuracy and speed, the majority are proprietary and require laboratories to transmit HIV-1 sequence data over the network to remote servers. This potentially exposes sensitive patient data to unauthorized access, and makes it impossible to determine how classifications are made and to maintain the data provenance of clinical bioinformatic workflows. We propose an open-source supervised and alignment-free subtyping method (Kameris) that operates on k-mer frequencies in HIV-1 sequences. We performed a detailed study of the accuracy and performance of subtype classification in comparison to four state-of-the-art programs. Based on our testing data set of manually curated real-world HIV-1 sequences (n = 2, 784), Kameris obtained an overall accuracy of 97%, which matches or exceeds all other tested software, with a processing rate of over 1,500 sequences per second. Furthermore, our fully standalone general-purpose software provides key advantages in terms of data security and privacy, transparency and reproducibility. Finally, we show that our method is readily adaptable to subtype classification of other viruses including dengue, influenza A, and hepatitis B and C virus.
bioRxiv | 2018
Mariano Avino; Garway T Ng; YiYing He; Mathias S Renaud; Bradley R. Jones; Art F. Y. Poon
Cophylogeny is the congruence of phylogenetic relationships between two different groups of organisms due to their long-term interaction, such as between host and pathogen species. Discordance between host and pathogen phylogenies may occur due to pathogen host-switch events, pathogen speciation within a host species, and extinction. Here, we investigated the use of tree shape distance measures to quantify the degree of cophylogeny for the comparative analysis of host-pathogen interactions across taxonomic groups. We firstly implemented a coalescent model to simulate pathogen phylogenies within a fixed host tree, given the cospeciation probability, migration rate between hosts, and pathogen speciation rate within hosts. Next, we used simulations from this model to evaluate 13 distance metrics between these trees and the host tree, including Robinson-Foulds distance and two kernel distances that we developed for labeled and unlabeled trees, which use branch lengths and can accommodate trees of different sizes. Finally, we used these distance metrics to revisit actual datasets from published cophylogenetic studies across all taxonomic groups, where authors described the observed associations as representing a high or low degree of cophylogeny. Our simulation analyses demonstrated that some metrics are more informative than others with respect to specific coevolution parameters. For example, the Sim metric was the most responsive to variation in coalescence rates, whereas the unlabeled kernel metric was the most responsive to cospeciation probabilities. We also determined that distance metrics were more informative about the model parameters when the underlying parameter values did not assume extreme values, e.g., rapid host switching. When applied to real datasets, projection of these trees’ associations into a parameter space defined by the 13 distance metrics revealed some clustering of studies reporting low concordance. This suggested that different investigators are describing concordance in a consistent way across biological systems, and that these expert subjective assessments can be at least partly quantified using distance metrics. Our results support the hypothesis that tree distance measures can be useful for quantifying host and pathogen cophylogeny. This motivates the usage of distance metrics in the field of coevolution and supports the development of simulation-based methods, i.e., approximate Bayesian computation, to estimate coevolutionary parameters from the discordant shapes of host and pathogen trees. [tree shape; cophylogeny; codivergence; coevolution; host switching; tree metrics; kernel]