Michael F. Chou
Harvard University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael F. Chou.
The Lancet | 2010
Euan A. Ashley; Atul J. Butte; Matthew T. Wheeler; Rong Chen; Teri E. Klein; Frederick E. Dewey; Joel T. Dudley; Kelly E. Ormond; Aleksandra Pavlovic; Alexander A. Morgan; Dmitry Pushkarev; Norma F. Neff; Louanne Hudgins; Li Gong; Laura M. Hodges; Dorit S. Berlin; Caroline F. Thorn; Joan M. Hebert; Mark Woon; Hersh Sagreiya; Ryan Whaley; Joshua W. Knowles; Michael F. Chou; Joseph V. Thakuria; Abraham M. Rosenbaum; Alexander Wait Zaranek; George M. Church; Henry T. Greely; Stephen R. Quake; Russ B. Altman
BACKGROUND The cost of genomic information has fallen steeply, but the clinical translation of genetic risk estimates remains unclear. We aimed to undertake an integrated analysis of a complete human genome in a clinical context. METHODS We assessed a patient with a family history of vascular disease and early sudden death. Clinical assessment included analysis of this patients full genome sequence, risk prediction for coronary artery disease, screening for causes of sudden cardiac death, and genetic counselling. Genetic analysis included the development of novel methods for the integration of whole genome and clinical risk. Disease and risk analysis focused on prediction of genetic risk of variants associated with mendelian disease, recognised drug responses, and pathogenicity for novel variants. We queried disease-specific mutation databases and pharmacogenomics databases to identify genes and mutations with known associations with disease and drug response. We estimated post-test probabilities of disease by applying likelihood ratios derived from integration of multiple common variants to age-appropriate and sex-appropriate pre-test probabilities. We also accounted for gene-environment interactions and conditionally dependent risks. FINDINGS Analysis of 2.6 million single nucleotide polymorphisms and 752 copy number variations showed increased genetic risk for myocardial infarction, type 2 diabetes, and some cancers. We discovered rare variants in three genes that are clinically associated with sudden cardiac death-TMEM43, DSP, and MYBPC3. A variant in LPA was consistent with a family history of coronary artery disease. The patient had a heterozygous null mutation in CYP2C19 suggesting probable clopidogrel resistance, several variants associated with a positive response to lipid-lowering therapy, and variants in CYP4F2 and VKORC1 that suggest he might have a low initial dosing requirement for warfarin. Many variants of uncertain importance were reported. INTERPRETATION Although challenges remain, our results suggest that whole-genome sequencing can yield useful and clinically relevant information for individual patients. FUNDING National Institute of General Medical Sciences; National Heart, Lung And Blood Institute; National Human Genome Research Institute; Howard Hughes Medical Institute; National Library of Medicine, Lucile Packard Foundation for Childrens Health; Hewlett Packard Foundation; Breetwor Family Foundation.
Proceedings of the National Academy of Sciences of the United States of America | 2010
Sladjana Prisic; Selasi Dankwa; Daniel K. Schwartz; Michael F. Chou; Jason W. Locasale; Choong-Min Kang; Guy Bemis; George M. Church; Hanno Steen; Robert N. Husson
The Mycobacterium tuberculosis genome encodes 11 serine/threonine protein kinases (STPKs) that are structurally related to eukaryotic kinases. To gain insight into the role of Ser/Thr phosphorylation in this major global pathogen, we used a phosphoproteomic approach to carry out an extensive analysis of protein phosphorylation in M. tuberculosis. We identified more than 500 phosphorylation events in 301 proteins that are involved in a broad range of functions. Bioinformatic analysis of quantitative in vitro kinase assays on peptides containing a subset of these phosphorylation sites revealed a dominant motif shared by six of the M. tuberculosis STPKs. Kinase assays on a second set of peptides incorporating targeted substitutions surrounding the phosphoacceptor validated this motif and identified additional residues preferred by individual kinases. Our data provide insight into processes regulated by STPKs in M. tuberculosis and create a resource for understanding how specific phosphorylation events modulate protein activity. The results further provide the potential to predict likely cognate STPKs for newly identified phosphoproteins.
Current protocols in human genetics | 2011
Michael F. Chou; D. A. Schwartz
The Web-based motif-x program provides a simple interface to extract statistically significant motifs from large data sets, such as MS/MS post-translational modification data and groups of proteins that share a common biological function. Users upload data files and download results using common Web browsers on essentially any Web-compatible computer. Once submitted, data analyses are performed rapidly on an associated high-speed computer cluster and they produce both syntactic and image-based motif results and statistics. The protocols presented demonstrate the use of motif-x in three common user scenarios.
Proceedings of the National Academy of Sciences of the United States of America | 2012
Madeleine Ball; Joseph V. Thakuria; Alexander Wait Zaranek; Tom Clegg; Abraham M. Rosenbaum; Xiaodi Wu; Misha Angrist; Jong Bhak; Jason Bobe; Matthew J. Callow; Carlos Cano; Michael F. Chou; Wendy K. Chung; Shawn M. Douglas; Preston W. Estep; Athurva Gore; Peter J. Hulick; Alberto Labarga; Je-Hyuk Lee; Jeantine E. Lunshof; Byung Chul Kim; Jong-Il Kim; Zhe Li; Michael F. Murray; Geoffrey B. Nilsen; Brock A. Peters; Anugraha M. Raman; Hugh Y. Rienhoff; Kimberly Robasky; Matthew T. Wheeler
Rapid advances in DNA sequencing promise to enable new diagnostics and individualized therapies. Achieving personalized medicine, however, will require extensive research on highly reidentifiable, integrated datasets of genomic and health information. To assist with this, participants in the Personal Genome Project choose to forgo privacy via our institutional review board- approved “open consent” process. The contribution of public data and samples facilitates both scientific discovery and standardization of methods. We present our findings after enrollment of more than 1,800 participants, including whole-genome sequencing of 10 pilot participant genomes (the PGP-10). We introduce the Genome-Environment-Trait Evidence (GET-Evidence) system. This tool automatically processes genomes and prioritizes both published and novel variants for interpretation. In the process of reviewing the presumed healthy PGP-10 genomes, we find numerous literature references implying serious disease. Although it is sometimes impossible to rule out a late-onset effect, stringent evidence requirements can address the high rate of incidental findings. To that end we develop a peer production system for recording and organizing variant evaluations according to standard evidence guidelines, creating a public forum for reaching consensus on interpretation of clinically relevant variants. Genome analysis becomes a two-step process: using a prioritized list to record variant evaluations, then automatically sorting reviewed variants using these annotations. Genome data, health and trait information, participant samples, and variant interpretations are all shared in the public domain—we invite others to review our results using our participant samples and contribute to our interpretations. We offer our public resource and methods to further personalized medical research.
Nature Methods | 2013
Joseph Patrick O'shea; Michael F. Chou; Saad A Quader; James K Ryan; George M. Church; D. A. Schwartz
Methods for visualizing protein or nucleic acid motifs have traditionally relied upon residue frequencies to graphically scale character heights. We describe the pLogo, a motif visualization in which residue heights are scaled relative to their statistical significance. A pLogo generation tool is publicly available at http://plogo.uconn.edu/ and supports real-time conditional probability calculations and visualizations.
Molecular & Cellular Proteomics | 2009
D. A. Schwartz; Michael F. Chou; George M. Church
Protein post-translational modifications are an important biological regulatory mechanism, and the rate of their discovery using high throughput techniques is rapidly increasingly. To make use of this wealth of sequence data, we introduce a new general strategy designed to predict a variety of post-translational modifications in several organisms. We used the motif-x program to determine phosphorylation motifs in yeast, fly, mouse, and man and lysine acetylation motifs in man. These motifs were then scanned against proteomic sequence data using a newly developed tool called scan-x to globally predict other potential modification sites within these organisms. 10-fold cross-validation was used to determine the sensitivity and minimum specificity for each set of predictions, all of which showed improvement over other available tools for phosphoprediction. New motif discovery is a byproduct of this approach, and the phosphorylation motif analyses provide strong evidence of evolutionary conservation of both known and novel kinase motifs.
Expert Review of Molecular Diagnostics | 2014
Shidong Jia; Davide Zocco; Michael L. Samuels; Michael F. Chou; Roger Chammas; Johan Skog; Natasa Zarovni; Fatemeh Momen-Heravi; Winston Patrick Kuo
Extracellular vesicles (EVs), including exosomes and microvesicles, have been shown to carry a variety of biomacromolecules including mRNA, microRNA and other non-coding RNAs. Within the past 5 years, EVs have emerged as a promising minimally invasive novel source of material for molecular diagnostics. Although EVs can be easily identified and collected from biological fluids, further research and proper validation is needed in order for them to be useful in the clinical setting. In addition, innovative and more efficient means of nucleic acid profiling are needed to facilitate investigations into the cellular and molecular mechanisms of EV function and to establish their potential as useful clinical biomarkers and therapeutic tools. In this article, we provide an overview of recent technological improvements in both upstream EV isolation and downstream analytical technologies, including digital PCR and next generation sequencing, highlighting future prospects for EV-based molecular diagnostics.
Genome Medicine | 2014
Madeleine Ball; Jason Bobe; Michael F. Chou; Tom Clegg; Preston W. Estep; Jeantine E. Lunshof; Ward Vandewege; Alexander Wait Zaranek; George M. Church
BackgroundSince its initiation in 2005, the Harvard Personal Genome Project has enrolled thousands of volunteers interested in publicly sharing their genome, health and trait data. Because these data are highly identifiable, we use an ‘open consent’ framework that purposefully excludes promises about privacy and requires participants to demonstrate comprehension prior to enrollment.DiscussionOur model of non-anonymous, public genomes has led us to a highly participatory model of researcher-participant communication and interaction. The participants, who are highly committed volunteers, self-pursue and donate research-relevant datasets, and are actively engaged in conversations with both our staff and other Personal Genome Project participants. We have quantitatively assessed these communications and donations, and report our experiences with returning research-grade whole genome data to participants. We also observe some of the community growth and discussion that has occurred related to our project.SummaryWe find that public non-anonymous data is valuable and leads to a participatory research model, which we encourage others to consider. The implementation of this model is greatly facilitated by web-based tools and methods and participant education. Project results are long-term proactive participant involvement and the growth of a community that benefits both researchers and participants.
Genetics | 2008
Charleston W. K. Chiang; Adnan Derti; D. A. Schwartz; Michael F. Chou; Joel N. Hirschhorn; C.-ting Wu
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.
PLOS ONE | 2012
Michael F. Chou; Sladjana Prisic; Joshua M. Lubner; George M. Church; Robert N. Husson; D. A. Schwartz
The identification of protein kinase targets remains a significant bottleneck for our understanding of signal transduction in normal and diseased cellular states. Kinases recognize their substrates in part through sequence motifs on substrate proteins, which, to date, have most effectively been elucidated using combinatorial peptide library approaches. Here, we present and demonstrate the ProPeL method for easy and accurate discovery of kinase specificity motifs through the use of native bacterial proteomes that serve as in vivo libraries for thousands of simultaneous phosphorylation reactions. Using recombinant kinases expressed in E. coli followed by mass spectrometry, the approach accurately recapitulated the well-established motif preferences of human basophilic (Protein Kinase A) and acidophilic (Casein Kinase II) kinases. These motifs, derived for PKA and CK II using only bacterial sequence data, were then further validated by utilizing them in conjunction with the scan-x software program to computationally predict known human phosphorylation sites with high confidence.