Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Collin Tokheim is active.

Publication


Featured researches published by Collin Tokheim.


Nature | 2015

The genomic landscape of response to EGFR blockade in colorectal cancer

Andrea Bertotti; Eniko Papp; Siân Jones; Vilmos Adleff; Valsamo Anagnostou; Barbara Lupo; Mark Sausen; Jillian Phallen; Carolyn Hruban; Collin Tokheim; Noushin Niknafs; Monica Nesselbush; Karli Lytle; Francesco Sassi; Francesca Cottino; Giorgia Migliardi; Eugenia Rosalinda Zanella; Dario Ribero; Nadia Russolillo; Alfredo Mellano; Andrea Muratore; Gianluca Paraluppi; Mauro Salizzoni; Silvia Marsoni; Michael Kragh; Johan Lantto; Andrea Cassingena; Qing Kay Li; Rachel Karchin; Robert B. Scharpf

Colorectal cancer is the third most common cancer worldwide, with 1.2 million patients diagnosed annually. In late-stage colorectal cancer, the most commonly used targeted therapies are the monoclonal antibodies cetuximab and panitumumab, which prevent epidermal growth factor receptor (EGFR) activation. Recent studies have identified alterations in KRAS and other genes as likely mechanisms of primary and secondary resistance to anti-EGFR antibody therapy. Despite these efforts, additional mechanisms of resistance to EGFR blockade are thought to be present in colorectal cancer and little is known about determinants of sensitivity to this therapy. To examine the effect of somatic genetic changes in colorectal cancer on response to anti-EGFR antibody therapy, here we perform complete exome sequence and copy number analyses of 129 patient-derived tumour grafts and targeted genomic analyses of 55 patient tumours, all of which were KRAS wild-type. We analysed the response of tumours to anti-EGFR antibody blockade in tumour graft models and in clinical settings and functionally linked therapeutic responses to mutational data. In addition to previously identified genes, we detected mutations in ERBB2, EGFR, FGFR1, PDGFRA, and MAP2K1 as potential mechanisms of primary resistance to this therapy. Novel alterations in the ectodomain of EGFR were identified in patients with acquired resistance to EGFR blockade. Amplifications and sequence changes in the tyrosine kinase receptor adaptor gene IRS2 were identified in tumours with increased sensitivity to anti-EGFR therapy. Therapeutic resistance to EGFR blockade could be overcome in tumour graft models through combinatorial therapies targeting actionable genes. These analyses provide a systematic approach to evaluating response to targeted therapies in human cancer, highlight new mechanisms of responsiveness to anti-EGFR therapies, and delineate new avenues for intervention in managing colorectal cancer.


Annals of Oncology | 2015

Genomic alterations in head and neck squamous cell carcinoma determined by cancer gene-targeted sequencing

Christine H. Chung; Violeta Beleva Guthrie; David L. Masica; Collin Tokheim; Hyunseok Kang; Jeremy D. Richmon; Nishant Agrawal; Carole Fakhry; Harry Quon; Rathan M. Subramaniam; Z. Zuo; Tanguy Y. Seiwert; Zachary R. Chalmers; Garrett Michael Frampton; Siraj M. Ali; R. Yelensky; Philip J. Stephens; Vincent A. Miller; Rachel Karchin; Justin A. Bishop

BACKGROUND To determine genomic alterations in head and neck squamous cell carcinoma (HNSCC) using formalin-fixed, paraffin-embedded (FFPE) tumors obtained through routine clinical practice, selected cancer-related genes were evaluated and compared with alterations seen in frozen tumors obtained through research studies. PATIENTS AND METHODS DNA samples obtained from 252 FFPE HNSCC were analyzed using next-generation sequencing-based (NGS) clinical assay to determine sequence and copy number variations in 236 cancer-related genes plus 47 introns from 19 genes frequently rearranged in cancer. Human papillomavirus (HPV) status was determined by presence of the HPV DNA sequence in all samples and corroborated with high-risk HPV in situ hybridization (ISH) and p16 immunohistochemical (IHC) staining in a subset of tumors. Sequencing data from 399 frozen tumors in The Cancer Genome Atlas and University of Chicago public datasets were analyzed for comparison. RESULTS Among 252 FFPE HNSCC, 84 (33%) were HPV positive and 168 (67%) were HPV negative by sequencing. A subset of 40 tumors with HPV ISH and p16 IHC results showed complete concordance with NGS-derived HPV status. The most common genes with genomic alterations were PIK3CA and PTEN in HPV-positive tumors and TP53 and CDKN2A/B in HPV-negative tumors. In the pathway analysis, the PI3K pathway in HPV-positive tumors and DNA repair-p53 and cell cycle pathways in HPV-negative tumors were frequently altered. The HPV-positive oropharynx and HPV-positive nasal cavity/paranasal sinus carcinoma shared similar mutational profiles. CONCLUSION The genomic profile of FFPE HNSCC tumors obtained through routine clinical practice is comparable with frozen tumors studied in research setting, demonstrating the feasibility of comprehensive genomic profiling in a clinical setting. However, the clinical significance of these genomic alterations requires further investigation through application of these genomic profiles as integral biomarkers in clinical trials.BACKGROUND To determine genomic alterations in head and neck squamous cell carcinoma (HNSCC) using formalin-fixed, paraffin-embedded (FFPE) tumors obtained through routine clinical practice, selected cancer-related genes were evaluated and compared with alterations seen in frozen tumors obtained through research studies. PATIENTS AND METHODS DNA samples obtained from 252 FFPE HNSCC were analyzed using next-generation sequencing-based (NGS) clinical assay to determine sequence and copy number variations in 236 cancer-related genes plus 47 introns from 19 genes frequently rearranged in cancer. Human papillomavirus (HPV) status was determined by presence of the HPV DNA sequence in all samples and corroborated with high-risk HPV in situ hybridization (ISH) and p16 immunohistochemical (IHC) staining in a subset of tumors. Sequencing data from 399 frozen tumors in The Cancer Genome Atlas and University of Chicago public datasets were analyzed for comparison. RESULTS Among 252 FFPE HNSCC, 84 (33%) were HPV positive and 168 (67%) were HPV negative by sequencing. A subset of 40 tumors with HPV ISH and p16 IHC results showed complete concordance with NGS-derived HPV status. The most common genes with genomic alterations were PIK3CA and PTEN in HPV-positive tumors and TP53 and CDKN2A/B in HPV-negative tumors. In the pathway analysis, the PI3K pathway in HPV-positive tumors and DNA repair-p53 and cell cycle pathways in HPV-negative tumors were frequently altered. The HPV-positive oropharynx and HPV-positive nasal cavity/paranasal sinus carcinoma shared similar mutational profiles. CONCLUSION The genomic profile of FFPE HNSCC tumors obtained through routine clinical practice is comparable with frozen tumors studied in research setting, demonstrating the feasibility of comprehensive genomic profiling in a clinical setting. However, the clinical significance of these genomic alterations requires further investigation through application of these genomic profiles as integral biomarkers in clinical trials.


Proceedings of the National Academy of Sciences of the United States of America | 2016

Evaluating the evaluation of cancer driver genes

Collin Tokheim; Nickolas Papadopoulos; Kenneth W. Kinzler; Bert Vogelstein; Rachel Karchin

Significance Modern large-scale sequencing of human cancers seeks to comprehensively discover mutated genes that confer a selective advantage to cancer cells. Key to this effort has been development of computational algorithms to find genes that drive cancer based on their patterns of mutation in large patient cohorts. Because there is no generally accepted gold standard of driver genes, it has been difficult to quantitatively compare these methods. We present a machine-learning–based method for driver gene prediction and a protocol to evaluate and compare prediction methods. Our results suggest that most current methods do not adequately account for heterogeneity in the number of mutations expected by chance and consequently yield many false-positive calls, particularly in cancers with high mutation rate. Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, that is, bona fide driver gene mutations. Here, we establish an evaluation framework that can be applied to driver gene prediction methods. We used this framework to compare the performance of eight such methods. One of these methods, described here, incorporated a machine-learning–based ratiometric approach. We show that the driver genes predicted by each of the eight methods vary widely. Moreover, the P values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false-positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future.


Cell | 2018

Comprehensive Characterization of Cancer Driver Genes and Mutations

Matthew Bailey; Collin Tokheim; Eduard Porta-Pardo; Sohini Sengupta; Denis Bertrand; Amila Weerasinghe; Antonio Colaprico; Michael C. Wendl; Jaegil Kim; Brendan Reardon; Patrick Kwok Shing Ng; Kang Jin Jeong; Song Cao; Zixing Wang; Jianjiong Gao; Qingsong Gao; Fang Wang; Eric Minwei Liu; Loris Mularoni; Carlota Rubio-Perez; Niranjan Nagarajan; Isidro Cortes-Ciriano; Daniel Cui Zhou; Wen-Wei Liang; Julian Hess; Venkata Yellapantula; David Tamborero; Abel Gonzalez-Perez; Chayaporn Suphavilai; Jia Yu Ko

Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.


Cancer Prevention Research | 2016

Whole-Genome Sequencing of Salivary Gland Adenoid Cystic Carcinoma.

Eleni M. Rettig; C. Conover Talbot; Mark Sausen; Sian Jones; Justin A. Bishop; Laura D. Wood; Collin Tokheim; Noushin Niknafs; Rachel Karchin; Elana J. Fertig; Sarah J. Wheelan; Luigi Marchionni; Michael Considine; Carole Fakhry; Nickolas Papadopoulos; Kenneth W. Kinzler; Bert Vogelstein; Patrick K. Ha; Nishant Agrawal

Adenoid cystic carcinomas (ACC) of the salivary glands are challenging to understand, treat, and cure. To better understand the genetic alterations underlying the pathogenesis of these tumors, we performed comprehensive genome analyses of 25 fresh-frozen tumors, including whole-genome sequencing and expression and pathway analyses. In addition to the well-described MYB–NFIB fusion that was found in 11 tumors (44%), we observed five different rearrangements involving the NFIB transcription factor gene in seven tumors (28%). Taken together, NFIB translocations occurred in 15 of 25 samples (60%, 95% CI, 41%–77%). In addition, mRNA expression analysis of 17 tumors revealed overexpression of NFIB in ACC tumors compared with normal tissues (P = 0.002). There was no difference in NFIB mRNA expression in tumors with NFIB fusions compared with those without. We also report somatic mutations of genes involved in the axonal guidance and Rho family signaling pathways. Finally, we confirm previously described alterations in genes related to chromatin regulation and Notch signaling. Our findings suggest a separate role for NFIB in ACC oncogenesis and highlight important signaling pathways for future functional characterization and potential therapeutic targeting. Cancer Prev Res; 9(4); 265–74. ©2016 AACR.


Cancer Research | 2016

Exome-scale discovery of hotspot mutation regions in human cancer using 3D protein structure

Collin Tokheim; Rohit Bhattacharya; Noushin Niknafs; Derek M. Gygax; Rick Kim; Michael C. Ryan; David L. Masica; Rachel Karchin

The impact of somatic missense mutation on cancer etiology and progression is often difficult to interpret. One common approach for assessing the contribution of missense mutations in carcinogenesis is to identify genes mutated with statistically nonrandom frequencies. Even given the large number of sequenced cancer samples currently available, this approach remains underpowered to detect drivers, particularly in less studied cancer types. Alternative statistical and bioinformatic approaches are needed. One approach to increase power is to focus on localized regions of increased missense mutation density or hotspot regions, rather than a whole gene or protein domain. Detecting missense mutation hotspot regions in three-dimensional (3D) protein structure may also be beneficial because linear sequence alone does not fully describe the biologically relevant organization of codons. Here, we present a novel and statistically rigorous algorithm for detecting missense mutation hotspot regions in 3D protein structures. We analyzed approximately 3 × 10(5) mutations from The Cancer Genome Atlas (TCGA) and identified 216 tumor-type-specific hotspot regions. In addition to experimentally determined protein structures, we considered high-quality structural models, which increase genomic coverage from approximately 5,000 to more than 15,000 genes. We provide new evidence that 3D mutation analysis has unique advantages. It enables discovery of hotspot regions in many more genes than previously shown and increases sensitivity to hotspot regions in tumor suppressor genes (TSG). Although hotspot regions have long been known to exist in both TSGs and oncogenes, we provide the first report that they have different characteristic properties in the two types of driver genes. We show how cancer researchers can use our results to link 3D protein structure and the biologic functions of missense mutations in cancer, and to generate testable hypotheses about driver mechanisms. Our results are included in a new interactive website for visualizing protein structures with TCGA mutations and associated hotspot regions. Users can submit new sequence data, facilitating the visualization of mutations in a biologically relevant context. Cancer Res; 76(13); 3719-31. ©2016 AACR.


Cancer Cell | 2018

Systematic Functional Annotation of Somatic Mutations in Cancer

Patrick Kwok Shing Ng; Jun Li; Kang Jin Jeong; Shan Shao; Hu Chen; Yiu Huen Tsang; Sohini Sengupta; Zixing Wang; Venkata Hemanjani Bhavana; Richard Tran; Stephanie Soewito; Darlan Conterno Minussi; Daniela Moreno; Kathleen Kong; Turgut Dogruluk; Hengyu Lu; Jianjiong Gao; Collin Tokheim; Daniel Cui Zhou; Amber Johnson; Jia Zeng; Carman Ka Man Ip; Zhenlin Ju; Matthew Wester; Shuangxing Yu; Yongsheng Li; Christopher P. Vellano; Nikolaus Schultz; Rachel Karchin; Li Ding

The functional impact of the vast majority of cancer somatic mutations remains unknown, representing a critical knowledge gap for implementing precision oncology. Here, we report the development of a moderate-throughput functional genomic platform consisting of efficient mutant generation, sensitive viability assays using two growth factor-dependent cell models, and functional proteomic profiling of signaling effects for select aberrations. We apply the platform to annotate >1,000 genomic aberrations, including gene amplifications, point mutations, indels, and gene fusions, potentially doubling the number of driver mutations characterized in clinically actionable genes. Further, the platform is sufficiently sensitive to identify weak drivers. Our data are accessible through a user-friendly, public data portal. Our study will facilitate biomarker discovery, prediction algorithm improvement, and drug development.


bioRxiv | 2017

Evaluation of machine learning methods to predict peptide binding to MHC Class I proteins

Rohit Bhattacharya; Ashok Sivakumar; Collin Tokheim; Violeta Beleva Guthrie; Valsamo Anagnostou; Victor E. Velculescu; Rachel Karchin

Prediction of antigens likely to be recognized by the immune system is a fundamental challenge for development of immune therapy approaches. We explore the utility of deep learning for in silico prediction of peptide binding affinity to major histocompatibiliy complex Type I molecules (pMHC-I binding). This process is a critical step in the immune system’s response to cancer cells, which may present highly specific neoantigen peptides bound to MHC proteins at the cell surface. With the advent of high-throughput sequencing and the recognition that somatic mutations in the exome can produce neoantigens, fast in silico prediction of these affinities has become increasingly relevant to precision cancer immunotherapy. We have developed five machine learning methods and use a benchmark from the Immune Epitope Database of experimental pMHC-I binding affinities to compare them to existing machine learning approaches. All methods were used to score, rank, and classify pMHC-I pairs. The best six methods, which include three of our own, were identified and found to make highly correlated predictions, even for individual pMHC-I pairs. The most effective deep learning methods were a gated recurrent unit and a long short-term memory neural network, enhanced by transfer learning. These methods can handle peptides of any length without the need for artificial lengthening or shortening and were substantially faster than the most widely-used standard neural networks. Major findings The best in silico predictors of peptide major histocompatibility complex binding must be identified for application in precision cancer immunotherapy. We design and test a variety of machine learning methods for this purpose. We identify six best-in-class methods, three of our own design. Surprisingly, the best deep and standard machine learning methods make highly correlated predictions. Several standard methods run significantly slower and may have less utility as high-throughput sequence analysis for precision immunotherapy becomes more common. Performance of all methods varies by MHC allele, and most of this variance can be explained by data-driven, rather than biological properties. Increasing the quantity of publicly available experimental data has the potential to improve all machine learning methods applied to this problem, and in particular deep learning methods.Binding of peptides to Major Histocompatibility Complex (MHC) proteins is a critical step in immune response. Peptides bound to MHCs are recognized by CD8+ (MHC Class I) and CD4+ (MHC Class II) T-cells. Successful prediction of which peptides will bind to specific MHC alleles would benefit many cancer immunotherapy applications. Currently, supervised machine learning is the leading computational approach to predict peptide-MHC binding, and a number of methods, trained using results of binding assays, have been published. Many clinical researchers are dissatisfied with the sensitivity and specificity of currently available methods and the limited number of alleles for which they can be applied. We evaluated several recent methods to predict peptide-MHC Class I binding affinities and a new method of our own design (MHCnuggets). We used a high-quality benchmark set of 51 alleles, which has been applied previously. The neural network methods NetMHC, NetMHCpan, MHCflurry, and MHCnuggets achieved similar best-in-class prediction performance in our testing, and of these methods MHCnuggets was significantly faster. MHCnuggets is a gated recurrent neural network, and the only method to our knowledge which can handle peptides of any length, without artificial lengthening and shortening. Seventeen alleles were problematic for all tested methods. Prediction difficulties could be explained by deficiencies in the training and testing examples in the benchmark, suggesting that biological differences in allele-specific binding properties are not as important as previously claimed. Advances in accuracy and speed of computational methods to predict peptide-MHC affinity are urgently needed. These methods will be at the core of pipelines to identify patients who will benefit from immunotherapy, based on tumor-derived somatic mutations. Machine learning methods, such as MHCnuggets, which efficiently handle peptides of any length will be increasingly important for the challenges of predicting immunogenic response for MHC Class II alleles.


Cancer Research | 2017

CRAVAT 4: Cancer-Related Analysis of Variants Toolkit

David L. Masica; Christopher Douville; Collin Tokheim; Rohit Bhattacharya; RyangGuk Kim; Kyle Moad; Michael C. Ryan; Rachel Karchin

Cancer sequencing studies are increasingly comprehensive and well powered, returning long lists of somatic mutations that can be difficult to sort and interpret. Diligent analysis and quality control can require multiple computational tools of distinct utility and producing disparate output, creating additional challenges for the investigator. The Cancer-Related Analysis of Variants Toolkit (CRAVAT) is an evolving suite of informatics tools for mutation interpretation that includes mutation mapping and quality control, impact prediction and extensive annotation, gene- and mutation-level interpretation, including joint prioritization of all nonsilent mutation consequence types, and structural and mechanistic visualization. Results from CRAVAT submissions are explored in an interactive, user-friendly web environment with dynamic filtering and sorting designed to highlight the most informative mutations, even in the context of very large studies. CRAVAT can be run on a public web portal, in the cloud, or downloaded for local use, and is easily integrated with other methods for cancer omics analysis. Cancer Res; 77(21); e35-38. ©2017 AACR.


bioRxiv | 2018

Enhanced context reveals the scope of somatic missense mutations driving human cancers

Collin Tokheim; Rachel Karchin

Large-scale sequencing studies of patient cohorts enable identification of many cancer driver genes. However, not every mutation in a driver gene is necessarily a driver of cancer; thus requiring methods to discriminate whether an individual mutation is a driver or passenger. By completely re-working the CHASM algorithm, CHASMplus leverages multi-scale context to identify driver missense mutations, and consistently outperforms comparable methods across a wide variety of benchmarks – including in vitro experiments, in vivo experiments, and literature curation. Applied to 8,657 samples across 32 cancer types, CHASMplus identifies 3,527 unique driver mutations. Our results support a prominent emerging role for rare driver mutations. To our knowledge, this study is the first to systematically assess variability across cancer types with respect to the spectrum of common, intermediate, and rare frequency driver mutations. We show that the trajectory of driver discovery is systematically different across cancer types, depending on mutational prevalence and diversity.

Collaboration


Dive into the Collin Tokheim's collaboration.

Top Co-Authors

Avatar

Rachel Karchin

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bert Vogelstein

Howard Hughes Medical Institute

View shared research outputs
Top Co-Authors

Avatar

Carole Fakhry

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Mark Sausen

Johns Hopkins University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Justin A. Bishop

University of Texas Southwestern Medical Center

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge