Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Chunlin Xiao is active.

Publication


Featured researches published by Chunlin Xiao.


Nature Genetics | 2008

A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures

Andrew J. Sharp; Mefford Hc; Kelly Li; Carl Baker; Cindy Skinner; Roger E. Stevenson; Richard J. Schroer; Francesca Novara; Manuela De Gregori; Roberto Ciccone; Adam Broomer; Iris Casuga; Yu Wang; Chunlin Xiao; Catalin Barbacioru; Giorgio Gimelli; Bernardo Dalla Bernardina; Claudia Torniero; Roberto Giorda; Regina Regan; Victoria Murday; Sahar Mansour; Marco Fichera; Lucia Castiglia; Pinella Failla; Mario Ventura; Zhaoshi Jiang; Gregory M. Cooper; Samantha J. L. Knight; Corrado Romano

We report a recurrent microdeletion syndrome causing mental retardation, epilepsy and variable facial and digital dysmorphisms. We describe nine affected individuals, including six probands: two with de novo deletions, two who inherited the deletion from an affected parent and two with unknown inheritance. The proximal breakpoint of the largest deletion is contiguous with breakpoint 3 (BP3) of the Prader-Willi and Angelman syndrome region, extending 3.95 Mb distally to BP5. A smaller 1.5-Mb deletion has a proximal breakpoint within the larger deletion (BP4) and shares the same distal BP5. This recurrent 1.5-Mb deletion contains six genes, including a candidate gene for epilepsy (CHRNA7) that is probably responsible for the observed seizure phenotype. The BP4–BP5 region undergoes frequent inversion, suggesting a possible link between this inversion polymorphism and recurrent deletion. The frequency of these microdeletions in mental retardation cases is ∼0.3% (6/2,082 tested), a prevalence comparable to that of Williams, Angelman and Prader-Willi syndromes.


Nature Methods | 2012

The 1000 Genomes Project: data management and community access

Laura Clarke; Xiangqun Zheng-Bradley; Richard S. Smith; Eugene Kulesha; Chunlin Xiao; Iliana Toneva; Brendan Vaughan; Don Preuss; Rasko Leinonen; Martin Shumway; Stephen T. Sherry; Paul Flicek

The 1000 Genomes Project was launched as one of the largest distributed data collection and analysis projects ever undertaken in biology. In addition to the primary scientific goals of creating both a deep catalog of human genetic variation and extensive methods to accurately discover and characterize variation using new sequencing technologies, the project makes all of its data publicly available. Members of the project data coordination center have developed and deployed several tools to enable widespread data access.


Cancer Research | 2014

Abstract 5328: GIAB: Genome reference material development resources for clinical sequencing

Chunlin Xiao; Justin M. Zook; Shane Trask; Stephen T. Sherry

Reference materials play important roles in validating performance of sequencing platforms and enabling regulations of clinical applications. Genome-in-a-Bottle (GIAB) project is a collaboration between NIST, FDA, NCBI, academic sequencing groups, sequencing technology developers, and clinical laboratories to develop analytical-grade reference genome materials and accompanying performance metrics for the development of regulations and professional standards for clinical sequencing. NCBI is serving as the Data Coordination Center (DCC) and repository for the raw sequencing reads, mapped alignments, genotypes, and other details for each sample on a dedicated FTP site (ftp://ftp-trace.ncbi.nih.gov/giab/ftp). Here we describe the processes of data generations and data submissions, and how the community can access the data. We are also developing a genome browser for data visualization. GIAB consortium plans to release data to the public on a regular basis. Citation Format: Chunlin Xiao, Justin Zook, Shane Trask, Stephen Sherry, the Genome-in-a-Bottle Consortium. GIAB: Genome reference material development resources for clinical sequencing. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 5328. doi:10.1158/1538-7445.AM2014-5328


Cancer Research | 2016

Abstract 5278: NGS-SWIFT: A cloud-based variant analysis framework using control-accessed sequencing data from dbGaP/SRA

Chunlin Xiao; Eugene Yaschenko; Stephen T. Sherry

Genetic variation analysis plays an important role in elucidating the causes of various human diseases. The drastically reduced costs of genome sequencing driven by next generation sequence technologies now make it possible to analyze genetic variations with hundreds or thousands of samples simultaneously, but with the cost of ever increasing local storage requirements. The tera- and peta-byte scale footprint for sequence data imposes significant technical challenges for data management and analysis, including the tasks of collection, storage, transfer, sharing, and privacy protection. Currently, each analysis group must download all the relevant sequence data into a local file system before variation analysis is initiated. This heavy-weight transaction not only slows down the pace of the analysis, but also creates financial burdens for researchers due to the cost of hardware and time required to transfer the data over typical academic internet connections. To overcome such limitations and explore the feasibility of analyzing control-accessed sequencing data in cloud environment while maintaining data privacy and security, here we introduce a cloud-based analysis framework that facilitates variation analysis using direct access to the NCBI Sequence Read Archive through NCBI SRA Toolkit, which allows the users to programmatically access data housed within SRA with encryption and decryption capabilities and converts it from the SRA format to the desired format for data analysis. A customized machine image (ngs-swift) with preconfigured tools, including NCBI SRA Toolkit and NGS Software Development Kit, and resources essential for variant analysis has been created for instantiating an EC2 instance or instance cluster on Amazon cloud. Performance of this framework has been evaluated using dbGaP study phs000710.v1.p1 (1000Genome Dataset in dbGaP, http://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id = phs000710.v1.p1), and compared with that from traditional analysis pipeline, and security handling in cloud environment when dealing with control-accessed sequence data has been addressed. We demonstrate that with this framework, it is cost effective to make variant calls without first transferring the entire set of aligned sequence data into a local storage environment, thereby accelerating variant discovery using control-accessed sequencing data. Citation Format: Chunlin Xiao, Eugene Yaschenko, Stephen Sherry. NGS-SWIFT: A cloud-based variant analysis framework using control-accessed sequencing data from dbGaP/SRA. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 5278.


Cancer Research | 2012

Abstract 3983: Evaluation of data compression strategies for genetic variation calling using next generation sequencing data in normal and tumor samples

Chunlin Xiao; Eugene Yaschenko; Mikhail Kimelman; Kurt Rodarmer; Stephen T. Sherry

The per-base cost of whole genome sequencing has dropped dramatically due to recent advancement of next generation sequencing (NGS) technologies, and a number of large-scale re-sequencing projects, e.g. 1000 Genomes, ICGC, TCGA, GO-ESP, and CGCI etc., have been initiated to extend our knowledge of single nucleotide polymorphisms (SNPs), short insertions/deletions (INDELs) and structural variations (SVs), and relate these variants to human diseases. To date, various compression technologies, including NCBI9s cSRA and EBI9s CRAM etc., have been developed to significantly reduce the storage footprints and costs. However, whether or not such compression treatments have any impact on genetic variant callings has not been systematically assessed and validated with large scale NGS datasets as different calling algorithms for different variant classes (SNPs, INDELs, or SVs) may have different underlying data requirements. We have developed an integrated analysis framework (VarPipe) to profile genetic mutations using NGS data generated from various sequencing platforms in a uniform manner. This pipeline is also applied to the official data from 1000Genomes (normal samples) and Cancer Genome Characterization Initiative (tumor samples) projects to evaluate the compressions effects on variation callers. With defined quality metrics, the variant callsets from compression-treated and untreated data are carefully evaluated as well as compared to the official variation releases of the corresponding projects so that optimal compression strategy can be developed and recommended for variation detection using NGS data. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 103rd Annual Meeting of the American Association for Cancer Research; 2012 Mar 31-Apr 4; Chicago, IL. Philadelphia (PA): AACR; Cancer Res 2012;72(8 Suppl):Abstract nr 3983. doi:1538-7445.AM2012-3983


Cancer Research | 2015

Abstract 4858: Cloud-based variant analysis solution using control-accessed sequencing data

Chunlin Xiao; Eugene Yaschenko; Stephen T. Sherry


Archive | 2005

Methods and systems for identifying genes, splice variants, and transcripts using an evidence mapping approach

Chunlin Xiao; Valentina Di Francesco; Brian Walenz; Peter Li; Michael J. Campbell; Liliana Florea

Collaboration


Dive into the Chunlin Xiao's collaboration.

Top Co-Authors

Avatar

Stephen T. Sherry

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar

Eugene Yaschenko

National Institutes of Health

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Andrew J. Sharp

Icahn School of Medicine at Mount Sinai

View shared research outputs
Top Co-Authors

Avatar

Brian Walenz

J. Craig Venter Institute

View shared research outputs
Top Co-Authors

Avatar

Carl Baker

University of Washington

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Justin M. Zook

National Institute of Standards and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge