Ruibin Xi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ruibin Xi is active.

Explore More

Publication

Featured researches published by Ruibin Xi.

Nature | 2011

Comprehensive analysis of the chromatin landscape in Drosophila melanogaster

Peter V. Kharchenko; Artyom A. Alekseyenko; Yuri B. Schwartz; Aki Minoda; Nicole C. Riddle; Jason Ernst; Peter J. Sabo; Erica Larschan; Andrey A. Gorchakov; Tingting Gu; Daniela Linder-Basso; Annette Plachetka; Gregory Shanower; Michael Y. Tolstorukov; Lovelace J. Luquette; Ruibin Xi; Youngsook L. Jung; Richard Park; Eric P. Bishop; Theresa P. Canfield; Richard Sandstrom; Robert E. Thurman; David M. MacAlpine; John A. Stamatoyannopoulos; Manolis Kellis; Sarah C. R. Elgin; Mitzi I. Kuroda; Vincenzo Pirrotta; Gary H. Karpen; Peter J. Park

Chromatin is composed of DNA and a variety of modified histones and non-histone proteins, which have an impact on cell differentiation, gene regulation and other key cellular processes. Here we present a genome-wide chromatin landscape for Drosophila melanogaster based on eighteen histone modifications, summarized by nine prevalent combinatorial patterns. Integrative analysis with other data (non-histone chromatin proteins, DNase I hypersensitivity, GRO-Seq reads produced by engaged polymerase, short/long RNA products) reveals discrete characteristics of chromosomes, genes, regulatory elements and other functional domains. We find that active genes display distinct chromatin signatures that are correlated with disparate gene lengths, exon patterns, regulatory functions and genomic contexts. We also demonstrate a diversity of signatures among Polycomb targets that include a subset with paused polymerase. This systematic profiling and integrative analysis of chromatin signatures provides insights into how genomic elements are regulated, and will serve as a resource for future experimental investigations of genome structure and function.

Proceedings of the National Academy of Sciences of the United States of America | 2011

Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion

Ruibin Xi; Angela Hadjipanayis; Lovelace J. Luquette; Tae-Min Kim; Eunjung Lee; Jianhua Zhang; Mark D. Johnson; Donna M. Muzny; David A. Wheeler; Richard A. Gibbs; Raju Kucherlapati; Peter J. Park

DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10× coverage, whereas we could only detect large CNVs (> 15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.

Cell | 2013

Diverse Mechanisms of Somatic Structural Variations in Human Cancer Genomes

Lixing Yang; Lovelace J. Luquette; Nils Gehlenborg; Ruibin Xi; Psalm Haseley; Chih Heng Hsieh; Chengsheng Zhang; Xiaojia Ren; Alexei Protopopov; Lynda Chin; Raju Kucherlapati; Charles Lee; Peter J. Park

Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ∼20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.Identification of somatic rearrangements in cancer genomes has accelerated through analysis of high-throughput sequencing data. However, characterization of complex structural alterations and their underlying mechanisms remains inadequate. Here, applying an algorithm to predict structural variations from short reads, we report a comprehensive catalog of somatic structural variations and the mechanisms generating them, using high-coverage whole-genome sequencing data from 140 patients across ten tumor types. We characterize the relative contributions of different types of rearrangements and their mutational mechanisms, find that ~20% of the somatic deletions are complex deletions formed by replication errors, and describe the differences between the mutational mechanisms in somatic and germline alterations. Importantly, we provide detailed reconstructions of the events responsible for loss of CDKN2A/B and gain of EGFR in glioblastoma, revealing that these alterations can result from multiple mechanisms even in a single genome and that both DNA double-strand breaks and replication errors drive somatic rearrangements.

Nature | 2015

Hallmarks of pluripotency

Alejandro De Los Angeles; Francesco Ferrari; Ruibin Xi; Yuko Fujiwara; Nissim Benvenisty; Hongkui Deng; Rudolf Jaenisch; Soohyun Lee; Harry G. Leitch; M. William Lensch; Ernesto Lujan; Duanqing Pei; Janet Rossant; Marius Wernig; Peter J. Park; George Q. Daley

Stem cells self-renew and generate specialized progeny through differentiation, but vary in the range of cells and tissues they generate, a property called developmental potency. Pluripotent stem cells produce all cells of an organism, while multipotent or unipotent stem cells regenerate only specific lineages or tissues. Defining stem-cell potency relies upon functional assays and diagnostic transcriptional, epigenetic and metabolic states. Here we describe functional and molecular hallmarks of pluripotent stem cells, propose a checklist for their evaluation, and illustrate how forensic genomics can validate their provenance.

Nature Biotechnology | 2012

Systematic identification of synergistic drug pairs targeting HIV

Xu Tan; Long Hu; Lovelace J. Luquette; Geng Gao; Yifang Liu; Hongjing Qu; Ruibin Xi; Zhi John Lu; Peter J. Park; Stephen J. Elledge

The systematic identification of effective drug combinations has been hindered by the unavailability of methods that can explore the large combinatorial search space of drug interactions. Here we present multiplex screening for interacting compounds (MuSIC), which expedites the comprehensive assessment of pairwise compound interactions. We examined ∼500,000 drug pairs from 1,000 US Food and Drug Administration (FDA)-approved or clinically tested drugs and identified drugs that synergize to inhibit HIV replication. Our analysis reveals an enrichment of anti-inflammatory drugs in drug combinations that synergize against HIV. As inflammation accompanies HIV infection, these findings indicate that inhibiting inflammation could curb HIV propagation. Multiple drug pairs identified in this study, including various glucocorticoids and nitazoxanide (NTZ), synergize by targeting different steps in the HIV life cycle. MuSIC can be applied to a wide variety of disease-relevant screens to facilitate efficient identification of compound combinations.

BMC Bioinformatics | 2010

rSW-seq: Algorithm for detection of copy number alterations in deep sequencing data

Tae-Min Kim; Lovelace J. Luquette; Ruibin Xi; Peter J. Park

BackgroundRecent advances in sequencing technologies have enabled generation of large-scale genome sequencing data. These data can be used to characterize a variety of genomic features, including the DNA copy number profile of a cancer genome. A robust and reliable method for screening chromosomal alterations would allow a detailed characterization of the cancer genome with unprecedented accuracy.ResultsWe develop a method for identification of copy number alterations in a tumor genome compared to its matched control, based on application of Smith-Waterman algorithm to single-end sequencing data. In a performance test with simulated data, our algorithm shows >90% sensitivity and >90% precision in detecting a single copy number change that contains approximately 500 reads for the normal sample. With 100-bp reads, this corresponds to a ~50 kb region for 1X genome coverage of the human genome. We further refine the algorithm to develop rSW-seq, (recursive Smith-Waterman-seq) to identify alterations in a complex configuration, which are commonly observed in the human cancer genome. To validate our approach, we compare our algorithm with an existing algorithm using simulated and publicly available datasets. We also compare the sequencing-based profiles to microarray-based results.ConclusionWe propose rSW-seq as an efficient method for detecting copy number changes in the tumor genome.

Briefings in Functional Genomics | 2010

Detecting structural variations in the human genome using next generation sequencing

Ruibin Xi; Tae-Min Kim; Peter J. Park

Structural variations are widespread in the human genome and can serve as genetic markers in clinical and evolutionary studies. With the advances in the next-generation sequencing technology, recent methods allow for identification of structural variations with unprecedented resolution and accuracy. They also provide opportunities to discover variants that could not be detected on conventional microarray-based platforms, such as dosage-invariant chromosomal translocations and inversions. In this review, we will describe some of the sequencing-based algorithms for detection of structural variations and discuss the key issues in future development.

Nucleic Acids Research | 2016

Copy number analysis of whole-genome data using BIC-seq2 and its application to detection of cancer susceptibility variants.

Ruibin Xi; Semin Lee; Yuchao Xia; Tae-Min Kim; Peter J. Park

Whole-genome sequencing data allow detection of copy number variation (CNV) at high resolution. However, estimation based on read coverage along the genome suffers from bias due to GC content and other factors. Here, we develop an algorithm called BIC-seq2 that combines normalization of the data at the nucleotide level and Bayesian information criterion-based segmentation to detect both somatic and germline CNVs accurately. Analysis of simulation data showed that this method outperforms existing methods. We apply this algorithm to low coverage whole-genome sequencing data from peripheral blood of nearly a thousand patients across eleven cancer types in The Cancer Genome Atlas (TCGA) to identify cancer-predisposing CNV regions. We confirm known regions and discover new ones including those covering KMT2C, GOLPH3, ERBB2 and PLAG1. Analysis of colorectal cancer genomes in particular reveals novel recurrent CNVs including deletions at two chromatin-remodeling genes RERE and NPM2. This method will be useful to many researchers interested in profiling CNVs from whole-genome sequencing data.

Current protocols in human genetics | 2012

A survey of copy-number variation detection tools based on high-throughput sequencing data.

Ruibin Xi; Semin Lee; Peter J. Park

Copy‐number variation (CNV) is a major class of genomic variation with potentially important functional consequences in both normal and diseased populations. Remarkable advances in development of next‐generation sequencing (NGS) platforms provide an unprecedented opportunity for accurate, high‐resolution characterization of CNVs. In this unit, we give an overview of available computational tools for detection of CNVs and discuss comparative advantages and disadvantages of different approaches. Curr. Protoc. Hum. Genet. 75:7.19.1‐7.19.15.

Briefings in Bioinformatics | 2016

Evaluation of somatic copy number estimation tools for whole-exome sequencing data

Jae-Young Nam; Nayoung Kim; Sang Cheol Kim; Je-Gun Joung; Ruibin Xi; Semin Lee; Peter J. Park; Woong-Yang Park

Whole-exome sequencing (WES) has become a standard method for detecting genetic variants in human diseases. Although the primary use of WES data has been the identification of single nucleotide variations and indels, these data also offer a possibility of detecting copy number variations (CNVs) at high resolution. However, WES data have uneven read coverage along the genome owing to the target capture step, and the development of a robust WES-based CNV tool is challenging. Here, we evaluate six WES somatic CNV detection tools: ADTEx, CONTRA, Control-FREEC, EXCAVATOR, ExomeCNV and Varscan2. Using WES data from 50 kidney chromophobe, 50 bladder urothelial carcinoma, and 50 stomach adenocarcinoma patients from The Cancer Genome Atlas, we compared the CNV calls from the six tools with a reference CNV set that was identified by both single nucleotide polymorphism array 6.0 and whole-genome sequencing data. We found that these algorithms gave highly variable results: visual inspection reveals significant differences between the WES-based segmentation profiles and the reference profile, as well as among the WES-based profiles. Using a 50% overlap criterion, 13-77% of WES CNV calls were covered by CNVs from the reference set, up to 21% of the copy gains were called as losses or vice versa, and dramatic differences in CNV sizes and CNV numbers were observed. Overall, ADTEx and EXCAVATOR had the best performance with relatively high precision and sensitivity. We suggest that the current algorithms for somatic CNV detection from WES data are limited in their performance and that more robust algorithms are needed.

Explore More