Hong Sain Ooi
Agency for Science, Technology and Research
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hong Sain Ooi.
Nature | 2009
Melissa J. Fullwood; Liu Mh; Pan Yf; Jianjun Liu; Xu H; Mohamed Yb; Yuriy L. Orlov; Velkov S; Ho A; Mei Ph; Chew Eg; Huang Py; Welboren Wj; Yuyuan Han; Hong Sain Ooi; Pramila Ariyaratne; Vinsensius B. Vega; Luo Y; Peck Yean Tan; Choy Py; Wansa Kd; Zhao B; Kar Sian Lim; Leow Sc; Yow Js; Joseph R; Li H; Desai Kv; Thomsen Js; Lee Yk
Genomes are organized into high-level three-dimensional structures, and DNA elements separated by long genomic distances can in principle interact functionally. Many transcription factors bind to regulatory DNA elements distant from gene promoters. Although distal binding sites have been shown to regulate transcription by long-range chromatin interactions at a few loci, chromatin interactions and their impact on transcription regulation have not been investigated in a genome-wide manner. Here we describe the development of a new strategy, chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) for the de novo detection of global chromatin interactions, with which we have comprehensively mapped the chromatin interaction network bound by oestrogen receptor α (ER-α) in the human genome. We found that most high-confidence remote ER-α-binding sites are anchored at gene promoters through long-range chromatin interactions, suggesting that ER-α functions by extensive chromatin looping to bring genes together for coordinated transcriptional regulation. We propose that chromatin interactions constitute a primary mechanism for regulating transcription in mammalian genomes.
Proceedings of the National Academy of Sciences of the United States of America | 2006
Karen I. Zeller; Xiaodong Zhao; Charlie W. H. Lee; Kuo Ping Chiu; Fei Yao; Jason T. Yustein; Hong Sain Ooi; Yuriy L. Orlov; Atif Shahab; How Choong Yong; Yutao Fu; Zhiping Weng; Vladimir A. Kuznetsov; Wing-Kin Sung; Yijun Ruan; Chi V. Dang; Chia-Lin Wei
The protooncogene MYC encodes the c-Myc transcription factor that regulates cell growth, cell proliferation, cell cycle, and apoptosis. Although deregulation of MYC contributes to tumorigenesis, it is still unclear what direct Myc-induced transcriptomes promote cell transformation. Here we provide a snapshot of genome-wide, unbiased characterization of direct Myc binding targets in a model of human B lymphoid tumor using ChIP coupled with pair-end ditag sequencing analysis (ChIP-PET). Myc potentially occupies >4,000 genomic loci with the majority near proximal promoter regions associated frequently with CpG islands. Using gene expression profiles with ChIP-PET, we identified 668 direct Myc-regulated gene targets, including 48 transcription factors, indicating that Myc is a central transcriptional hub in growth and proliferation control. This first global genomic view of Myc binding sites yields insights of transcriptional circuitries and cis regulatory modules involving Myc and provides a substantial framework for our understanding of mechanisms of Myc-induced tumorigenesis.
Genome Biology | 2010
Guoliang Li; Melissa J. Fullwood; Han Xu; Fabianus Hendriyan Mulawadi; Stoyan Velkov; Vinsensius B. Vega; Pramila Ariyaratne; Yusoff Bin Mohamed; Hong Sain Ooi; Chia-Lin Wei; Yijun Ruan; Wing-Kin Sung
Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg).
Nucleic Acids Research | 2006
Patrick Kwok Shing Ng; Jack J.S. Tan; Hong Sain Ooi; Yen Ling Lee; Kuo Ping Chiu; Melissa J. Fullwood; Kandhadayar G. Srinivasan; Clotilde Perbost; Lei Du; Wing-Kin Sung; Chia-Lin Wei; Yijun Ruan
The paired-end ditagging (PET) technique has been shown to be efficient and accurate for large-scale transcriptome and genome analysis. However, as with other DNA tag-based sequencing strategies, it is constrained by the current efficiency of Sanger technology. A recently developed multiplex sequencing method (454-sequencing™) using picolitre-scale reactions has achieved a remarkable advance in efficiency, but suffers from short-read lengths, and a lack of paired-end information. To further enhance the efficiency of PET analysis and at the same time overcome the drawbacks of the new sequencing method, we coupled multiplex sequencing with paired-end ditagging (MS-PET) using modified PET procedures to simultaneously sequence 200 000 to 300 000 dimerized PET (diPET) templates, with an output of nearly half-a-million PET sequences in a single 4 h machine run. We demonstrate the utility and robustness of MS-PET by analyzing the transcriptome of human breast carcinoma cells, and by mapping p53 binding sites in the genome of human colorectal carcinoma cells. This combined sequencing strategy achieved an approximate 100-fold efficiency increase over the current standard for PET analysis, and furthermore enables the short-read-length multiplex sequencing procedure to acquire paired-end information from large DNA fragments.
Nucleic Acids Research | 2009
Hong Sain Ooi; Chia Yee Kwo; Michael Wildpaner; Fernanda L. Sirota; Birgit Eisenhaber; Sebastian Maurer-Stroh; Wing Cheong Wong; Alexander Schleiffer; Frank Eisenhaber; Georg Schneider
Function prediction of proteins with computational sequence analysis requires the use of dozens of prediction tools with a bewildering range of input and output formats. Each of these tools focuses on a narrow aspect and researchers are having difficulty obtaining an integrated picture. ANNIE is the result of years of close interaction between computational biologists and computer scientists and automates an essential part of this sequence analytic process. It brings together over 20 function prediction algorithms that have proven sufficiently reliable and indispensable in daily sequence analytic work and are meant to give scientists a quick overview of possible functional assignments of sequence segments in the query proteins. The results are displayed in an integrated manner using an innovative AJAX-based sequence viewer. ANNIE is available online at: http://annie.bii.a-star.edu.sg. This website is free and open to all users and there is no login requirement.
BMC Genomics | 2010
Fernanda L. Sirota; Hong Sain Ooi; Tobias Gattermayer; Georg Schneider; Frank Eisenhaber; Sebastian Maurer-Stroh
BackgroundAlgorithms designed to predict protein disorder play an important role in structural and functional genomics, as disordered regions have been reported to participate in important cellular processes. Consequently, several methods with different underlying principles for disorder prediction have been independently developed by various groups. For assessing their usability in automated workflows, we are interested in identifying parameter settings and threshold selections, under which the performance of these predictors becomes directly comparable.ResultsFirst, we derived a new benchmark set that accounts for different flavours of disorder complemented with a similar amount of order annotation derived for the same protein set. We show that, using the recommended default parameters, the programs tested are producing a wide range of predictions at different levels of specificity and sensitivity. We identify settings, in which the different predictors have the same false positive rate. We assess conditions when sets of predictors can be run together to derive consensus or complementary predictions. This is useful in the framework of proteome-wide applications where high specificity is required such as in our in-house sequence analysis pipeline and the ANNIE webserver.ConclusionsThis work identifies parameter settings and thresholds for a selection of disorder predictors to produce comparable results at a desired level of specificity over a newly derived benchmark dataset that accounts equally for ordered and disordered regions of different lengths.
BMC Bioinformatics | 2006
Kuo Ping Chiu; Chee-Hong Wong; Qiongyu Chen; Pramila Ariyaratne; Hong Sain Ooi; Chia-Lin Wei; Wing-Kin Sung; Yijun Ruan
BackgroundWe recently developed the Paired End diTag (PET) strategy for efficient characterization of mammalian transcriptomes and genomes. The paired end nature of short PET sequences derived from long DNA fragments raised a new set of bioinformatics challenges, including how to extract PETs from raw sequence reads, and correctly yet efficiently map PETs to reference genome sequences. To accommodate and streamline data analysis of the large volume PET sequences generated from each PET experiment, an automated PET data process pipeline is desirable.ResultsWe designed an integrated computation program package, PET-Tool, to automatically process PET sequences and map them to the genome sequences. The Tool was implemented as a web-based application composed of four modules: the Extractor module for PET extraction; the Examiner module for analytic evaluation of PET sequence quality; the Mapper module for locating PET sequences in the genome sequences; and the ProjectManager module for data organization. The performance of PET-Tool was evaluated through the analyses of 2.7 million PET sequences. It was demonstrated that PET-Tool is accurate and efficient in extracting PET sequences and removing artifacts from large volume dataset. Using optimized mapping criteria, over 70% of quality PET sequences were mapped specifically to the genome sequences. With a 2.4 GHz LINUX machine, it takes approximately six hours to process one million PETs from extraction to mapping.ConclusionThe speed, accuracy, and comprehensiveness have proved that PET-Tool is an important and useful component in PET experiments, and can be extended to accommodate other related analyses of paired-end sequences. The Tool also provides user-friendly functions for data quality check and system for multi-layer data management.
Methods of Molecular Biology | 2010
Hong Sain Ooi; Georg Schneider; Teng-Ting Lim; Ying-Leong Chan; Birgit Eisenhaber; Frank Eisenhaber
From the database point of view, biomolecular pathways are sets of proteins and other biomacromolecules that represent spatio-temporally organized cascades of interactions with the involvement of low-molecular compounds and are responsible for achieving specific phenotypic biological outcomes. A pathway is usually associated with certain subcellular compartments. In this chapter, we analyze the major public biomolecular pathway databases. Special attention is paid to database scope, completeness, issues of annotation reliability, and pathway classification. In addition, systems for information retrieval, tools for mapping user-defined gene sets onto the information in pathway databases, and their typical research applications are reviewed. Whereas today, pathway databases contain almost exclusively qualitative information, the desired trend is toward quantitative description of interactions and reactions in pathways, which will gradually enable predictive modeling and transform the pathway databases into analytical workbenches.
Methods of Molecular Biology | 2010
Hong Sain Ooi; Georg Schneider; Ying-Leong Chan; Teng-Ting Lim; Birgit Eisenhaber; Frank Eisenhaber
In the current understanding, translation of genomic sequences into proteins is the most important path for realization of genome information. In exercising their intended function, proteins work together through various forms of direct (physical) or indirect interaction mechanisms. For a variety of basic functions, many proteins form a large complex representing a molecular machine or a macromolecular super-structural building block. After several high-throughput techniques for detection of protein-protein interactions had matured, protein interaction data became available in a large scale and curated databases for protein-protein interactions (PPIs) are a new necessity for efficient research. Here, their scope, annotation quality, and retrieval tools are reviewed. In addition, attention is paid to portals that provide unified access to a variety of such databases with added annotation value.
Archive | 2012
Georg Schneider; Westley Sherman; Durga Kuchibhatla; Hong Sain Ooi; Fernanda L. Sirota; Sebastian Maurer-Stroh; Birgit Eisenhaber; Frank Eisenhaber
While very little genomic sequence is interpretable in terms of biological mechanism directly, the chances are much better for protein-coding genes that can be translated into protein sequences. This review considers the different concepts applicable to sequence analysis and function prediction of globular and non-globular protein segments. The publicly accessible ANNOTATOR software environment integrates most of the reliable protein sequence-based function prediction methods, protein domain databases and pathway, and protein–protein interaction collections developed in academia. As application example, the structural and functional domains of mel-28/ELYS, an important nuclear protein, are delineated and are proposed for experimental follow-up in structural biology and functional studies.