Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kwok Pui Choi is active.

Publication


Featured researches published by Kwok Pui Choi.


Journal of Computer and System Sciences | 2004

Sensitivity analysis and efficient method for identifying optimal spaced seeds

Kwok Pui Choi; Louxin Zhang

The novel introduction of spaced seed idea in the filtration stage of sequence comparison by Ma et al. (Bioinformatics 18 (2002) 440) has greatly increased the sensitivity of homology search without compromising the speed of search. Finding the optimal spaced seeds is of great importance both theoretically and in designing better search tool for sequence comparison. In this paper, we study the computational aspects of calculating the hitting probability of spaced seeds; and based on these results, we propose an efficient algorithm for identifying optimal spaced seeds.


Bioinformatics | 2004

Good spaced seeds for homology search

Kwok Pui Choi; Fanfan Zeng; Louxin Zhang

Motivation: Filtration is an important technique used to speed up local alignment as exemplified in the BLAST programs. Recently, Ma et al. discovered that better filtering can be achieved by spacing out the matching positions according to a certain pattern, instead of contiguous positions to trigger a local alignment in their PatternHunter program. Such a match pattern is called a spaced seed. Results: Our numerical computation shows that the ranks of spaced seeds (based on sensitivity) change with the sequences similarity. Since homologous sequences may have diverse similarity, we assess the sensitivity of spaced seeds over a range of similarity levels and present a list of good spaced seeds for facilitating homology search in DNA genomic sequences. We validate that the listed spaced seeds are indeed more sensitive using three arbitrarily chosen pairs of DNA genomic sequences.


Proceedings of the American Mathematical Society | 1994

On the medians of gamma distributions and an equation of Ramanujan

Kwok Pui Choi

For n > 0, let k(n) denote the median of the T(n +1,1) distribution. We prove that n + \ j. We show that the bounds on X(n) imply s log2 < median(Z^) < p. + ±. This proves a conjecture of Chen and Rubin. These inequalities are sharp.


Journal of Computational Biology | 2005

Nonrandom Clusters of Palindromes in Herpesvirus Genomes

Ming Ying Leung; Kwok Pui Choi; Aihua Xia; Louis H. Y. Chen

Palindromes are symmetrical words of DNA in the sense that they read exactly the same as their reverse complementary sequences. Representing the occurrences of palindromes in a DNA molecule as points on the unit interval, the scan statistics can be used to identify regions of unusually high concentration of palindromes. These regions have been associated with the replication origins on a few herpesviruses in previous studies. However, the use of scan statistics requires the assumption that the points representing the palindromes are independently and uniformly distributed on the unit interval. In this paper, we provide a mathematical basis for this assumption by showing that in randomly generated DNA sequences, the occurrences of palindromes can be approximated by a Poisson process. An easily computable upper bound on the Wasserstein distance between the palindrome process and the Poisson process is obtained. This bound is then used as a guide to choose an optimal palindrome length in the analysis of a collection of 16 herpesvirus genomes. Regions harboring significant palindrome clusters are identified and compared to known locations of replication origins. This analysis brings out a few interesting extensions of the scan statistics that can help formulate an algorithm for more accurate prediction of replication origins.


Nature Communications | 2013

Counting motifs in the human interactome

Ngoc Hieu Tran; Kwok Pui Choi; Louxin Zhang

Small over-represented motifs in biological networks often form essential functional units of biological processes. A natural question is to gauge whether a motif occurs abundantly or rarely in a biological network. Here we develop an accurate method to estimate the occurrences of a motif in the entire network from noisy and incomplete data, and apply it to eukaryotic interactomes and cell-specific transcription factor regulatory networks. The number of triangles in the human interactome is about 194 times that in the Saccharomyces cerevisiae interactome. A strong positive linear correlation exists between the numbers of occurrences of triad and quadriad motifs in human cell-specific transcription factor regulatory networks. Our findings show that the proposed method is general and powerful for counting motifs and can be applied to any network regardless of its topological structure.


Journal of Computational Biology | 2005

Quick, practical selection of effective seeds for homology search.

Franco P. Preparata; Louxin Zhang; Kwok Pui Choi

It has been observed that in homology search gapped seeds have better sensitivity than ungapped ones for the same cost (weight). In this paper, we propose a probability leakage model (a dissipative Markov system) to elucidate the mechanism that confers power to spaced seeds. Based on this model, we identify desirable features of gapped search seeds and formulate an extremely efficient procedure for seed design: it samples from the set of spaced seed exhibiting those features, evaluates their sensitivity, and then selects the best. The sensitivity of the constructed seeds is negligibly less than that of the corresponding known optimal seeds. While the challenging mathematical question of characterizing optimal search seeds remains open, we believe that our eminently efficient and effective approach represents a satisfactory solution from a practitioners viewpoint.


Transactions of the American Mathematical Society | 1988

Some sharp inequalities for martingale transforms

Kwok Pui Choi

TWO sharp inequalities for martingale transforms are proved. These results extend some earlier work of Burkholder. The inequalities are then extended to stochastic integrals and differentially subordinate martingales.


Journal of Bioinformatics and Computational Biology | 2015

A quantum leap in the reproducibility, precision, and sensitivity of gene expression profile analysis even when sample size is extremely small

Kevin Lim; Zhenhua Li; Kwok Pui Choi; Limsoon Wong

Transcript-level quantification is often measured across two groups of patients to aid the discovery of biomarkers and detection of biological mechanisms involving these biomarkers. Statistical tests lack power and false discovery rate is high when sample size is small. Yet, many experiments have very few samples (≤ 5). This creates the impetus for a method to discover biomarkers and mechanisms under very small sample sizes. We present a powerful method, ESSNet, that is able to identify subnetworks consistently across independent datasets of the same disease phenotypes even under very small sample sizes. The key idea of ESSNet is to fragment large pathways into smaller subnetworks and compute a statistic that discriminates the subnetworks in two phenotypes. We do not greedily select genes to be included based on differential expression but rely on gene-expression-level ranking within a phenotype, which is shown to be stable even under extremely small sample sizes. We test our subnetworks on null distributions obtained by array rotation; this preserves the gene-gene correlation structure and is suitable for datasets with small sample size allowing us to consistently predict relevant subnetworks even when sample size is small. For most other methods, this consistency drops to less than 10% when we test them on datasets with only two samples from each phenotype, whereas ESSNet is able to achieve an average consistency of 58% (72% when we consider genes within the subnetworks) and continues to be superior when sample size is large. We further show that the subnetworks identified by ESSNet are highly correlated to many references in the biological literature. ESSNet and supplementary material are available at: http://compbio.ddns.comp.nus.edu.sg:8080/essnet .


Nucleic Acids Research | 2005

A post-processing method for optimizing synthesis strategy for oligonucleotide microarrays

Kang Ning; Kwok Pui Choi; Hon Wai Leong; Louxin Zhang

The broad applicability of gene expression profiling to genomic analyses has generated huge demand for mass production of microarrays and hence for improving the cost effectiveness of microarray fabrication. We developed a post-processing method for deriving a good synthesis strategy. In this paper, we assessed all the known efficient methods and our post-processing method for reducing the number of synthesis cycles for manufacturing a DNA-chip of a given set of oligos. Our experimental results on both simulated and 52 real datasets show that no single method consistently gives the best synthesis strategy, and post-processing an existing strategy is necessary as it often reduces the number of synthesis cycles further.


Nucleic Acids Research | 2014

Profiling the transcription factor regulatory networks of human cell types

Shihua Zhang; Dechao Tian; Ngoc Hieu Tran; Kwok Pui Choi; Louxin Zhang

Neph et al. (2012) (Circuitry and dynamics of human transcription factor regulatory networks. Cell, 150: 1274–1286) reported the transcription factor (TF) regulatory networks of 41 human cell types using the DNaseI footprinting technique. This provides a valuable resource for uncovering regulation principles in different human cells. In this paper, the architectures of the 41 regulatory networks and the distributions of housekeeping and specific regulatory interactions are investigated. The TF regulatory networks of different human cell types demonstrate similar global three-layer (top, core and bottom) hierarchical architectures, which are greatly different from the yeast TF regulatory network. However, they have distinguishable local organizations, as suggested by the fact that wiring patterns of only a few TFs are enough to distinguish cell identities. The TF regulatory network of human embryonic stem cells (hESCs) is dense and enriched with interactions that are unseen in the networks of other cell types. The examination of specific regulatory interactions suggests that specific interactions play important roles in hESCs.

Collaboration


Dive into the Kwok Pui Choi's collaboration.

Top Co-Authors

Avatar

Louxin Zhang

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Ngoc Hieu Tran

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Taoyang Wu

University of East Anglia

View shared research outputs
Top Co-Authors

Avatar

Ming Ying Leung

University of Texas at El Paso

View shared research outputs
Top Co-Authors

Avatar

David S H Chew

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Juntao Li

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Limsoon Wong

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Louis H. Y. Chen

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Si Li

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Dechao Tian

National University of Singapore

View shared research outputs
Researchain Logo
Decentralizing Knowledge