Hon Nian Chua
National University of Singapore
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Hon Nian Chua.
Bioinformatics | 2006
Hon Nian Chua; Wing-Kin Sung; Limsoon Wong
Motivation: Most approaches in predicting protein function from protein--protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e. level-2 neighbours) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We speculate that functional similarity between a protein and its neighbours from the two different levels arise from two distinct forms of functional association, and a protein is likely to share functions with its level-1 and/or level-2 neighbours. We are interested in finding out how significant is functional association between level-2 neighbours and how they can be exploited for protein function prediction. Results: We made a statistical study on recent interaction data and observed that functional association between level-2 neighbours is clearly observable. A substantial number of proteins are observed to share functions with level-2 neighbours but not with level-1 neighbours. We develop an algorithm that predicts the functions of a protein in two steps: (1) assign a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and (2) scoring each function based on its weighted frequency in these neighbours. Using leave-one-out cross validation, we compare the performance of our method against that of several other existing approaches and show that our method performs relatively well. Contact: [email protected]
international conference on data mining | 2006
Hon Nian Chua; Wing-Kin Sung; Limsoon Wong
MOTIVATION Most approaches in predicting protein function from protein-protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e. level-2 neighbours) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We speculate that functional similarity between a protein and its neighbours from the two different levels arise from two distinct forms of functional association, and a protein is likely to share functions with its level-1 and/or level-2 neighbours. We are interested in finding out how significant is functional association between level-2 neighbours and how they can be exploited for protein function prediction. RESULTS We made a statistical study on recent interaction data and observed that functional association between level-2 neighbours is clearly observable. A substantial number of proteins are observed to share functions with level-2 neighbours but not with level-1 neighbours. We develop an algorithm that predicts the functions of a protein in two steps: (1) assign a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and (2) scoring each function based on its weighted frequency in these neighbours. Using leave-one-out cross validation, we compare the performance of our method against that of several other existing approaches and show that our method performs relatively well.
Bioinformatics | 2009
Guimei Liu; Limsoon Wong; Hon Nian Chua
MOTIVATION Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, which makes it possible to predict protein complexes from protein-protein interaction (PPI) networks. However, protein interaction data produced by high-throughput experiments are often associated with high false positive and false negative rates, which makes it difficult to predict complexes accurately. RESULTS We use an iterative scoring method to assign weight to protein pairs, and the weight of a protein pair indicates the reliability of the interaction between the two proteins. We develop an algorithm called CMC (clustering-based on maximal cliques) to discover complexes from the weighted PPI network. CMC first generates all the maximal cliques from the PPI networks, and then removes or merges highly overlapped clusters based on their interconnectivity. We studied the performance of CMC and the impact of our iterative scoring method on CMC. Our results show that: (i) the iterative scoring method can improve the performance of CMC considerably; (ii) the iterative scoring method can effectively reduce the impact of random noise on the performance of CMC; (iii) the iterative scoring method can also improve the performance of other protein complex prediction methods and reduce the impact of random noise on their performance; and (iv) CMC is an effective approach to protein complex prediction from protein interaction network. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Molecular Systems Biology | 2014
Murat Cokol; Hon Nian Chua; Murat Tasan; Beste Mutlu; Zohar B. Weinstein; Yo Suzuki; Mehmet Ercan Nergiz; Michael Costanzo; Anastasia Baryshnikova; Guri Giaever; Corey Nislow; Chad L. Myers; Brenda Andrews; Charles Boone; Frederick P. Roth
Drug synergy allows a therapeutic effect to be achieved with lower doses of component drugs. Drug synergy can result when drugs target the products of genes that act in parallel pathways (‘specific synergy’). Such cases of drug synergy should tend to correspond to synergistic genetic interaction between the corresponding target genes. Alternatively, ‘promiscuous synergy’ can arise when one drug non‐specifically increases the effects of many other drugs, for example, by increased bioavailability. To assess the relative abundance of these drug synergy types, we examined 200 pairs of antifungal drugs in S. cerevisiae. We found 38 antifungal synergies, 37 of which were novel. While 14 cases of drug synergy corresponded to genetic interaction, 92% of the synergies we discovered involved only six frequently synergistic drugs. Although promiscuity of four drugs can be explained under the bioavailability model, the promiscuity of Tacrolimus and Pentamidine was completely unexpected. While many drug synergies correspond to genetic interactions, the majority of drug synergies appear to result from non‐specific promiscuous synergy.
BioEssays | 2012
Alicia A. Bicknell; Can Cenik; Hon Nian Chua; Frederick P. Roth; Melissa J. Moore
Although introns in 5′‐ and 3′‐untranslated regions (UTRs) are found in many protein coding genes, rarely are they considered distinctive entities with specific functions. Indeed, mammalian transcripts with 3′‐UTR introns are often assumed nonfunctional because they are subject to elimination by nonsense‐mediated decay (NMD). Nonetheless, recent findings indicate that 5′‐ and 3′‐UTR intron status is of significant functional consequence for the regulation of mammalian genes. Therefore these features should be ignored no longer.
BMC Bioinformatics | 2007
Hon Nian Chua; Wing-Kin Sung; Limsoon Wong
BackgroundProtein-protein interaction has been used to complement traditional sequence homology to elucidate protein function. Most existing approaches only make use of direct interactions to infer function, and some have studied the application of indirect interactions for functional inference but are unable to improve prediction performance. We have previously proposed an approach, FS-Weighted Averaging, which uses topological weighting and level-2 indirect interactions (protein pairs connected via two interactions) for predicting protein function from protein interactions and have found that it yields predictions with superior precision on yeast proteins over existing approaches. Here we study the use of this technique to predict functional annotations from the Gene Ontology for seven genomes: Saccharomyces cerevisiae, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Rattus norvegicus, Mus musculus, and Homo sapiens.ResultsOur analysis shows that protein-protein interactions provide supplementary coverage over sequence homology in the inference of protein function and is definitely a complement to sequence homology. We also find that FS-Weighted Averaging consistently outperforms two classical approaches, Neighbor Counting and Chi-Square, across the seven genomes for all three categories of the Gene Ontology. By randomly adding and removing interactions from the interactions, we find that Weighted Averaging is also rather robust against noisy interaction data.ConclusionWe have conducted a comprehensive study over seven genomes. We conclude that FS-Weighted Averaging can effectively make use of indirect interactions to make the inference of protein functions from protein interactions more effective. Furthermore, the technique is general enough to work over a variety of genomes.
Bioinformatics | 2007
Hon Nian Chua; Wing-Kin Sung; Limsoon Wong
MOTIVATION With the increasing availability of diverse biological information, protein function prediction approaches have converged towards integration of heterogeneous data. Many adapted existing techniques, such as machine-learning and probabilistic methods, which have proven successful on specific data types. However, the impact of these approaches is hindered by a couple of factors. First, there is little comparison between existing approaches. This is in part due to a divergence in the focus adopted by different works, which makes comparison difficult or even fuzzy. Second, there seems to be over-emphasis on the use of computationally demanding machine-learning methods, which runs counter to the surge in biological data. Analogous to the success of BLAST for sequence homology search, we believe that the ability to tap escalating quantity, quality and diversity of biological data is crucial to the success of automated function prediction as a useful instrument for the advancement of proteomic research. We address these problems by: (1) providing useful comparison between some prominent methods; (2) proposing Integrated Weighted Averaging (IWA)--a scalable, efficient and flexible function prediction framework that integrates diverse information using simple weighting strategies and a local prediction method. The simplicity of the approach makes it possible to make predictions based on on-the-fly information fusion. RESULTS In addition to its greater efficiency, IWA performs exceptionally well against existing approaches. In the presence of cross-genome information, which is overwhelming for existing approaches, IWA makes even better predictions. We also demonstrate the significance of appropriate weighting strategies in data integration.
Drug Discovery Today | 2008
Hon Nian Chua; Limsoon Wong
Protein interactions are crucial components of all cellular processes. An in-depth knowledge of the full complement of protein interactions in a cell, therefore, provides insight into the structure, properties and functions of the cell and its components. An accurate and comprehensive protein interaction network is, thus, an invaluable framework to study protein regulation in disease. Although the amount of protein-protein interaction data has grown significantly because of advances in high-throughput experimental techniques, these high-throughput methods are highly susceptible to noise. Therefore, computational techniques for assessing the reliability of a protein-protein interaction are highly desirable. We review here computational techniques for assessing and improving the reliability of protein-protein interaction data from these high-throughput experiments.
PLOS Genetics | 2011
Can Cenik; Hon Nian Chua; Stefan P. Tarnawsky; Abdalla Akef; Adnan Derti; Murat Tasan; Melissa J. Moore; Alexander F. Palazzo; Frederick P. Roth
In higher eukaryotes, messenger RNAs (mRNAs) are exported from the nucleus to the cytoplasm via factors deposited near the 5′ end of the transcript during splicing. The signal sequence coding region (SSCR) can support an alternative mRNA export (ALREX) pathway that does not require splicing. However, most SSCR–containing genes also have introns, so the interplay between these export mechanisms remains unclear. Here we support a model in which the furthest upstream element in a given transcript, be it an intron or an ALREX–promoting SSCR, dictates the mRNA export pathway used. We also experimentally demonstrate that nuclear-encoded mitochondrial genes can use the ALREX pathway. Thus, ALREX can also be supported by nucleotide signals within mitochondrial-targeting sequence coding regions (MSCRs). Finally, we identified and experimentally verified novel motifs associated with the ALREX pathway that are shared by both SSCRs and MSCRs. Our results show strong correlation between 5′ untranslated region (5′UTR) intron presence/absence and sequence features at the beginning of the coding region. They also suggest that genes encoding secretory and mitochondrial proteins share a common regulatory mechanism at the level of mRNA export.
PLOS ONE | 2014
Yujing J. Heng; Craig E. Pennell; Hon Nian Chua; Jonathan Edward Perkins; Stephen J. Lye
Threatened preterm labor (TPTL) is defined as persistent premature uterine contractions between 20 and 37 weeks of gestation and is the most common condition that requires hospitalization during pregnancy. Most of these TPTL women continue their pregnancies to term while only an estimated 5% will deliver a premature baby within ten days. The aim of this work was to study differential whole blood gene expression associated with spontaneous preterm birth (sPTB) within 48 hours of hospital admission. Peripheral blood was collected at point of hospital admission from 154 women with TPTL before any medical treatment. Microarrays were utilized to investigate differential whole blood gene expression between TPTL women who did (n = 48) or did not have a sPTB (n = 106) within 48 hours of admission. Total leukocyte and neutrophil counts were significantly higher (35% and 41% respectively) in women who had sPTB than women who did not deliver within 48 hours (p<0.001). Fetal fibronectin (fFN) test was performed on 62 women. There was no difference in the urine, vaginal and placental microbiology and histopathology reports between the two groups of women. There were 469 significant differentially expressed genes (FDR<0.05); 28 differentially expressed genes were chosen for microarray validation using qRT-PCR and 20 out of 28 genes were successfully validated (p<0.05). An optimal random forest classifier model to predict sPTB was achieved using the top nine differentially expressed genes coupled with peripheral clinical blood data (sensitivity 70.8%, specificity 75.5%). These differentially expressed genes may further elucidate the underlying mechanisms of sPTB and pave the way for future systems biology studies to predict sPTB.