Is this you? Create Your Porfile

Limsoon Wong

National University of Singapore

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Limsoon Wong is active.

Explore More

Publication

Featured researches published by Limsoon Wong.

Cancer Cell | 2002

Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling

Eng Juh Yeoh; Mary E. Ross; Sheila A. Shurtleff; W. Kent Williams; Divyen H. Patel; Rami Mahfouz; Fred G. Behm; Susana C. Raimondi; Mary V. Relling; Anami R. Patel; Cheng Cheng; Dario Campana; Dawn Wilkins; Xiaodong Zhou; Jinyan Li; Huiqing Liu; Ching-Hon Pui; William E. Evans; Clayton W. Naeve; Limsoon Wong; James R. Downing

Treatment of pediatric acute lymphoblastic leukemia (ALL) is based on the concept of tailoring the intensity of therapy to a patients risk of relapse. To determine whether gene expression profiling could enhance risk assignment, we used oligonucleotide microarrays to analyze the pattern of genes expressed in leukemic blasts from 360 pediatric ALL patients. Distinct expression profiles identified each of the prognostically important leukemia subtypes, including T-ALL, E2A-PBX1, BCR-ABL, TEL-AML1, MLL rearrangement, and hyperdiploid >50 chromosomes. In addition, another ALL subgroup was identified based on its unique expression profile. Examination of the genes comprising the expression signatures provided important insights into the biology of these leukemia subgroups. Further, within some genetic subgroups, expression profiles identified those patients that would eventually fail therapy. Thus, the single platform of expression profiling should enhance the accurate risk stratification of pediatric ALL patients.

Bioinformatics | 2006

Exploiting indirect neighbours and topological weight to predict protein function from protein--protein interactions

Hon Nian Chua; Wing-Kin Sung; Limsoon Wong

Motivation: Most approaches in predicting protein function from protein--protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e. level-2 neighbours) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We speculate that functional similarity between a protein and its neighbours from the two different levels arise from two distinct forms of functional association, and a protein is likely to share functions with its level-1 and/or level-2 neighbours. We are interested in finding out how significant is functional association between level-2 neighbours and how they can be exploited for protein function prediction. Results: We made a statistical study on recent interaction data and observed that functional association between level-2 neighbours is clearly observable. A substantial number of proteins are observed to share functions with level-2 neighbours but not with level-1 neighbours. We develop an algorithm that predicts the functions of a protein in two steps: (1) assign a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and (2) scoring each function based on its weighted frequency in these neighbours. Using leave-one-out cross validation, we compare the performance of our method against that of several other existing approaches and show that our method performs relatively well. Contact: [email protected]

discovery science | 1999

CAEP: Classification by Aggregating Emerging Patterns

Guozhu Dong; Xiuzhen Zhang; Limsoon Wong; Jinyan Li

Emerging patterns (EPs) are itemsets whose supports change significantly from one dataset to another; they were recently proposed to capture multi-attribute contrasts between data classes, or trends over time. In this paper we propose a new classifier, CAEP, using the following main ideas based on EPs: (i) Each EP can sharply differentiate the class membership of a (possibly small) fraction of instances containing the EP, due to the big difference between its supports in the opposing classes; we define the differentiating power of the EP in terms of the supports and their ratio, on instances containing the EP. (ii) For each instance t, by aggregating the differentiating power of a fixed, automatically selected set of EPs, a score is obtained for each class. The scores for all classes are normalized and the largest score determines ts class. CAEP is suitable for many applications, even those with large volumes of high (e.g. 45) dimensional data; it does not depend on dimension reduction on data; and it is usually equally accurate on all classes even if their populations are unbalanced. Experiments show that CAEP has consistent good predictive accuracy, and it almost always outperforms C4.5 and CBA. By using efficient, border-based algorithms (developed elsewhere) to discover EPs, CAEP scales up on data volume and dimensionality. Observing that accuracy on the whole dataset is too coarse description of classifiers, we also used a more accurate measure, sensitivity and precision, to better characterize the performance of classifiers. CAEP is also very good under this measure.

international conference on data mining | 2006

Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions

Hon Nian Chua; Wing-Kin Sung; Limsoon Wong

MOTIVATION Most approaches in predicting protein function from protein-protein interaction data utilize the observation that a protein often share functions with proteins that interacts with it (its level-1 neighbours). However, proteins that interact with the same proteins (i.e. level-2 neighbours) may also have a greater likelihood of sharing similar physical or biochemical characteristics. We speculate that functional similarity between a protein and its neighbours from the two different levels arise from two distinct forms of functional association, and a protein is likely to share functions with its level-1 and/or level-2 neighbours. We are interested in finding out how significant is functional association between level-2 neighbours and how they can be exploited for protein function prediction. RESULTS We made a statistical study on recent interaction data and observed that functional association between level-2 neighbours is clearly observable. A substantial number of proteins are observed to share functions with level-2 neighbours but not with level-1 neighbours. We develop an algorithm that predicts the functions of a protein in two steps: (1) assign a weight to each of its level-1 and level-2 neighbours by estimating its functional similarity with the protein using the local topology of the interaction network as well as the reliability of experimental sources and (2) scoring each function based on its weighted frequency in these neighbours. Using leave-one-out cross validation, we compare the performance of our method against that of several other existing approaches and show that our method performs relatively well.

Bioinformatics | 2002

Accomplishments and challenges in literature data mining for biology

Lynette Hirschman; Jong C. Park; Jun’ichi Tsujii; Limsoon Wong; Cathy H. Wu

We review recent results in literature data mining for biology and discuss the need and the steps for a challenge evaluation for this field. Literature data mining has progressed from simple recognition of terms to extraction of interaction relationships from complex sentences, and has broadened from recognition of protein interactions to a range of problems such as improving homology search, identifying cellular location, and so on. To encourage participation and accelerate progress in this expanding field, we propose creating challenge evaluations, and we describe two specific applications in this context.

Bioinformatics | 2009

Complex discovery from weighted PPI networks

Guimei Liu; Limsoon Wong; Hon Nian Chua

MOTIVATION Protein complexes are important for understanding principles of cellular organization and function. High-throughput experimental techniques have produced a large amount of protein interactions, which makes it possible to predict protein complexes from protein-protein interaction (PPI) networks. However, protein interaction data produced by high-throughput experiments are often associated with high false positive and false negative rates, which makes it difficult to predict complexes accurately. RESULTS We use an iterative scoring method to assign weight to protein pairs, and the weight of a protein pair indicates the reliability of the interaction between the two proteins. We develop an algorithm called CMC (clustering-based on maximal cliques) to discover complexes from the weighted PPI network. CMC first generates all the maximal cliques from the PPI networks, and then removes or merges highly overlapped clusters based on their interconnectivity. We studied the performance of CMC and the impact of our iterative scoring method on CMC. Our results show that: (i) the iterative scoring method can improve the performance of CMC considerably; (ii) the iterative scoring method can effectively reduce the impact of random noise on the performance of CMC; (iii) the iterative scoring method can also improve the performance of other protein complex prediction methods and reduce the impact of random noise on their performance; and (iv) CMC is an effective approach to protein complex prediction from protein interaction network. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

international conference on database theory | 1995

Principles of programming with complex objects and collection types

Peter Buneman; Shamim A. Naqvi; Val Tannen; Limsoon Wong

Abstract We present a new principle for the development of database query languages that the primitive operations should be organized around types. Viewing a relational database as consisting of sets of records, this principle dectates that we should investigate separately operations for records and sets. There are two immediate advantages of this approach, which is partly inspired by basic ideas from category theoryl. First, it provides a language for structures in which record and set types may be freely combined: nested relations or complex objects. Second, the fundamental operations for sets are closely related to those for other “collection types” such as bags or lists, and this suggests how database languages may be uniformly extended to these new types. the most general operation on sets, that of structural recursion , is one in which not all programs are well-defined. In looking for limited forms of this operation that always give rise to well-defined operations, we find a number of close connection with exiting database languages, notably those developed for complex objects. Moreover, even though the general paradigm of structural recursion is shown to be no more expressive than one of the existing languages for complex objects, it possesses certain properties of uniformity that make it a better candidate for an efficient, practical language. Thus rather than developing query languages by extending, for example, relational calculus, we advocate a very powerful paradigm in which a number of well-known languages are to be found as natural sublanguages.

international conference on database theory | 1992

Naturally Embedded Query Languages

Val Tannen; Peter Buneman; Limsoon Wong

We investigate the properties of a simple programming language whose main computational engine is structural recursion on sets. We describe a progression of sublanguages in this paradigm that (1) have increasing expressive power, and (2) illustrate robust conceptual restrictions thus exhibiting interesting additional properties. These properties suggest that we consider our sublanguages as candidates for “query languages”. Viewing query languages as restrictions of our more general programming language has several advantages. First, there is no “impedance mismatch” problem; the query languages are already there, so they share common semantic foundation with the general language. Second, we suggest a uniform characterization of nested relational and complex-object algebras in terms of some surprisingly simple operators;and we can make comparisons of expressiveness in a general framework. Third, we exhibit differences in expressive power that are not always based on complexity arguments, but use the idea that a query in one language may not be polymorphically expressible in another. Fourth, ideas of category theory can be profitably used to organize semantics and syntax, in particular our minimal (core) language is a well-understood categorical construction: a cartesian category with a strong monad on it. Finally, we bring out an algebraic perspective, that is, our languages come with equational theories, and categorical ideas can be used to derive a number of rather general identities that may serve as optimizations or as techniques for discovering optimizations.

Cancer Cell | 2002

Optimal gene expression analysis by microarrays

Lance D. Miller; Philip M. Long; Limsoon Wong; Sayan Mukherjee; Lisa M. McShane; Edison T. Liu

DNA microarrays make possible the rapid and comprehensive assessment of the transcriptional activity of a cell, and as such have proven valuable in assessing the molecular contributors to biological processes and in the classification of human cancers. The major challenge in using this technology is the analysis of its massive data output, which requires computational means for interpretation and a heightened need for quality data. The optimal analysis requires an accounting and control of the many sources of variance within the system, an understanding of the limitations of the statistical approaches, and the ability to make sense of the results through intelligent database interrogation.

Bioinformatics | 2002

Identifying good diagnostic gene groups from gene expression profiles using the concept of emerging patterns

Jinyan Li; Limsoon Wong

Motivations and Results: Gene groups that are significantly related to a disease can be detected by conducting a series of gene expression experiments. This work is aimed at discovering special types of gene groups that satisfy the following property. In each group, its member genes are found to be one-to-one contained in pre-determined intervals of gene expression level with a large frequency in one class of cells but are never found unanimously in these intervals in the other class of cells. We call these gene groups emerging patterns, to emphasize the patterns’ frequency changes between two classes of cells. We use effective discretization and gene selection methods to obtain the most discriminatory genes. We also use efficient algorithms to derive the patterns from these genes. According to our studies on the ALL/AML dataset and the colon tumor dataset, some patterns, which consist of one or more genes, can reach a high frequency of 90%, or even 100%. In other words, they nearly or fully dominate one class of cells, even though they rarely occur in the other class. The discovered patterns are used to classify new cells with a higher accuracy than other reported methods. Based on these patterns, we also conjecture the possibility of a personalized treatment plan which converts colon tumor cells into normal cells by modulating the expression levels of a few genes. Contact: [email protected]; [email protected]

Explore More