Leonard K. M. Poon | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Leonard K. M. Poon is active.

Explore More

Publication

Featured researches published by Leonard K. M. Poon.

Machine Learning | 2015

Greedy learning of latent tree models for multidimensional clustering

Tengfei Liu; Nevin Lianwen Zhang; Peixian Chen; April Hua Liu; Leonard K. M. Poon; Yi Wang

Real-world data are often multifaceted and can be meaningfully clustered in more than one way. There is a growing interest in obtaining multiple partitions of data. In previous work we learnt from data a latent tree model (LTM) that contains multiple latent variables (Chen et al. 2012). Each latent variable represents a soft partition of data and hence multiple partitions result in. The LTM approach can, through model selection, automatically determine how many partitions there should be, what attributes define each partition, and how many clusters there should be for each partition. It has been shown to yield rich and meaningful clustering results.Our previous algorithm EAST for learning LTMs is only efficient enough to handle data sets with dozens of attributes. This paper proposes an algorithm called BI that can deal with data sets with hundreds of attributes. We empirically compare BI with EAST and other more efficient LTM learning algorithms, and show that BI outperforms its competitors on data sets with hundreds of attributes. In terms of clustering results, BI compares favorably with alternative methods that are not based on LTMs.

International Journal of Approximate Reasoning | 2013

Model-based clustering of high-dimensional data: Variable selection versus facet determination

Leonard K. M. Poon; Nevin Lianwen Zhang; Tengfei Liu; April Hua Liu

Variable selection is an important problem for cluster analysis of high-dimensional data. It is also a difficult one. The difficulty originates not only from the lack of class information but also the fact that high-dimensional data are often multifaceted and can be meaningfully clustered in multiple ways. In such a case the effort to find one subset of attributes that presumably gives the best clustering may be misguided. It makes more sense to identify various facets of a data set (each being based on a subset of attributes), cluster the data along each one, and present the results to the domain experts for appraisal and selection. In this paper, we propose a generalization of the Gaussian mixture models and demonstrate its ability to automatically identify natural facets of data and cluster data along each of those facets simultaneously. We present empirical results to show that facet determination usually leads to better clustering results than variable selection.

Artificial Intelligence | 2017

Latent tree models for hierarchical topic detection

Peixian Chen; Nevin Lianwen Zhang; Tengfei Liu; Leonard K. M. Poon; Zhourong Chen; Farhan Khawar

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables, with those at the lowest latent level representing word co-occurrence patterns and those at higher levels representing co-occurrence of patterns at the level below. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. Unlike LDA-based topic models, HLTMs do not refer to a document generation process and use word variables instead of token variables. They use a tree structure to model the relationships between topics and words, which is conducive to the discovery of meaningful topics and topic hierarchies.

Neurocomputing | 2014

Latent tree models for rounding in spectral clustering

April Hua Liu; Leonard K. M. Poon; Tengfei Liu; Nevin Lianwen Zhang

In spectral clustering, one defines a similarity matrix for a collection of data points, transforms the matrix to get the so-called Laplacian matrix, finds the eigenvectors of the Laplacian matrix, and obtains a partition of the data points using the leading eigenvectors. The last step is sometimes referred to as rounding, where one needs to decide how many leading eigenvectors to use, to determine the number of clusters, and to partition the data points. In this paper, we propose a novel method using latent tree models for rounding. The method differs from previous rounding methods in three ways. First, we relax the assumption that the number of clusters equals the number of eigenvectors used. Second, when deciding how many leading eigenvectors to use, we not only rely on information contained in the leading eigenvectors themselves, but also make use of the subsequent eigenvectors. Third, our method is model-based and solves all the three subproblems of rounding using latent tree models. We evaluate our method on both synthetic and real-world data. The results show that our method works correctly in the ideal case where between-clusters similarity is 0, and degrades gracefully as one moves away from the ideal case.

european conference on symbolic and quantitative approaches to reasoning and uncertainty | 2011

Latent tree classifier

Yi Wang; Nevin Lianwen Zhang; Tao Chen; Leonard K. M. Poon

We propose a novel generative model for classification called latent tree classifier (LTC). An LTC represents each class-conditional distribution of attributes using a latent tree model, and uses Bayes rule to make prediction. Latent tree models can capture complex relationship among attributes. Therefore, LTC can approximate the true distribution behind data well and thus achieve good classification accuracy. We present an algorithm for learning LTC and empirically evaluate it on 37 UCI data sets. The results show that LTC compares favorably to the state-of-the-art. We also demonstrate that LTC can reveal underlying concepts and discover interesting subgroups within each class.

International Conference on Blended Learning | 2017

Learning analytics for monitoring students participation online: Visualizing navigational patterns on learning management system

Leonard K. M. Poon; Siu Cheung Kong; Thomas S. H. Yau; Michael Y. W. Wong; Man Ho Ling

With the increasing use of blended learning approaches in classroom, various kinds of technologies are incorporated to provide digital teaching and learning resources to support students. These resources are often centralized in learning management systems (LMSs), which also store valuable learning data of students. The data could assist teachers in their pedagogical decision making but they are often not well utilized. This paper proposes the use of data mining and visualization techniques as learning analytics to provide a more comprehensive overview of students’ learning online based on log data from LMSs . The focus of this study is the discovery of frequent navigational patterns by sequential pattern mining techniques and the demonstration of how presentation of patterns through hierarchical clustering and sunburst visualization could facilitate the interpretation of patterns. The data in this paper were collected from a blended statistics course for undergraduate students.

International Journal of Approximate Reasoning | 2013

LTC: A latent tree approach to classification

Yi Wang; Nevin Lianwen Zhang; Tao Chen; Leonard K. M. Poon

Latent tree models were proposed as a class of models for unsupervised learning, and have been applied to various problems such as clustering and density estimation. In this paper, we study the usefulness of latent tree models in another paradigm, namely supervised learning. We propose a novel generative classifier called latent tree classifier (LTC). An LTC represents each class-conditional distribution of attributes using a latent tree model, and uses Bayes rule to make prediction. Latent tree models can capture complex relationship among attributes. Therefore, LTC is able to approximate the true distribution behind data well and thus achieves good classification accuracy. We present an algorithm for learning LTC and empirically evaluate it on an extensive collection of UCI data. The results show that LTC compares favorably to the state-of-the-art in terms of classification accuracy. We also demonstrate that LTC can reveal underlying concepts and discover interesting subgroups within each class.

international symposium on neural networks | 2017

Clustering with Multidimensional Mixture Models: Analysis on World Development Indicators

Leonard K. M. Poon

Clustering is one of the core problems in machine learning. Many clustering algorithms aim to partition data along a single dimension. This approach may become inappropriate when data has higher dimension and is multifaceted. This paper introduces a class of mixture models with multiple dimensions called pouch latent tree models. We use them to perform cluster analysis on a data set consisting of 75 development indicators for 133 countries. We further propose a method that guides the selection of clustering variables due to the existence of multiple latent variables. The analysis results demonstrate that some interesting clusterings of countries can be obtained from mixture models with multiple dimensions but not those with single dimensions.

International Journal of Approximate Reasoning | 2018

UC-LTM: Unidimensional clustering using latent tree models for discrete data

Leonard K. M. Poon; April Hua Liu; Nevin Lianwen Zhang

Abstract This paper is concerned with model-based clustering of discrete data. Latent class models (LCMs) are usually used for this task. An LCM consists of a latent variable and a number of attributes. It makes the overly restrictive assumption that the attributes are conditionally independent given the latent variable. We propose a novel method to relax this assumption. The key idea is to partition the attributes into groups such that correlations among the attributes in each group can be properly modeled by using a single latent variable. The latent variables for the attribute groups are then used to build a number of models, and one of them is chosen to produce the clustering results. The new method produces unidimensional clustering using latent tree models and is named UC-LTM. Extensive empirical studies were conducted to compare UC-LTM with several model-based and distance-based clustering methods. UC-LTM outperforms the alternative methods in most cases, and the differences are often large. Further, analysis on real-world social capital data further shows improved results given by UC-LTM over results given by LCMs in a previous study.

web age information management | 2017

Topic Browsing System for Research Papers Based on Hierarchical Latent Tree Analysis

Leonard K. M. Poon; Chun Fai Leung; Peixian Chen; Nevin Lianwen Zhang

New academic papers appear rapidly in the literature nowadays. This poses a challenge for researchers who are trying to keep up with a given field, especially those who are new to a field and may not know where to start from. To address this kind of problems, we have developed a topic browsing system for research papers where the papers have been automatically categorized by a probabilistic topic model. Rather than using Latent Dirichlet Allocation (LDA) for topic modeling, we use a recently proposed method called hierarchical latent tree analysis, which has been shown to perform better than some state-of-the-art LDA-based methods. The resulting topic model contains a hierarchy of topics so that users can browse topics at different levels. The topic model contains a manageable number of general topics at the top level and allows thousands of fine-grained topics at the bottom level.

Explore More