Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Daphne Teck Ching Lai is active.

Publication


Featured researches published by Daphne Teck Ching Lai.


International Journal of Biomedical Engineering and Technology | 2013

A preliminary study on automatic breast cancer data classification using semi-supervised fuzzy c-means

Daphne Teck Ching Lai; Jonathan M. Garibaldi

Soria et al. have successfully identified six clinically useful and novel subgroups in the Nottingham Tenovus Breast Cancer (NTBC) data set. However, the methodology used is semi-manual and no single clustering can automatically classify the data set so far. In this work, two variations of semisupervised Fuzzy c-Means (ssFCM) algorithms are explored to classify the NTBC data set into the same six subgroups. Three experiments were conducted using the two ssFCM algorithms and the results are evaluated by using interrater agreement measures. The ssFCM algorithms identified the six classes of breast cancer but it is in low agreement with Soria’s classification. This, together with high agreement using two clustering algorithms, suggests that the problem may lie in the way we use ssFCM rather than in model correctness. Despite this, we consider the ssFCM results promising and note that work for further investigation in ssFCM is required.


ieee international conference on fuzzy systems | 2011

A comparison of distance-based semi-supervised fuzzy c-means clustering algorithms

Daphne Teck Ching Lai; Jonathan M. Garibaldi

There are many issues to be considered in the design of distance-based fuzzy semi-supervised clustering (FSSC) algorithms. To identify these issues, we compare the performance of four such algorithms. We describe the properties of these algorithms, highlighting their key differences, and then experimentally compare their performance on common datasets. Several experimental conditions are investigated. Firstly, two forms of initialisation of the membership values of unlabelled patterns are used; 1/c and 0. Secondly, the algorithms are run with varying proportions of labelled patterns in the datasets, ranging from 2% to 40%. We find that no algorithm outperforms the others in all the datasets. We also observe that small modifications in similar objective functions can improve clustering, and that most of the algorithms perform slightly better with zero initialisation of unlabelled patterns. An interesting observation is that the increase in labelled patterns does not always improve clustering. From these results, we conclude that the number and scale of dimensions in the data set, initial partition matrix, distance metrics and objective functions, together, affect clustering results. In addition, we conclude that not all initially labelled patterns are good candidates for supervision.


computational intelligence methods for bioinformatics and biostatistics | 2012

Investigating Distance Metrics in Semi-supervised Fuzzy c-Means for Breast Cancer Classification

Daphne Teck Ching Lai; Jonathan M. Garibaldi

In previous work, semi-supervised Fuzzy c-means (ssFCM) was used as an automatic classification technique to classify the Nottingham Tenovus Breast Cancer (NTBC) dataset as no method to do this currently exists. However, the results were poor when compared with semi-manual classification. It is known that the NTBC data is highly non-normal and it was suspected that this affected the poor results. This motivated a further investigation into alternative distance metrics to explore their effect on classification results. Mahalanobis, Euclidean and kernel-based distance metrics were used on 100 sets of randomly-selected labelled data. It was found that ssFCM with Euclidean distance successfully and automatically identified the six classes in close agreement with those of Soria et al. We showed that there is also high agreement in the key features that define the breast cancer classes with those of Soria et al. The superiority of Euclidean distance for classifying this dataset, as compared to Mahalanobis distance is unexpected as it can only generate spherical clusters while Mahalanobis distance can generate hyperellipsoidal ones including spherical ones. We expected Mahalanobis distance to generate the hyperellipsoidal clusters that would best fit NTBC data.


Asia-Pacific Journal of Public Health | 2017

Cross-sectional STEPwise Approach to Surveillance (STEPS) Population Survey of Noncommunicable Diseases (NCDs) and Risk Factors in Brunei Darussalam 2016:

Sok King Ong; Daphne Teck Ching Lai; Justin Yun Yaw Wong; Khairil Azhar Si-Ramlee; Lubna Abdul Razak; Norhayati Kassim; Zakaria Kamis; David Koh

This article provides a cross-sectional weighted measurement of noncommunicable diseases (NCDs) and risk factors prevalence among Brunei adult population using WHO STEPS methodology. A 2-staged randomized sampling was conducted during August 2015 to April 2016. Three-step surveillance included (1) interview using standardized questionnaire, (2) blood pressure and anthropometric measurements, and (3) biochemistry tests. Data weighting was applied. A total of 3808 adults aged 18 to 69 years participated in step 1; 2082 completed steps 2 and 3 measurements. Adult smoking prevalence was 19.9%, obesity 28.2%, hypertension 28.0%, diabetes 9.7%, prediabetes 2.1%, and 51.3% had fasting cholesterol level ≥5 mmol/L. Inadequate consumption of fruits and vegetables prevalence was high at 91.7%. Among those aged 40 to 69 years, 8.9% had a 10-year cardiovascular disease (CVD) risk ≥30%, or with existing CVD. Population strategies and targeted group interventions are required to control the NCD risk factors and morbidities.


asian conference on intelligent information and database systems | 2016

An Integrated Pattern Recognition System for Knee Flexion Analysis

Joko Triloka; S. M. N. Arosha Senanayake; Daphne Teck Ching Lai

The purpose of this study is to propose an integrated knee-flexion analysis system (IKAS) as a novel tool for recognition pattern of knee muscle for athletes and soldiers based on neuromuscular signals and soft tissue deformation parameter. Different types of parameters from multi-sensors integration are combined to analyze the knee motion. Data fusion of EMG and frames of the video for each knee flexion angle acquired from synchronization of the motion capture system and video cameras interfaced with wireless EMG sensors. Systems are pre-processed in order to prepare the pattern set for a custom-developed artificial neural network and mesh generation technique based intelligent system for classifying the patterns of knee muscle of subjects during walking and squatting activity. Multilayer feed-forward backpropagation networks (FFBPNNs) with different network training algorithm were designed and coefficient correlation (CC) was uses and their classification results were compared. The newly introduced IKAS approach will provides assistance in making an objective and knowledgeable decisions about recognition of patterns from knee mm knee muscles.


international symposium on neural networks | 2014

Identifying stable breast cancer subgroups using semi-supervised fuzzy c-means on a reduced panel of biomarkers

Daphne Teck Ching Lai; Jonathan M. Garibaldi

The aim of this work is to identify clinically-useful and stable breast cancer subgroups using a reduced panel of biomarkers. First, we investigate the stability of subgroups generated using two different reduced panels of biomarkers on clustering of breast cancer data. The stability of the subgroups found are assessed based on comparison of agreement levels using Cohens Kappa Index on clustering solutions from ssFCM methodologies, consensus K-means and model-based clustering. The clustering solutions obtained from the feature set which achieve the higher agreement is chosen for further biological and clinical evaluation to establish the subgroups are clinically-useful. Using a ssFCM methodology, we identified seven clinically-useful and stable breast cancer subgroups using a reduced panel by Soria et al. So far, the stability of the subgroups identified using the reduced panel of biomarkers have not yet been investigated.


ieee international conference on fuzzy systems | 2014

Investigating distance metric learning in semi-supervised fuzzy c-means clustering

Daphne Teck Ching Lai; Jonathan M. Garibaldi; Jenna Marie Reps

The idea behind distance metric learning (DML) is to accentuate the distance relations found in the training data, maintaining whether the data patterns are similar or dissimilar. In this paper, we investigate in using DML (GDML, LMNN, MCML and NCA) in semi-supervised Fuzzy c-means clustering and apply them on a real, biomedical dataset and on UCI datasets. We used a cross validation setting with varying amount of labelled data to test our methodology. Out of eight datasets, statistical significant improvement was found on five datasets using ssFCM with DML. This shows that DML can improve ssFCM clustering for some datasets. Further analysis using 2D PCA projection and sum of squared distances before and after DML transformation of the original data are carried out. Interestingly, DML was found to worsen ssFCM clustering in the NTBC dataset with hierarchical clusters.


multi disciplinary trends in artificial intelligence | 2017

Multivariate Time Series Clustering Analysis for Human Balance Data

Owais Ahmed Malik; Daphne Teck Ching Lai

The evaluation of human balance control patterns is an important tool for identifying the underlying disorders in the postural control system of individuals and taking appropriate actions if required. This study presents the use of the multivariate time-series clustering techniques for analyzing the human balance patterns based on the force platform data. Different multivariate time-series clustering techniques including partitioning clustering with Dynamic Time Warping (DTW) measure, Permutation Distribution Clustering (PDC) and k-means for longitudinal data (KmL3D) were investigated. The cluster solutions were generated using anterior-posterior and medial-lateral center of pressure (COP) displacement data for four balance evaluation conditions namely eyes open on stable surface (EOS), eyes open on unstable surface (EOU), eyes closed on stable surface (ECS) and eyes closed on unstable surface (ECU). The resulted clusters were evaluated based on various cluster validity indexes. Further, suitable association measures were computed between clustering solutions and demographic (age and body mass index) and qualitative balance test (BEST-T) parameters. The clusters generated by Partition Around Medoid (PAM) DTW technique for EOS, EOU and ECS balance conditions demonstrated statistically significant association with all parameters while for ECU balance testing condition, significant associations were observed only for the age parameter of the participants.


international conference on intelligent computing | 2017

A Comparison of Distance Metrics in Semi-supervised Hierarchical Clustering Methods

Abeer Aljohani; Daphne Teck Ching Lai; Paul C. Bell; Eran A. Edirisinghe

The basic idea of ssHC is to leverage domain knowledge in the form of triple-wise constraints to group data into clusters. In this paper, we perform extensive experiments in order to evaluate the effects of different distance metrics, linkages measures and constraints on the performance of two ssHC algorithms: IPoptim and UltraTran. The algorithms are implemented with varying proportions of constraints in the different datasets, ranging from 10% to 60%. We found that both IPoptim and UltraTran performed almost equally across the seven datasets. An interesting observation is that an increase in constraint does not always show an improvement in ssHC performance. It can also be observed that the inclusion of too many classes degrades the performance of clustering. The experimental results show that the ssHC with Canberra distance perform well, apart from ssHC with well-known distances such as Euclidean and Standard Euclidean distances. Together with complete linkages and small amount of constraints of 10%, ssHC can achieve good results of an F-score close to 0.8 and above for four out of the seven datasets. Moreover, the output of non-parametric statistical test shows that using the UltraTran algorithm in combination with the Manhattan distance metric and Ward.D linkage method provides the best results. Furthermore, utilizing IPoptim and UltraTran with the Canberra distance measure performs better for the given datasets.


computational intelligence | 2016

On using genetic algorithm for initialising semi-supervised fuzzy c-means clustering

Daphne Teck Ching Lai; Jonathan M. Garibaldi

In a previous work, suitable initialisation techniques were incorporated with semi-supervised Fuzzy c-Means clustering (ssFCM) to improve clustering results on a trial and error basis. In this work, we present a single fully-automatic version of an existing semi-supervised Fuzzy c-means clustering framework which uses genetically-modified prototypes (ssFCMGA). Initial prototypes are generated by GA to initialise the ssFCM algorithm without experimentation of different initialisation techniques. The framework is tested on a real, biomedical dataset NTBC and on the Arrhythmia UCI dataset, using varying amounts of labelled data from 10 % to 60 % of the total data patterns. Different ssFCM threshold values and fitness functions for ssFCMGA are also investigated (sGAs). We used accuracy and NMI to measure class-label agreement and internal measures WSS, BSS, CH, CWB, DB and DU to evaluate cluster quality of the clustering algorithms. Results are compared with those produced by the existing ssFCM. While ssFCMGA and sGAs produced slightly lower agreement level than ssFCM with known class labels based on accuracy and NMI, the other six measurements showed improvement in the results in terms of compactness and well-separatedness (cluster quality), particularly when labelled data are low at 10 %. Furthermore, the cluster quality are shown to further improve using ssFCMGA with a more complex fitness function (sGA2). This demonstrates the application of GA in ssFCM improves cluster quality without exploration of different initialisation techniques.

Collaboration


Dive into the Daphne Teck Ching Lai's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Joko Triloka

Universiti Brunei Darussalam

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Owais Ahmed Malik

Universiti Brunei Darussalam

View shared research outputs
Top Co-Authors

Avatar

Umar Yahya

Universiti Brunei Darussalam

View shared research outputs
Top Co-Authors

Avatar

David Koh

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Daniele Soria

Universiti Teknologi MARA

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge