Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Saman K. Halgamuge is active.

Publication


Featured researches published by Saman K. Halgamuge.


IEEE Transactions on Neural Networks | 2000

Dynamic self-organizing maps with controlled growth for knowledge discovery

Damminda Alahakoon; Saman K. Halgamuge; Bala Srinivasan

The growing self-organizing map (GSOM) has been presented as an extended version of the self-organizing map (SOM), which has significant advantages for knowledge discovery applications. In this paper, the GSOM algorithm is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated. The spread factor is independent of the dimensionality of the data and as such can be used as a controlling measure for generating maps with different dimensionality, which can then be compared and analyzed with better accuracy. The spread factor is also presented as a method of achieving hierarchical clustering of a data set with the GSOM. Such hierarchical clustering allows the data analyst to identify significant and interesting clusters at a higher level of the hierarchy, and as such continue with finer clustering of only the interesting clusters. Therefore, only a small map is created in the beginning with a low spread factor, which can be generated for even a very large data set. Further analysis is conducted on selected sections of the data and as such of smaller volume. Therefore, this method facilitates the analysis of even very large data sets.


Fuzzy Sets and Systems | 1994

Neural networks in designing fuzzy systems for real world applications

Saman K. Halgamuge; Manfred Glesner

Abstract A special multilayer perceptron architecture known as FuNe I is successfully used for generating fuzzy systems for a number of real world applications. The FuNe I trained with supervised learning can be used to extract fuzzy rules from a given representative input/output data set. Furthermore, optimization of the knowledge base in possible including the tuning of membership functions. The new method employed to identify the rule relevant nodes before the rules are extracted makes FuNe I suitable for applications with large number of inputs. Some of the real world applications in areas of state identification and image classification show encouraging results in a shorter development time. Expert knowledge is not compulsory but can be included in the automatically extracted knowledge base. The generated fuzzy system can be implemented in hardware very easily. A flexible prototype board is developed with a FPGA chip in order to run applications with up to 128 inputs and 4 outputs in realtime (1.25 million rules per second).


Bioinformatics | 2012

CONTRA: copy number analysis for targeted resequencing.

Jason Li; Richard Lupat; Kaushalya C. Amarasinghe; Ella R. Thompson; Maria A. Doyle; Georgina L. Ryland; Richard W. Tothill; Saman K. Halgamuge; Ian G. Campbell; Kylie L. Gorringe

Motivation: In light of the increasing adoption of targeted resequencing (TR) as a cost-effective strategy to identify disease-causing variants, a robust method for copy number variation (CNV) analysis is needed to maximize the value of this promising technology. Results: We present a method for CNV detection for TR data, including whole-exome capture data. Our method calls copy number gains and losses for each target region based on normalized depth of coverage. Our key strategies include the use of base-level log-ratios to remove GC-content bias, correction for an imbalanced library size effect on log-ratios, and the estimation of log-ratio variations via binning and interpolation. Our methods are made available via CONTRA (COpy Number Targeted Resequencing Analysis), a software package that takes standard alignment formats (BAM/SAM) and outputs in variant call format (VCF4.0), for easy integration with other next-generation sequencing analysis packages. We assessed our methods using samples from seven different target enrichment assays, and evaluated our results using simulated data and real germline data with known CNV genotypes. Availability and implementation: Source code and sample data are freely available under GNU license (GPLv3) at http://contra-cnv.sourceforge.net/ Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


congress on evolutionary computation | 2003

A comparison of constraint-handling methods for the application of particle swarm optimization to constrained nonlinear optimization problems

G. Coath; Saman K. Halgamuge

We present a comparison of two constraint-handling methods used in the application of particle swarm optimization (PSO) to constrained nonlinear optimization problems (CNOPs). A brief review of constraint-handling techniques for evolutionary algorithms (EAs) is given, followed by a direct comparison of two existing methods of enforcing constraints using PSO. The two methods considered are the application of nonstationary multistage penalty functions and the preservation of feasible solutions. Five benchmark functions are used for the comparison, and the results are examined to assess the performance of each method in terms of accuracy and rate of convergence. Conclusions are drawn and suggestions for the applicability of each method to real-world CNOPs are given.


BMC Bioinformatics | 2006

Splice site identification using probabilistic parameters and SVM classification

A. K. M. A. Baten; Bill C. H. Chang; Saman K. Halgamuge; Jason Li

Recent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive. The proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases. We proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods.BackgroundRecent advances and automation in DNA sequencing technology has created a vast amount of DNA sequence data. This increasing growth of sequence data demands better and efficient analysis methods. Identifying genes in this newly accumulated data is an important issue in bioinformatics, and it requires the prediction of the complete gene structure. Accurate identification of splice sites in DNA sequences plays one of the central roles of gene structural prediction in eukaryotes. Effective detection of splice sites requires the knowledge of characteristics, dependencies, and relationship of nucleotides in the splice site surrounding region. A higher-order Markov model is generally regarded as a useful technique for modeling higher-order dependencies. However, their implementation requires estimating a large number of parameters, which is computationally expensive.ResultsThe proposed method for splice site detection consists of two stages: a first order Markov model (MM1) is used in the first stage and a support vector machine (SVM) with polynomial kernel is used in the second stage. The MM1 serves as a pre-processing step for the SVM and takes DNA sequences as its input. It models the compositional features and dependencies of nucleotides in terms of probabilistic parameters around splice site regions. The probabilistic parameters are then fed into the SVM, which combines them nonlinearly to predict splice sites. When the proposed MM1-SVM model is compared with other existing standard splice site detection methods, it shows a superior performance in all the cases.ConclusionWe proposed an effective pre-processing scheme for the SVM and applied it for the identification of splice sites. This is a simple yet effective splice site detection method, which shows a better classification accuracy and computational speed than some other more complex methods.


Bioinformatics | 2003

An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data

Arthur L. Hsu; Sen-Lin Tang; Saman K. Halgamuge

MOTIVATION Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. RESULTS The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). AVAILABILITY JAVA software of dynamic SOM tree algorithm is available upon request for academic use. SUPPLEMENTARY INFORMATION A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf


BMC Bioinformatics | 2008

Binning sequences using very sparse labels within a metagenome

Chon-Kit Kenneth Chan; Arthur L. Hsu; Saman K. Halgamuge; Sen-Lin Tang

BackgroundIn metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, k-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels) to assign other reads based on their compositional similarity.ResultsThe proposed seeding method is implemented on an unsupervised Growing Self-Organising Map (GSOM), and called Seeded GSOM (S-GSOM). We compared it with four well-known semi-supervised learning methods in a preliminary test, separating random-length prokaryotic sequence fragments sampled from the NCBI genome database. We identified the flanking sequences of the highly conserved 16S rRNA as suitable seeds that could be used to group the sequence fragments according to their species. S-GSOM showed superior performance compared to the semi-supervised methods tested. Additionally, S-GSOM may also be used to visually identify some species that do not have seeds.The proposed method was then applied to simulated metagenomic datasets using two different confidence threshold settings and compared with PhyloPythia, k-mer and BLAST. At the reference taxonomic level Order, S-GSOM outperformed all k-mer and BLAST results and showed comparable results with PhyloPythia for each of the corresponding confidence settings, where S-GSOM performed better than PhyloPythia in the ≥ 10 reads datasets and comparable in the ≥ 8 kb benchmark tests.ConclusionIn the task of binning using semi-supervised learning methods, results indicate S-GSOM to be the best of the methods tested. Most importantly, the proposed method does not require knowledge from known genomes and uses only very few labels (one per species is sufficient in most cases), which are extracted from the metagenome itself. These advantages make it a very attractive binning method. S-GSOM outperformed the binning methods that depend on already-sequenced genomes, and compares well to the current most advanced binning method, PhyloPythia.


BioMed Research International | 2008

Using Growing Self-Organising Maps to Improve the Binning Process in Environmental Whole-Genome Shotgun Sequencing

Chon-Kit Kenneth Chan; Arthur L. Hsu; Sen-Lin Tang; Saman K. Halgamuge

Metagenomic projects using whole-genome shotgun (WGS) sequencing produces many unassembled DNA sequences and small contigs. The step of clustering these sequences, based on biological and molecular features, is called binning. A reported strategy for binning that combines oligonucleotide frequency and self-organising maps (SOM) shows high potential. We improve this strategy by identifying suitable training features, implementing a better clustering algorithm, and defining quantitative measures for assessing results. We investigated the suitability of each of di-, tri-, tetra-, and pentanucleotide frequencies. The results show that dinucleotide frequency is not a sufficiently strong signature for binning 10 kb long DNA sequences, compared to the other three. Furthermore, we observed that increased order of oligonucleotide frequency may deteriorate the assignment result in some cases, which indicates the possible existence of optimal species-specific oligonucleotide frequency. We replaced SOM with growing self-organising map (GSOM) where comparable results are obtained while gaining 7%–15% speed improvement.


International Journal of Production Research | 2008

Empirical relationships between some manufacturing practices and performance

M. A. Karim; Alan Smith; Saman K. Halgamuge

Intense global competition, rapid technological changes, advances in manufacturing and information technology and discerning customers are forcing manufacturers to adopt manufacturing practices and competitive priorities that enable them to deliver high quality products in a short period of time. Identifying manufacturers’ competitive priorities and effective manufacturing practices has long been considered one of the key elements in manufacturing strategy research. This paper presents the results of a study conducted to identify some of the effective manufacturing practices that have a significant influence on manufacturing performance. This study also identifies the main competitive objectives of manufacturing industries that participated in the study. The results reported in this paper are based on data collected from a survey using a standard questionnaire administered to 1000 manufacturers in Australia. Evidence indicates that product quality and reliability are the main competitive factors for manufacturers and price has become surprisingly a relatively less important factor. Results show that simultaneous pursuit of advanced quality practices can neutralize the potential negative impacts of manufacturing difficulties and significantly improve product quality and manufacturing performance. Failure mode and effect analysis (FMEA) is shown to be an important tool for improving product quality and on time delivery performance. FMEA practice driven by the intention to improve customer satisfaction is more effective than that practised to fulfil customer requirements. Effective supplier relationships are shown to contribute positively to the manufacturing performance. The results also suggest that maintaining a supplier rating system and product data management and regularly updating them with field failure and warranty data are important manufacturing practices.


international symposium on neural networks | 2007

Combining News and Technical Indicators in Daily Stock Price Trends Prediction

Yu Zheng Zhai; Arthur L. Hsu; Saman K. Halgamuge

Stock market prediction has always been one of the hottest topics in research, as well as a great challenge due to its complex and volatile nature. However, most of the existing methods neglect the impact from mass media that will greatly affect the behavior of investors. In this paper we present a system that combines the information from both related news releases and technical indicators to enhance the predictability of the daily stock price trends. The performance shows that this system can achieve higher accuracy and return than a single source system.

Collaboration


Dive into the Saman K. Halgamuge's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Manfred Glesner

Technische Universität Darmstadt

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Isaam Saeed

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar

Andrew Wirth

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar

Jason Li

Peter MacCallum Cancer Centre

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge