Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Arthur L. Hsu is active.

Publication


Featured researches published by Arthur L. Hsu.


Bioinformatics | 2003

An unsupervised hierarchical dynamic self-organizing approach to cancer class discovery and marker gene identification in microarray data

Arthur L. Hsu; Sen-Lin Tang; Saman K. Halgamuge

MOTIVATION Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. RESULTS The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). AVAILABILITY JAVA software of dynamic SOM tree algorithm is available upon request for academic use. SUPPLEMENTARY INFORMATION A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf


BMC Bioinformatics | 2008

Binning sequences using very sparse labels within a metagenome

Chon-Kit Kenneth Chan; Arthur L. Hsu; Saman K. Halgamuge; Sen-Lin Tang

BackgroundIn metagenomic studies, a process called binning is necessary to assign contigs that belong to multiple species to their respective phylogenetic groups. Most of the current methods of binning, such as BLAST, k-mer and PhyloPythia, involve assigning sequence fragments by comparing sequence similarity or sequence composition with already-sequenced genomes that are still far from comprehensive. We propose a semi-supervised seeding method for binning that does not depend on knowledge of completed genomes. Instead, it extracts the flanking sequences of highly conserved 16S rRNA from the metagenome and uses them as seeds (labels) to assign other reads based on their compositional similarity.ResultsThe proposed seeding method is implemented on an unsupervised Growing Self-Organising Map (GSOM), and called Seeded GSOM (S-GSOM). We compared it with four well-known semi-supervised learning methods in a preliminary test, separating random-length prokaryotic sequence fragments sampled from the NCBI genome database. We identified the flanking sequences of the highly conserved 16S rRNA as suitable seeds that could be used to group the sequence fragments according to their species. S-GSOM showed superior performance compared to the semi-supervised methods tested. Additionally, S-GSOM may also be used to visually identify some species that do not have seeds.The proposed method was then applied to simulated metagenomic datasets using two different confidence threshold settings and compared with PhyloPythia, k-mer and BLAST. At the reference taxonomic level Order, S-GSOM outperformed all k-mer and BLAST results and showed comparable results with PhyloPythia for each of the corresponding confidence settings, where S-GSOM performed better than PhyloPythia in the ≥ 10 reads datasets and comparable in the ≥ 8 kb benchmark tests.ConclusionIn the task of binning using semi-supervised learning methods, results indicate S-GSOM to be the best of the methods tested. Most importantly, the proposed method does not require knowledge from known genomes and uses only very few labels (one per species is sufficient in most cases), which are extracted from the metagenome itself. These advantages make it a very attractive binning method. S-GSOM outperformed the binning methods that depend on already-sequenced genomes, and compares well to the current most advanced binning method, PhyloPythia.


Cancer Cell | 2014

The Architecture and Evolution of Cancer Neochromosomes

Dale W. Garsed; Owen J. Marshall; Vincent Corbin; Arthur L. Hsu; Leon Di Stefano; Jan Schröder; Jason Li; Zhi-Ping Feng; Bo W. Kim; Mark Kowarsky; Ben Lansdell; Ross Brookwell; Ola Myklebost; Leonardo A. Meza-Zepeda; Andrew J. Holloway; Florence Pedeutour; K.H. Andy Choo; Michael A. Damore; Andrew J. Deans; Anthony T. Papenfuss; David Thomas

We isolated and analyzed, at single-nucleotide resolution, cancer-associated neochromosomes from well- and/or dedifferentiated liposarcomas. Neochromosomes, which can exceed 600 Mb in size, initially arise as circular structures following chromothripsis involving chromosome 12. The core of the neochromosome is amplified, rearranged, and corroded through hundreds of breakage-fusion-bridge cycles. Under selective pressure, amplified oncogenes are overexpressed, while coamplified passenger genes may be silenced epigenetically. New material may be captured during punctuated chromothriptic events. Centromeric corrosion leads to crisis, which is resolved through neocentromere formation or native centromere capture. Finally, amplification terminates, and the neochromosome core is stabilized in linear form by telomere capture. This study investigates the dynamic mutational processes underlying the life history of a special form of cancer mutation.


BioMed Research International | 2008

Using Growing Self-Organising Maps to Improve the Binning Process in Environmental Whole-Genome Shotgun Sequencing

Chon-Kit Kenneth Chan; Arthur L. Hsu; Sen-Lin Tang; Saman K. Halgamuge

Metagenomic projects using whole-genome shotgun (WGS) sequencing produces many unassembled DNA sequences and small contigs. The step of clustering these sequences, based on biological and molecular features, is called binning. A reported strategy for binning that combines oligonucleotide frequency and self-organising maps (SOM) shows high potential. We improve this strategy by identifying suitable training features, implementing a better clustering algorithm, and defining quantitative measures for assessing results. We investigated the suitability of each of di-, tri-, tetra-, and pentanucleotide frequencies. The results show that dinucleotide frequency is not a sufficiently strong signature for binning 10 kb long DNA sequences, compared to the other three. Furthermore, we observed that increased order of oligonucleotide frequency may deteriorate the assignment result in some cases, which indicates the possible existence of optimal species-specific oligonucleotide frequency. We replaced SOM with growing self-organising map (GSOM) where comparable results are obtained while gaining 7%–15% speed improvement.


international symposium on neural networks | 2007

Combining News and Technical Indicators in Daily Stock Price Trends Prediction

Yu Zheng Zhai; Arthur L. Hsu; Saman K. Halgamuge

Stock market prediction has always been one of the hottest topics in research, as well as a great challenge due to its complex and volatile nature. However, most of the existing methods neglect the impact from mass media that will greatly affect the behavior of investors. In this paper we present a system that combines the information from both related news releases and technical indicators to enhance the predictability of the daily stock price trends. The performance shows that this system can achieve higher accuracy and return than a single source system.


Genome Biology | 2010

Novel venom gene discovery in the platypus

Camilla M. Whittington; Anthony T. Papenfuss; Devin P. Locke; Elaine R. Mardis; Richard Wilson; Sahar Abubucker; Makedonka Mitreva; Emily S. W. Wong; Arthur L. Hsu; Philip W. Kuchel; Katherine Belov; Wesley C. Warren

BackgroundTo date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components.ResultsWe identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation.ConclusionsThis study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom.


International Journal of Approximate Reasoning | 2003

Enhancement of topology preservation and hierarchical dynamic self-organising maps for data visualisation

Arthur L. Hsu; Saman K. Halgamuge

Abstract The use of self-organising maps (SOM) in unsupervised knowledge discovery has been successful and widely accepted, since the results produced are unbiased and can be visualised. Growing SOM (GSOM), or dynamic SOM that dynamically allocates map size and shape, was proposed to compensate for the static nature of Kohonen’s SOM. GSOM has proven in experiments to decrease the time required to produce a feature map that is of appropriate size for the given data. However, although GSOM usually arrives at similar quantisation error when compared to SOM, it produces considerably higher topographic error. This property has significant influence on the quality of data visualisation and clustering using GSOM, therefore the authors propose an algorithm to enhance topographic quality of GSOM by means of recursive mean directed growing (RMDG) in the growing phase of GSOM while maintaining or even improving its quantisation quality. Furthermore, the authors introduce a dynamic SOM tree model, or hierarchical GSOM, to identify clusters with better accuracy and to visualise cluster separation and merging. Results show improvement of topography preservation when compared to GSOM, and SOM that has similar map size but is not of topologically optimum map aspect ratio. The dynamic SOM tree model demonstrates the ability to allow users to identify clusters interactively and at the same time understand how a larger cluster breaks up into smaller clusters (if it has any) and/or smaller clusters group to form a larger cluster.


Gut | 2017

Circulating tumour cells from patients with colorectal cancer have cancer stem cell hallmarks in ex vivo culture

Fanny Grillet; Elsa Bayet; Olivia Villeronce; Luke Zappia; Ebba L. Lagerqvist; Sebastian Lunke; Emmanuelle Charafe-Jauffret; Kym Pham; Christina Mølck; Nathalie Rolland; Jean-François Bourgaux; Michel Prudhomme; Claire Philippe; Sophie Bravo; Jean Christophe Boyer; Lucile Canterel-Thouennon; Graham R. Taylor; Arthur L. Hsu; Jean Marc Pascussi; Frédéric Hollande; Julie Pannequin

Objective Although counting of circulating tumour cells (CTC) has attracted a broad interest as potential markers of tumour progression and treatment response, the lack of functional characterisation of these cells had become a bottleneck in taking these observations to the clinic. Our objective was to culture these cells in order to understand them and exploit their therapeutic potential to the full. Design Here, hypothesising that some CTC potentially have cancer stem cell (CSC) phenotype, we generated several CTC lines from the blood of patients with advanced metastatic colorectal cancer (CRC) based on their self-renewal abilities. Multiple standard tests were then employed to characterise these cells. Results Our CTC lines self-renew, express CSC markers and have multilineage differentiation ability, both in vitro and in vivo. Patient-derived CTC lines are tumorigenic in subcutaneous xenografts and are also able to colonise the liver after intrasplenic injection. RNA sequencing analyses strikingly demonstrate that drug metabolising pathways represent the most upregulated feature among CTC lines in comparison with primary CRC cells grown under similar conditions. This result is corroborated by the high resistance of the CTC lines to conventional cytotoxic compounds. Conclusions Taken together, our results directly demonstrate the existence of patient-derived colorectal CTCs that bear all the functional attributes of CSCs. The CTC culture model described here is simple and takes <1 month from blood collection to drug testing, therefore, routine clinical application could facilitate access to personalised medicine. Clinical Trial Registration ClinicalTrial.gov NCT01577511.


Neurocomputing | 2008

Class structure visualization with semi-supervised growing self-organizing maps

Arthur L. Hsu; Saman K. Halgamuge

We present a semi-supervised learning method for the growing self-organising maps (GSOM) that allows fast visualisation of data class structure on the 2D feature map. Instead of discarding data with missing values, the network can be trained from data with up to 60% of their class labels and 25% of attribute values missing, while able to make class prediction with over 90% accuracy for the benchmark datasets used. The proposed algorithm is compared to three variants of semi-supervised K-means learning on four real-world benchmark datasets and showed comparable performance and better generalisation.


machine vision applications | 2011

Comparing two video-based techniques for driver fatigue detection: classification versus optical flow approach

Rajinda Senaratne; Budi Thomas Jap; Sara Lal; Arthur L. Hsu; Saman K. Halgamuge; Peter Fischer

Lack of concentration in a driver due to fatigue is a major cause of road accidents. This paper investigates approaches that can be used to develop a video-based system to automatically detect driver fatigue and warn the driver, in order to prevent accidents. Ocular cues such as percentage eye closure (PERCLOS) are considered strong fatigue indicators; thus, accurately locating and tracking the driver’s eyes is vital. Tests were carried out based on two approaches to track the eyes and estimate PERCLOS: (1) classification approach and (2) optical flow approach. In the first approach, the eyes are tracked by finding local regions, the state (open or closed) of the eyes in each image frame is estimated using a classifier, and thereby the PERCLOS is calculated. In the second approach, the movement of the upper eyelid is tracked using a newly proposed simple eye model, which captures image velocities based on optical flow, thereby the eye closures and openings are detected, and then the eye states are estimated to calculate PERCLOS. Experiments show that both approaches can detect fatigue with reasonable accuracy, and that the classification approach is more accurate. However, the classification approach requires a large amount of suitable training data. If such data are unavailable, then the optical flow approach would be more practical.

Collaboration


Dive into the Arthur L. Hsu's collaboration.

Top Co-Authors

Avatar

Saman K. Halgamuge

Australian National University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Anthony T. Papenfuss

Walter and Eliza Hall Institute of Medical Research

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kym Pham

University of Melbourne

View shared research outputs
Top Co-Authors

Avatar

Paul Waring

Peter MacCallum Cancer Centre

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Alan Smith

University of Melbourne

View shared research outputs
Researchain Logo
Decentralizing Knowledge