Isaam Saeed
University of Melbourne
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Isaam Saeed.
Nucleic Acids Research | 2012
Isaam Saeed; Sen-Lin Tang; Saman K. Halgamuge
An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis.
The ISME Journal | 2013
Ching-Hung Tseng; Pei-Wen Chiang; Fuh-Kwo Shiah; Yi-Lung Chen; Jia-Rong Liou; Ting-Chang Hsu; Suhinthan Maheswararajah; Isaam Saeed; Saman K. Halgamuge; Sen-Lin Tang
Extreme climatic activities, such as typhoons, are widely known to disrupt our natural environment. In particular, studies have revealed that typhoon-induced perturbations can result in several long-term effects on various ecosystems. In this study, we have conducted a 2-year metagenomic survey to investigate the microbial and viral community dynamics associated with environmental changes and seasonal variations in an enclosed freshwater reservoir subject to episodic typhoons. We found that the microbial community structure and the associated metagenomes continuously changed, where microbial richness increased after typhoon events and decreased during winter. Among the environmental factors that influenced changes in the microbial community, precipitation was considered to be the most significant. Similarly, the viral community regularly showed higher relative abundances and diversity during summer in comparison to winter, with major variations happening in several viral families including Siphoviridae, Myoviridae, Podoviridae and Microviridae. Interestingly, we also found that the precipitation level was associated with the terrestrial viral abundance in the reservoir. In contrast to the dynamic microbial community (L-divergence 0.73±0.25), we found that microbial metabolic profiles were relatively less divergent (L-divergence 0.24±0.04) at the finest metabolic resolution. This study provides for the first time a glimpse at the microbial and viral community dynamics of a subtropical freshwater ecosystem, adding a comprehensive set of new knowledge to aquatic environments.
BMC Genomics | 2009
Isaam Saeed; Saman K. Halgamuge
BackgroundThe characterisation, or binning, of metagenome fragments is an important first step to further downstream analysis of microbial consortia. Here, we propose a one-dimensional signature, OFDEG, derived from the oligonucleotide frequency profile of a DNA sequence, and show that it is possible to obtain a meaningful phylogenetic signal for relatively short DNA sequences. The one-dimensional signal is essentially a compact representation of higher dimensional feature spaces of greater complexity and is intended to improve on the tetranucleotide frequency feature space preferred by current compositional binning methods.ResultsWe compare the fidelity of OFDEG against tetranucleotide frequency in both an unsupervised and semi-supervised setting on simulated metagenome benchmark data. Four tests were conducted using assembler output of Arachne and phrap, and for each, performance was evaluated on contigs which are greater than or equal to 8 kbp in length and contigs which are composed of at least 10 reads. Using both G-C content in conjunction with OFDEG gave an average accuracy of 96.75% (semi-supervised) and 95.19% (unsupervised), versus 94.25% (semi-supervised) and 82.35% (unsupervised) for tetranucleotide frequency.ConclusionWe have presented an observation of an alternative characteristic of DNA sequences. The proposed feature representation has proven to be more beneficial than the existing tetranucleotide frequency space to the metagenome binning problem. We do note, however, that our observation of OFDEG deserves further anlaysis and investigation. Unsupervised clustering revealed OFDEG related features performed better than standard tetranucleotide frequency in representing a relevant organism specific signal. Further improvement in binning accuracy is given by semi-supervised classification using OFDEG. The emphasis on a feature-driven, bottom-up approach to the problem of binning reveals promising avenues for future development of techniques to characterise short environmental sequences without bias toward cultivable organisms.
PLOS ONE | 2014
Jason Li; Maria A. Doyle; Isaam Saeed; Stephen Q. Wong; Victoria Mar; David L. Goode; Franco Caramia; Ken Doig; Georgina L. Ryland; Ella R. Thompson; Sally M. Hunter; Saman K. Halgamuge; Jason Ellul; Alexander Dobrovic; Ian G. Campbell; Anthony T. Papenfuss; Grant A. McArthur; Richard W. Tothill
Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/.
international conference on information and automation | 2007
Isaam Saeed; Alyoxia Wang; Rajinda Senaratne; Saman K. Halgamuge
Driver fatigue is a major cause of accidents on roads. We present a video-based driver fatigue detection system, which uses the Active Appearance Model as a means to track rigid and non-rigid motion of a face in an input video sequence. We propose an Active Appearance Model specific to the ocular region of the face to place greater focus on the eyes, as fatigue cues based on ocular features, such as blink rate, are considered robust indicators of fatigue. This specific model is used to estimate eye-blinks, and further, secondary cues such as head-nodding. Results show that the proposed model can effectively detect eye-blinks. Finally, we evaluate tracking performance and blink estimation on two test subjects.
BMC Genomics | 2015
Ching-Hung Tseng; Pei-Wen Chiang; Hung-Chun Lai; Fuh-Kwo Shiah; Ting-Chang Hsu; Yi-Lung Chen; Liang-Saw Wen; Chun-Mao Tseng; Wung Yang Shieh; Isaam Saeed; Saman K. Halgamuge; Sen-Lin Tang
BackgroundProkaryotic microbes, the most abundant organisms in the ocean, are remarkably diverse. Despite numerous studies of marine prokaryotes, the zonation of their communities in pelagic zones has been poorly delineated. By exploiting the persistent stratification of the South China Sea (SCS), we performed a 2-year, large spatial scale (10, 100, 1000, and 3000 m) survey, which included a pilot study in 2006 and comprehensive sampling in 2007, to investigate the biological zonation of bacteria and archaea using 16S rRNA tag and shotgun metagenome sequencing.ResultsAlphaproteobacteria dominated the bacterial community in the surface SCS, where the abundance of Betaproteobacteria was seemingly associated with climatic activity. Gammaproteobacteria thrived in the deep SCS, where a noticeable amount of Cyanobacteria were also detected. Marine Groups II and III Euryarchaeota were predominant in the archaeal communities in the surface and deep SCS, respectively. Bacterial diversity was higher than archaeal diversity at all sampling depths in the SCS, and peaked at mid-depths, agreeing with the diversity pattern found in global water columns. Metagenomic analysis not only showed differential %GC values and genome sizes between the surface and deep SCS, but also demonstrated depth-dependent metabolic potentials, such as cobalamin biosynthesis at 10 m, osmoregulation at 100 m, signal transduction at 1000 m, and plasmid and phage replication at 3000 m. When compared with other oceans, urease at 10 m and both exonuclease and permease at 3000 m were more abundant in the SCS. Finally, enriched genes associated with nutrient assimilation in the sea surface and transposase in the deep-sea metagenomes exemplified the functional zonation in global oceans.ConclusionsProkaryotic communities in the SCS stratified with depth, with maximal bacterial diversity at mid-depth, in accordance with global water columns. The SCS had functional zonation among depths and endemically enriched metabolic potentials at the study site, in contrast to other oceans.
Bioinformatics | 2015
Duleepa Jayasundara; Isaam Saeed; Suhinthan Maheswararajah; Bill C. H. Chang; Sen-Lin Tang; Saman K. Halgamuge
MOTIVATION The combined effect of a high replication rate and the low fidelity of the viral polymerase in most RNA viruses and some DNA viruses results in the formation of a viral quasispecies. Uncovering information about quasispecies populations significantly benefits the study of disease progression, antiviral drug design, vaccine design and viral pathogenesis. We present a new analysis pipeline called ViQuaS for viral quasispecies spectrum reconstruction using short next-generation sequencing reads. ViQuaS is based on a novel reference-assisted de novo assembly algorithm for constructing local haplotypes. A significantly extended version of an existing global strain reconstruction algorithm is also used. RESULTS Benchmarking results showed that ViQuaS outperformed three other previously published methods named ShoRAH, QuRe and PredictHaplo, with improvements of at least 3.1-53.9% in recall, 0-12.1% in precision and 0-38.2% in F-score in terms of strain sequence assembly and improvements of at least 0.006-0.143 in KL-divergence and 0.001-0.035 in root mean-squared error in terms of strain frequency estimation, over the next-best algorithm under various simulation settings. We also applied ViQuaS on a real read set derived from an in vitro human immunodeficiency virus (HIV)-1 population, two independent datasets of foot-and-mouth-disease virus derived from the same biological sample and a real HIV-1 dataset and demonstrated better results than other methods available.
Metabolomics | 2013
Chalini D. Wijetunge; Zhaoping Li; Isaam Saeed; Jairus Bowne; Arthur L. Hsu; Ute Roessner; Antony Bacic; Saman K. Halgamuge
In order to make sense of the sheer volume of metabolomic data that can be generated using current technology, robust data analysis tools are essential. We propose the use of the growing self-organizing map (GSOM) algorithm and by doing so demonstrate that a deeper analysis of metabolomics data is possible in comparison to the widely used batch-learning self-organizing map, hierarchical cluster analysis and partitioning around medoids algorithms on simulated and real-world time-course metabolomic datasets. We then applied GSOM to a recently published dataset representing metabolome response patterns of three wheat cultivars subject to a field simulated cyclic drought stress. This novel and information rich analysis provided by the proposed GSOM framework can be easily extended to other high-throughput metabolomics studies.
BMC Bioinformatics | 2015
Duleepa Jayasundara; Isaam Saeed; Bill C.H. Chang; Sen-Lin Tang; Saman K. Halgamuge
BackgroundEstimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related.The lack of knowledge on the number of different strains in a quasispecies population is observed to hinder the precision of existing Viral Quasispecies Spectrum Reconstruction (QSR) methods due to the uncontrolled reconstruction of a large number of in silico false positives. In this work, we formulated a novel probabilistic method for strain richness estimation specifically targeting viral quasispecies. By using this approach we improved our recently proposed spectrum reconstruction pipeline ViQuaS to achieve higher levels of precision in reconstructed quasispecies spectra without compromising the recall rates. We also discuss how one other existing popular QSR method named ShoRAH can be improved using this new approach.ResultsOn benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively.ConclusionsThe proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors.Availabilityhttp://sourceforge.net/projects/viquas/
foundations of computational intelligence | 2009
Arthur L. Hsu; Isaam Saeed; Saman K. Halgamuge
In an effort to counter the restrictions enforced by the fixed map size and aspect ratio of a Kohonen Self-Organising Map, many variants to the method have been proposed. As a recent development, the Dynamic Self- Organising Map, also known as the Growing Self-Organising Map (GSOM), provides a balanced performance in topology preservation, data visualisation and computational speed. In this book chapter, a comprehensive description and theory of GSOM is provided, which also includes recent theoretical developments. Methods of clustering and identifying clusters using GSOM are also introduced here together with their related applications and results.