Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Thuc Duy Le is active.

Publication


Featured researches published by Thuc Duy Le.


Bioinformatics | 2013

Inferring microRNA–mRNA causal regulatory relationships from expression data

Thuc Duy Le; Lin Liu; Anna Tsykin; Gregory J. Goodall; Bing Liu; Bingyu Sun; Jiuyong Li

MOTIVATION microRNAs (miRNAs) are known to play an essential role in the post-transcriptional gene regulation in plants and animals. Currently, several computational approaches have been developed with a shared aim to elucidate miRNA-mRNA regulatory relationships. Although these existing computational methods discover the statistical relationships, such as correlations and associations between miRNAs and mRNAs at data level, such statistical relationships are not necessarily the real causal regulatory relationships that would ultimately provide useful insights into the causes of gene regulations. The standard method for determining causal relationships is randomized controlled perturbation experiments. In practice, however, such experiments are expensive and time consuming. Our motivation for this study is to discover the miRNA-mRNA causal regulatory relationships from observational data. RESULTS We present a causality discovery-based method to uncover the causal regulatory relationship between miRNAs and mRNAs, using expression profiles of miRNAs and mRNAs without taking into consideration the previous target information. We apply this method to the epithelial-to-mesenchymal transition (EMT) datasets and validate the computational discoveries by a controlled biological experiment for the miR-200 family. A significant portion of the regulatory relationships discovered in data is consistent with those identified by experiments. In addition, the top genes that are causally regulated by miRNAs are highly relevant to the biological conditions of the datasets. The results indicate that the causal discovery method effectively discovers miRNA regulatory relationships in data. Although computational predictions may not completely replace intervention experiments, the accurate and reliable discoveries in data are cost effective for the design of miRNA experiments and the understanding of miRNA-mRNA regulatory relationships.


BMC Bioinformatics | 2013

Inferring microRNA and transcription factor regulatory networks in heterogeneous data

Thuc Duy Le; Lin Liu; Bing Liu; Anna Tsykin; Gregory J. Goodall; Kenji Satou; Jiuyong Li

BackgroundTranscription factors (TFs) and microRNAs (miRNAs) are primary metazoan gene regulators. Regulatory mechanisms of the two main regulators are of great interest to biologists and may provide insights into the causes of diseases. However, the interplay between miRNAs and TFs in a regulatory network still remains unearthed. Currently, it is very difficult to study the regulatory mechanisms that involve both miRNAs and TFs in a biological lab. Even at data level, a network involving miRNAs, TFs and genes will be too complicated to achieve. Previous research has been mostly directed at inferring either miRNA or TF regulatory networks from data. However, networks involving a single type of regulator may not fully reveal the complex gene regulatory mechanisms, for instance, the way in which a TF indirectly regulates a gene via a miRNA.ResultsWe propose a framework to learn from heterogeneous data the three-component regulatory networks, with the presence of miRNAs, TFs, and mRNAs. This method firstly utilises Bayesian network structure learning to construct a regulatory network from multiple sources of data: gene expression profiles of miRNAs, TFs and mRNAs, target information based on sequence data, and sample categories. Then, in order to produce more meaningful results for further biological experimentation and research, the method searches the learnt network to identify the interplay between miRNAs and TFs and applies a network motif finding algorithm to further infer the network.We apply the proposed framework to the data sets of epithelial-to-mesenchymal transition (EMT). The results elucidate the complex gene regulatory mechanism for EMT which involves both TFs and miRNAs. Several discovered interactions and molecular functions have been confirmed by literature. In addition, many other discovered interactions and bio-markers are of high statistical significance and thus can be good candidates for validation by experiments. Moreover, the results generated by our method are compact, involving a small number of interactions which have been proved highly relevant to EMT.ConclusionsWe have designed a framework to infer gene regulatory networks involving both TFs and miRNAs from multiple sources of data, including gene expression data, target information, and sample categories. Results on the EMT data sets have shown that the proposed approach is able to produce compact and meaningful gene regulatory networks that are highly relevant to the biological conditions of the data sets. This framework has the potential for application to other heterogeneous datasets to reveal the complex gene regulatory relationships.


Briefings in Bioinformatics | 2016

Computational methods for identifying miRNA sponge interactions

Thuc Duy Le; Lin Liu; Jiuyong Li

Recent findings show that coding genes are not the only targets that miRNAs interact with. In fact, there is a pool of different RNAs competing with each other to attract miRNAs for interactions, thus acting as competing endogenous RNAs (ceRNAs). The ceRNAs indirectly regulate each other via the titration mechanism, i.e. the increasing concentration of a ceRNA will decrease the number of miRNAs that are available for interacting with other targets. The cross-talks between ceRNAs, i.e. their interactions mediated by miRNAs, have been identified as the drivers in many disease conditions, including cancers. In recent years, some computational methods have emerged for identifying ceRNA-ceRNA interactions. However, there remain great challenges and opportunities for developing computational methods to provide new insights into ceRNA regulatory mechanisms.In this paper, we review the publically available databases of ceRNA-ceRNA interactions and the computational methods for identifying ceRNA-ceRNA interactions (also known as miRNA sponge interactions). We also conduct a comparison study of the methods with a breast cancer dataset. Our aim is to provide a current snapshot of the advances of the computational methods in identifying miRNA sponge interactions and to discuss the remaining challenges.


international conference on data mining | 2013

Mining Causal Association Rules

Jiuyong Li; Thuc Duy Le; Lin Liu; Jixue Liu; Zhou Jin; Bingyu Sun

Discovering causal relationships is the ultimate goal of many scientific explorations. Causal relationships can be identified with controlled experiments, but such experiments are often very expensive and sometimes impossible to conduct. On the other hand, the collection of observational data has increased dramatically in recent decades. Therefore it is desirable to find causal relationships from the data directly. Significant progress has been made in the field of discovering causal relationships using the Causal Bayesian Network (CBN) theory. The applications of CBNs, however, are greatly limited due to the high computational complexity. In another direction, association rule mining has been shown to be an efficient data mining means for relationship discovery. However, although causal relationships imply associations, the reverse does not always hold. In this paper we study how to use an efficient association mining approach to discover potential causal rules in observational data. We make use of the idea of retrospective cohort studies, a widely used approach in medical and social research, to detect causal association rules. In comparison with the constraint-based methods within the CBN paradigm, the proposed approach is faster and is capable of finding a cause consisting of combined variables.


PLOS ONE | 2015

Ensemble Methods for MiRNA Target Prediction from Expression Data

Thuc Duy Le; Lin Liu; Jiuyong Li

Background microRNAs (miRNAs) are short regulatory RNAs that are involved in several diseases, including cancers. Identifying miRNA functions is very important in understanding disease mechanisms and determining the efficacy of drugs. An increasing number of computational methods have been developed to explore miRNA functions by inferring the miRNA-mRNA regulatory relationships from data. Each of the methods is developed based on some assumptions and constraints, for instance, assuming linear relationships between variables. For such reasons, computational methods are often subject to the problem of inconsistent performance across different datasets. On the other hand, ensemble methods integrate the results from individual methods and have been proved to outperform each of their individual component methods in theory. Results In this paper, we investigate the performance of some ensemble methods over the commonly used miRNA target prediction methods. We apply eight different popular miRNA target prediction methods to three cancer datasets, and compare their performance with the ensemble methods which integrate the results from each combination of the individual methods. The validation results using experimentally confirmed databases show that the results of the ensemble methods complement those obtained by the individual methods and the ensemble methods perform better than the individual methods across different datasets. The ensemble method, Pearson+IDA+Lasso, which combines methods in different approaches, including a correlation method, a causal inference method, and a regression method, is the best performed ensemble method in this study. Further analysis of the results of this ensemble method shows that the ensemble method can obtain more targets which could not be found by any of the single methods, and the discovered targets are more statistically significant and functionally enriched. The source codes, datasets, miRNA target predictions by all methods, and the ground truth for validation are available in the Supplementary materials.


international conference on data mining | 2012

Discovery of Causal Rules Using Partial Association

Zhou Jin; Jiuyong Li; Lin Liu; Thuc Duy Le; Bingyu Sun; Rujing Wang

Discovering causal relationships in large databases of observational data is challenging. The pioneering work in this area was rooted in the theory of Bayesian network (BN) learning, which however, is a NP-complete problem. Hence several constraint-based algorithms have been developed to efficiently discover causations in large databases. These methods usually use the idea of BN learning, directly or indirectly, and are focused on causal relationships with single cause variables. In this paper, we propose an approach to mine causal rules in large databases of binary variables. Our method expands the scope of causality discovery to causal relationships with multiple cause variables, and we utilise partial association tests to exclude noncausal associations, to ensure the high reliability of discovered causal rules. Furthermore an efficient algorithm is designed for the tests in large databases. We assess the method with a set of real-world diagnostic data. The results show that our method can effectively discover interesting causal rules in large databases.


Briefings in Bioinformatics | 2015

From miRNA regulation to miRNA–TF co-regulation: computational approaches and challenges

Thuc Duy Le; Lin Liu; Bing Liu; Jiuyong Li

microRNAs (miRNAs) are important gene regulators. They control a wide range of biological processes and are involved in several types of cancers. Thus, exploring miRNA functions is important for diagnostics and therapeutics. To date, there are few feasible experimental techniques for discovering miRNA regulatory mechanisms. Alternatively, predictions of miRNA-mRNA regulatory relationships by computational methods have increasingly achieved promising results. Computational approaches are proving their ability as effective tools in reducing the number of biological experiments that must be conducted and to assist with the design of the experiments. In this review, we categorize and review different computational approaches to identify miRNA activities and functions, including the co-regulation of miRNAs and transcription factors. Our main focuses are on the recent approaches that use multiple data types for exploring miRNA functions. We discuss the remaining challenges in the evaluation and selection of models based on the results from a case study. Finally, we analyse the remaining challenges of each computational approach and suggest some future research directions.


ACM Transactions on Intelligent Systems and Technology | 2016

From Observational Studies to Causal Rule Mining

Jiuyong Li; Thuc Duy Le; Lin Liu; Jixue Liu; Zhou Jin; Bingyu Sun; Saisai Ma

Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore, observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, and hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many datasets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large datasets. The problem is that causal relationships imply associations, but the reverse is not always true. However, we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem, whereas observational studies can be utilised to eliminate noncausal associations. In this article, we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large datasets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real-world datasets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach generally is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables—a feature that other causal discovery methods do not possess.


BMC Genomics | 2016

Identification of miRNA-mRNA regulatory modules by exploring collective group relationships

S. M. Masud Karim; Lin Liu; Thuc Duy Le; Jiuyong Li

BackgroundmicroRNAs (miRNAs) play an essential role in the post-transcriptional gene regulation in plants and animals. They regulate a wide range of biological processes by targeting messenger RNAs (mRNAs). Evidence suggests that miRNAs and mRNAs interact collectively in gene regulatory networks. The collective relationships between groups of miRNAs and groups of mRNAs may be more readily interpreted than those between individual miRNAs and mRNAs, and thus are useful for gaining insight into gene regulation and cell functions. Several computational approaches have been developed to discover miRNA-mRNA regulatory modules (MMRMs) with a common aim to elucidate miRNA-mRNA regulatory relationships. However, most existing methods do not consider the collective relationships between a group of miRNAs and the group of targeted mRNAs in the process of discovering MMRMs. Our aim is to develop a framework to discover MMRMs and reveal miRNA-mRNA regulatory relationships from the heterogeneous expression data based on the collective relationships.ResultsWe propose DIscovering COllective group RElationships (DICORE), an effective computational framework for revealing miRNA-mRNA regulatory relationships. We utilize the notation of collective group relationships to build the computational framework. The method computes the collaboration scores of the miRNAs and mRNAs on the basis of their interactions with mRNAs and miRNAs, respectively. Then it determines the groups of miRNAs and groups of mRNAs separately based on their respective collaboration scores. Next, it calculates the strength of the collective relationship between each pair of miRNA group and mRNA group using canonical correlation analysis, and the group pairs with significant canonical correlations are considered as the MMRMs. We applied this method to three gene expression datasets, and validated the computational discoveries.ConclusionsAnalysis of the results demonstrates that a large portion of the regulatory relationships discovered by DICORE is consistent with the experimentally confirmed databases. Furthermore, it is observed that the top mRNAs that are regulated by the miRNAs in the identified MMRMs are highly relevant to the biological conditions of the given datasets. It is also shown that the MMRMs identified by DICORE are more biologically significant and functionally enriched.


Bioinformatics | 2017

CancerSubtypes: An R/Bioconductor package for molecular cancer subtype identification, validation and visualization

Taosheng T. Xu; Thuc Duy Le; Lin Liu; Ning N. Su; Rujing R. Wang; Bingyu B. Sun; Antonio Colaprico; Gianluca Bontempi; Jiuyong Li

Summary: Identifying molecular cancer subtypes from multi‐omics data is an important step in the personalized medicine. We introduce CancerSubtypes, an R package for identifying cancer subtypes using multi‐omics data, including gene expression, miRNA expression and DNA methylation data. CancerSubtypes integrates four main computational methods which are highly cited for cancer subtype identification and provides a standardized framework for data pre‐processing, feature selection, and result follow‐up analyses, including results computing, biology validation and visualization. The input and output of each step in the framework are packaged in the same data format, making it convenience to compare different methods. The package is useful for inferring cancer subtypes from an input genomic dataset, comparing the predictions from different well‐known methods and testing new subtype discovery methods, as shown with different application scenarios in the Supplementary Material. Availability and implementation: The package is implemented in R and available under GPL‐2 license from the Bioconductor website (http://bioconductor.org/packages/CancerSubtypes/). Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Collaboration


Dive into the Thuc Duy Le's collaboration.

Top Co-Authors

Avatar

Jiuyong Li

University of South Australia

View shared research outputs
Top Co-Authors

Avatar

Lin Liu

University of South Australia

View shared research outputs
Top Co-Authors

Avatar

Saisai Ma

University of South Australia

View shared research outputs
Top Co-Authors

Avatar

Bing Liu

University of New South Wales

View shared research outputs
Top Co-Authors

Avatar

Bingyu Sun

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Gregory J. Goodall

University of South Australia

View shared research outputs
Top Co-Authors

Avatar

Jixue Liu

University of South Australia

View shared research outputs
Top Co-Authors

Avatar

Jianfeng He

Kunming University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Taosheng Xu

Hefei Institutes of Physical Science

View shared research outputs
Top Co-Authors

Avatar

Weijia Zhang

University of South Australia

View shared research outputs
Researchain Logo
Decentralizing Knowledge