Is this you? Create Your Porfile

Tu Bao Ho

Japan Advanced Institute of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tu Bao Ho is active.

Explore More

Publication

Featured researches published by Tu Bao Ho.

International Journal of Intelligent Systems | 2002

Nonhierarchical Document Clustering Based on a Tolerance Rough Set Model

Tu Bao Ho; Ngoc Binh Nguyen

Document clustering, the grouping of documents into several clusters, has been recognized as a means for improving efficiency and effectiveness of information retrieval and text mining. With the growing importance of electronic media for storing and exchanging large textual databases, document clustering becomes more significant. Hierarchical document clustering methods, having a dominant role in document clustering, seem inadequate for large document databases as the time and space requirements are typically of order O(N3) and O(N2), where N is the number of index terms in a database. In addition, when each document is characterized by only several terms or keywords, clustering algorithms often produce poor results as most similarity measures yield many zero values. In this article we introduce a nonhierarchical document clustering algorithm based on a proposed tolerance rough set model (TRSM). This algorithm contributes two considerable features: (1) it can be applied to large document databases, as the time and space requirements are of order O(NlogN) and O(N), respectively; and (2) it can be well adapted to documents characterized by a few terms due to the TRSMs ability of semantic calculation. The algorithm has been evaluated and validated by experiments on test collections.

BMC Bioinformatics | 2008

Finding microRNA regulatory modules in human genome using rule induction

Dang Hung Tran; Kenji Satou; Tu Bao Ho

Background:MicroRNAs (miRNAs) are a class of small non-coding RNA molecules (20–24 nt), which are believed to participate in repression of gene expression. They play important roles in several biological processes (e.g. cell death and cell growth). Both experimental and computational approaches have been used to determine the function of miRNAs in cellular processes. Most efforts have concentrated on identification of miRNAs and their target genes. However, understanding the regulatory mechanism of miRNAs in the gene regulatory network is also essential to the discovery of functions of miRNAs in complex cellular systems. To understand the regulatory mechanism of miRNAs in complex cellular systems, we need to identify the functional modules involved in complex interactions between miRNAs and their target genes.Results:We propose a rule-based learning method to identify groups of miRNAs and target genes that are believed to participate cooperatively in the post-transcriptional gene regulation, so-called miRNA regulatory modules (MRMs). Applying our method to human genes and miRNAs, we found 79 MRMs. The MRMs are produced from multiple information sources, including miRNA-target binding information, gene expression and miRNA expression profiles. Analysis of two first MRMs shows that these MRMs consist of highly-related miRNAs and their target genes with respect to biological processes.Conclusion:The MRMs found by our method have high correlation in expression patterns of miRNAs as well as mRNAs. The mRNAs included in the same module shared similar biological functions, indicating the ability of our method to detect functionality-related genes. Moreover, review of the literature reveals that miRNAs in a module are involved in several types of human cancer.

systems man and cybernetics | 2006

Multiple-attribute decision making under uncertainty: the evidential reasoning approach revisited

Van-Nam Huynh; Yoshiteru Nakamori; Tu Bao Ho; Tetsuya Murai

In multiple-attribute decision making (MADM) problems, one often needs to deal with decision information with uncertainty. During the last decade, Yang and Singh (1994) have proposed and developed an evidential reasoning (ER) approach to deal with such MADM problems. Essentially, this approach is based on an evaluation analysis model and Dempsters rule of combination in the Dempster-Shafer (D-S) theory of evidence. This paper reanalyzes the ER approach explicitly in terms of D-S theory and then proposes a general scheme of attribute aggregation in MADM under uncertainty. In the spirit of such a reanalysis, previous ER algorithms are reviewed and two other aggregation schemes are discussed. Theoretically, it is shown that new aggregation schemes also satisfy the synthesis axioms, which have been recently proposed by Yang and Xu (2002) for which any rational aggregation process should grant. A numerical example traditionally examined in published sources on the ER approach is used to illustrate the discussed techniques

Archive | 2008

PRICAI 2008: Trends in Artificial Intelligence

Tu Bao Ho; Zhi-Hua Zhou

Keynotes.- What Shall We Do Next? The Challenges of AI Midway through Its First Century.- Exposing the Causal Structure of Processes by Learning CP-Logic Programs.- Building Structured Web Community Portals Via Extraction, Integration, and Mass Collaboration.- Large Scale Corpus Analysis and Recent Applications.- On the Computability and Complexity Issues of Extended RDF.- Toward Formalizing Common-Sense Psychology: An Analysis of the False-Belief Task.- Computing Stable Skeletons with Particle Filters.- Using Semantic Web Technologies for the Assessment of Open Questions.- Quantifying Commitment.- Temporal Data Mining for Educational Applications.- Dual Properties of the Relative Belief of Singletons.- Alternative Formulations of the Theory of Evidence Based on Basic Plausibility and Commonality Assignments.- Non-negative Sparse Principal Component Analysis for Multidimensional Constrained Optimization.- Sentence Compression by Removing Recursive Structure from Parse Tree.- An ATP of a Relational Proof System for Order of Magnitude Reasoning with Negligibility, Non-closeness and Distance.- A Heuristic Data Reduction Approach for Associative Classification Rule Hiding.- Evolutionary Computation Using Interaction among Genetic Evolution, Individual Learning and Social Learning.- Behavior Learning Based on a Policy Gradient Method: Separation of Environmental Dynamics and State Values in Policies.- Developing Evaluation Model of Topical Term for Document-Level Sentiment Classification.- Learning to Identify Comparative Sentences in Chinese Text.- Efficient Exhaustive Generation of Functional Programs Using Monte-Carlo Search with Iterative Deepening.- Identification of Subject Shareness for Korean-English Machine Translation.- Agent for Predicting Online Auction Closing Price in a Simulated Auction Environment.- Feature Selection Using Mutual Information: An Experimental Study.- Finding Orthogonal Arrays Using Satisfiability Checkers and Symmetry Breaking Constraints.- Statistical Model for Japanese Abbreviations.- A Novel Heuristic Algorithm for Privacy Preserving of Associative Classification.- Time-Frequency Analysis of Vietnamese Speech Inspired on Chirp Auditory Selectivity.- Meta-level Control of Multiagent Learning in Dynamic Repeated Resource Sharing Problems.- Ontology-Based Natural Query Retrieval Using Conceptual Graphs.- Optimal Multi-issue Negotiation in Open and Dynamic Environments.- The Density-Based Agglomerative Information Bottleneck.- State-Based Regression with Sensing and Knowledge.- Some Results on the Completeness of Approximation Based Reasoning.- KT and S4 Satisfiability in a Constraint Logic Environment.- Clustering with Feature Order Preferences.- Distributed Memory Bounded Path Search Algorithms for Pervasive Computing Environments.- Using Cost Distributions to Guide Weight Decay in Local Search for SAT.- Fault Resolution in Case-Based Reasoning.- Constrained Sequence Classification for Lexical Disambiguation.- Map Building by Sequential Estimation of Inter-feature Distances.- Document-Based HITS Model for Multi-document Summarization.- External Force for Active Contours: Gradient Vector Convolution.- Representation = Grounded Information.- Learning from the Past with Experiment Databases.- An Argumentation Framework Based on Conditional Priorities.- Knowledge Supervised Text Classification with No Labeled Documents.- Constrained Local Regularized Transducer for Multi-Component Category Classification.- Low Resolution Gait Recognition with High Frequency Super Resolution.- NIIA: Nonparametric Iterative Imputation Algorithm.- Mining Multidimensional Data through Element Oriented Analysis.- Evolutionary Feature Selections for Face Detection System.- A Probabilistic Approach to the Interpretation of Spoken Utterances.- Regular Papers.- Towards Autonomous Robot Operation: Path Map Generation of an Unknown Area by a New Trapezoidal Approximation Method Using a Self Guided Vehicle and Shortest Path Calculation by a Proposed SRS Algorithm.- Exploring Combinations of Ontological Features and Keywords for Text Retrieval.- Instance Management Problems in the Role Model of Hozo.- Advancing Topic Ontology Learning through Term Extraction.- Handling Unknown and Imprecise Attribute Values in Propositional Rule Learning: A Feature-Based Approach.- Fuzzy Knowledge Discovery from Time Series Data for Events Prediction.- Evolution of Migration Behavior with Multi-agent Simulation.- Constraint Relaxation Approach for Over-Constrained Agent Interaction.- Structure Extraction from Presentation Slide Information.- Combining Local and Global Resources for Constructing an Error-Minimized Opinion Word Dictionary.- An Improvement of PAA for Dimensionality Reduction in Large Time Series Databases.- Stability Margin for Linear Systems with Fuzzy Parametric Uncertainty.- An Imperative Account of Actions.- Natural Language Interface Construction Using Semantic Grammars.- Exploiting the Role of Named Entities in Query-Oriented Document Summarization.- A Probabilistic Model for Understanding Composite Spoken Descriptions.- Fuzzy Communication Reaching Consensus under Acyclic Condition.- Probabilistic Nogood Store as a Heuristic.- Semantic Filtering for DDL-Based Service Composition.- Prediction of Protein Functions from Protein Interaction Networks: A Naive Bayes Approach.- Multi-class Support Vector Machine Simplification.- A Syntactic-based Word Re-ordering for English-Vietnamese Statistical Machine Translation System.- A Multi-modal Particle Filter Based Motorcycle Tracking System.- Bayesian Inference on Hidden Knowledge in High-Throughput Molecular Biology Data.- Personalized Search Using ODP-based User Profiles Created from User Bookmark.- Domain-Driven Local Exceptional Pattern Mining for Detecting Stock Price Manipulation.- A Graph-Based Method for Combining Collaborative and Content-Based Filtering.- Hierarchical Differential Evolution for Parameter Estimation in Chemical Kinetics.- Differential Evolution Based on Improved Learning Strategy.- SalienceGraph: Visualizing Salience Dynamics of Written Discourse by Using Reference Probability and PLSA.- Learning Discriminative Sequence Models from Partially Labelled Data for Activity Recognition.- Feature Selection for Clustering on High Dimensional Data.- Availability of Web Information for Intercultural Communication.- Short Papers.- Mining Weighted Frequent Patterns in Incremental Databases.- Revision of Spatial Information by Containment.- Joint Power Control and Subcarrier Allocation in MC - CDMA Systems - An Intelligent Search Approach.- Domain-Independent Error-Based Simulation for Error-Awareness and Its Preliminary Evaluation.- A Characterization of Sensitivity Communication Robots Based on Mood Transition.- Recommendation Algorithm for Learning Materials That Maximizes Expected Test Scores.- A Hybrid Kansei Design Expert System Using Artificial Intelligence.- Solving the Contamination Minimization Problem on Networks for the Linear Threshold Model.- A Data-Driven Approach for Finding the Threshold Relevant to the Temporal Data Context of an Alarm of Interest.- Branch and Bound Algorithms to Solve Semiring Constraint Satisfaction Problems.- Image Analysis of the Relationship between Changes of Cornea and Postmortem Interval.- Context-Based Term Frequency Assessment for Text Classification.- Outlier Mining on Multiple Time Series Data in Stock Market.- Generating Interactive Facial Expression of Communication Robots Using Simple Recurrent Network.- Effects of Repair Support Agent for Accurate Multilingual Communication.- Towards Adapting XCS for Imbalance Problems.- Personalized Summarization Agent Using Non-negative Matrix Factorization.- Interactive Knowledge Acquisition and Scenario Authoring.- Reconstructing Hard Problems in a Human-Readable and Machine-Processable Way.- Evolving Intrusion Detection Rules on Mobile Ad Hoc Networks.- On the Usefulness of Interactive Computer Game Logs for Agent Modelling.- An Empirical Study on the Effect of Different Similarity Measures on User-Based Collaborative Filtering Algorithms.- Using Self-Organizing Maps with Learning Classifier System for Intrusion Detection.- New Particle Swarm Optimization Algorithm for Solving Degree Constrained Minimum Spanning Tree Problem.- Continuous Pitch Contour as an Improvement Feature for Music Information Retrieval by Humming/Singing.- Classification Using Improved Hybrid Wavelet Neural Networks.- Online Classifier Considering the Importance of Attributes.- An Improved Tabu Search Algorithm for 3D Protein Folding Problem.- Transferring Knowledge from Another Domain for Learning Action Models.- Texture and Target Orientation Estimation from Phase Congruency.- Query Classification and Expansion for Translation Mining Via Search Engines.

international conference on machine learning | 2005

An efficient method for simplifying support vector machines

DucDung Nguyen; Tu Bao Ho

In this paper we describe a new method to reduce the complexity of support vector machines by reducing the number of necessary support vectors included in their solutions. The reduction process iteratively selects two nearest support vectors belonging to the same class and replaces them by a newly constructed vector. Through the analysis of relation between vectors in the input and feature spaces, we present the construction of new vectors that requires to find the unique maximum point of a one-variable function on the interval (0, 1), not to minimize a function of many variables with local minimums in former reduced set methods. Experimental results on real life datasets show that the proposed method is effective in reducing number of support vectors and preserving machines generalization performance.

International Journal of Approximate Reasoning | 2002

A parametric representation of linguistic hedges in Zadeh's fuzzy logic

Van-Nam Huynh; Tu Bao Ho; Yoshiteru Nakamori

This paper proposes a model for the parametric representation of linguistic hedges in Zadeh?s fuzzy logic. In this model each linguistic truth-value, which is generated from a primary term of the linguistic truth variable, is identified by a real number r depending on the primary term. It is shown that the model yields a method of efficiently computing linguistic truth expressions accompanied with a rich algebraic structure of the linguistic truth domain, namely De Morgan algebra. Also, a fuzzy logic based on the parametric representation of linguistic truth-values is introduced.

knowledge discovery and data mining | 2003

Mining hepatitis data with temporal abstraction

Tu Bao Ho; Trong Dung Nguyen; Saori Kawasaki; Si Quang Le; Dung Duc Nguyen; Hideto Yokoi; Katsuhiko Takabayashi

The hepatitis temporal database collected at Chiba university hospital between 1982--2001 was recently given to challenge the KDD research. The database is large where each patient corresponds to 983 tests represented as sequences of irregular timestamp points with different lengths. This paper presents a temporal abstraction approach to mining knowledge from this hepatitis database. Exploiting hepatitis background knowledge and data analysis, we introduce new notions and methods for abstracting short-term changed and long-term changed tests. The abstracted data allow us to apply different machine learning methods for finding knowledge part of which is considered as new and interesting by medical doctors.

Pattern Recognition | 2008

An efficient kernel matrix evaluation measure

Canh Hao Nguyen; Tu Bao Ho

We study the problem of evaluating the goodness of a kernel matrix for a classification task. As kernel matrix evaluation is usually used in other expensive procedures like feature and model selections, the goodness measure must be calculated efficiently. Most previous approaches are not efficient except for kernel target alignment (KTA) that can be calculated in O(n^2) time complexity. Although KTA is widely used, we show that it has some serious drawbacks. We propose an efficient surrogate measure to evaluate the goodness of a kernel matrix based on the data distributions of classes in the feature space. The measure not only overcomes the limitations of KTA but also possesses other properties like invariance, efficiency and an error bound guarantee. Comparative experiments show that the measure is a good indication of the goodness of a kernel matrix.

european conference on principles of data mining and knowledge discovery | 2000

Hierarchical Document Clustering Based on Tolerance Rough Set Model

Saori Kawasaki; Ngoc Binh Nguyen; Tu Bao Ho

Clustering is a powerful tool for knowledge discovery in text collections. The quality of document clustering depends not only on clustering algorithms but also on document representation models. We develop a hierarchical document clustering algorithm based on a tolerance rough set model (TRSM) for representing documents, which offers a way of considering semantics relatedness between documents. The results of validation and evaluation of this method suggest that this clustering algorithm can be well adapted to text mining.

Pattern Recognition Letters | 2005

An association-based dissimilarity measure for categorical data

Si Quang Le; Tu Bao Ho

In this paper, we propose a novel method to measure the dissimilarity of categorical data. The key idea is to consider the dissimilarity between two categorical values of an attribute as a combination of dissimilarities between the conditional probability distributions of other attributes given these two values. Experiments with real data show that our dissimilarity estimation method improves the accuracy of the popular nearest neighbor classifier.

Explore More