Is this you? Create Your Porfile

Saori Kawasaki

Japan Advanced Institute of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Saori Kawasaki is active.

Explore More

Publication

Featured researches published by Saori Kawasaki.

knowledge discovery and data mining | 2003

Mining hepatitis data with temporal abstraction

Tu Bao Ho; Trong Dung Nguyen; Saori Kawasaki; Si Quang Le; Dung Duc Nguyen; Hideto Yokoi; Katsuhiko Takabayashi

The hepatitis temporal database collected at Chiba university hospital between 1982--2001 was recently given to challenge the KDD research. The database is large where each patient corresponds to 983 tests represented as sequences of irregular timestamp points with different lengths. This paper presents a temporal abstraction approach to mining knowledge from this hepatitis database. Exploiting hepatitis background knowledge and data analysis, we introduce new notions and methods for abstracting short-term changed and long-term changed tests. The abstracted data allow us to apply different machine learning methods for finding knowledge part of which is considered as new and interesting by medical doctors.

european conference on principles of data mining and knowledge discovery | 2000

Hierarchical Document Clustering Based on Tolerance Rough Set Model

Saori Kawasaki; Ngoc Binh Nguyen; Tu Bao Ho

Clustering is a powerful tool for knowledge discovery in text collections. The quality of document clustering depends not only on clustering algorithms but also on document representation models. We develop a hierarchical document clustering algorithm based on a tolerance rough set model (TRSM) for representing documents, which offers a way of considering semantics relatedness between documents. The results of validation and evaluation of this method suggest that this clustering algorithm can be well adapted to text mining.

International Journal on Artificial Intelligence Tools | 2001

Visualization Support for User-Centered Model Selection in Knowledge Discovery and Data Mining

Tu Bao Ho; Trong Dung Nguyen; DucDung Nguyen; Saori Kawasaki

The problem of model selection in knowledge discovery and data mining—the selection of appropriate discovered patterns/models or algorithms to achieve such patterns/models—is generally a difficult task for the user as it requires meta-knowledge on algorithms/models and model performance metrics. Viewing knowledge discovery as a human-centered process that requires an effective collaboration between the user and the discovery system, our work aims to make model selection in knowledge discovery easier and more effective. For such a collaboration, our solution is to give the user the ability to try easily various alternatives and to compare competing models quantitatively and qualitatively. The basic idea of our solution is to integrate data and knowledge visualization with the knowledge discovery process in order to the support the participation of the user. We introduce the knowledge discovery system D2MS in which several visualization techniques of data and knowledge are developed and integrated into the steps of the knowledge discovery process. The visualizers in D2MS greatly help the user gain better insight in each step of the knowledge discovery process as well the relationship between data and discovered knowledge in the whole process.

New Generation Computing | 2007

Exploiting temporal relations in mining hepatitis data

Tu Bao Ho; Canh Hao Nguyen; Saori Kawasaki; Si Quang Le; Katsuhiko Takabayashi

Various data mining methods have been developed last few years for hepatitis study using a large temporal and relational database given to the research community. In this work we introduce a novel temporal abstraction method to this study by detecting and exploiting temporal patterns and relations between events in viral hepatitis such as “event A slightly happened before event B and B simultaneously ended with event C”. We developed algorithms to first detect significant temporal patterns in temporal sequences and then to identify temporal relations between these temporal patterns. Many findings by data mining methods applied to transactions/graphs of temporal relations shown to be significant by physician evaluation and matching with published in Medline.

Intelligent exploration of the web | 2003

Documents clustering using tolerance rough set model and its application to information retrieval

Tu Bao Ho; Saori Kawasaki; Ngoc Binh Nguyen

Clustering is a powerful tool for analyzing and finding useful information in text collections. However, document clustering is a difficult clustering problem because of the unstructured form and textual characteristics of documents. As a consequence, the quality of document clustering depends not only on clustering algorithms but also on document representation models. In this work we introduce a tolerance rough set model (TRSM) for representing documents as an alternative way of considering semantics relatedness between documents. Using TRSM we develop two hierarchical and nonhierarchical clustering algorithms for documents and apply these clustering methods to information retrieval. The TRSM clustering methods and the TRSM cluster-based information retrieval method are carefully evaluated and validated by comparative experiments on test collections.

2008 IEEE International Conference on Research, Innovation and Vision for the Future in Computing and Communication Technologies | 2008

Simple but effective methods for combining kernels in computational biology

Hiroaki Tanabe; Tu Bao Ho; Canh Hao Nguyen; Saori Kawasaki

Complex biological data generated from various experiments are stored in diverse data types in multiple datasets. By appropriately representing each biological dataset as a kernel matrix then combining them in solving problems, the kernel-based approach has become a spotlight in data integration and its application in bioinformatics and other fields as well. While linear combination of unweighed multiple kernels (UMK) is popular, there have been effort on multiple kernel learning (MKL) where optimal weights are learned by semi-definite programming or sequential minimal optimization (SMO-MKL). These methods provide high accuracy of biological prediction problems, but very complicated and hard to use, especially for non-experts in optimization. These methods are also usually of high computational cost and not suitable for large data sets. In this paper, we propose two simple but effective methods for determining weights for conic combination of multiple kernels. The former is to learn optimal weights formulated by our measure FSM for kernel matrix evaluation (feature space-based kernel matrix evaluation measure), denoted by FSM-MKL. The latter assigns a weight to each kernel that is proportional to the quality of the kernel, determining by direct cross validation, named proportionally weighted multiple kernels (PWMK). Experimental comparative evaluation of the four methods UMK, SMO-MKL, FSM-MKL and PWMK for the problem of protein-protein interactions shows that our proposed methods are simpler, more efficient but still effective. They achieved performances almost as high as that of MKL and higher than that of UMK.

international conference on knowledge-based and intelligent information and engineering systems | 2003

Abstraction of Long-Term Changed Tests in Mining Hepatitis Data

Saori Kawasaki; Tu Bao Ho; Dung Trong Nguyen

For data mining from time related data, how to deal with data sequences is one of the most important problems. We developed a temporal abstraction (TA) method for mining the hepatitis dataset provided as a common challenge by Chiba university hospital. TA transforms original data sequences into the categorical data and enables to apply various learning methods to transformed data. This paper focuses on temporal abstraction for long-term changed tests with the introduced notion of “changes of state” and an algorithm for extracting them.

IEICE Transactions on Information and Systems | 2007

Integration of Learning Methods, Medical Literature and Expert Inspection in Medical Data Mining

Tu Bao Ho; Saori Kawasaki; Katsuhiko Takabayashi; Canh Hao Nguyen

From lessons learned in medical data mining projects we show that integration of advanced computation techniques and human inspection is indispensable in medical data mining. We proposed an integrated approach that merges data mining and text mining methods plus visualization support for expert evaluation. We also appropriately developed temporal abstraction and text mining methods to exploit the collected data. Furthermore, our visual discovery system D2MS allowed to actively and effectively working with physicians. Significant findings in hepatitis study were obtained by the integrated approach.

The International Journal of Fuzzy Logic and Intelligent Systems | 2002

Cluster-based Information Retrieval with Tolerance Rough Set Model

Tu Bao Ho; Saori Kawasaki; Ngoc Binh Nguyen

The objectives of this paper are twofold. First is to introduce a model for representing documents with semantics relatedness using rough sets but with tolerance relations instead of equivalence relations (TRSM). Second is to introduce two document hierarchical and nonhierarchical clustering algorithms based on this model and TRSM cluster-based information retrieval using these two algorithms. The experimental results show that TRSM offers an alterative approach to text clustering and information retrieval.

knowledge, information, and creativity support systems | 2010

Discovering relationship between hepatitis C virus NS5A protein and interferon/ribavirin therapy

Saori Kawasaki; Tu Bao Ho; Tatsuo Kanda; Osamu Yokosuka; Katsuhiro Takabayashi; Nhan Le

Less than a half of hepatitis C patients respond to the current therapy by peg-interferon combined with ribavirin, and the genomic basis of those drug resistance remains unknown. It is recognized that the emerging challenge in the development of therapies for hepatitis C is new functions and mysteries of hepatitis C virus (HCV) non structural 5A (NS5A) protein. Different from current studies using small experimental samples to analyze relations between the HCV NS5A region and the response to interferon/ribavirin therapy, we introduce a data mining-based framework that exploits the largest available HCV database. This paper focuses on the methodology and early results of analyzing NS5A data.

Explore More