Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Kazuyo Narita is active.

Publication


Featured researches published by Kazuyo Narita.


web age information management | 2008

Outlier Detection for Transaction Databases Using Association Rules

Kazuyo Narita; Hiroyuki Kitagawa

Outlier detection, a data mining technique to detect rare events, deviant objects, and exceptions from data, has drawn increasing attention in recent years. Much existing research targets record data constructed with numerical attributes or a set of points having numeric values. However, very few studies have attempted to detect outliers from data having items. We focus on transaction data and propose a framework for detecting outlier transactions that behave abnormally compared to others. As an outlier, we are interested in a transaction t in which more items are not observed even though they should normally have a strong dependency on item sets in t. We use information of association rules with high confidence for the outlier degree calculation. In this paper, we first discuss what outliers of transactions are, and provide an outlier degree for systematically detecting outlier transactions. We also propose algorithms for efficiently detecting outlier transactions from transaction databases. We present two devices for faster detection that (i) remove redundant association rules and (ii) prune candidates of outlier transactions utilizing maximal frequent itemsets. In experiments using synthetic and real world data sets, we show that our proposal can derive enough detection accuracies and detect outlier transactions faster than a brute force algorithm.


asia pacific web conference | 2008

Detecting outliers in categorical record databases based on attribute associations

Kazuyo Narita; Hiroyuki Kitagawa

Outlier detection, a data mining technique to detect rare events, deviant objects, and exceptions from data, has been drawing increasing attention in recent years. Most existing outlier detection algorithms focus on numerical data sets. We target categorical record databases and detect records in which many attribute values are not observed even though they should occur in association with other attribute values in the records. To detect such records as outliers, we provide an outlier degree, which demonstrates sufficient detection performance in accuracy-evaluation experiments compared with the probabilistic approach used in a related work. We also propose an efficient algorithm for detecting such outlier records. Experiments using real data sets show that our method detects interesting records as outliers.


data warehousing and knowledge discovery | 2012

Landmark-join: hash-join based string similarity joins with edit distance constraints

Kazuyo Narita; Shinji Nakadai; Takuya Araki

Parallel data processing complicates the completion of string similarity joins because parallel data processing requires the use of a well designed data partitioning scheme. Moreover, efficient verification of string pairs is needed to speed up the entire string similarity join process. We propose a novel framework that addresses these requirements through the use of edit distance constraints. The Landmark-Join framework has two functions that reduce two kinds of search spaces. The first, q-bucket partitioning, reduces the number of verifications of dissimilar string pairs and lowers skewness among buckets. The second, local upper bound calculation, prunes the search space of edit distance to speed up each verification. Experimental results show that Landmark-Join has good parallel scalability and that the two proposed functions speed up the entire string similarity join process.


asia-pacific web conference | 2014

MOARLE: Matrix Operation Accelerator Based on Run-Length Encoding

Masafumi Oyamada; Jianquan Liu; Kazuyo Narita; Takuya Araki

Matrix computation is a key technology in various data processing tasks including data mining, machine learning, and information retrieval. Size of matrices has been increasing with the development of computational resources and dissemination of big data. Huge matrices are memory- and computational-time-consuming. Therefore, reducing the size and computational time of huge matrices is a key challenge in the data processing area. We develop MOARLE, a novel matrix computation framework that saves memory space and computational time. In contrast to conventional matrix computational methods that target to sparse matrices, MOARLE can efficiently handle both sparse matrices and dense matrices. Our experimental results show that MOARLE can reduce the memory usage to 2% of the original usage and improve the computational performance by a factor of 124x.


international conference on big data | 2013

Feliss: Flexible distributed computing framework with light-weight checkpointing

Takuya Araki; Kazuyo Narita; Hiroshi Tamano

Current distributed computing frameworks, such as MapReduce and Spark, allow programmers to use only limited operations defined by the framework. Because of this restriction, algorithms that do not fit with the framework cannot be efficiently expressed. This restriction arose from the need of fault-tolerance. That is, these frameworks recover lost data by re-computing them from available data when a fault occurs. To ensure this mechanism works correctly, only operations provided by the system can be used. On the other hand, there is another fault-tolerance method called checkpointing. Since it achieves fault-tolerance by saving memory contents, there is no such limitation to operations. However, the cost of saving a memory image is high. To overcome this trade-off, we propose a light-weight checkpointing method called continuation-based checkpointing, which enables low overhead fault-tolerance without any restriction. It saves only the information that is necessary for restarting, which significantly reduces the cost of checkpointing. We implemented a distributed computing framework called Feliss by using our continuation-based checkpointing method, which includes an improved MapReduce without the above restriction and a message passing interface (MPI) subset. We evaluated Feliss with various applications and showed that order-of-magnitude speedup can be attained with applications that cannot be expressed efficiently with current frameworks.


Journal of Information Processing | 2018

Compressed Vector Set: A Fast and Space-Efficient Data Mining Framework

Masafumi Oyamada; Jianquan Liu; Shinji Ito; Kazuyo Narita; Takuya Araki; Hiroyuki Kitagawa

In this paper, we present CVS (Compressed Vector Set), a fast and space-efficient data mining framework that efficiently handles both sparse and dense datasets. CVS holds a set of vectors in a compressed format and conducts primitive vector operations, such as p-norm and dot product, without decompression. By combining these primitive operations, CVS accelerates prominent data mining or machine learning algorithms including k-nearest neighbor algorithm, stochastic gradient descent algorithm on logistic regression, and kernel methods. In contrast to the commonly used sparse matrix/vector representation, which is not effective for dense datasets, CVS efficiently handles sparse datasets and dense datasets in a unified manner. Our experimental results demonstrate that CVS can process both dense datasets and sparse datasets faster than conventional sparse vector representation with smaller memory usage.


Archive | 2011

JOIN PROCESSING DEVICE, DATA MANAGEMENT DEVICE, AND STRING SIMILARITY JOIN SYSTEM

Kazuyo Narita


Physica B-condensed Matter | 2010

Syntheses of new TTF-based metal complexes for conducting and magnetic systems: Schiff base-type metal complex with partially oxidized TTF moiety

Hiroyuki Nishikawa; Hironori Oshima; Kazuyo Narita; Hiroki Oshio


Archive | 2017

DATA MANAGEMENT APPARATUS, DATA ANALYSIS APPARATUS, DATA ANALYSIS SYSTEM, AND ANALYSIS METHOD

Kazuyo Narita


Archive | 2017

MULTIPLE QUERY OPTIMIZATION IN SQL-ON-HADOOP SYSTEMS

Ting Chen; Kazuyo Narita

Collaboration


Dive into the Kazuyo Narita's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge