Shinji Nakadai | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shinji Nakadai is active.

Explore More

Publication

Featured researches published by Shinji Nakadai.

ieee international conference on cloud computing technology and science | 2011

Optimizing Multiple Machine Learning Jobs on MapReduce

Hiroshi Tamano; Shinji Nakadai; Takuya Araki

Recently, MapReduce has been used to parallelize machine learning algorithms. To obtain the best performance for these algorithms, tuning the parameters of the algorithms is required. However, this is time consuming because it requires executing a MapReduce program multiple times using various parameters. Such multiple executions can be assigned to a cluster in various ways, and the execution time varies depending on the assignments. To achieve the shortest execution time, we propose a method for optimizing the assignment of MapReduce jobs to a cluster assuming machine learning targeted runtime. We developed an execution cost model to predict the total execution time of jobs and obtained the optimal assignment by minimizing the cost model. To evaluate the proposed method, we implemented an experimental MapReduce runtime based on Message Passing Interface and executed logistic regression in four cases. The results showed that the proposed method can correctly predict the optimal job assignment. We also confirmed that the optimal assignment reduced execution time by a maximum 77% compared to the worst assignment.

ieee international conference on cloud computing technology and science | 2012

LoadAtomizer: A locality and I/O load aware task scheduler for MapReduce

Masato Asahara; Shinji Nakadai; Takuya Araki

Data-intensive computing systems like MapReduce and Dryad have emerged as a framework for leveraging computing resources of a cluster. I/O bottlenecks need to be eased to improve performance in data-intensive computing systems. State-of-the-art frameworks for data-intensive computing have tackled the issue with a data locality based task scheduling policy. However, locality-aware scheduling does not always work good to mitigate I/O bottlenecks when different I/O characteristic jobs run concurrently. This paper presents LoadAtomizer, a locality and I/O load aware task scheduler for MapReduce. LoadAtomizer mitigates the I/O bottlenecks of a cluster with locality and I/O load aware map task assignment and storage selection. LoadAtomizer quickly assigns a slave a map task whose input data is stored in a lightly loaded storage and commands the slave to read the input data from the storage. LoadAtomizer maintains the load information of storages and the network with a topology-aware load tree. A topology-aware load tree enables LoadAtomizer to select quickly a lightly loaded storage that a slave can access through a lightly loaded network path. Experimental results demonstrated that our prototype of LoadAtomizer shortened completion time of multiple jobs by up to 18.6 %.

network operations and management symposium | 2008

UTRAN O&M support system with statistical fault identification and customizable rule sets

Yoshinori Watanabe; Yasuhiko Matsunaga; Kosei Kobayashi; Toshio Tonouchi; Tomohiro Igakura; Shinji Nakadai; Kenichirou Kamachi

With the proliferation of mobile-network services, mobile networks have become one of the core social infrastructures and are therefore required to operate stably. Conventional mobile-network-management systems can detect and recover from faults according to previously formulated rules. However, such semi-static rules are vulnerable to large variation in quality of networks and to sudden increases in traffic, which are inherent to radio access networks. Maintenance personnel often had to perform case-by-case analyses for the vast numbers of abnormal cases and such analyses and creation of rules may have taken more time when new faults appeared. In addition, providing network elements timely is essential to maintaining network quality but it is becoming more difficult as mobile networks become larger and more complicated. To overcome these problems, we have developed a system that enhances the conventional UMTS Terrestrial Radio Access Network (UTRAN) management system of mobile networks and improves stability during network operation. In this paper, we present main features of the system and its fundamental technologies: fault detection based on statistical reliability, root-cause analysis of the fault, a rule editor easy to create and modify fault-analysis rules for maintenance personnel, and guidance on expansion based on long-term analysis of trends. We tested this system then confirmed it could reduce fault-detection errors in a conventional management system through theoretical calculations and trials in a cellular mobile network.

integrated network management | 2007

Server Capacity Planning with Priority Allocation for Service Level Management in Heterogeneous Server Clusters

Shinji Nakadai; Kunihiro Taniguchi

Web sites occasionally experience sharp fluctuation in load. The quality of such services can be maintained by allocating servers according to the load. Such autonomic service level management requires server capacity planning. However, existing capacity planning functions cannot appropriately calculate capacity in a heterogeneous server cluster, nor can they facilitate the prioritizing of services. As a result, high-priority services may deteriorate, while the quality of low-priority services remains high. Our approach achieves appropriate capacity planning for a heterogeneous server cluster by the resolution of integer programming induced by a certain status in a system model The status is specified by the consideration of a weighted round-robin algorithm of a managed load balancer. The priority allocation function is facilitated by fuzzy control.

data warehousing and knowledge discovery | 2012

Landmark-join: hash-join based string similarity joins with edit distance constraints

Kazuyo Narita; Shinji Nakadai; Takuya Araki

Parallel data processing complicates the completion of string similarity joins because parallel data processing requires the use of a well designed data partitioning scheme. Moreover, efficient verification of string pairs is needed to speed up the entire string similarity join process. We propose a novel framework that addresses these requirements through the use of edit distance constraints. The Landmark-Join framework has two functions that reduce two kinds of search spaces. The first, q-bucket partitioning, reduces the number of verifications of dissimilar string pairs and lowers skewness among buckets. The second, local upper bound calculation, prunes the search space of edit distance to speed up each verification. Experimental results show that Landmark-Join has good parallel scalability and that the two proposed functions speed up the entire string similarity join process.

distributed systems operations and management | 2004

Rule-Based CIM Query Facility for Dependency Resolution

Shinji Nakadai; Masato Kudo; Koichi Konishi

A distributed system is composed of various resources which have mutually complicated dependencies. The fact increases an importance of the dependency resolution facility which makes it possible to check if there is given dependency between resources such as a router, and to determine which resources have given dependencies with other resources. This paper addresses a CIM query facility for dependency resolution. Its main features are ease of query description, bi-directional query execution, and completeness of query capability to CIM. These features are performed by a rule-based language that enables interesting predicates to be defined declaratively, unification and backtracking, and the preparation of predicates corresponding to CIM metamodel elements. To validate this facility, it was applied in servers dynamically allocated to service providers in a data center. The basic behavior of the query facility and the dynamic server allocation was illustrated.

machine learning and data mining in pattern recognition | 2018

From Black-Box to White-Box: Interpretable Learning with Kernel Machines.

Hao Zhang; Shinji Nakadai; Kenji Fukumizu

We present a novel approach to interpretable learning with kernel machines. In many real-world learning tasks, kernel machines have been successfully applied. However, a common perception is that they are difficult to interpret by humans due to the inherent black-box nature. This restricts the application of kernel machines in domains where model interpretability is highly required. In this paper, we propose to construct interpretable kernel machines. Specifically, we design a new kernel function based on random Fourier features (RFF) for scalability, and develop a two-phase learning procedure: in the first phase, we explicitly map pairwise features to a high-dimensional space produced by the designed kernel, and learn a dense linear model; in the second phase, we extract an interpretable data representation from the first phase, and learn a sparse linear model. Finally, we evaluate our approach on benchmark datasets, and demonstrate its usefulness in terms of interpretability by visualization.

pacific-asia conference on knowledge discovery and data mining | 2017

Link Prediction for Isolated Nodes in Heterogeneous Network by Topic-Based Co-clustering

Katsufumi Tomobe; Masafumi Oyamada; Shinji Nakadai

This paper presents a new probabilistic generative model (PGM) that predicts links for isolated nodes in a heterogeneous network using textual data. In conventional PGMs, a link between two nodes is predicted on the basis of the nodes’ other existing links. This method makes it difficult to predict links for isolated nodes, which happens when new items are recommended. In this study, we first naturally expand the relational topic model (RTM) to a heterogeneous network (Hetero-RTM). However, this simple extension degrades performance in a link prediction for existing nodes. We present a new model called the Grouped Hetero-RTM that has both latent topics and latent clusterings. Through intensive experiments that simulate real recommendation problems, the Grouped Hetero-RTM outperforms baseline methods at predicting links for isolated nodes. This model, furthermore, performs as effectively as the stochastic block model in the link prediction for existing nodes. We also find that the Grouped Hetero-RTM is effective for various textual data such as item reviews and movie descriptions.

Archive | 2006