Masato Inagi | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Masato Inagi is active.

Explore More

Publication

Featured researches published by Masato Inagi.

field-programmable logic and applications | 2009

Globally optimal time-multiplexing in inter-FPGA connections for accelerating multi-FPGA systems

Masato Inagi; Yasuhiro Takashima; Yuichi Nakamura

Multi-FPGA systems are widely used for rapid prototyping and logic verification of VLSIs. To implement a huge logic circuit in a multi-FPGA system, the circuit needs to be partitioned into multiple FPGAs. Because of the limited interconnection resources between FPGAs, time-multiplexed I/Os are used for inter-FPGA connections. Due to the large delay of time-multiplexed I/Os, inter-FPGA connections strongly affect the system performance. In this paper, we extend an ILP-based optimization method of the inter-FPGA connections to improve the system performance. Our method uses both a normal I/O and a time-multiplexed I/O, and decides whether each inter-FPGA signal is transferred by a time-multiplexed I/O or not. Our extended method improves the system performance considering the variation of the amount of interconnection resources, and the variation of the number of inter-FPGA signals, from an FPGA pair to another FPGA pair. Experiments showed that our method improved the circuit performance on a 4-FPGA system by 26.4% compared with a conventional method, on average.

Ipsj Transactions on System Lsi Design Methodology | 2010

Globally Optimal Time-multiplexing of Inter-FPGA Connections for Multi-FPGA Prototyping Systems

Masato Inagi; Yasuhiro Takashima; Yuichi Nakamura

Multi-FPGA prototyping systems are widely used to verify logic circuit designs. To implement a large circuit using such a system, the circuit is partitioned into multiple FPGAs. Subsequently, sub-circuits assigned to FPGAs are connected using interconnection resources among the FPGAs. Because of limited resources, time-multiplexed I/Os are used to accommodate all signals in exchange for system speed. In this study, we propose an optimization method of inter-FPGA connections for multi-FPGA systems with time-multiplexed I/Os to shorten the verification time by accelerating the systems. Our method decides whether each inter-FPGA signal is transferred by a normal I/O or a time-multiplexed I/O, which is slower than a normal I/O but can transfer multiple signals. Our method optimizes inter-FPGA connections not only between a single FPGA pair, but among all the FPGAs. Experiments showed that for four-way partitioned circuits, our method obtains an average system clock period 16.0% shorter than that of a conventional method.

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences | 2008

Optimal Time-Multiplexing in Inter-FPGA Connections for Accelerating Multi-FPGA Prototyping Systems

Masato Inagi; Yasuhiro Takashima; Yuichi Nakamura; Atsushi Takahashi

In multi-FPGA prototyping systems for circuit verification, serialized time-multiplexed I/O technique is used because of the limited number of I/O pins of an FPGA. The verification time depends on a selection of inter-FPGA signals to be time-multiplexed. In this paper, we propose a method that minimizes the verification time of multi-FPGA systems by finding an optimal selection of inter-FPGA signals to be time-multiplexed. In the experiments, it is shown that the estimated verification time is improved 38.2% on average compared with conventional methods.

digital systems design | 2013

A Multithreaded Parallel Global Routing Method with Overlapped Routing Regions

Yasuhiro Shintani; Masato Inagi; Shinobu Nagayama; Shin'ichi Wakabayashi

Routing is one of the time-consuming processes in LSI design to connect previously placed terminals. In this study, we propose a multithreaded parallel routing algorithm for LSI design. In the proposed method, first, threads are created and the nets of the target net list are equally distributed to the threads. Sharing the routing regions, each of the threads searches a candidate path of a net in parallel without synchronization. Then, each thread exclusively writes a candidate path to the routing regions as a determined path. Although the exclusive control is necessary when updating the routing regions, this asynchronous parallel routing reduces the wait time of the threads. If a candidate path of a net does not satisfy constraints due to the asynchronous parallel routing, the net is re-routed. We experimentally confirmed that our proposed method running on a PC with eight cores was 7.1 times faster than the sequential execution. In addition, we also confirmed that the routing quality was not degraded compared to the sequential execution.

field-programmable logic and applications | 2011

An Efficient Hardware Matching Engine for Regular Expression with Nested Kleene Operators

Yoichi Wakaba; Masato Inagi; Shin'ichi Wakabayashi; Shinobu Nagayama

In this paper, we propose a systolic pattern-independent hardware regular expression matching (REM) engine which handles nested Kleene operators used in virus patterns. Pattern-independent systolic REM engines are suitable to network intrusion detection systems for quick updating of virus pattern. In the proposed engine, we introduce a compact pattern-independent NFA circuit, which can handle any small regular expression patterns, into a systolic REM engine to handle nested Kleene operators. Experimental results show that the extended engine implemented on an FPGA handles nested Kleene operators with efficient circuit size and high performance (2.17Gbps).

international symposium on circuits and systems | 2008

ILP-based optimization of time-multiplexed I/O assignment for multi-FPGA systems

Masato Inagi; Yasuhiro Takashima; Yuichi Nakamura; Atsushi Takahashi

Due to the limited device capacity of an FPGA, multi-FPGA systems are used to verify huge state-of-the-art circuits. In the case, the number of I/O signals of each sub-circuit implemented in an FPGA tends to exceed the number of I/O-pins of the FPGA. To resolve the problem, time-multiplexed I/Os are used. Each of time-multiplexed I/Os is shared by multiple I/O signals of a sub-circuit by time-division. Since time-multiplexed I/Os introduce large delay, we propose algorithms which obtain the optimal number of required I/O-pins under the given timing constraint by choosing signals to be time-multiplexed.

field-programmable technology | 2006

A performance-driven circuit bipartitioning algorithm for multi-FPGA implementation with time-multiplexed I/Os

Masato Inagi; Yasuhiro Takashima; Yuichi Nakamura; Yoji Kajitani

For multi-FPGA systems, the limitation of the number of FPGA I/O-pins is one of the most critical issues. Using time-multiplexed I/Os eases the limitation. While, a signal path through n time-multiplexed I/Os makes the system clock period n + 1 times longer at most. To capture this feature, we introduce a new cost total cut-hopcount. Under the total cut-hopcount, we propose a performance-driven bipartitioning method VIOP. VIOP combines three algorithms, such that i) min-cut partitioning, ii) coarse performance-driven partitioning, and iii) fine performance-driven partitioning. For min-cut and coarse performance-driven partitioning, we employ well-known bipartitioning algorithms CLIP-FM and DUBA, respectively. For fine performance-driven partitioning, we propose a partitioning algorithm CAVP. By VIOP, the average cost was improved by 11.5% compared with the state-of-the-art algorithms

Ipsj Transactions on System Lsi Design Methodology | 2014

An Area Efficient Regular Expression Matching Engine Using Partial Reconfiguration for Quick Pattern Updating

Yoichi Wakaba; Shin'ichi Wakabayashi; Shinobu Nagayama; Masato Inagi

This paper proposes a method using partial reconfiguration to realize a compact regular expression matching engine, which can update a pattern quickly. In the proposed method, a set of partial circuits, each of which handles a different class of regular expressions, are provided in advance. When a regular expression pattern is given, a compact matching engine dedicated to the pattern is implemented on FPGA by combining the partial circuits according to the given pattern using partial reconfiguration. The method can update a pattern quickly, since it does not need re-design of a circuit. Experimental results show that the proposed method reduces 60% circuit size compared with the previous method without increasing the pattern updating time significantly.

database and expert systems applications | 2018

An Approximate Nearest Neighbor Search Algorithm Using Distance-Based Hashing

Yuri Itotani; Shin'ichi Wakabayashi; Shinobu Nagayama; Masato Inagi

This paper proposes an approximate nearest neighbor search algorithm for high-dimensional data. The proposed algorithm is based on a distance-based hashing called adaptive flexible distance-based hashing (AFDH). For a given query, AFDH returns a small-sized candidate set of nearest neighbors, and the one closest to the query is selected as the final result. The main advantage of the proposed algorithm is that, without fine tuning of parameter values of the algorithm, good search results can be obtained. Experimental results show that the proposed algorithm produces satisfactory results in terms of quality of results as well as execution time.

field-programmable technology | 2016

An efficient FPGA implementation of Mahalanobis distance-based outlier detection for streaming data

Yuto Arai; Shin'ichi Wakabayashi; Shinobu Nagayama; Masato Inagi

With the recent explosive growth of data in the real world, data mining techniques to obtain characteristics and knowledge from big data attract more attention. This paper focuses on a method to detect outliers in streaming data, and proposes a fast FPGA implementation of outlier detection based on the Mahalanobis distance. The proposed circuit is fully pipelined, and in every clock cycle, a given sample data can be judged as an outlier or not. Experimental evaluation shows that the proposed circuit is 37 times faster than the software implementation of the Mahalanobis distance-based outlier detection.

Explore More