Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Takuya Azumi is active.

Publication


Featured researches published by Takuya Azumi.


international conference on parallel and distributed systems | 2013

Data Transfer Matters for GPU Computing

Yusuke Fujii; Takuya Azumi; Nobuhiko Nishio; Shinpei Kato; Masato Edahiro

Although the expectation maximization (EM)-based 3D computed tomography (CT) reconstruction algorithm lowers radiation exposure, its long execution time hinders practical usage. To accelerate this process, we introduce a novel external memory bandwidth reduction strategy by reusing both the sinogram and the voxel intensity. Also, a customized computing engine based on field-programmable gate array (FPGA) is presented to increase the effective memory bandwidth. Experiments on actual patient data show that 85X speedup can be achieved over single-threaded CPU.Graphics processing units (GPUs) embrace many-core compute devices where massively parallel compute threads are offloaded from CPUs. This heterogeneous nature of GPU computing raises non-trivial data transfer problems especially against latency-critical real-time systems. However even the basic characteristics of data transfers associated with GPU computing are not well studied in the literature. In this paper, we investigate and characterize currently-achievable data transfer methods of cutting-edge GPU technology. We implement these methods using open-source software to compare their performance and latency for real-world systems. Our experimental results show that the hardware-assisted direct memory access (DMA) and the I/O read-and-write access methods are usually the most effective, while on-chip micro controllers inside the GPU are useful in terms of reducing the data transfer latency for concurrent multiple data streams. We also disclose that CPU priorities can protect the performance of GPU data transfers.


international symposium on object/component/service-oriented real-time distributed computing | 2007

A New Specification of Software Components for Embedded Systems

Takuya Azumi; Masanari Yamamoto; Yasuo Kominami; Nobuhisa Takagi; Hiroshi Oyama; Hiroaki Takada

In the last decade, the size and complexity of the software in embedded systems have increased. The present study attempts to decrease the complexity and difficulty of software development in embedded systems. We herein introduce a new component system that is suitable for embedded systems. It is possible to estimate the memory consumption of an entire application since the proposed system adopts a static configuration. In addition, this system takes into account to be used in several domains of embedded systems because several particle sizes of component are supported. Moreover, the concept of the component for a distributed application is presented


international symposium on object/component/service-oriented real-time distributed computing | 2010

Wheeled Inverted Pendulum with Embedded Component System: A Case Study

Takuya Azumi; Hiroaki Takada; Takayuki Ukai; Hiroshi Oyama

Software component techniques have been widely used for enhancement and the cost reduction of software development. We herein introduce a component system with a real-time operating system (RTOS). A case study of a two-wheeled inverted pendulum balancing robot with the component system is presented. The component system can deal with RTOS resources, such as tasks and semaphores, as components. Moreover, a trace functionality which is a new functionality to confirm the state of components or calling components without modification of C source code is introduced.


Journal of Information Processing | 2014

TECS Components Providing Functionalities of OSEK Specifications for ITRON OS

Atsushi Ohno; Takuya Azumi; Nobuhiko Nishio

The number of electronic control units (ECUs) has increased to manage complicated vehicle systems. Many kinds of operating systems that run on ECUs exist: ITRON OS, OSEK OS, and so forth. Currently, developers implement the system control software according to the ITRON and OSEK specifications independently. For example, even though OSes provide similar functionalities, OSEK specifications have several differences from ITRON specifications such as scheduling policies (Non-preemptive scheduling), alarms, hook routines, and several system calls. Thus, when using legacy software following OSEK specifications on the ITRON OS, developers have to port the software to ITRON OS. This paper presents a component-based framework to fill the gap between OSEK and ITRON specifications by using TECS (TOPPERS Embedded Component System). The work required to port legacy OSEK applications built with TECS components to ITRON applications built for TECS is reduced by using our method. TECS is a highlevel abstraction component system for enhancing the reusability of software. Examples for the characteristics of the framework are: (1) Non-preemptive scheduling tasks are implemented by changing the priority of the task to the highest priority; (2) The system works as the OSEK alarm based on a counter value, which is incremented at an arbitrary time interval; (3) OSEK hook routines are also available with a particular timing. Experimental results demonstrate that the overhead of the corresponding system calls compared to the original OSEK system calls is reduced to within 13.58 μsec.


embedded software | 2016

Exploring the performance of ROS2

Yuya Maruyama; Shinpei Kato; Takuya Azumi

Middleware for robotics development must meet demanding requirements in real-time distributed embedded systems. The Robot Operating System (ROS), open-source middleware, has been widely used for robotics applications. However, the ROS is not suitable for real-time embedded systems because it does not satisfy real-time requirements and only runs on a few OSs. To address this problem, ROS1 will undergo a significant upgrade to ROS2 by utilizing the Data Distribution Service (DDS). DDS is suitable for real-time distributed embedded systems due to its various transport configurations (e.g., deadline and fault-tolerance) and scalability. ROS2 must convert data for DDS and abstract DDS from its users; however, this incurs additional overhead, which is examined in this study. Transport latencies between ROS2 nodes vary depending on the use cases, data size, configurations, and DDS vendors. We conduct proof of concept for DDS approach to ROS and arrange DDS characteristic and guidelines from various evaluations. By highlighting the DDS capabilities, we explore and evaluate the potential and constraints of DDS and ROS2.


international symposium on object/component/service-oriented real-time distributed computing | 2015

mruby on TECS: Component-Based Framework for Running Script Program

Takuya Azumi; Yuki Nagahara; Hiroshi Oyama; Nobuhiko Nishio

Scripting languages are attractive for embedded system due to their high productivity. However, it is difficult to use scripting languages in a practical application because their performance and libraries for managing embedded devices are immature compared to the C programming language. This paper proposes a framework for effectively running an mruby script program on embedded systems based on the TOPPERS Embedded Component System (TECS). TECS generates glue code for invocation from mruby programs to legacy code in C language. It also supports configuration of mruby. Experimental results demonstrate the effectiveness of the proposed framework.


international conference on cyber physical systems | 2014

Reducing Data Copies between GPUs and NICs

Anh Nguyen; Yusuke Fujii; Yuki Iida; Takuya Azumi; Nobuhiko Nishio; Shinpei Kato

Cyber-physical systems (CPS) must perform complex algorithms at very high speed to monitor and control complex real-world phenomena. GPU, with a large number of cores and extremely high parallel processing, promises better computation if the data parallelism often found in real-world scenarios of CPS could be exploited. Nevertheless, its performance is limited by the latency incurred when data are transferred between GPU memory and I/O devices. This paper describes a method, based on zero-copy processing, for data transmission between GPUs and NICs. The arrangement enables NICs to directly transfer data to and from GPU. Experimental results show effective data throughput without packet loss.


international symposium on object/component/service-oriented real-time distributed computing | 2013

HR-TECS: Component technology for embedded systems with memory protection

Takuya Ishikawa; Takuya Azumi; Hiroshi Oyama; Hiroaki Takada

A software partitioning has been used to develop safety-critical systems in recent years. In addition, software component technologies supporting a software partitioning have been developed. This paper describes the new component technology for embedded software that requires memory protection, which is one of the important features for the partitioning. HR-TECS is a new component technology based on the real-time operating system supporting the static memory layout. Developers can easily allocate components to partitions in order to protect memory areas. In addition, HR-TECS supports inter-partition communications so that developers can implement components without consideration for inter-partition communications. The results of evaluation demonstrate the effectiveness of HR-TECS.


computational science and engineering | 2013

Distributed Intent: Android Framework for Networked Devices Operation

Yuki Nagahara; Hiroshi Oyama; Takuya Azumi; Nobuhiko Nishio

We propose a distributed framework extending Intent in Android systems to improve collaboration with embedded devices. In the proposed framework, Android applications collaborate with embedded devices by sending serialized Intent messages to networked embedded devices. The applications can collaborate with embedded devices without proxy hardware. We indicate the frameworks effectiveness by evaluating its execution times and memory usage.


international conference on cyber physical systems | 2014

Connected Smartphones and High-Performance Servers for Remote Object Detection

Yuki Iida; Manato Hirabayashi; Takuya Azumi; Nobuhiko Nishio; Shinpei Kato

Object detection is one of the main challenges of cyber-physical systems for mobile applications. Mobile devices as embedded systems might not provide sufficient performance to meet the computational requirements for object detection, whereas it is not reasonable to enhance such devices with powerful processors. In this paper, we present a prototype system for mobile object detection based on smartphones and high-performance servers supplemented with graphics processing units (GPUs). The basic idea in the design of this system is to offload expensive image processing on a remote server over a wireless network, which allows mobile devices to support high-precision object detection. We describe an experimental evaluation using Android smartphones and high-end workstation with NVIDIA GPUs. Our conclusion is that mobile object detection is achievable at the expense of millisecond latency due to the communication between the mobile device and the remote server.

Collaboration


Dive into the Takuya Azumi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Yuki Iida

Ritsumeikan University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Jaeyong Rho

Ritsumeikan University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge