Tamiya Onodera | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Tamiya Onodera is active.

Explore More

Publication

Featured researches published by Tamiya Onodera.

Proceedings of the ACM 1999 conference on Java Grande | 1999

Design, implementation, and evaluation of optimizations in a just-in-time compiler

Kazuaki Ishizaki; Motohiro Kawahito; Toshiaki Yasue; Mikio Takeuchi; Takeshi Ogasawara; Toshio Suganuma; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

The Java language incurs a runtime overhead for exception checks and object accesses without an interior pointer in order to ensure safety. It also requires type inclusion test, dynamic class loading, and dynamic method calls in order to ensure flexibility. A “JustIn-Time” (JIT) compiler generates native code from Java byte code at runtime. It must improve the runtime performance without compromising the safety and flexibility of the Java language. We designed and implemented effective optimizations for the JIT compiler, such as exception check elimination, common subexpression elimination, simple type inclusion test, method inlining, and resolution of dynamic method call. We evaluate the performance benefits of these optimizations based on various statistics collected using SPECjvm98 and two JavaSoft applications with byte code sizes ranging from 20000 to 280000 bytes. Each optimization contributes to an improvement in the performance of the programs.

conference on object-oriented programming systems, languages, and applications | 2002

Lock reservation: Java locks can mostly do without atomic operations

Kiyokuni Kawachiya; Akira Koseki; Tamiya Onodera

Because of the built-in support for multi-threaded programming, Java programs perform many lock operations. Although the overhead has been significantly reduced in the recent virtual machines, One or more atomic operations are required for acquiring and releasing an objects lock even in the fastest cases.This paper presents a novel algorithm called lock reservation. It exploits thread locality of Java locks, which claims that the locking sequence of a Java lock contains a very long repetition of a specific thread. The algorithm allows locks to be reserved for threads. When a thread attempts to acquire a lock, it can do without any atomic operation if the lock is reserved for the thread. Otherwise, it cancels the reservation and falls back to a conventional locking algorithm.We have evaluated an implementation of lock reservation in IBMs production virtual machine and compiler. The results show that it achieved performance improvements up to 53% in real Java programs.

Concurrency and Computation: Practice and Experience | 2000

Design, implementation, and evaluation of optimizations in a JavaTM Just-In-Time compiler

Kazuaki Ishizaki; Motohiro Kawahito; Toshiaki Yasue; Mikio Takeuchi; Takeshi Ogasawara; Toshio Suganuma; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

The Java language incurs a runtime overhead for exception checks and object accesses, which are executed without an interior pointer in order to ensure safety. It also requires type inclusion test, dynamic class loading, and dynamic method calls in order to ensure flexibility. A ‘Just-In-Time’ (JIT) compiler generates native code from Java byte code at runtime. It must improve the runtime performance without compromising the safety and flexibility of the Java language. We designed and implemented effective optimizations for a JIT compiler, such as exception check elimination, common subexpression elimination, simple type inclusion test, method inlining, and devirtualization of dynamic method call. We evaluate the performance benefits of these optimizations based on various statistics collected using SPECjvm98, its candidates, and two JavaSoft applications with byte code sizes ranging from 23 000 to 280 000 bytes. Each optimization contributes to an improvement in the performance of the programs. Copyright

programming language design and implementation | 2003

Stride prefetching by dynamically inspecting objects

Tatsushi Inagaki; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

Software prefetching is a promising technique to hide cache miss latencies, but it remains challenging to effectively prefetch pointer-based data structures because obtaining the memory address to be prefetched requires pointer dereferences. The recently proposed stride prefetching overcomes this problem, but it only exploits inter-iteration stride patterns and relies on an off-line profiling method.We propose a new algorithm for stride prefetching which is intended for use in a dynamic compiler. We exploit both inter- and intra-iteration stride patterns, which we discover using an ultra-lightweight profiling technique, called object inspection. This is a kind of partial interpretation that only a dynamic compiler can perform. During the compilation of a method, the dynamic compiler gathers the profile information by partially interpreting the method using the actual values of parameters and causing no side effects.We evaluated an implementation of our prefetching algorithm in a production-level Java just-in time compiler. The results show that the algorithm achieved up to an 18.9% and 25.1% speedup in industry-standard benchmarks on the Pentium 4 and the Athlon MP, respectively, while it increased the compilation time by less than 3.0%.

conference on object oriented programming systems languages and applications | 2003

Effectiveness of cross-platform optimizations for a java just-in-time compiler

Kazuaki Ishizaki; Mikio Takeuchi; Kiyokuni Kawachiya; Toshio Suganuma; Osamu Gohda; Tatsushi Inagaki; Akira Koseki; Kazunori Ogata; Motohiro Kawahito; Toshiaki Yasue; Takeshi Ogasawara; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

This paper describes the system overview of our Java Just-In-Time (JIT) compiler, which is the basis for the latest production version of IBM Java JIT compiler that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the cross-platform optimizations that are common across different architectures. We studied the effectiveness of each optimization by selectively disabling it in our JIT compiler on three different platforms: IA-32, IA-64, and PowerPC. Our detailed measurements allowed us to rank the optimizations in terms of the greatest performance improvements with the smallest compilation times. The identified set includes method inlining only for tiny methods, exception check eliminations using forward dataflow analysis and partial redundancy elimination, scalar replacement for instance and class fields using dataflow analysis, optimizations for type inclusion checks, and the elimination of merge points in the control flow graphs. These optimizations can achieve 90% of the peak performance for two industry-standard benchmark programs on these platforms with only 34% of the compilation time compared to the case for using all of the optimizations.

conference on object-oriented programming systems, languages, and applications | 1999

A study of locking objects with bimodal fields

Tamiya Onodera; Kiyokuni Kawachiya

Object locking can be efficiently implemented by bimodal use of a field reserved in an object. The field is used as a lightweight lock in one mode, while it holds a reference to a heavyweight lock in the other mode. A bimodal locking algorithm recently proposed for Java achieves the highest performance in the absence of contention, and is still fast enough when contention occurs. However, mode transitions inherent in bimodal locking have not yet been fully considered. The algorithm requires busy-wait in the transition from the light mode (inflation), and does not make the reverse transition (deflation) at all. We propose a new algorithm that allows both inflation without busy-wait and deflation, but still maintains an almost maximum level of performance in the absence of contention. We also present statistics on the synchronization behavior of real multithreaded Java programs, which indicate that busy-wait in inflation and absence of deflation can be problematic in terms of robustness and performance. Actually, an implementation of our algorithm shows increased robustness, and achieves performance improvements of up to 13.1% in server-oriented benchmarks.

international symposium on software testing and analysis | 2008

Finding bugs in java native interface programs

Goh Kondoh; Tamiya Onodera

In this paper, we describe static analysis techniques for finding bugs in programs using the Java Native Interface (JNI). The JNI is both tedious and error-prone because there are many JNI-specific mistakes that are not caught by a native compiler. This paper is focused on four kinds of common mistakes. First, explicit statements to handle a possible exception need to be inserted after a statement calling a Java method. However, such statements tend to be forgotten. We present a typestate analysis to detect this exception-handling mistake. Second, while the native code can allocate resources in a Java VM, those resources must be manually released, unlike Java. Mistakes in resource management cause leaks and other errors. To detect Java resource errors, we used the typestate analysis also used for detecting general memory errors. Third, if a reference to a Java resource lives across multiple native method invocations, it should be converted into a global reference. However, programmers sometimes forget this rule and, for example, store a local reference in a global variable for later uses. We provide a syntax checker that detects this bad coding practice. Fourth, no JNI function should be called in a critical region. If called there, the current thread might block and cause a deadlock. Misinterpreting the end of the critical region, programmers occasionally break this rule. We present a simple typestate analysis to detect an improper JNI function call in a critical region. We have implemented our analysis techniques in a bug-finding tool called BEAM, and executed it on opensource software including JNI code. In the experiment, our analysis techniques found 86 JNI-specific bugs without any overhead and increased the total number of bug reports by 76%.

Proceedings of the 2012 ACM SIGPLAN X10 Workshop on | 2012

X10-based massive parallel large-scale traffic flow simulation

Toyotaro Suzumura; Sei Kato; Takashi Imamichi; Mikio Takeuchi; Hiroki Kanezashi; Tsuyoshi Idé; Tamiya Onodera

Optimizing city transportation for smarter cities can have a major impact on the quality of life in urban areas in terms of economic merits and low environmental load. In many cities of the world, transport authorities are facing common challenges such as worsening congestion, insufficient transport infrastructure, increasing carbon emissions, and growing customer needs. To tackle these challenges, it is highly necessary to have fine-grained and large-scale agent simulation for designing smarter cities. In this paper we propose a large-scale traffic simulation platform built on top of X10, a new distributed and parallel programming language. Experimental results demonstrate linear scalable performance in simulating large-scale traffic flows of the national Japanese road network and a hundred of cities of the world using thousands of CPU cores.

virtual execution environments | 2007

Cloneable JVM: a new approach to start isolated java applications faster

Kiyokuni Kawachiya; Kazunori Ogata; Daniel Silva; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

Java has been successful particularly for writing applications in the server environment. However, isolation of multiple applications hasnot been efficiently achieved in Java. Many customers require that their applications are guarded by independent OS processes, but starting a Java application with a new process results in a long sequence of initializations being repeated each time. To date, there has been no way to quickly start a new Java application as an isolated OS process. In this paper, we propose a new isolation approach called Cloneable JVM to eliminate this startup overhead in Java. The key idea is to createa new Java application by copying, or cloning, the already-initialized image of the primary JVM process. Since the clone is already initialized, it can begin actual operations immediately as a new isolated process. This cloning abstraction can support new scenarios for Java, such as user isolation and transaction isolation. We implemented a prototype of the Cloneable JVM by modifying a production JVM on Linux, which provides a new API for cloning constructed on the Isolate API defined in JSR 121. Using this cloning API, several Java applications, including a large production J2EE application server, we remodified to demonstrate the isolation scenarios. Evaluations using these prototypes showed that new ready-to-serve Java applications can start up as a new process in less than 5 seconds, which is 4 to 170 times faster than starting these applications from scratch.

european conference on object-oriented programming | 2004

Lock Reservation for Java Reconsidered

Tamiya Onodera; Kikyokuni Kawachiya; Akira Koseki

Lock reservation, a powerful optimization for Java locks, is based on the observation that, in Java, each lock tends to be dominantly acquired and released by a specific thread. Reserving a lock for such a dominant thread allows the owner thread of the lock to acquire and release the lock without any atomic read-modify-write instructions.

Explore More