Takeshi Ogasawara | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Takeshi Ogasawara is active.

Explore More

Publication

Featured researches published by Takeshi Ogasawara.

Proceedings of the ACM 1999 conference on Java Grande | 1999

Design, implementation, and evaluation of optimizations in a just-in-time compiler

Kazuaki Ishizaki; Motohiro Kawahito; Toshiaki Yasue; Mikio Takeuchi; Takeshi Ogasawara; Toshio Suganuma; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

The Java language incurs a runtime overhead for exception checks and object accesses without an interior pointer in order to ensure safety. It also requires type inclusion test, dynamic class loading, and dynamic method calls in order to ensure flexibility. A “JustIn-Time” (JIT) compiler generates native code from Java byte code at runtime. It must improve the runtime performance without compromising the safety and flexibility of the Java language. We designed and implemented effective optimizations for the JIT compiler, such as exception check elimination, common subexpression elimination, simple type inclusion test, method inlining, and resolution of dynamic method call. We evaluate the performance benefits of these optimizations based on various statistics collected using SPECjvm98 and two JavaSoft applications with byte code sizes ranging from 20000 to 280000 bytes. Each optimization contributes to an improvement in the performance of the programs.

Concurrency and Computation: Practice and Experience | 2000

Design, implementation, and evaluation of optimizations in a JavaTM Just-In-Time compiler

Kazuaki Ishizaki; Motohiro Kawahito; Toshiaki Yasue; Mikio Takeuchi; Takeshi Ogasawara; Toshio Suganuma; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

The Java language incurs a runtime overhead for exception checks and object accesses, which are executed without an interior pointer in order to ensure safety. It also requires type inclusion test, dynamic class loading, and dynamic method calls in order to ensure flexibility. A ‘Just-In-Time’ (JIT) compiler generates native code from Java byte code at runtime. It must improve the runtime performance without compromising the safety and flexibility of the Java language. We designed and implemented effective optimizations for a JIT compiler, such as exception check elimination, common subexpression elimination, simple type inclusion test, method inlining, and devirtualization of dynamic method call. We evaluate the performance benefits of these optimizations based on various statistics collected using SPECjvm98, its candidates, and two JavaSoft applications with byte code sizes ranging from 23 000 to 280 000 bytes. Each optimization contributes to an improvement in the performance of the programs. Copyright

conference on object-oriented programming systems, languages, and applications | 2001

A study of exception handling and its dynamic optimization in Java

Takeshi Ogasawara; Hideaki Komatsu; Toshio Nakatani

Optimizing exception handling is critical for programs that frequently throw exceptions. We observed that there are many such exception-intensive programs iin various categories of Java programs. There are two commonly used exception handling techniques, stack unwinding optimizes the normal path, while stack cutting optimizes the exception handling path. However, there has been no single exception handling technique to optimize both paths.

conference on object oriented programming systems languages and applications | 2003

Effectiveness of cross-platform optimizations for a java just-in-time compiler

Kazuaki Ishizaki; Mikio Takeuchi; Kiyokuni Kawachiya; Toshio Suganuma; Osamu Gohda; Tatsushi Inagaki; Akira Koseki; Kazunori Ogata; Motohiro Kawahito; Toshiaki Yasue; Takeshi Ogasawara; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

This paper describes the system overview of our Java Just-In-Time (JIT) compiler, which is the basis for the latest production version of IBM Java JIT compiler that supports a diversity of processor architectures including both 32-bit and 64-bit modes, CISC, RISC, and VLIW architectures. In particular, we focus on the design and evaluation of the cross-platform optimizations that are common across different architectures. We studied the effectiveness of each optimization by selectively disabling it in our JIT compiler on three different platforms: IA-32, IA-64, and PowerPC. Our detailed measurements allowed us to rank the optimizations in terms of the greatest performance improvements with the smallest compilation times. The identified set includes method inlining only for tiny methods, exception check eliminations using forward dataflow analysis and partial redundancy elimination, scalar replacement for instance and class fields using dataflow analysis, optimizations for type inclusion checks, and the elimination of merge points in the control flow graphs. These optimizations can achieve 90% of the peak performance for two industry-standard benchmark programs on these platforms with only 34% of the compilation time compared to the case for using all of the optimizations.

conference on object-oriented programming systems, languages, and applications | 2009

NUMA-aware memory manager with dominant-thread-based copying GC

Takeshi Ogasawara

We propose a novel online method of identifying the preferred NUMA nodes for objects with negligible overhead during the garbage collection time as well as object allocation time. Since the number of CPUs (or NUMA nodes) is increasing recently, it is critical for the memory manager of the runtime environment of an object-oriented language to exploit the low latency of local memory for high performance. To locate the CPU of a thread that frequently accesses an object, prior research uses the runtime information about memory accesses as sampled by the hardware. However, the overhead of this approach is high for a garbage collector. Our approach uses the information about which thread can exclusively access an object, or the Dominant Thread (DoT). The dominant thread of an object is the thread that often most accesses an object so that we do not require memory access samples. Our NUMA-aware GC performs DoT based object copying, which copies each live object to the CPU where the dominant thread was last dispatched before GC. The dominant thread information is known from the thread stack and from objects that are locked or reserved by threads and is propagated in the object reference graph. We demonstrate that our approach can improve the performance of benchmark programs such as SPECpower ssj2008, SPECjbb2005, and SPECjvm2008.We prototyped a NUMAaware memory manager on a modified version of IBM Java VM and tested it on a cc-NUMA POWER6 machine with eight NUMA nodes. Our NUMA-aware GC achieved performance improvements up to 14.3% and 2.0% on average over a JVM that only used the NUMA-aware allocator. The total improvement using both the NUMA-aware allocator and GC is up to 53.1% and 10.8% on average.

embedded and real-time computing systems and applications | 1995

An algorithm with constant execution time for dynamic storage allocation

Takeshi Ogasawara

The predictability of the computation time of program modules is very important for estimating an accurate worst-case execution time (WCET) of a task in real-time systems. Dynamic storage allocation (DSA) is a common programming technique. Although many DSA algorithms have been developed, they focus on the average execution time rather than the WCET, making it is very difficult to calculate their WCET accurately. In this paper, we propose a new algorithm called Half-Fit whose WCET can be calculated accurately. The algorithm finds a free block without searching on a free list or tree, allowing extra unusable memory called incomplete memory use. In a simulation following a queueing model of M/G//spl infin/, Half-Fit has the advantage over the binary buddy system of more efficient storage utilization. The binary buddy system is a conventional algorithm whose WCET can be calculated.

conference on object-oriented programming systems, languages, and applications | 2012

On the benefits and pitfalls of extending a statically typed language JIT compiler for dynamic scripting languages

José G. Castaños; David Joel Edelsohn; Kazuaki Ishizaki; Priya Nagpurkar; Toshio Nakatani; Takeshi Ogasawara; Peng Wu

Whenever the need to compile a new dynamically typed language arises, an appealing option is to repurpose an existing statically typed language Just-In-Time (JIT) compiler (repurposed JIT compiler). Existing repurposed JIT compilers (RJIT compilers), however, have not yet delivered the hoped-for performance boosts. The performance of JVM languages, for instance, often lags behind standard interpreter implementations. Even more customized solutions that extend the internals of a JIT compiler for the target language compete poorly with those designed specifically for dynamically typed languages. Our own Fiorano JIT compiler is an example of this problem. As a state-of-the-art, RJIT compiler for Python, the Fiorano JIT compiler outperforms two other RJIT compilers (Unladen Swallow and Jython), but still shows a noticeable performance gap compared to PyPy, todays best performing Python JIT compiler. In this paper, we discuss techniques that have proved effective in the Fiorano JIT compiler as well as limitations of our current implementation. More importantly, this work offers the first in-depth look at benefits and limitations of the repurposed JIT compiler approach. We believe the most common pitfall of existing RJIT compilers is not focusing sufficiently on specialization, an abundant optimization opportunity unique to dynamically typed languages. Unfortunately, the lack of specialization cannot be overcome by applying traditional optimizations.

virtual execution environments | 2012

Adding dynamically-typed language support to a statically-typed language compiler: performance evaluation, analysis, and tradeoffs

Kazuaki Ishizaki; Takeshi Ogasawara; José G. Castaños; Priya Nagpurkar; David Joel Edelsohn; Toshio Nakatani

Applications written in dynamically typed scripting languages are increasingly popular for Web software development. Even on the server side, programmers are using dynamically typed scripting languages such as Ruby and Python to build complex applications quickly. As the number and complexity of dynamically typed scripting language applications grows, optimizing their performance is becoming important. Some of the best performing compilers and optimizers for dynamically typed scripting languages are developed entirely from scratch and target a specific language. This approach is not scalable, given the variety of dynamically typed scripting languages, and the effort involved in developing and maintaining separate infrastructures for each. In this paper, we evaluate the feasibility of adapting and extending an existing production-quality method-based Just-In-Time (JIT) compiler for a language with dynamic types. Our goal is to identify the challenges and shortcomings with the current infrastructure, and to propose and evaluate runtime techniques and optimizations that can be incorporated into a common optimization infrastructure for static and dynamic languages. We discuss three extensions to the compiler to support dynamically typed languages: (1) simplification of control flow graphs, (2) mapping of memory locations to stack-allocated variables, and (3) reduction of runtime overhead using language semantics. We also propose four new optimizations for Python in (2) and (3). These extensions are effective in reduction of compiler working memory and improvement of runtime performance. We present a detailed performance evaluation of our approach for Python, finding an overall improvement of 1.69x on average (up to 2.74x) over our JIT compiler without any optimization for dynamically typed languages and Python.

Ibm Journal of Research and Development | 2004

Evolution of a java just-in-time compiler for IA-32 platforms

Toshio Suganuma; Takeshi Ogasawara; Kiyokuni Kawachiya; Mikio Takeuchi; Kazuaki Ishizaki; Akira Koseki; Tatsushi Inagaki; Toshiaki Yasue; Motohiro Kawahito; Tamiya Onodera; Hideaki Komatsu; Toshio Nakatani

JavaTM has gained widespread popularity in the industry, and an efficient Java virtual machine (JVMTM) and just-in-time (JIT) compiler are crucial in providing high performance for Java applications. This paper describes the design and implementation of our JIT compiler for IA-32 platforms by focusing on the recent advances achieved in the past several years. We first present the dynamic optimization framework, which focuses the expensive optimization efforts only on performance-critical methods, thus helping to manage the total compilation overhead. We then describe the platform-independent features, which include the conversion from the stack-semantic Java bytecode into our register-based intermediate representation (IR) and a variety of aggressive optimizations applied to the IR. We also present some techniques specific to the IA-32 used to improve code quality, especially for the efficient use of the small number of registers on that platform. Using several industry-standard benchmark programs, the experimental results show that our approach offers high performance with low compilation overhead. Most of the techniques presented here are included in the IBM JIT compiler product, integrated into the IBM Development Kit for Microsoft Windows®, Java Technology Edition Version 1.4.0.

ieee international symposium on workload characterization | 2014

Workload characterization of server-side JavaScript

Takeshi Ogasawara

Recently, scripting languages are becoming popular as languages to develop server-side applications. Modern JavaScript compilers significantly optimize JavaScript code, but their main targets are client-side Web applications. In this paper, we characterize the runtime behaviors of server workloads on an emerging JavaScript server-side framework, Node.js, comparing it to client-side JavaScript code. The runtime profile shows that a large amount (47.5% on average) of the total CPU time is spent in the V8 C++ library used for server workloads, while only 3.2% is for client-side programs. Such high CPU usage in the C++ code limits the performance improvements for server workloads, since recent changes in the performance of client-side workloads are due to optimized JavaScript code. Our analysis of complex calling contexts reveals that function calls to the V8 runtime from the server-side framework to handle JavaScript objects significantly contribute to the high CPU usage in the V8 library (up to 22.5% of the total time and 15.4% on average). We also show that the function calls to the V8 runtime from compiled JavaScript code are another source of the high CPU usage in the V8 library code (up to 24.5% of the total and 18.7% on average). Only a few JavaScript functions that implement the server-side framework API make these function calls to the V8 runtime and contribute to a large amount of the CPU time (up to 6.8% of the total).

Explore More