Thorsten von Eicken | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Thorsten von Eicken is active.

Explore More

Publication

Featured researches published by Thorsten von Eicken.

international symposium on computer architecture | 1992

Active messages: a mechanism for integrated communication and computation

Thorsten von Eicken; David E. Culler; Seth Copen Goldstein; Klaus E. Schauser

The design challenge for large-scale multiprocessors is (1) to minimize communication overhead, (2) allow communication to overlap computation, and (3) coordinate the two without sacrificing processor cost/performance. We show that existing message passing multiprocessors have unnecessarily high communication costs. Research prototypes of message driven machines demonstrate low communication overhead, but poor processor cost/performance. We introduce a simple communication mechanism, Active Messages, show that it is intrinsic to both architectures, allows cost effective use of the hardware, and offers tremendous flexibility. Implementations on nCUBE/2 and CM-5 are described and evaluated using a split-phase shared-memory extension to C, Split-C. We further show that active messages are sufficient to implement the dynamically scheduled languages for which message driven machines were designed. With this mechanism, latency tolerance becomes a programming/compiling concern. Hardware support for active messages is desirable and we outline a range of enhancements to mainstream processors.

acm sigplan symposium on principles and practice of parallel programming | 1993

LogP: towards a realistic model of parallel computation

David E. Culler; Richard M. Karp; David A. Patterson; Abhijit Sahay; Klaus E. Schauser; Eunice E. Santos; Ramesh Subramonian; Thorsten von Eicken

A vast body of theoretical research has focused either on overly simplistic models of parallel computation, notably the PRAM, or overly specific models that have few representatives in the real world. Both kinds of models encourage exploitation of formal loopholes, rather than rewarding development of techniques that yield performance across a range of current and future parallel machines. This paper offers a new parallel machine model, called LogP, that reflects the critical technology trends underlying parallel computers. it is intended to serve as a basis for developing fast, portable parallel algorithms and to offer guidelines to machine designers. Such a model must strike a balance between detail and simplicity in order to reveal important bottlenecks without making analysis of interesting problems intractable. The model is based on four parameters that specify abstractly the computing bandwidth, the communication bandwidth, the communication delay, and the efficiency of coupling communication and computation. Portable parallel algorithms typically adapt to the machine configuration, in terms of these parameters. The utility of the model is demonstrated through examples that are implemented on the CM-5.

conference on object-oriented programming systems, languages, and applications | 1998

JRes: a resource accounting interface for Java

Grzegorz Czajkowski; Thorsten von Eicken

With the spread of the Internet the computing model on server systems is undergoing several important changes. Recent research ideas concerning dynamic operating system extensibility are finding their way into the commercial domain, resulting in designs of extensible databases and Web servers. In addition, both ordinary users and service providers must deal with untrusted downloadable executable code of unknown origin and intentions.Across the board, Java has emerged as the language of choice for Internet-oriented software. We argue that, in order to realize its full potential in applications dealing with untrusted code, Java needs a flexible resource accounting interface. The design and prototype implementation of such an interface --- JRes --- is presented in this paper. The interface allows to account for heap memory, CPU time, and network resources consumed by individual threads or groups of threads. JRes allows limits to be set on resources available to threads and it can invoke callbacks when these limits are exceeded. The JRes prototype described in this paper is implemented on top of standard Java virtual machines and requires only a small amount of native code.

Journal of Parallel and Distributed Computing | 1993

TAM—a compiler controlled threaded abstract machine

David E. Culler; Seth Copen Goldstein; Klaus E. Schauser; Thorsten von Eicken

Abstract The Threaded Abstract Machine (TAM) refines dataflow execution models to address the critical constraints that modern parallel architectures place on the compilation of general-purpose parallel programming languages. TAM defines a self-scheduled machine language of parallel threads, which provides a path from data-flow-graph program representations to conventional control flow. The most important feature of TAM is the way it exposes the interaction between the handling of asynchronous message events, the scheduling of computation, and the utilization of the storage hierarchy. This paper provides a complete description of TAM and codifies the model in terms of a pseudo machine language TL0. Issues in compilation from a high level parallel language to TL0 are discussed in general and specifically in regard to the Id90 language. The implementation of TL0 on the CM-5 multiprocessor is explained in detail. Using this implementation, a cost model is developed for the various TAM primitives. The TAM approach is evaluated on sizable Id90 programs on a 64 processor system. The scheduling hierarchy of quanta and threads is shown to provide substantial locality while tolerating long latencies. This allows the average thread scheduling cost to be extremely low.

conference on high performance computing (supercomputing) | 1996

Low-Latency Communication on the IBM RISC System/6000 SP

Chi-Chao Chang; Grzegorz Czajkowski; Chris Hawblitzel; Thorsten von Eicken

The IBM SP is one of the most powerful commercial MPPs, yet, in spite of its fast processors and high network bandwidth, the SPs communication latency is inferior to older machines such as the TMC CM-5 or Meiko CS-2. This paper investigates the use of Active Messages (AM) communication primitives as an alternative to the standard message passing in order to reduce communication overheads and to offer a good building block for higher layers of software. The first part of this paper describes an implementation of Active Messages (SP AM) which is layered directly on top of the SPs network adapter (TB2). With comparable bandwidth, SP AMs low overhead yields a round-trip latency that is 40% lower than IBM MPLs. The second part of the paper demonstrates the power of AM as a communication substrate by layering Split-C as well as MPI over it. Split-C benchmarks are used to compare the SP to other MPPs and show that low message overhead and high throughput compensate for SPs high network latency. The MPI implementation is based on the freely available MPICH version and achieves performance equivalent to IBMs MPI-F on the NAS benchmarks.

international symposium on computer architecture | 1993

Evaluation of mechanisms for fine-grained parallel programs in the J-machine and the CM-5

Ellen Spertus; Seth Copen Goldstein; Klaus E. Schauser; Thorsten von Eicken; David E. Culler; William J. Dally

This paper uses an abstract machine approach to compare the mechanisms of two parallel machines: the J-Machine and the CM-5. High-level parallel programs are translated by a single optimizing compiler to a fine-grained abstract parallel machine, TAM. A final compilation step is unique to each machine and optimizes for specifics of the architecture. By determining the cost of the primitives and weighting them by their dynamic frequency in parallel programs, we quantify the effectiveness of the following mechanisms individually and in combination. Efficient processor/network coupling proves valuable. Message dispatch is found to be less valuable without atomic operations that allow the scheduling levels to cooperate. Multiple hardware contexts are of small value when the contexts cooperate and the compiler can partition the register set. Tagged memory provides little gain. Finally, the performance of the overall system is strongly influenced by the performance of the memory system and the frequency of control operations.

Secure Internet programming | 2001

J-Kernel: a capability-based operating system for Java

Thorsten von Eicken; Chi-Chao Chang; Grzegorz Czajkowski; Chris Hawblitzel; Deyu Hu; Daniel Spoonhower

Safe language technology can be used for protection within a single address space. This protection is enforced by the languages type system, which ensures that references to objects cannot be forged. A safe language alone, however, lacks many features taken for granted in more traditional operating systems, such as rights revocation, thread protection, resource management, and support for domain termination. This paper describes the J-Kernel, a portable Java-based protection system that addresses these issues. J-Kernel protection domains can communicate through revocable capabilities, but are prevented from directly sharing unrevocable object references. A number of micro-benchmaxks characterize the costs of language-based protection, and an extensible web and telephony server based on the J-Kernel demonstrates the use of language-based protection in a large application.

operating systems design and implementation | 2002

Luna: a flexible Java protection system

Chris Hawblitzel; Thorsten von Eicken

Extensible Java systems face a difficult trade-off between sharing and protection. On one hand, Javas ability to run different protection domains in a single virtual machine enables domains to share data easily and communicate without address space switches. On the other hand, unrestricted sharing blurs the boundaries between protection domains, making it difficult to terminate domains and enforce restrictions on resource usage. Existing solutions to these problems restrict sharing in an ad-hoc fashion, ruling out many desirable programming styles.This paper presents an extension to Javas type system that systematically addresses the issues of data sharing, revocation, thread control, and resource control. Multiple tasks running in a single virtual machines share data using special remote pointers, which have different types from local pointers. The distinction between local and remote pointers allows the Java runtime system to mediate the communication between tasks without slowing down operations on ordinary pointers. The extensions to Java are implemented by a system called Luna, based on the Guavac and Marmot compilers, extended with special optimizations to support both fast inter-task communication and dynamic access control. The paper describes two applications written in Luna: a simple extensible web server, and an extension of the Squid web cache to support dynamic content generation.

european conference on parallel processing | 1996

Low-Latency Communication over Fast Ethernet

Matt Welsh; Anindya Basu; Thorsten von Eicken

Fast Ethernet (100Base-TX) can provide a low-cost alternative to more esoteric network technologies for high-performance cluster computing. We use a network architecture based on the U-Net approach to implement low-latency and high-bandwidth communication over Fast Ethernet, with performance rivaling (and in some cases exceeding) that of 155 Mbps ATM. U-Net provides protected, user-level access to the network interface and enables application-level round-trip latencies of less than 60μs over Fast Ethernet.

international conference on management of data | 1998

Secure and portable database extensibility

Michael W. Godfrey; Tobias Mayr; Praveen Seshadri; Thorsten von Eicken

The functionality of extensible database servers can be augmented by user-defined functions (UDFs). However, the servers security and stability are concerns whenever new code is incorporated. Recently, there has been interest in the use of Java for database extensibility. This raises several questions: Does Java solve the security problems? How does it affect efficiency? We explore the tradeoffs involved in extending the PREDATOR object-relational database server using Java. We also describe some interesting details of our implementation. The issues examined in our study are security, efficiency, and portability. Our performance experiments compare Java-based extensibility with traditional alternatives in the native language of the server. We explore a variety of UDFs that differ in the amount of computation involved and in the quantity of data accessed. We also qualitatively compare the security and portability of the different alternatives. Our conclusion is that Java-based UDFs are a viable approach in terms of performance. However, there may be challenging design issues in integrating Java UDFs with existing database systems.

Explore More