Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Heiner Litz is active.

Publication


Featured researches published by Heiner Litz.


field-programmable logic and applications | 2011

High Frequency Trading Acceleration Using FPGAs

Christian Leber; Benjamin Geib; Heiner Litz

This paper presents the design of an application specific hardware for accelerating High Frequency Trading applications. It is optimized to achieve the lowest possible latency for interpreting market data feeds and hence enable minimal round-trip times for executing electronic stock trades. The implementation described in this work enables hardware decoding of Ethernet, IP and UDP as well as of the FAST protocol which is a common protocol to transmit market feeds. For this purpose, we developed a microcode engine with a corresponding instruction set as well as a compiler which enables the flexibility to support a wide range of applied trading protocols. The complete system has been implemented in RTL code and evaluated on an FPGA. Our approach shows a 4x latency reduction in comparison to the conventional Software based approach.


ieee international symposium on parallel distributed processing workshops and phd forum | 2010

Efficient hardware support for the Partitioned Global Address Space

Holger Fröning; Heiner Litz

We present a novel architecture of a communication engine for non-coherent distributed shared memory systems. The shared memory is composed by a set of nodes exporting their memory. Remote memory access is possible by forwarding local load or store transactions to remote nodes. No software layers are involved in a remote access, neither on origin or target side: a user level process can directly access remote locations without any kind of software involvement. We have implemented the architecture as an FPGA-based prototype in order to demonstrate the functionality of the complete system. This prototype also allows real world measurements in order to show the performance potential of this architecture, in particular for fine grain memory accesses like they are typically used for synchronization tasks.


international conference on parallel processing | 2008

VELO: A Novel Communication Engine for Ultra-Low Latency Message Transfers

Heiner Litz; Holger Froening; Mondrian Nuessle; Ulrich Bruening

This paper presents a novel stateless, virtualized communication engine for sub-microsecond latency. Using a field-programmable-gate-array (FPGA) based prototype we show a latency of 970 ns between two machines with our virtualized engine for low overhead (VELO). The FPGA device is directly connected to the CPUs by a hypertransport link. The described hardware architecture is optimized for small messages and avoids the overhead typically found with direct-memory access (DMA) controlled transfers. The stateless approach allows to use the hardware unit directly from many threads and processes simultaneously. It provides a secure user level communication with an extremely optimized start-up phase. Micro benchmarks results are reported both based on proprietary API and OpenMPI basis.


architectural support for programming languages and operating systems | 2014

SI-TM: reducing transactional memory abort rates through snapshot isolation

Heiner Litz; David R. Cheriton; Amin Firoozshahian; Omid Azizi; John P. Stevenson

Transactional memory represents an attractive conceptual model for programming concurrent applications. Unfortunately, high transaction abort rates can cause significant performance degradation. Conventional transactional memory realizations not only pessimistically abort transactions on every read-write conflict but also because of false sharing, cache evictions, TLB misses, page faults and interrupts. Consequently, the use of transactions needs to be restricted to a very small number of operations to achieve predictable performance, thereby, limiting its benefit to programming simplification. In this paper, we investigate snapshot isolation transactional memory in which transactions operate on memory snapshots that always guarantee consistent reads. By exploiting snapshots, an established database model of transactions, transactions can ignore read-write conflicts and only need to abort on write-write conflicts. Our implementation utilizes a memory controller that supports multiversion memory, to efficiently support snapshotting in hardware.We show that snapshot isolation can reduce the number of aborts in some cases by three orders of magnitude and improve performance by up to 20x.


ieee/acm international symposium cluster, cloud and grid computing | 2013

On Achieving High Message Rates

Holger Fröning; Mondrian Nüssle; Heiner Litz; Christian Leber; Ulrich Brüning

Computer systems continue to increase in parallelism in all areas. Stagnating single thread performance as well as power constraints prevent a reversal of this trend, on the contrary, current projections show that the trend towards parallelism will accelerate. In cluster computing, scalability, and therefore the degree of parallelism, is limited by the network interconnect and more specifically by the message rate it provides. We designed an interconnection network specifically for high message rates. Among other things, it reduces the burden on the software stack by relying on communication engines that perform a large fraction of the send and receive functionality in hardware. It also supports multi-core environments very efficiently through hardware-level virtualization of the communication engines. We provide details on the overall architecture, the thin software stack, performance results for a set of MPI-based benchmarks, and an in-depth analysis of how application performance depends on the message rate. We vary the message rate by software and hardware techniques, and measure the application-level impact of different message rates. We are also using this analysis to extrapolate performance for technologies with wider data paths and higher line rates.


international conference on networks | 2010

A Case for FPGA Based Accelerated Communication

Holger Fröning; Mondrian Nüssle; Heiner Litz; Ulrich Brüning

The use of Field Programmable Gate Arrays (FPGAs) in the area of High Performance Computing (HPC) to accelerate computations is well known. We present here a case where FPGAs can be used to speed up communication instead of computation. Current interconnects for HPC are in particular missing support for fine grain communication, which is increasingly found in various applications. In order to overcome this situation we developed a novel custom network. By using solely FPGAs it can easily be reconfigured to custom needs. The main drawback of FPGAs is their limited performance, which is about one to two orders of magnitude slower than commercial (specialized) solutions. However, an architecture optimized for small packet sizes results in a performance superior even to commercial high performance solutions. This excellent communication performance is verified by results from several popular benchmarks. In summary, we present a case where FPGAs can be used to accelerate communication and outperform commercial interconnection networks for HPC.


architectural support for programming languages and operating systems | 2017

ReFlex: Remote Flash ≈ Local Flash

Ana Klimovic; Heiner Litz; Christos Kozyrakis

Remote access to NVMe Flash enables flexible scaling and high utilization of Flash capacity and IOPS within a datacenter. However, existing systems for remote Flash access either introduce significant performance overheads or fail to isolate the multiple remote clients sharing each Flash device. We present ReFlex, a software-based system for remote Flash access, that provides nearly identical performance to accessing local Flash. ReFlex uses a dataplane kernel to closely integrate networking and storage processing to achieve low latency and high throughput at low resource requirements. Specifically, ReFlex can serve up to 850K IOPS per core over TCP/IP networking, while adding 21us over direct access to local Flash. ReFlex uses a QoS scheduler that can enforce tail latency and throughput service-level objectives (SLOs) for thousands of remote clients. We show that ReFlex allows applications to use remote Flash while maintaining their original performance with local Flash.


ACM Transactions on Architecture and Code Optimization | 2015

Efficient Correction of Anomalies in Snapshot Isolation Transactions

Heiner Litz; Ricardo J. Dias; David R. Cheriton

Transactional memory systems providing snapshot isolation enable concurrent access to shared data without incurring aborts on read-write conflicts. Reducing aborts is extremely relevant as it leads to higher concurrency, greater performance, and better predictability. Unfortunately, snapshot isolation does not provide serializability as it allows certain anomalies that can lead to subtle consistency violations. While some mechanisms have been proposed to verify the correctness of a program utilizing snapshot isolation transactions, it remains difficult to repair incorrect applications. To reduce the programmer’s burden in this case, we present a technique based on dynamic code and graph dependency analysis that automatically corrects existing snapshot isolation anomalies in transactional memory programs. Our evaluation shows that corrected applications retain the performance benefits characteristic of snapshot isolation over conventional transactional memory systems.


intelligent user interfaces | 2006

Creating multiplatform user interfaces by annotation and adaptation

Yun Ding; Heiner Litz

This paper presents our novel framework, which creates user interfaces (UIs) for a variety of devices by annotating and reusing an existing one originally designed for large devices. It distinguishes itself from previous work by the unique combination of reusing existing UIs, intuitive graphical support and adaptation-based approach. It is extensible by supporting UI developers to build and integrate their customized transformation strategies into our framework.


multiagent system technologies | 2003

On Programming Information Agent Systems - An Integrated Hotel Reservation Service as Case Study

Yun Ding; Heiner Litz; Rainer Malaka; Dennis Pfisterer

This paper presents our integrated hotel reservation service. Using it as a case study, we discuss the design and implementation of agent-based information systems. Taking a system as a whole, we consider not only information agents but also their interface to human users and external information sources. In particular, our focus is on the interaction behavior, which can be observed both in interactions between agents and in interactions between agents and these interface components. We show that both kinds of interaction are coordinated by the same protocol. Using our implemented hotel reservation service system, we illustrate exemplarily how this understanding can be used to systematically design and validate interaction mechanism. We explore the possibility to facilitate the rapid prototyping of information agent systems using an interaction behavior editor. Moreover, by giving insight into some details of our hotel service system, we exemplify where the difficulties in implementing information agent systems are and thus infrastructural support are desirable.

Collaboration


Dive into the Heiner Litz's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge