Oliver Knodel
Dresden University of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Oliver Knodel.
application specific systems architectures and processors | 2011
Oliver Knodel; Thomas B. Preusser; Rainer G. Spallek
The mapping of DNA sequences to huge genome databases is an essential analysis task in modern molecular biology. Having linearized reference genomes available, the alignment of short DNA reads obtained from the sequencing of an individual genome against such a database provides a powerful diagnostic and analysis tool. In essence, this task amounts to a simple string search tolerating a certain number of mismatches to account for the diversity of individuals. The complexity of this process arises from the sheer size of the reference genome. It is further amplified by current next-generation sequencing technologies, which produce a huge number of increasingly short reads. These short reads hurt established alignment heuristics like BLAST severely. This paper proposes an FPGA-based custom computation, which performs the alignment of short DNA reads in a timely manner by the use of tremendous concurrency for reasonable costs. The special measures to achieve an extremely efficient and compact mapping of the computation to a Xilinx FPGA architecture are described. The presented approach also surpasses all software heuristics in the quality of its results. It guarantees to find all alignment locations of a read in the database while also allowing a freely adjustable character mismatch threshold. On the contrary, advanced fast alignment heuristics like Bowtie and Maq can only tolerate small mismatch maximums with a quick deterioration of the probability to detect existing valid alignments. The performance comparison with these widely used software tools also demonstrates that the proposed FPGA computation achieves its guaranteed exact results in very competitive time.
field-programmable custom computing machines | 2012
Thomas B. Preuber; Oliver Knodel; Rainer G. Spallek
The mapping of reads, i.e. short DNA base pair strings, to large genome databases has become a critical operation for genetic analysis and diagnosis. Although this mapping operation is a simple string search tolerant of some character mismatches, it is yet extremely challenging due to the tremendous size of the searched genome databases. It is the heavy use of search heuristics such as BLAST, Maq and Bowtie, which makes the economic deployment of read mappers possible. While these heuristics achieve feasible computation times, they also sacrifice the accuracy of the mapping results, which is itself a high value for reliable diagnostics. The traditional software implementations are unable to exploit the tremendous parallelism, which is available in the mapping of thousands and millions of reads. Merely a handful of concurrent control flows, and thus searches, can be performed efficiently on contemporary multicores. Even GPU assistance only enables a few dozens of parallel searches. This paper proposes a systolic custom computation on FPGA, which implements the read mapping on a massively parallel architecture. It implements a true search and guarantees to find all read mappings under a configurable threshold of base pair mismatches. The highly regular design from compact string matchers enables the implementation of thousands of parallel search engines on a single FPGA device. The presented map per platform combines highest computational performance with an excellent result accuracy. Its performance is more than twice as high as that of a recently published comparable FPGA map per. Already when implemented on a contemporary mid-size FPGA, it meets the search speed of software heuristics, which only detect little more than half of the valid read mappings. The map per easily scales to large FPGA devices, which can, thus, implement accurate high-performance volume mappers. Accurate mapping is made available in application domains that could only afford fuzzy heuristics by now.
digital systems design | 2015
Oliver Knodel; Rainer G. Spallek
Integrating FPGAs into clouds or data centers allows easy access to such reconfigurable resources and provides a promising opportunity to improve both performance and energy efficiency of such systems. Although currently the use of FPGAs as hardware accelerators and especially in clouds is mainly a topic of research, the integration of reconfigurable virtualized resources will become a task of growing importance in the future. We developed a cloud management and hypervisor system called RC3E providing FPGA resources as a service. This paper introduces a computing framework which extends our hypervisor and allows multiple (virtual) user designs on a single physical FPGA. The communication between host and FPGA is implemented by a communication API on the host and the integration of high-level synthesis (HLS) to accelerate applications. We demonstrate the usability of our framework by implementing a sample user design on an FPGA and measuring the performance with up to four simultaneous virtual user designs.
international conference on parallel processing | 2013
Oliver Knodel; Andy Georgi; Patrick Lehmann; Wolfgang E. Nagel; Rainer G. Spallek
Heterogeneous systems consisting of general-purpose processors and different types of hardware accelerators are becoming more and more common in HPC systems. Especially Field Programmable Gate Arrays (FPGAs) provide an energy-efficient way to achieve high performance. Numerous application areas, including bio- and neuroinformatics, require enormous processing capability and employ simple computation cores, elementary data structures and algorithms highly suitable for FPGAs. To allow an efficient work with distributed FPGAs, it is necessary to provide a simple and scalable integration of these FPGAs in a common cluster architecture and to permit an easy access to these resources. Our approach enables a system-wide dynamic partitioning, a batch-based administration and the monitoring of FPGA resources. The system can easily be reconfigured to user-specific requirements and provides a high degree of flexibility and performance.
field-programmable logic and applications | 2013
Patrick Lehmann; Thomas Frank; Oliver Knodel; Steffen Köhler; Thomas B. Preußer; Rainer G. Spallek
Field-Programmable Gate Arrays, which are widely used as prototyping platforms, are intruding the domain of custom-specific high-performance hardware accelerators, which operate highly efficiently by exploiting bit- and word-level parallelism. One opportunity to feed these FPGA accelerators with Gbytes of data is the direct attachment of mass-storage devices through a Serial-ATA link. State-of-the-art SATA controllers are designed and optimized for microprocessor-based systems with a random memory access pattern. Our approach, named Weasel, introduces a modularized, platform-independent and streaming-optimized SATA controller, which supports link speeds up to 6 Gbit/s. We demonstrate how to customize the given ATA standard and how to design a generic interface for different vendor-specific multi-gigabit transceivers. Implementations of the platform-independent interface for the Xilinx Virtex-5 and the Altera StratixII GX devices prove our concept. Finally, our measurements using hard-disk and solid-state drives prove a sustained throughput of 540 Mbytes/s over a SATA 6 Gbit/s link achievable. This is close to the theoretical maximum, which is constrained by the attached devices as by the speed of their flash memory.
ACM Sigarch Computer Architecture News | 2017
Oliver Knodel; Paul R. Genssler; Rainer G. Spallek
Computing performance and scalability are the essential basics in modern data centres. Field Programmable Gate Arrays (FPGAs) provide a promising opportunity to improve performance, security and energy efficiency. Especially background acceleration of computationally complex and long-running tasks is an important field of application. A flexible use of reconfigurable devices within a cloud context requires an abstraction of the actual hardware through virtualization. In this paper we present an approach inspired by paravirtualized machines for the integration of reconfigurable hardware into cloud services. Using partial reconfiguration our hardware and software framework virtualizes a single physical FPGA to enable multiple independent user designs. Essential components are the management of those virtual user-defined accelerators (vFPGA) and their migration between physical FPGAs to achieve higher system-wide utilization. The migration requires saving and restoring the internal state or context of the vFPGA. We demonstrate the application possibilities and the resource trade-off of our approach by transferring a running design from one physical FPGA to another. Moreover, we present future perspectives for the use of FPGAs in cloud-based environments.
southern conference programmable logic | 2014
Oliver Knodel; Martin Zabel; Patrick Lehmann; Rainer G. Spallek
The future of hardware development lies in massively parallel hardware architectures as used in embedded as well as high-performance systems, for instance streaming-based, realtime and database applications. Especially field-programmable gate arrays provide a platform for the rapid development of integrated circuits and the accompanied software. For reasons of energy efficiency, it is increasingly important to tailor hardware directly to the application. As such systems are very complex, the training of engineers has to start early. Furthermore, the usual curricula in computer science and electrical engineering teach only basic skills. In this paper we present lectures and especially practical FPGA design courses for bachelor and master students. We introduce a selection of individual projects, which were realized by students in practical courses. With examples from final bachelor projects and master theses we demonstrate the quality of education and its integration into current research. We describe possible improvements of labs, such as automated test benches and a remote FPGA laboratory for advanced courses.
reconfigurable computing and fpgas | 2014
Thomas B. Preußer; Oliver Knodel; Rainer G. Spallek
The mapping of reads, i.e. short DNA base pair strings, to large genome databases has become a critical operation for genetic analysis and diagnosis. The underlying alignment operation essentially is a string search tolerating some character mismatches and possibly character deletions or insertions with respect to a reference genome. Its output comprises the locations within the reference that are likely to correspond to the mapped DNA snippet. This paper describes PoC-Align, an alignment infrastructure using FPGA accelerators. It is an extension of our preceding FPGA aligner [1], which has been enhanced to tolerate alignment gaps (insertions and deletions) and to be more customizable though generic parameters. In addition to the descriptions of the implementation of these extensions, we also name the mainly software-carried enhancements, such as the support of mapping paired-end reads, that are implemented on top of the FPGA accelerator. Providing a thorough overview on the complete infrastructure, we aim at advertising the disclosure of the sources of our solution and hope to encourage other groups to use and extend this platform.
field-programmable logic and applications | 2013
Oliver Knodel; Rainer G. Spallek
Heterogeneous systems consisting of general-purpose processors and different types of hardware accelerators are becoming more and more common in HPC systems. Especially Field Programmable Gate Arrays (FPGAs) provide an energy-efficient way to achieve high performance. Numerous application areas, including bio- and neuroinformatics, require enormous processing capabilities and employ simple computation cores. These elementary data structures are highly suitable for FPGAs. To allow an efficient use of distributed FPGAs, we introduce a simple and scalable integration of the FPGAs in a common cluster architecture to permit an easy access to these resources.
international conference on cloud computing | 2016
Oliver Knodel; Patrick Lehmann; Rainer G. Spallek
Computing performance and scalability are essential ingredients in modern data centres offering cloud services. Field Programmable Gate Arrays (FPGAs) provide a promising opportunity to improve performance, security and energy efficiency because their hardware architecture can be adapted directly to the application. In this paper we present the development of our FPGA cloud architecture, beginning with realistic use cases and adapted service models for the use of reconfigurable hardware accelerators in a cloud context. Our architecture supports both applications where FPGAs are tightly coupled to host processors and applications using the FPGA for a secured cloud access in an overall homogeneous system. In contrast to other approaches, we model the system as a whole with a more flexible FPGA provision. The application service provider has the opportunity to offer a service with an individual FPGA design or customized secure interfaces to the cloud. For an abstraction from the real hardware and to achieve high device utilization, the FPGAs and the interfaces are fully virtualized. Because of the uncommon reconfigurable hardware integration and our adapted service models, we developed a special resource management system (RC3E) which serves as a hypervisor for the virtualized hardware. The demonstration of an intelligent load balancing as well as the systems performance and flexibility conclude our approach of FPGA services in a cloud.