David Szczesny
Ruhr University Bochum
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David Szczesny.
international symposium on system-on-chip | 2009
David Szczesny; Anas Showk; Sebastian Hessel; Attila Bilgic; Uwe Hildebrand; Valerio Frascolla
In this paper we present detailed profiling results and identify the time critical algorithms of the Long Term Evolution (LTE) layer 2 (L2) protocol processing on an ARM based mobile hardware platform. Furthermore, we investigate the applicability of a single ARM processor combined with a traditional hardware acceleration concept for the significantly increased computational demands in LTE and future mobile devices. A virtual prototyping approach is adopted in order to simulate a state-of-the-art mobile phone platform which is based on an ARM1176 core. Moreover a physical layer and base station emulator is implemented that allows for protocol investigations on transport block level at different transmission conditions. By simulating LTE data rates of 100 Mbit/s and beyond, we measure the execution times in a protocol stack model which is compliant to 3GPP Rel.8 specifications and comprises the most processing intensive downlink (DL) part of the LTE L2 data plane. We show that the computing power of a single embedded processor at reasonable clock frequencies is not enough to cope with the L2 requirements of next generation mobile devices. Thereby, Robust Header Compression (ROHC) processing is identified as the major time critical software algorithm, demanding half of the entire L2 DL execution time. Finally, we illustrate that a conventional hardware acceleration approach for the encryption algorithms fails to offer the performance required by LTE and future mobile phones.
euromicro conference on real-time systems | 2010
Felix Bruns; Shadi Traboulsi; David Szczesny; Elizabeth Gonzalez; Yang Xu; Attila Bilgic
Devices for the mobile market have to satisfy a set of challenging constraints. In addition to the classical power, reliability and cost constraints, modern devices often have to be open to third party applications and at the same time provide a closed and secure environment for system functionality. In current systems, this antagonism is solved by maintaining a physical separation of subsystems with contrary constraints. Virtualization technology is a promising solution to safely merge conflicting subsystems on a single processor which leads to huge cost benefits and higher flexibility. Microkernel based hyper visors are an attractive choice for virtualization, due to their reliability and robustness. However, the involvement of real-time constraints remains a challenging factor. In this paper, we investigate how the security and isolation features of the L4/Fiasco microkernel impact real-time applications by comparing thread switching times and interrupt latencies to those of a conventional Real-time Operating System (RTOS). In addition, we demonstrate that microkernel based systems require significantly more cache resources than traditional systems. Finally, we investigate the performance loss caused by cache and TLB interference imposed by an application subsystem which runs in parallel to the real-time subsystem.
computational science and engineering | 2009
Sebastian Hessel; David Szczesny; Shadi Traboulsi; Attila Bilgic; Josef Hausner
In this paper we present a design methodology for the identification and development of a suitable hardware platform (including dedicated hardware accelerators) for the data plane processing of the LTE protocol stack layer 2 (L2) in downlink direction. For this purpose, a hybrid design approach is adopted allowing first investigations of future mobile phone platforms on the system level (using virtual prototyping) combined with more accurate power-area explorations of hardware accelerators on the architectural level. Additionally, we show the employment of an LTE data generator peripheral, realizing L2 uplink processing and thus enabling platform analyses in a closed virtual environment. Furthermore, a modeling technique for a fast and efficient design of virtual hardware accelerator peripherals is demonstrated. A reasonable hardware/software partitioning can thereby be achieved early in the design phase. Once the system architecture is settled and thus the solution space is reduced, VHDL models of the accelerators are developed in order to find a suitable hardware implementation for LTE terminals based on timing constraints by system level simulations. As a case study, the LTE ciphering scheme, including the Advanced Encryption Standard (AES), is applied. We show results of our methodology by developing a deciphering hardware accelerator that enables the LTE protocol stack to process data rates of 100 Mbit/s and beyond.
international conference on hardware/software codesign and system synthesis | 2009
David Szczesny; Sebastian Hessel; Felix Bruns; Attila Bilgic
In this paper we present a new on-the-fly hardware acceleration approach, based on a smart Direct Memory Access (sDMA) controller, for the layer 2 (L2) downlink protocol stack processing in Long Term Evolution (LTE) and beyond mobile devices. We use virtual prototyping in order to simulate an ARM1176 processor based hardware platform together with the executed software comprising an LTE protocol stack model. The sDMA controller with diff erent hardware accelerator units for the time critical algorithms in the protocol stack is implemented and integrated in the hardware platform. We prove our new hardware/software partitioning concept for the LTE L2 by measuring the average execution time per transport block in the protocol stack at di fferent activated on-the-fly hardware acceleration stages in the sDMA controller. At LTE data rates of 100 Mbit/s, we achieve a speedup of 24% compared to a pure software implementation by enabling the sDMA hardware support for header processing in the protocol stack. Furthermore, an activation of the complete on-the-fly hardware acceleration in the sDMA controller, including on-the-fly deciphering, leads to a speedup of more than 50 %. Finally, at transmission conditions with more computational demands and data rates up to 320 Mbit/s, we obtain acceleration ratios of almost 80 %. Investigations show that our new sDMA on-the-fly hardware acceleration approach in combination with a single-core processor off ers the required computational power for next generation mobile devices.
SDL'09 Proceedings of the 14th international SDL conference on Design for motes and mobiles | 2009
Anas Showk; David Szczesny; Shadi Traboulsi; Irv Badr; Elizabeth Gonzalez; Attila Bilgic
The Long Term Evolution (LTE) radio communication is the upgrade of the current 3G mobile technology with a more complex protocol in order to enable very high data rates. The usage of Model Driven Development (MDD) has arisen as a promising way of dealing with the increasing complexity of next generation mobile protocols. In this paper, a light version of the LTE protocol for the access stratum user plane is modeled using the SDL Suite™ tool. The tool shows easy understanding of the model as well as easy testing of its functionality using simulation in cooperation with Message Sequence Chart (MSC). The simulation result shows that the implemented Specification and Description Language (SDL) guarantees a good consistency with the target scenarios. The system implementation is mapped to multiple threads and integrated with an operating system to enable execution in multi core hardware platforms.
International Journal of Embedded and Real-time Communication Systems | 2010
David Szczesny; Sebastian Hessel; Anas Showk; Attila Bilgic; Uwe Hildebrand; Valerio Frascolla
This article provides a detailed profiling of the layer 2 L2 protocol processing for 3G successor Long Term Evolution LTE. For this purpose, the most processing intensive part of the LTE L2 data plane is executed on top of a virtual ARM based mobile phone platform. The authors measure the execution times as well as the maximum data rates at different system setups. The profiling is done for uplink UL and downlink DL directions separately as well as in a joint UL and DL scenario. As a result, the authors identify time critical algorithms in the protocol stack and check to what extent state-of-the-art hardware platforms with a single-core processor and traditional hardware acceleration concepts are still applicable for protocol processing in LTE and beyond LTE mobile devices.
global communications conference | 2009
Sebastian Hessel; David Szczesny; Nils Lohmann; Attila Bilgic; Josef Hausner
In this paper we investigate hardware implementations of ciphering algorithms, SNOW 3G and the Advanced Encryption Standard (AES), for the acceleration of the protocol stack layer 2 in the 3G Long Term Evolution (LTE). This analysis is based on timing requirements from execution time measurements in a simulated mobile phone platform, where we apply data rates of 100 Mbit/s and above (200 and 300 Mbit/s) to account for LTE and beyond LTE investigations. Different architectures for both algorithms are explored in order to meet the performance requirements, while keeping the power and area budget at a reasonable level. Therefore, a hardware analysis is done using a standard cell library of Faradays 90nm CMOS technology. Finally, the cryptographic substitution box with one-hot encoding emerges as the best solution for both ciphering schemes. Additionally, the 128-bit data path in the AES is identified as the most suitable architecture for LTE terminals, whereas a dual-AES approach turns out to be a candidate for data rates far beyond LTE (like LTE-Advanced).
ieee international conference on communication software and networks | 2011
Shadi Traboulsi; Mohamad Sbeiti; David Szczesny; Anas Showk; Attila Bilgic
In this paper we present an efficient software implementation of the Advanced Encryption Standard (AES) used in the confidentiality algorithm of the Long Term Evolution (LTE) protocol. Our implementation is based on slicing and merging the bytes of several data blocks to exploit processors architecture width for multi-block encryption. In addition, an appropriate lookup table and data organization in memory are applied, combined with media processing instructions in order to enhance the performance of AES in embedded environments. Other optimized software implementations from literature are also explored and evaluated in comparison to the proposed implementation with respect to processing throughput and energy consumption using a multi-core based mobile phone platform. Simulation results show that the proposed implementation is the fastest among other implementations and achieves improvements in performance up to 69% while providing 59% of energy savings. Moreover, the presented implementation is scalable for multi-core execution. When running on two cores, it fulfills the LTE data rate of 100 Mbit/s and extends energy savings to 68%, leading to a total of 13 times improvement in energy efficiency.
international symposium on industrial embedded systems | 2011
David Szczesny; Shadi Traboulsi; Felix Bruns; Sebastian Hessel; Attila Bilgic
In this paper, we present different acceleration concepts for the Robust Header Compression version 2 (ROHCv2) algorithms in Long Term Evolution (LTE) handsets. First, we explore the potential performance improvements and energy savings by adopting scratchpad memories at various sizes. Second, dedicated hardware accelerators with different data transfer modes are compared in terms of processing speed and energy efficiency on system level. By applying a virtual prototyping methodology with a proprietary filter module, we are able to investigate these two approaches within a state-of-the-art ARM based mobile phone platform at real software loads. Additionally, combined measurements of the execution time together with an estimation of the energy, that is consumed in the memory and the bus architecture, are performed. With reasonably dimensioned scratchpad memories (16 kB for instructions and data respectively), maximum speedups and energy savings both of approximately 60 % are achieved depending on the cache sizes in the embedded processor. Even better performance, especially in combination with big caches, is reached with a dedicated ROHCv2 hardware accelerator supporting the processing of several packets at once in a so called list mode. Compared to the pure software case, the execution time and the energy consumption are both improved by up to 80 % at small caches and still amount to more than 40 % and almost 30 % at big caches, respectively.
vehicular technology conference | 2010
Sebastian Hessel; David Szczesny; Felix Bruns; Attila Bilgic; Josef Hausner
In this paper we present an architectural analysis of a smart DMA (sDMA) controller for protocol stack acceleration in mobile devices supporting 3GPPs Long Term Evolution (LTE). This concept already demonstrated a significant performance benefit over conventional approaches by on-the-fly header decoding and deciphering for the data plane of the LTE protocol stack layer 2 in downlink direction. With a low-level hardware implementation we prove that also from an architectural point of view the sDMA controller is suitable for LTE terminals. Compared to conventional hardware acceleration, chip area and energy consumption are reduced by 10% and 56%, respectively. Furthermore, we show that the header decoding has the highest architectural impact on the sDMA controller. By a change of the hardware/software partitioning within the header decoding unit, the chip area of the sDMA controller is decreased by 35%, while it consumes 39% less power. The improvement compared to the conventional approach (with the same modification) is then even increased to 17% (area) and 59% (energy).