Lukas Kekely
CESNET
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Lukas Kekely.
international conference on computer communications | 2014
Lukas Kekely; Viktor Pus; Jan Korenek
Current high-speed network monitoring systems focus more and more on the data from the application layers. Flow data is usually enriched by the information from HTTP, DNS and other protocols. The increasing speed of the network links, together with the time consuming application protocol parsing, require a new way of hardware acceleration. Therefore we propose a new concept of hardware acceleration for flexible flow-based application level monitoring which we call Software Defined Monitoring (SDM). The concept relies on smart monitoring tasks implemented in the software in conjunction with a configurable hardware accelerator. The hardware accelerator is an application-specific processor tailored to stateful flow processing. The monitoring tasks reside in the software and can easily control the level of detail retained by the hardware for each flow. This way the measurement of bulk/uninteresting traffic is offloaded to the hardware while the advanced monitoring over the interesting traffic is performed in the software. The proposed concept allows one to create flexible monitoring systems capable of deep packet inspection at high throughput. Our pilot implementation in FPGA is able to perform a 100 Gb/s flow traffic measurement augmented by a selected application-level protocol parsing.
IEEE Transactions on Computers | 2016
Lukas Kekely; Jan Kucera; Viktor Pus; Jan Korenek; Athanasios V. Vasilakos
With the ongoing shift of network services to the application layer also the monitoring systems focus more on the data from the application layer. The increasing speed of the network links, together with the increased complexity of application protocol processing, require a new way of hardware acceleration. We propose a new concept of hardware acceleration for flexible flow-based application level traffic monitoring which we call Software Defined Monitoring. Application layer processing is performed by monitoring tasks implemented in the software in conjunction with a configurable hardware accelerator. The accelerator is a high-speed application-specific processor tailored to stateful flow processing. The software monitoring tasks control the level of detail retained by the hardware for each flow in such a way that the usable information is always retained, while the remaining data is processed by simpler methods. Flexibility of the concept is provided by a plugin-based design of both hardware and software, which ensures adaptability in the evolving world of network monitoring. Our high-speed implementation using FPGA acceleration board in a commodity server is able to perform a 100 Gb/s flow traffic measurement augmented by a selected application-level protocol analysis.
architectures for networking and communications systems | 2012
Viktor Pus; Lukas Kekely; Jan Korenek
Packet parsing is the basic operation performed at all points of the network infrastructure. Modern networks impose challenging requirements on the performance and configurability of packet parsing modules, however the high-speed parsers often use very large chip area. We propose novel architecture of pipelined packet parser, which in addition to high throughput (over 100 Gb/s) offers also low latency. Moreover, the latency to throughput ratio can be finely tuned to fit the particular application. The parser is hand-optimized thanks to the direct implementation in VHDL, yet the structure is very uniform and easily extensible for new protocols.
design and diagnostics of electronic circuits and systems | 2014
Viktor Pus; Lukas Kekely; Jan Korenek
Packet parsing is among basic operations that are performed at all points of a network infrastructure. Modern networks impose challenging requirements on the performance and configurability of packet parsing modules. However, high-speed parsers often use a significant amount of hardware resources. We propose a novel architecture of a pipelined packet parser for FPGA, which offers low latency in addition to high throughput (over 100 Gb/s). Moreover, the latency, throughput and chip area can be finely tuned to fit the needs of a particular application. The parser is hand-optimized thanks to a direct implementation in VHDL, yet the structure is uniform and easily extensible for new protocols.
field programmable logic and applications | 2014
Lukas Kekely; Viktor Pus; Pavel Benacek; Jan Korenek
Current hardware acceleration cores for network traffic processing are often well optimized for one particular task and therefore provide high level of hardware acceleration. But for many applications, such as network traffic monitoring and security, it is also necessary to achieve rapid development cycle to provide fast response to security threats.We propose and evaluate a new concept of hardware acceleration for flexible flow-based network traffic monitoring with support of application protocol analysis. The concept is called Software Defined Monitoring (SDM) and it relies on a configurable hardware accelerator implemented in FPGA, coupled with smart monitoring tasks running as software on general CPU. The monitoring tasks in the software control the level of detail and type of information retained during the hardware processing. This arrangement allows rapid application prototyping in the software, followed by further shifting of the timing critical parts of the processing to the hardware accelerator. The concept is proposed with the scalability in mind, therefore it is suitable for different FPGA based platforms ranging from embedded single-chip solutions (such as Zynq or CycloneV) to high-speed backbone network monitoring boxes. Our pilot high-speed implementation using FPGA acceleration board in a commodity server performs a 100Gb/s flow traffic measurement augmented by a selected application protocol analysis.
design and diagnostics of electronic circuits and systems | 2014
Lukas Kekely; Martin Zadnik; Jiri Matousek; Jan Korenek
Rapidly growing speed and complexity of computer networks impose new requirements on fast lookup structures which are utilized in many networking applications (SDN, firewalls, NATs, etc.). We propose a novel lookup concept based on the well-known cuckoo hashing, which can achieve good memory utilization, supplemented by a binary search tree for offloading the colliding keys and supporting LPM lookup. We also propose a hardware architecture implementing this lookup concept in the FPGA. Our solution is suitable for lookup of the variable-length keys in 100+ Gbps networks. Memory utilization of the proposed concept is thoroughly evaluated and it is shown that the concept is scalable to external memory components.
integrated network management | 2015
Viktor Pus; Petr Velan; Lukas Kekely; Jan Korenek; Pavel Minarik
This demo demonstrates results of a joint research project of CESNET and INVEA-TECH focused on 100 GbE network flow monitoring using FPGA. It shows, to the best of our knowledge, the first flow monitoring setup capable of handling fully saturated 100 G Ethernet line. We present COMBO-CG card that provides accurate timestamps for high-resolution traffic monitoring. The card is complemented by fast DMA engine and optimized Linux drivers which were designed and implemented to achieve 100 Gbps data transfers through PCIe bus with low CPU utilization. Network traffic can be distributed among multiple CPU cores based on configurable hash functions. Our flow exporter is able to fully utilize available CPU cores to provide wire-speed performance for processing of the 100 Gbps traffic. The demo will show complete 100 G flow monitoring setup - from packet generator to flow collector.
design and diagnostics of electronic circuits and systems | 2014
Tomáš Závodník; Lukas Kekely; Viktor Pus
We propose a novel approach to the computation of the CRC functions, commonly used for bit error checking purposes when handling binary data. This approach is designed for general hashing purposes in FPGA, for which the CRCs are usable as well. The method is suitable for applications which use parallel inputs of fixed size and require high throughput, such as hash tables. We employ the DSP blocks present in modern FPGAs to perform all the necessary XOR operations, so that our solution does not consume any LUTs. We propose a Monte Carlo based heuristic to reduce the number of the DSP blocks required by the computation. Our experimental results show that one DSP block capable of 48 XOR operations can replace around eleven 6-input LUTs.
field programmable gate arrays | 2018
Jakub Cabal; Pavel Benacek; Lukas Kekely; Michal Kekely; Viktor Pus; Jan Kořenek
As throughput of computer networks is on a constant rise, there is a need for ever-faster packet parsing modules at all points of the networking infrastructure. Parsing is a crucial operation which has an influence on the final throughput of a network device. Moreover, this operation must precede any kind of further traffic processing like filtering/classification, deep packet inspection, and so on. This paper presents a parser architecture which is capable to currently scale up to a terabit throughput in a single FPGA, while the overall processing speed is sustained even on the shortest frame lengths and for an arbitrary number of supported protocols. The architecture of our parser can be also automatically generated from a high-level description of a protocol stack in the P4 language which makes the rapid deployment of new protocols considerably easier. The results presented in the paper confirm that our automatically generated parsers are capable of reaching an effective throughput of over 1 Tbps (or more than 2000 Mpps) on the Xilinx UltraScale+ FPGAs and around 800 Gbps (or more than 1200 Mpps) on their previous generation Virtex-7 FPGAs.
architectures for networking and communications systems | 2018
Jan Kucera; Lukas Kekely; Viktor Pus; Adam Piecek; Jan Kořenek
Intrusion Detection Systems (IDS) are among popular technologies for securing computer networks. However, their high computational complexity makes it hard to meet performance goals of modern high-speed networks. This paper aims at an acceleration of IDS by informed packet discarding. Focusing the limited computational resources available to IDS towards only the most relevant parts of incoming traffic and offloading (bypassing) the rest. We show that this controlled (informed) discarding of well-defined traffic portions helps IDS to achieve better results and compare software and FPGA accelerated discarding implementations.