Marcin Lukowiak
Rochester Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Marcin Lukowiak.
international conference of the ieee engineering in medicine and biology society | 2010
Fei Hu; Qi Hao; Marcin Lukowiak; Qingquan Sun; Kyle Wilhelm; Stanislaw P. Radziszowski; Yao Wu
Implantable medical devices (IMDs) have played an important role in many medical fields. Any failure in IMDs operations could cause serious consequences and it is important to protect the IMDs access from unauthenticated access. This study investigates secure IMD data collection within a telehealthcare [mobile health (m-health)] network. We use medical sensors carried by patients to securely access IMD data and perform secure sensor-to-sensor communications between patients to relay the IMD data to a remote doctors server. To meet the requirements on low computational complexity, we choose N-th degree truncated polynomial ring (NTRU)-based encryption/decryption to secure IMD-sensor and sensor-sensor communications. An extended matryoshkas model is developed to estimate direct/indirect trust relationship among sensors. An NTRU hardware implementation in very large integrated circuit hardware description language is studied based on industry Standard IEEE 1363 to increase the speed of key generation. The performance analysis results demonstrate the security robustness of the proposed IMD data access trust model.
Security and Communication Networks | 2009
Fei Hu; Kyle Wilhelm; Michael Schab; Marcin Lukowiak; Stanislaw P. Radziszowski; Yang Xiao
Summary Wireless sensor network security requires the cryptography software extremely low complex and energy efficient due to the limited memory and CPU capacity in a sensor. The NTRU (Nth degree truncated polynomial ring) encrypt algorithm has been shown to provide certain advantages when designing low power and resource constrained systems, while still providing comparable security levels to higher complexity algorithms. Unlike the current works that build NTRU software in a chip, this research focuses on the hardware implementation of NTRU algorithms because hardware implementation has much higher execution speed than software implementation. In contrast to previous research, the focus is shifted away from specific optimizations but rather provides a study of many of the recommended practices and suggested optimizations with particular emphasis on polynomial arithmetic and parameter selection. Recommendations for algorithm and parameter selection are made regarding implementation in hardware with respect to the resources available. Copyright # 2008 John Wiley & Sons, Ltd.
reconfigurable computing and fpgas | 2013
Sam Skalicky; Christopher A. Wood; Marcin Lukowiak; Matthew Ryan
One of the pitfalls of FPGA design is the relatively long implementation time when compared to alternative architectures, such as CPU, GPU or DSP. This time can be greatly reduced however by using tools that can generate hardware systems in the form of a hardware description language (HDL) from high-level languages such as C, C++, or Python. Such implementations can be optimized by applying special directives that focus the high-level synthesis (HLS) effort on particular objectives, such as performance, area, throughput, or power consumption. In this paper we examine the benefits of this approach by comparing the performance and design times of HLS generated systems versus custom systems for matrix multiplication. We investigate matrix multiplication using a standard algorithm, Strassen algorithm, and a sparse algorithm to provide a comprehensive analysis of the capabilities and usability of the Xilinx Vivado HLS tool. In our experience, a hardware-oriented electrical engineering student can achieve up to 61% of the performance of custom designs with 1/3 the effort, thus enabling faster hardware acceleration of many compute-bound algorithms.
application-specific systems, architectures, and processors | 2006
Thomas Warsaw; Marcin Lukowiak
This paper discusses hardware development of a real-lime H.264/AVC video decoder. Synthesis results are presented for example implementations of the inverse quantization, inverse transform, and deblocking filter stages. A hardware architecture is also proposed for FPGA implementations of a complete video decoder
ACM Transactions on Computing Education | 2014
Marcin Lukowiak; Stanislaw P. Radziszowski; James R. Vallino; Christopher A. Wood
With the continuous growth of cyberinfrastructure throughout modern society, the need for secure computing and communication is more important than ever before. As a result, there is also an increasing need for entry-level developers who are capable of designing and building practical solutions for systems with stringent security requirements. This calls for careful attention to algorithm choice and implementation method, as well as trade-offs between hardware and software implementations. This article describes motivation and efforts taken by three departments at Rochester Institute of Technology (Computer Engineering, Computer Science, and Software Engineering) that were focused on creating a multidisciplinary course that integrates the algorithmic, engineering, and practical aspects of security as exemplified by applied cryptography. In particular, the article presents the structure of this new course, topics covered, lab tools and results from the first two spring quarter offerings in 2011 and 2012.
Computers & Electrical Engineering | 2014
Sam Skalicky; Sonia Martín López; Marcin Lukowiak
The potential design space of FPGA accelerators is very large. The factors that define performance of a particular implementation include the architecture design, number of pipelines, and memory bandwidth. In this paper we present a mathematical model that, based on these factors, calculates the computation time of pipelined FPGA accelerators and allows for quick exploration of the design space without any implementation or simulation. We evaluate the model and its ability to identify design bottlenecks and improve performance. Being the core of many compute-intensive applications, linear algebra computations are the main contributors to their total execution time. Hence, five relevant linear algebra computations are selected, analyzed, and the accuracy of the model is validated against implemented designs.
military communications conference | 2010
Kenneth Smith; Marcin Lukowiak
This paper presents detailed methodology for performing simulated Power Analysis Attacks (PAAs) on gate level models of cryptographic components with Synopsys design tools. First the Advanced Encryption Standard (AES) hardware model is developed for the experiment using VHDL. The model is then synthesized with Synopsys DesignCompiler and the 130-nm CMOS standard cell library. Simulated instantaneous power consumption waveforms are generated with Synopsys PrimeTime PX. Results are analyzed and successful single and multiple-bit Differential Power Analysis (DPA) attacks are performed on the waveforms. The AES hardware model did not implement any DPA countermeasure techniques.
reconfigurable computing and fpgas | 2010
Aric Schorr; Marcin Lukowiak
This paper focuses on design and analysis of a Field Programmable Gate Array (FPGA) hardware for Skein’s tree hashing mode. Several approaches on how to modify sequential hashing cores, and create scalable control logic in order to provide for high-speed parallel hashing hardware are presented and analyzed. The results are compared to the current sequential designs of Skein, providing a complete analysis of the performance of Skein in custom FPGA hardware. The post place-and-route results show that our tree based Skein-256 design achieved a throughput of 3 Gbps using two cores, and 5.6 Gbps for four cores as compared to 1.6 Gbps for sequential systems. The design is parametrizable using leaf-size (YL), nodefanout (YF ), and internal block size (ISBIT ). The control logic follows the same methodology, but is designed for a specific number of cores (NCORE) to correctly handle the core assignment strategy and control.
military communications conference | 2015
Matthew Kelly; Alan Kaminsky; Michael Thomas Kurdziel; Marcin Lukowiak; Stanislaw P. Radziszowski
Authenticated encryption (AE) is a symmetric key cryptographic scheme that aims to provide both confidentiality and data integrity. There are many AE algorithms in existence today. However, they are often far from ideal in terms of efficiency and ease of use. For this reason, there is ongoing effort to develop new AE algorithms that are secure, efficient, and easy to use.
reconfigurable computing and fpgas | 2013
Sam Skalicky; Sonia Martín López; Marcin Lukowiak
One of the main challenges of using cutting edge medical imaging applications in the clinical setting is the large amount of data processing required. Many of these applications are based on linear algebra computations operating on large data sizes and their execution may require days in a standard CPU. Distributed heterogeneous systems are capable of improving the performance of applications by using the right computation-to-hardware mapping. To achieve high performance, hardware platforms are chosen to satisfy the needs of each computation with corresponding architectural features such as clock speed, number of parallel computational units, and memory bandwidth. In this paper we evaluate the performance benefits of using different hardware platforms to accelerate the execution of a transmural electro physiological imaging algorithm, targeting a standard CPU with GPU and FPGA accelerators. Using this cutting edge medical imaging application as a case study, we demonstrate the importance of making intelligent computation assignments for improved performance. We show that, depending on the size of the data structures the application works with, the usage of an FPGA to run certain computations can make a big difference: a heterogeneous system with all three hardware platforms (CPU+GPU+FPGA) can cut the execution time by half, compared to the best result using one single accelerator (CPU+GPU). In addition, our experimental results show that combining CPU, GPU, and FPGA platforms in a single system achieves a speedup of up to 62×, 2×, and 1605× compared to systems with a single CPU, GPU, or FPGA platform respectively.