Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stanislaw J. Piestrak.
IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 1995
Stanislaw J. Piestrak
This paper proposes a new high-speed ROM-less residue-to-binary converter for the three moduli residue number systems (RNS) of the form. >
symposium on computer arithmetic | 1991
Stanislaw J. Piestrak
The design of residue generators and multioperand modular adders is studied. Novel highly parallel schemes using carry-save adders with end-around carry are proposed for either type of circuit. They are derived on the basis of the periodicity of the series of powers of two taken modulo A (A is a module). The novel circuits are faster and use less hardware than existing similar circuits.<<ETX>>
international conference on computer design | 1994
Stanislaw J. Piestrak
The carry-free nature of the residue number system (RNS) has made it attractive for implementing a variety of specialized high-performance digital signal processing systems. A new implementation of a high-speed residue-to-binary converter based on the Chinese Remainder Theorem (CRT) is proposed. It employs a new r-operand adder modulo M, realized using carry-save adders and its delay path involves only one addition of two a-bit operands performed by some carry-propagate adder, where M is the dynamic range and a=[log/sub 2/ M]. The latter feature is in contrast with any other CRT-based converter that has been proposed to date, which require two subsequent additions of a-bit operands. This circuit offers a latency period significantly smaller than the fastest known designs.<<ETX>>
IEEE Transactions on Computers | 2013
Hung-Manh Pham; Sébastien Pillement; Stanislaw J. Piestrak
In this paper, we propose a new approach to implement a reliable softcore processor on SRAM-based FPGAs, which can mitigate radiation-induced temporary faults (single-event upsets (SEUs)) at moderate cost. A new Enhanced Lockstep scheme built using a pair of MicroBlaze cores is proposed and implemented on Xilinx Virtex-5 FPGA. Unlike the basic lockstep scheme, ours allows to detect and eliminate its internal temporary configuration upsets without interrupting normal functioning. Faults are detected and eliminated using a Configuration Engine built on the basis of the PicoBlaze core which, to avoid a single point of failure, is implemented as fault-tolerant using triple modular redundancy (TMR). A softcore processor can recover from configuration upsets through partial reconfiguration combined with roll-forward recovery. SEUs affecting logic which are significantly less likely than those affecting configuration are handled by checkpointing and rollback. Finally, to handle permanent faults, the tiling technique is also proposed. The new Enhanced Lockstep scheme requires significantly shorter error recovery time compared to conventional lockstep scheme and uses significantly smaller number of slices compared to known TMR-based design (although at the cost of longer error recovery time). The efficiency of the proposed approach was validated through fault injection experiments.
compilers, architecture, and synthesis for embedded systems | 2009
Rooju Chokshi; Krzysztof S. Berezowski; Aviral Shrivastava; Stanislaw J. Piestrak
2s complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propagation. Residue Number System (RNS) breaks free of these bonds by decomposing a number into parts and performing arithmetic operations in parallel, significantly reducing the breadth of carry propagation. Consequently, RNS arithmetic has been proposed as a solution to improve the power-efficiency of arithmetic hardware. However, limitations of the expressiveness of RNS in terms of arithmetic operations together with overheads related to interaction with 2s complement arithmetic make programmable processor design that takes advantage of these benefits challenging. In this paper we meet this challenge by multi-tier synergistic co-design of architecture, micro-architecture, hardware components, as well as compilation techniques. Our experiments not only demonstrate simultaneous improvement of up to 30% in performance and 57% reduction in functional unit power consumption, but also that most of these benefits can be exploited with automatically generated code.
IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing | 2002
Stanislaw J. Piestrak
A squarer mod A is a circuit that computes the residue of the square of an integer X taken modulo a positive integer A. It is an essential building block in a variety of high-speed hardware for a digital signal processor (DSP) using the residue number system (RNS) which implements, e.g., the quarter-square modulo multiplication, the squared Euclidean distance, correlation, and circular convolution. Also, it is used to build large modulo exponentiators needed for implementation of cryptographic algorithms. In this paper, a comprehensive study of new squarers mod A is presented. For some special cases of A, like 2/sup a/-1, 2/sup a/, 2/sup a-1/+1, and others, the general design approach is presented, which takes advantage of the periodicity of the series of powers of 2 taken modulo A, with no limitations on the size of A. The resulting squarers are almost exclusively composed of full- and half-adders which makes them suitable for low-level pipelining. For many A/spl les/64, the minimized logic functions of the squarers with small delay are also derived.
IEEE Transactions on Reliability | 2003
Stanislaw J. Piestrak; Abbas Dandache; Fabrice Monteiro
We consider the open problem of designing fault-secure parallel encoders for various systematic linear ECC. The main idea relies on generating not only the check bits for error correction but also, separately and in parallel, the check bits for error detection. Then, the latter are compared against error detecting check bits which are regenerated from the error correcting check bits. The detailed design is presented for encoders for CRC codes. The complexity evaluation of FPGA implementations of encoders with various degrees of parallelism shows that their fault-secure versions compare favorably against their unprotected counterparts both with respect to complexity and the maximal frequency of operation. Future research will include the design of FS decoders for CRC codes as well as the generalization of the presented ideas to design of FS encoders and decoders for other systematic linear ECC like nonbinary BCH codes and Reed-Solomon codes.
IEEE Transactions on Computers | 1990
Stanislaw J. Piestrak
Methods for designing self-testing checkers (STCs) for arithmetic error-detecting codes are presented. First, general rules for the design of minimal-level STCs for any error-detecting code are given. The design is illustrated with STCs for 3N+B codes, 0 >
IEEE Transactions on Computers | 2002
Stanislaw J. Piestrak
This paper tackles the open problem of designing combinational self-testing checkers (STCs) for K-pair 2-rail codes which are self-testing, even by a subset of codewords, such that some input lines are 0 (or 1) for only one input codeword. The checker presented in the paper has both theoretical and practical importance. It is useful, for example, to build STCs for other systematic error-detecting codes, like Berger codes with I = 2/sup K-1/ data bits and arithmetic codes with the check base A = 2/sup K-1/+I (K = 3, 4, 5, ...). It also allows the designers to build functional totally self-checking circuits with 100% fault coverage in which such 2-rail codes could not have been used otherwise.
international symposium on low power electronics and design | 2011
Piotr Patronik; Krzysztof S. Berezowski; Stanislaw J. Piestrak; Janusz Biernat; Aviral Shrivastava
In this paper, we present constant-coefficient finite impulse response (FIR) filters design using residue number system (RNS) arithmetic. The novelty of our approach rests in an attempt to maximize the accumulated benefit of the application of RNS to the design of constant coefficient filters. To achieve this, we consider the impact of RNS on many layers: from coefficient representation and techniques of sharing of subexpressions in the multiplier block (MB), to its optimized usage in the MB and accumulation pipeline hardware design. As a result, we propose a common subexpression elimination (CSE) based synthesis technique for RNS-based MBs, along with a high-performance RNS-based FIR filter architecture that employs RNS arithmetic principles but implements them mainly using more efficient 2s complement hardware. Several filters with numbers of taps ranging from 25 to 326 and dynamic ranges from 24 to 50 bits have been synthesized using TSMC 90 nm LP kit and Cadence RTL Compiler. Comparison of power, delay, and area of the new filters implemented using the 4- and 5-moduli RNSs against various equivalent 2s complement counterparts show uniform improvement in performance and power efficiency, often accompanied by significant reduction in area/power consumption as compared to 2s complement implementations. We observed up to 22% improvement in peformance (19% reduction in area) within bounded power envelope, or up to 14% reduction in power consumption (12% reduction in area) at same frequency.