Arash Reyhani-Masoleh
University of Western Ontario
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Arash Reyhani-Masoleh.
IEEE Transactions on Computers | 2004
Arash Reyhani-Masoleh; M.A. Hasan
Representing the field elements with respect to the polynomial (or standard) basis, we consider bit parallel architectures for multiplication over the finite field GF(2m). In this effect, first we derive a new formulation for polynomial basis multiplication in terms of the reduction matrix Q. The main advantage of this new formulation is that it can be used with any field defining irreducible polynomial. Using this formulation, we then develop a generalized architecture for the multiplier and analyze the time and gate complexities of the proposed multiplier as a function of degree m and the reduction matrix Q. To the best of our knowledge, this is the first time that these complexities are given in terms of Q. Unlike most other articles on bit parallel finite field multipliers, here we also consider the number of signals to be routed in hardware implementation and we show that, compared to the well-known Mastrovitos multiplier, the proposed architecture has fewer routed signals. The proposed generalized architecture is further optimized for three special types of polynomials, namely, equally spaced polynomials, trinomials, and pentanomials. We have obtained explicit formulas and complexities of the multipliers for these three special irreducible polynomials. This makes it very easy for a designer to implement the proposed multipliers using hardware description languages like VHDL and Verilog with minimum knowledge of finite field arithmetic.
IEEE Transactions on Computers | 2002
Arash Reyhani-Masoleh; M. A. Hasan
The Massey-Omura multiplier of GF(2/sup m/) uses a normal basis and its bit parallel version is usually implemented using m identical combinational logic blocks whose inputs are cyclically shifted from one another. In the past, it was shown that, for a class of finite fields defined by irreducible all-one polynomials, the parallel Massey-Omura multiplier had redundancy and a modified architecture of lower circuit complexity was proposed. In this article, it is shown that, not only does this type of multiplier contain redundancy in that special class of finite fields, but it also has redundancy in fields GF(2/sup m/) defined by any irreducible polynomial. By removing the redundancy, we propose a new architecture for the normal basis parallel multiplier, which is applicable to any arbitrary finite field and has significantly lower circuit complexity compared to the original Massey-Omura normal basis parallel multiplier. The proposed multiplier structure is also modular and, hence, suitable for VLSI realization. When applied to fields defined by the irreducible all-one polynomials, the multipliers circuit complexity matches the best result available in the open literature.
IEEE Transactions on Computers | 2010
Mehran Mozaffari-Kermani; Arash Reyhani-Masoleh
The Advanced Encryption Standard (AES) has been lately accepted as the symmetric cryptography standard for confidential data transmission. However, the natural and malicious injected faults reduce its reliability and may cause confidential information leakage. In this paper, we study concurrent fault detection schemes for reaching a reliable AES architecture. Specifically, we propose low-cost structure-independent fault detection schemes for the AES encryption and decryption. We have obtained new formulations for the fault detection of SubBytes and inverse SubBytes using the relation between the input and the output of the S-box and the inverse S-box. The proposed schemes are independent of the way the S-box and the inverse S-box are constructed. Therefore, they can be used for both the S-boxes and the inverse S-boxes using lookup tables and those utilizing logic gates based on composite fields. Our simulation results show the error coverage of greater than 99 percent for the proposed schemes. Moreover, the proposed and the previously reported fault detection schemes have been implemented on the most recent Xilinx Virtex FPGAs. Their area and delay overheads have been compared and it is shown that the proposed schemes outperform the previously reported ones.
IEEE Transactions on Computers | 2006
Arash Reyhani-Masoleh
Recently, implementations of normal basis multiplication over the extended binary field GF(2/sup m/) have received considerable attention. A class of low complexity normal bases called Gaussian normal bases has been included in a number of standards, such as IEEE and NIST for an elliptic curve digital signature algorithm. The multiplication algorithms presented there are slow in software since they rely on bit-wise inner product operations. In this paper, we present two vector-level software algorithms which essentially eliminate such bit-wise operations for Gaussian normal bases. Our analysis and timing results show that the software implementation of the proposed algorithm is faster than previously reported normal basis multiplication algorithms. The proposed algorithm is also more memory efficient compared with its look-up table-based counterpart. Moreover, two new digit-level multiplier architectures are proposed and it is shown that they outperform the existing normal basis multiplier structures. As compared with similar digit-level normal basis multipliers, the proposed multiplier with serial output requires the fewest number of XOR gates and the one with parallel output is the fastest multiplier.
IEEE Transactions on Computers | 2005
Arash Reyhani-Masoleh; M.A. Hasan
For efficient hardware implementation of finite field arithmetic units, the use of a normal basis is advantageous. In this paper, two classes of architectures for multipliers over the finite field GF(2/sup m/) are proposed. These multipliers are of sequential type, i.e., after receiving the coordinates of the two input field elements, they go through k, 1 /spl les/ k /spl les/ m, iterations (i.e., clock cycles) to finally yield all the coordinates of the product in parallel. The value of k depends on the word size w = /spl lceil/m/k/spl rceil/. For w > 1, these multipliers are highly area efficient and require fewer number of logic gates even when compared with the most area efficient multipliers available in the open literature. This makes the proposed multipliers suitable for applications where the value of m is large but space is of concern, e.g., resource constrained cryptographic systems. Additionally, if the field dimension m is composite, i.e., m = kn, then the extension of one class of the architectures yields a highly efficient multiplier over composite fields.
IEEE Transactions on Very Large Scale Integration Systems | 2011
Mehran Mozaffari-Kermani; Arash Reyhani-Masoleh
The faults that accidently or maliciously occur in the hardware implementations of the Advanced Encryption Standard (AES) may cause erroneous encrypted/decrypted output. The use of appropriate fault detection schemes for the AES makes it robust to internal defects and fault attacks. In this paper, we present a lightweight concurrent fault detection scheme for the AES. In the proposed approach, the composite field S-box and inverse S-box are divided into blocks and the predicted parities of these blocks are obtained. Through exhaustive searches among all available composite fields, we have found the optimum solutions for the least overhead parity-based fault detection structures. Moreover, through our error injection simulations for one S-box (respectively inverse S-box), we show that the total error coverage of almost 100% for 16 S-boxes (respectively inverse S-boxes) can be achieved. Finally, it is shown that both the application-specific integrated circuit and field-programmable gate-array implementations of the fault detection structures using the obtained optimum composite fields, have better hardware and time complexities compared to their counterparts.
IEEE Transactions on Computers | 2006
Arash Reyhani-Masoleh; M.A. Hasan
In many cryptographic schemes, the most time consuming basic arithmetic operation is the finite field multiplication and its hardware implementation for bit parallel operation may require millions of logic gates. Some of these gates may become faulty in the field due to natural causes or malicious attacks, which may lead to the generation of erroneous outputs by the multiplier. In this paper, we propose new architectures to detect erroneous outputs caused by certain types of faults in bit-parallel and bit-serial polynomial basis multipliers over finite fields of characteristic two. In particular, parity prediction schemes are developed for detecting errors due to single and certain multiple stuck-at faults. Although the issue of detecting soft errors in registers is not considered, the proposed schemes have the advantage that they can be used with any irreducible binary polynomial chosen to define the finite field
defect and fault tolerance in vlsi and nanotechnology systems | 2006
Mehran Mozaffari Kermani; Arash Reyhani-Masoleh
In this paper, the authors present parity-based fault detection architecture of the S-box for designing high performance fault detection structures of the advanced encryption standard. Instead of using look-up tables for the S-box and its parity prediction, logical gate implementations based on the composite field are utilized. After analyzing the error propagation for injected single faults, the authors modify the original S-box and suggest fault detection architecture for the S-box. Using the closed formulations for the predicted parity bits, the authors propose a parity-based fault detection scheme for reaching the maximum fault coverage. Moreover, the overhead costs, including space complexity and time delay of our modified S-box and the parity predictions are also compared to those of the previously reported ones
IEEE Transactions on Computers | 2009
Arash Hariri; Arash Reyhani-Masoleh
Multiplication and squaring are main finite field operations in cryptographic computations and designing efficient multipliers and squarers affect the performance of cryptosystems. In this paper, we consider the Montgomery multiplication in the binary extension fields and study different structures of bit-serial and bit-parallel multipliers. For each of these structures, we study the role of the Montgomery factor, and then by using appropriate factors, propose new architectures. Specifically, we propose two bit-serial multipliers for general irreducible polynomials, and then derive bit-parallel Montgomery multipliers for two important classes of irreducible polynomials. In this regard, first we consider trinomials and provide a way for finding efficient Montgomery factors which results in a low time complexity. Then, we consider type-II irreducible pentanomials and design two bit-parallel multipliers which are comparable to the best finite field multipliers reported in the literature. Moreover, we consider squaring using this family of irreducible polynomials and show that this operation can be performed very fast with the time complexity of two XOR gates.
IEEE Transactions on Computers | 2003
Arash Reyhani-Masoleh; M.A. Hasan
For cryptographic applications, normal bases have received considerable attention, especially for hardware implementation. We consider fast software algorithms for normal basis multiplication over the extended binary field GF(2/sup m/). We present a vector-level algorithm, which essentially eliminates the bit-wise inner products needed in the conventional approach to the normal basis multiplication. We then present another algorithm, which significantly reduces the dynamic instruction counts. Both algorithms utilize the full width of the data-path of the general purpose processor on which the software is to be executed. We also consider composite fields and present an algorithm, which can provide further speed-ups and an added flexibility toward hardware-software codesign of processors for very large finite fields.