Huapeng Wu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Huapeng Wu is active.

Explore More

Publication

Featured researches published by Huapeng Wu.

IEEE Transactions on Computers | 2002

Bit-parallel finite field multiplier and squarer using polynomial basis

Huapeng Wu

Bit-parallel finite field multiplication using polynomial basis can be realized in two steps: polynomial multiplication and reduction modulo the irreducible polynomial. In this article, we present an upper complexity bound for the modular polynomial reduction. When the field is generated with an irreducible trinomial, closed form expressions for the coefficients of the product are derived in term of the coefficients of the multiplicands. The complexity of the multiplier architectures and their critical path length are evaluated, and they are comparable to the previous proposals for the same class of fields. An analytical form for bit-parallel squaring operation is also presented. The complexities for bit-parallel squarer are also derived when an irreducible trinomial is used. Consequently, it is argued that to solve multiplicative inverse using polynomial basis can be at least as good as using a normal basis.

IEEE Transactions on Computers | 1998

New low-complexity bit-parallel finite field multipliers using weakly dual bases

Huapeng Wu; Masud Hasan; I.F. Blake

New structures of bit-parallel weakly dual basis (WDB) multipliers over the binary ground field are proposed. An upper bound on the size complexity of bit-parallel multiplier using an arbitrary generating polynomial is given. When the generating polynomial is an irreducible trinomial x/sup m/+x/sup k/+1, 1/spl les/k/spl les/[m/2], the structure of the proposed bit-parallel multiplier requires only m/sup 2/ two-input AND gates and at most m/sup 2/-1 XOR gates. The time delay is no greater than T/sub A/+([log/sub 2/ m]+2)T/sub x/, where T/sub A/ and T/sub X/ are the time delays of an AND gate and an XOR gate, respectively.

IEEE Transactions on Computers | 2002

A New Finite-Field Multiplier Using Redundant Representation

Huapeng Wu; M.A. Hasan; Ian F. Blake; Shuhong Gao

This article presents simple and highly regular architectures for finite field multipliers using a redundant representation. The basic idea is to embed a finite field into a cyclotomic ring which is based on the elegant multiplicative structure of a cyclic group. One important feature of our architectures is that they provide area-time trade-offs which enable us to implement the multipliers in a partial-parallel/hybrid fashion. This hybrid architecture has great significance in its VLSI implementation in very large fields. The squaring operation using the redundant representation is simply a permutation of the coordinates. It is shown that, when there is an optimal normal basis, the proposed bit-serial and hybrid multiplier architectures have very low space complexity. Constant multiplication is also considered and is shown to have an advantage in using the redundant representation.

IEEE Transactions on Computers | 2002

Montgomery multiplier and squarer for a class of finite fields

Huapeng Wu

Montgomery multiplication in GF(2/sup m/) is defined by a(x)b(x)r/sup -1/(x) mod f(x), where the field is generated by a root of the irreducible polynomial f(x), a(x) and b(x) are two field elements in GF(2/sup m/), and r(x) is a fixed field element in GF(2/sup m/). In this paper, first, a slightly generalized Montgomery multiplication algorithm in GF(2/sup m/) is presented. Then, by choosing r(x) according to f (x), we show that efficient architectures of bit-parallel Montgomery multiplier and squarer can be obtained for the fields generated with an irreducible trinomial. Complexities of the Montgomery multiplier and squarer in terms of gate counts and time delay of the circuits are investigated and found to be as good as or better than that of previous proposals for the same class of fields.

IEEE Transactions on Computers | 1998

Low complexity bit-parallel multipliers for a class of finite fields

Huapeng Wu; M.A. Hasan

This short paper summarizes our recent results on construction of low-complexity bit-parallel finite field multiplier using polynomial basis. The complexity and time delay of the proposed multipliers are lower than those of the similar proposals.

international symposium on circuits and systems | 2009

Efficient hardware implementation of the hyperbolic tangent sigmoid function

Ashkan Hosseinzadeh Namin; Karl Leboeuf; Roberto Muscedere; Huapeng Wu; Majid Ahmadi

Efficient implementation of the activation function is important in the hardware design of artificial neural networks. Sigmoid, and hyperbolic tangent sigmoid functions are the most widely used activation functions for this purpose. In this paper, we present a simple and efficient architecture for digital hardware implementation of the hyperbolic tangent sigmoid function. The proposed method employs a piecewise linear approximation as a foundation, and further improves the results using a lookup table. Our design proves to be more efficient considering area × delay as a performance metric when compared to similar proposals. VLSI implementation of the proposed design using a 0.18µm CMOS process is also presented, which shows a 35% improvement over similar recently published architectures.

international conference on hybrid information technology | 2008

High Speed VLSI Implementation of the Hyperbolic Tangent Sigmoid Function

Karl Leboeuf; Ashkan Hosseinzadeh Namin; Roberto Muscedere; Huapeng Wu; Majid Ahmadi

The hyperbolic tangent function is commonly used as the activation function in artificial neural networks. In this work two different hardware implementations for the hyperbolic tangent function are proposed. Both methods are based on the approximation of the function rather than calculating it, since it has exponential nature. The first method uses a lookup table to approximate the function, while the second method reduces the size of the table by using range addressable decoding as opposed to the classic decoding scheme. Hardware synthesis results show the proposed methods perform significantly faster, and use less area compared to other similar methods with the same amount of error.

IEEE Transactions on Computers | 1997

Efficient exponentiation of a primitive root in GF(2/sup m/)

Huapeng Wu; Masud Hasan

In this paper, exponentiation of a primitive root in GF(2/sup m/) is considered. Signed digit (SD) number representation is used to efficiently represent the exponent and the corresponding algorithms and structures for exponentiation are developed. For primitive multiplications required in exponentiations, extended bidirectional linear feedback shift registers are proposed and used for the cases where the exponent is represented as a binary or a radix-4 SD number. Comparisons are made with other methods on the bases of space, time, and possible power consumption. Since the proposed structures can effectively reduce power and area when implemented in VLSI, they are especially suitable for battery powered portable devices.

IEEE Transactions on Computers | 2008

Bit-Parallel Polynomial Basis Multiplier for New Classes of Finite Fields

Huapeng Wu

In this paper, three small classes of finite fields GF(2m) are found for which low complexity bit-parallel multipliers are proposed. The proposed multipliers have lower complexities compared to those based on the irreducible pentanomials. It is also shown that there does not always exist an irreducible all-one polynomial, equally-spaced polynomial, or trinomial for the new classes of fields.

ieee international newcas conference | 2005

VLSI implementation of bit-parallel word-serial multiplier in GF(2/sup 233/)

Wenkai Tang; Huapeng Wu; Majid Ahmadi

A bit-parallel word-serial (BPWS) finite field multiplier in GF(2/sup 233/) is proposed in this paper. The complexities are lower than or comparable to those of the previous similar proposals. A VLSI implementation of the BPWS multiplier combined with a bit-parallel squarer is also presented. The fabricated ASIC chip can be used as the finite field arithmetic module on an elliptic curve technique based cryptographic accelerator board and the proposed VLSI design could also be utilized as a design IP core for fast implementation of a cryptographic processor or smart card.

Explore More