Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Fahad Qureshi is active.

Publication


Featured researches published by Fahad Qureshi.


international symposium on circuits and systems | 2009

Low-complexity reconfigurable complex constant multiplication for FFTs

Fahad Qureshi; Oscar Gustafsson

In this work we consider structures for simultaneous multiplication by a small set of two pairwise coefficients where the coefficients are the real and imaginary part of a limited number of points uniformly spread on the unit circle. Hence, each such multiplier forms half of a complex multiplier suitable for twiddle factor multiplication in FFT architectures. Based on trigonometric identities we propose a multiplier for a unit circle resolution of 32 points. Also, we revisit an earlier proposed multiplier for 16 points and show that the complexity can be reduced by using minimum adder constant multipliers compared with the earlier proposed CSD-based multipliers.


IEEE Signal Processing Letters | 2010

Addition Aware Quantization for Low Complexity and High Precision Constant Multiplication

Oscar Gustafsson; Fahad Qureshi

Multiplication by constants can be efficiently realized using shifts, additions, and subtractions. In this work we consider how to select a fixed-point value for a real valued, rational, or floating-point coefficient to obtain a low-complexity realization. It is shown that the process, denoted addition aware quantization, often can determine coefficients that has as low complexity as the rounded value, but with a smaller approximation error by searching among coefficients with a longer wordlength.


european conference on circuit theory and design | 2011

Generation of all radix-2 fast Fourier transform algorithms using binary trees

Fahad Qureshi; Oscar Gustafsson

In this work a systematic method to generate all possible fast Fourier transform (FFT) algorithms is proposed based on the relation to binary trees. The binary tree is used to represent the decomposition of a discrete Fourier transform (DFT) into sub-DFTs. The radix is adaptively changed according to compute sub-DFTs in proposed decomposition. In this work we determine the number of possible algorithms for 2n-point FFTs with radix-2 butterfly operation and propose a simple method to determine the twiddle factor indices for each algorithm based on the binary tree representation.


IEEE Transactions on Circuits and Systems | 2014

Low-Complexity Multiplierless Constant Rotators Based on Combined Coefficient Selection and Shift-and-Add Implementation (CCSSI)

Mario Garrido; Fahad Qureshi; Oscar Gustafsson

This paper presents a new approach to design multiplierless constant rotators. The approach is based on a combined coefficient selection and shift-and-add implementation (CCSSI) for the design of the rotators. First, complete freedom is given to the selection of the coefficients, i.e., no constraints to the coefficients are set in advance and all the alternatives are taken into account. Second, the shift-and-add implementation uses advanced single constant multiplication (SCM) and multiple constant multiplication (MCM) techniques that lead to low-complexity multiplierless implementations. Third, the design of the rotators is done by a joint optimization of the coefficient selection and shift-and-add implementation. As a result, the CCSSI provides an extended design space that offers a larger number of alternatives with respect to previous works. Furthermore, the design space is explored in a simple and efficient way. The proposed approach has wide applications in numerous hardware scenarios. This includes rotations by single or multiple angles, rotators in single or multiple branches, and different scaling of the outputs. Experimental results for various scenarios are provided. In all of them, the proposed approach achieves significant improvements with respect to state of the art.


international symposium on circuits and systems | 2010

Twiddle factor memory switching activity analysis of radix-2 2 and equivalent FFT algorithms

Fahad Qureshi; Oscar Gustafsson

In this paper, we propose equivalent radix-22 algorithms and evaluate them based on twiddle factor switching activity for a single delay feedback pipelined FFT architecture. These equivalent pipeline FFT algorithms have the same number of complex multipliers with the same resolution as the radix-22. It is shown that the twiddle factor switching activity of the equivalent algorithms is reduced with up to 40% for some of the equivalent algorithms derived for N = 256.


asia pacific conference on postgraduate research in microelectronics and electronics | 2010

4k-point FFT algorithms based on optimized twiddle factor multiplication for FPGAs

Fahad Qureshi; Syed Asad Alam; Oscar Gustafsson

In this paper, we propose higher point FFT (fast Fourier transform) algorithms for a single delay feedback pipelined FFT architecture considering the 4096-point FFT These algorithms are different from each other in terms of twiddle factor multiplication. Twiddle factor multiplication complexity comparison is presented when implemented on Field-Programmable Gate Arrays(FPGAs) for all proposed algorithms. We also discuss the design criteria of the twiddle factor multiplication. Finally it is shown that there is a trade-off between twiddle factor memory complexity and switching activity in the introduced algorithms.


asilomar conference on signals, systems and computers | 2009

Analysis of twiddle factor memory complexity of radix-2 i pipelined FFTs

Fahad Qureshi; Oscar Gustafsson

In this work, we analyze different approaches to store the coefficient twiddle factors for different stages of pipelined Fast Fourier Transforms (FFTs). The analysis is based on complexity comparisons of different algorithms when implemented on Field-Programmable Gate Arrays (FPGAs) and ASIC for different radix-2i algorithms. The objective of this work is to investigate the best possible combination for storing the coefficient twiddle factor for each stage of the pipelined FFT.


IEEE Transactions on Very Large Scale Integration Systems | 2017

Efficient FPGA Mapping of Pipeline SDF FFT Cores

Carl Ingemarsson; Petter Kallstrom; Fahad Qureshi; Oscar Gustafsson

In this paper, an efficient mapping of the pipeline single-path delay feedback (SDF) fast Fourier transform (FFT) architecture to field-programmable gate arrays (FPGAs) is proposed. By considering the architectural features of the target FPGA, significantly better implementation results are obtained. This is illustrated by mapping an R22SDF 1024-point FFT core toward both Xilinx Virtex-4 and Virtex-6 devices. The optimized FPGA mapping is explored in detail. Algorithmic transformations that allow a better mapping are proposed, resulting in implementation achievements that by far outperforms earlier published work. For Virtex-4, the results show a 350% increase in throughput per slice and 25% reduction in block RAM (BRAM) use, with the same amount of DSP48 resources, compared with the best earlier published result. The resulting Virtex-6 design sees even larger increases in throughput per slice compared with Xilinx FFT IP core, using half as many DSP48E1 blocks and less BRAM resources. The results clearly show that the FPGA mapping is crucial, not only the architecture and algorithm choices.


IEICE Electronics Express | 2011

On the efficient computation of single-bit input word length pipelined FFTs

Saima Athar; Oscar Gustafsson; Fahad Qureshi; Izzet Kale

This letter describes an efficient architecture for the computation of fast Fourier transform (FFT) algorithms with single-bit input. The proposed architecture is aimed for the first stages of pipelined FFT architectures, processing one sample per clock cycle, hence making it suiable for real-time FFT computation. Since natural input order pipeline FFTs use large memories in the early stages, it is important to keep the word length shorter in the beginning of the pipeline. By replacing the initial butterflies and rotators of an architecture with that of the proposed block, the memory requirements can be significantly reduced. Comparisons with the commonly used single delay feedback (SDF) architecture show that more than 50% of the required memory can be saved in some cases.


asilomar conference on signals, systems and computers | 2008

Comparison of multiplierless implementation of nonlinear-phase versus linear-phase FIR filters

Muhammad Abbas; Fahad Qureshi; Zaka Ullah Sheikh; Oscar Gustafsson; Håkan Johansson; Kenny Johansson

FIR filters are often used because of their linear-phase response. However, there are certain applications where the linear-phase property is not required, such as signal energy estimation, but IIR filters can not be used due to the limitation of sample rate imposed by the recursive algorithm. In this work, we discuss multiplierless implementation of minimum order, and therefore nonlinear-phase, FIR filters and compare it to the linear-phase counterpart.

Collaboration


Dive into the Fahad Qureshi's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge