Shahriar Shahabuddin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shahriar Shahabuddin is active.

Explore More

Publication

Featured researches published by Shahriar Shahabuddin.

international symposium on system-on-chip | 2013

Study of adaptive detection for MIMO-OFDM systems

Essi Suikkanen; Janne Janhunen; Shahriar Shahabuddin; Markku J. Juntti

Requirements for higher data rates and lower power consumption set new challenges for implementation of multiple-input multiple-output orthogonal frequency division multiplexing (MIMO-OFDM) receivers. Simple detectors have the advantage of low complexity and power consumption, but they cannot offer as good performance as more complex detectors. Therefore it would be beneficial to be able to adapt the detector algorithm to suit the channel conditions to minimize the receiver processing power consumption while satisfying the quality of service requirements. At low signal-to-noise ratio (SNR) and/or low rank channel, more power and computation resources could be used for detection in order to guarantee reliable communication, while in good conditions, a simple and less power consuming detector could be used. In this paper, we compare the performance of different detection algorithms. The performance results are based on simulations in long term evolution (LTE) system. The effect of precoding and hybrid automatic repeat request (HARQ) on the performance is shown. Implementation results based on the existing literature are included in the comparison. We discuss when it would be beneficial to use a complex detector and when a simple one would be sufficient. Also the switching criterion is discussed.

international symposium on circuits and systems | 2015

A customized lattice reduction multiprocessor for MIMO detection

Shahriar Shahabuddin; Janne Janhunen; Zaheer Khan; Markku J. Juntti; Amanullah Ghazi

Lattice reduction (LR) is a preprocessing technique for multiple-input multiple-output (MIMO) symbol detection to achieve better bit error-rate (BER) performance. In this paper, we propose a customized homogeneous multiprocessor for LR. Each individual core is based on transport triggered architecture (TTA). We propose a few modifications of the popular LR algorithm, Lenstra-Lenstra-Lovász (LLL) for high throughput. High level programming is used to implement the control path of the TTA cores and several special function units are designed to accelerate the program. The multiprocessor takes 187 cycles to reduce a single matrix for LR. The architecture is synthesized on 90 nm technology and takes 405 kgates at 210 MHz.

Wireless Networks | 2016

Cooperative content delivery exploiting multiple wireless interfaces: methods, new technological developments, open research issues and a case study

Zaheer Khan; Athanasios V. Vasilakos; Bidushi Barua; Shahriar Shahabuddin; Hamed Ahmadi

In this tutorial paper, we discuss and compare cooperative content delivery (CCD) techniques that exploit multiple wireless interfaces available on mobile devices to efficiently satisfy the already massive and rapidly growing user demand for content. The discussed CCD techniques include simultaneous use of wireless interfaces, opportunistic use of wireless interfaces, and aggregate use of wireless interfaces. We provide a taxonomy of different ways in which multiple wireless interfaces are exploited for CCD, and also discuss the real measurement studies that evaluate the content delivery performance of different wireless interfaces in terms of energy consumption and throughput. We describe several challenges related to the design of CCD methods using multiple interfaces, and also explain how new technological developments can help in accelerating the performance of such CCD methods. The new technological developments discussed in this paper include wireless interface aggregation, network caching, and the use of crowdsourcing. We provide a case study for selection of devices in a group for CCD using multiple interfaces. We consider this case study based on the observation that in general different CCD users can have different link qualities in terms of transmit/receive performance, and selection of users with good link qualities for CCD can accelerate the content delivery performance of wireless networks. Finally, we discuss some open research issues relating to CCD using multiple interfaces.

international conference on embedded computer systems architectures modeling and simulation | 2013

Design of a unified transport triggered processor for LDPC/turbo decoder

Shahriar Shahabuddin; Janne Janhunen; Muhammet Fatih Bayramoglu; Markku J. Juntti; Amanullah Ghazi; Olli Silvén

This paper summarizes the design of a programmable processor with transport triggered architecture (TTA) for decoding LDPC and turbo codes. The processor architecture is designed in such a manner that it can be programmed for LDPC or turbo decoding for the purpose of internetworking and roaming between different networks. The standard trellis based maximum a posteriori (MAP) algorithm is used for turbo decoding. Unlike most other implementations, a supercode based sum-product algorithm is used for the check node message computation for LDPC decoding. This approach ensures the highest hardware utilization of the processor architecture for the two different algorithms. Up to our knowledge, this is the first attempt to design a TTA processor for the LDPC decoder. The processor is programmed with a high level language to meet the time-to-market requirement. The optimization techniques and the usage of the function units for both algorithms are explained in detail. The processor achieves 22.64 Mbps throughput for turbo decoding with a single iteration and 10.12 Mbps throughput for LDPC decoding with five iterations for a clock frequency of 200 MHz.

european conference on networks and communications | 2017

ASIP design for multiuser MIMO broadcast precoding

Shahriar Shahabuddin; Olli Silvén; Markku J. Juntti

This paper presents an application-specific instruction-set processor (ASIP) for multiuser multiple-input multiple-output (MU-MIMO) broadcast precoding. The ASIP is designed for a base station (BS) with four antennas to perform user scheduling and precoding. Transport triggered architecture (TTA) is used as the processor template and high level language is used to program the ASIP. Several special function units (SFU) are designed to accelerate norm-based greedy user scheduling and minimum-mean square error (MMSE) precoding. We also program zero forcing dirty paper coding (ZF-DPC) to demonstrate the reusability of the ASIP. A single core provides a throughput of 52.17 Mbps for MMSE precoding and takes an area of 87.53 kgates at 200 MHz on 90 nm technology.

signal processing systems | 2013

Programmable implementation of zero-crossing demodulator on an application specific processor

Amanullah Ghazi; Jani Boutellier; Jari Hannuksela; Shahriar Shahabuddin; Olli Silvén

The zero-intermediate frequency zero-crossing demodulator (ZIFZCD) is extensively used for demodulating continuous phase frequency shift keying (CPFSK) signals in low power and low cost devices. ZIFZCD has previously been implemented as hardwired circuits. Many variations have been suggested to the ZIFZCD algorithm for different modulation methods and channel conditions. To support all these variants, a programmable processor based implementation of the ZIFZCD is needed. This paper describes a programmable software implementation of ZIFZCD on an application specific processor (ASP). The ASP is based on transport triggered architecture (TTA) and provides an ideal low power platform for ZIFZCD implementation due to its simplicity. The designed processor operates at a maximum clock frequency of 250 MHz and has gate count of 134 kGE for a 32-bit TTA processor and 76 kGE for a 16-bit processor. The demodulator has been developed as a part of an open source radio implementation for wireless sensor nodes.

signal processing systems | 2018

Programmable ASIPs for Multimode MIMO Transceiver

Shahriar Shahabuddin; Olli Silvén; Markku J. Juntti

Application specific instruction-set processors (ASIP) are a programmable and flexible alternative of traditional finite state machine (FSM) controlled register-transfer level (RTL) designs for multimode basedband systems. In this paper, we present two ASIPs for small scale multiple-input multiple-output (MIMO) wireless communication systems that demonstrate the soundness and effectiveness of ASIPs for this type of applications. The first ASIP is programmed with multiple MIMO symbol detection algorithms for 4 × 4 systems. The supported detection algorithms are minimum mean-square error (MMSE), two variants of the selective spanning with fast enumeration (SSFE) and K-best list sphere detection (LSD). The second ASIP supports MMSE and zero-forcing dirty paper coding (ZF-DPC) algorithms for a base station (BS) with 4 antennas and for 4 users. Both ASIPs are based on transport triggered architecture (TTA) and are programmed with a retargetable compiler with high level language to meet the time-to-market requirements. The detection and precoding algorithms can be switched in the respective ASIPs based on the error-rate requirements. Depending on the algorithms, MIMO detection ASIP delivers 6.16–66.66 Mbps throughput at a clock frequency of 200 MHz on 90 nm technology. The precoder ASIP provides a throughput of 52.17 and 51.95 Mbps for MMSE and ZF-DPC precoding respectively at a clock frequency of 210 MHz on 90 nm technology.

international symposium on circuits and systems | 2017

ADMM-based infinity norm detection for large MU-MIMO: Algorithm and VLSI architecture

Shahriar Shahabuddin; Markku J. Juntti; Christoph Studer

We propose a novel data detection algorithm and a corresponding VLSI design for large multi-user (MU) multiple-input multiple-output (MIMO) wireless receiver. Our algorithm, referred to as ADMIN, performs alternating direction method of multipliers (ADMM)-based infinity norm constrained equalization. ADMIN is an iterative algorithm that outperforms linear detectors if the number of users is small compared to that of the antennas in base station (BS). ADMIN computes the linear minimum mean-square error (MMSE) solution in the first iteration. It is sufficient when the ratio between the numbers of BS antennas and users is rather large. We develop a time-shared and iterative VLSI architecture for LDL-decomposition based soft-output ADMIN. Our architecture achieves 685.71 Mb/s for linear MMSE and 212.38 Mb/s for ADMIN for a 16-user system that employs 64-QAM in a 28 nm CMOS technology.

international conference on informatics electronics and vision | 2016

Complexity analysis of matrix decomposition algorithms for linear MIMO detection

Sadiqur Rahaman; Shahnewaz Shahabuddin; Belayat Hossain; Shahriar Shahabuddin

MIMO is a key technology to achieve the thousand fold data rate requirement for next generation communication system. The linear MIMO system is becoming more attractive as the antenna dimension is increasing and due to the advent of advanced MIMO techniques, such as massive MIMO. In this paper, we presented the complexity analysis of matrix decomposition algorithm that is needed to invert the Gramian matrix of a linear MIMO detector. We analyzed the complexity of four different matrix decomposition in this work, two variants of QR, Cholesky and LDL decomposition. The analysis is done for three different antenna configurations. We also presented the detection method using the decomposition algorithms and provided the hard output simulation results.

arXiv: Hardware Architecture | 2015