P.G. Gulak | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where P.G. Gulak is active.

Explore More

Publication

Featured researches published by P.G. Gulak.

IEEE Transactions on Communications | 1993

Architectural tradeoffs for survivor sequence memory management in Viterbi decoders

Gennady Feygin; P.G. Gulak

In a Viterbi decoder, there are two known memory organization techniques for the storage of survivor sequences from which the decoded information sequence is retrieved, namely, register exchange method and traceback method. This work extends previously known traceback approaches, describes two new traceback algorithms, and compares various traceback methods with each other. Memory size, latency, and implementational complexity of the survivor sequence management are analyzed for both uniprocessor and multiprocessor realizations of Viterbi decoders. A new, one-pointer traceback method is shown to be better than previously known traceback methods. >

IEEE Transactions on Communications | 2003

VLSI architectures for the MAP algorithm

Emmanuel Boutillon; Warren J. Gross; P.G. Gulak

This paper presents several techniques for the very large-scale integration (VLSI) implementation of the maximum a posteriori (MAP) algorithm. In general, knowledge about the implementation of the Viterbi (1967) algorithm can be applied to the MAP algorithm. Bounds are derived for the dynamic range of the state metrics which enable the designer to optimize the word length. The computational kernel of the algorithm is the add-MAX* operation, which is the add-compare-select operation of the Viterbi algorithm with an added offset. We show that the critical path of the algorithm can be reduced if the add-MAX* operation is reordered into an offset-add-compare-select operation by adjusting the location of registers. A general scheduling for the MAP algorithm is presented which gives the tradeoffs between computational complexity, latency, and memory size. Some of these architectures eliminate the need for RAM blocks with unusual form factors or can replace the RAM with registers. These architectures are suited to VLSI implementation of turbo decoders.

international conference on acoustics, speech, and signal processing | 2000

A discrete multitone power line communications system

Tooraj Esmailian; P.G. Gulak; Frank R. Kschischang

In-building power lines have been considered as a medium for high speed data transmission for applications like home networking and Internet access. Frequency selectivity and time variation of this medium in addition to the high level of narrow-band and impulsive interference makes multi-carrier modulation, and especially its popular variant discrete multitone (DMT), an attractive modulation candidate for this application. This paper presents the results of our measurements of the high frequency characteristics of ordinary in-building power lines, as well as simulation results of a DMT transceiver system in an in-building power line environment.

IEEE Transactions on Very Large Scale Integration Systems | 2012

A 675 Mbps, 4

Mahdi Shabany; P.G. Gulak

This paper introduces a novel scalable pipelined VLSI architecture for a 4 × 4 64-QAM hard-output multiple-input-multiple-output (MIMO) detector based on K-best lattice decoders. The key contribution is a means of expanding the intermediate nodes of the search tree on-demand, rather than exhaustively, along with three types of distributed sorters operating in a pipelined structure. The proposed architecture has a fixed critical path independent of the constellation size, on-demand expansion scheme, efficient distributed sorters, and is scalable to higher number of antennas. Fabricated in 0.13 μm CMOS, it occupies 0.95 mm2 core area. Operating at 282 MHz clock frequency, it dissipates 135 mW at 1.3 V supply with no BER performance loss. It achieves an SNR-independent throughput of 675 Mbps satisfying the requirements of IEEE 802.16m and long term evolution (LTE) systems. The measurements confirm that this design consumes 3.0 × less energy/bit and operates at a significantly higher throughput compared to the best previously published design.

IEEE Transactions on Communications | 2006

\times

Warren J. Gross; Frank R. Kschischang; Ralf Koetter; P.G. Gulak

Efficient soft-decision decoding of Reed–Solomon codes is made possible by the Koetter–Vardy (KV) algorithm which consists of a front-end to the interpolation-based Guruswami–Sudan list decoding algorithm. This paper approaches the soft-decision KV algorithm from the point of view of a communications systems designer who wants to know what benefits the algorithm can give, and how the extra complexity introduced by soft decoding can be managed at the systems level. We show how to reduce the computational complexity and memory requirements of the soft-decision front-end. Applications to wireless communications over Rayleigh fading channels and magnetic recording channels are proposed. For a high-rate (RS 9225,239) Reed–Solomon code, 2–3 dB of soft-decision gain is possible over a Rayleigh fading channel using 16-quadrature amplitude modulation. For shorter codes and at lower rates, the gain can be as large as 9 dB. To lower the complexity of decoding on the systems level, the redecoding architecture is explored which uses only the appropriate amount of complexity to decode each packet. An error-detection criterion based on the properties of the KV decoder is proposed for the redecoding architecture. Queuing analysis verifies the practicality of the redecoding architecture by showing that only a modestly sized RAM buffer is required.

international conference on acoustics, speech, and signal processing | 2008

4 64-QAM K-Best MIMO Detector in 0.13

M. Shabany; K. Su; P.G. Gulak

In this paper, a practical pipelined K-best lattice decoder featuring efficient operation over infinite complex lattices is proposed. This feature is a key element that enables it to operate at a significantly lower complexity than currently reported schemes. The main innovation is a simple means of expanding/visiting the intermediate nodes of the search tree on-demand, rather than exhaustively or approximately, and also directly within the complex-domain framework. In addition, a new distributed sorting scheme is developed to keep track of the best candidates at each search phase; the combined expansion and sorting cores are able to find the K best candidates in just K clock cycles. Its support of unbounded infinite lattice decoding distinguishes our work from previous K-best strategies and also allows its complexity to scale sub-linearly with modulation order. Since the expansion and sorting cores cooperate on a data-driven basis, the architecture is well-suited for a pipelined parallel VLSI implementation of the proposed K-best lattice decoder. Comparative results demonstrating the promising performance, complexity and latency profiles of our proposal are provided in the context of the 4x4 MIMO detection problem.

international symposium on circuits and systems | 2008

\mu{\rm m}

Mahdi Shabany; P.G. Gulak

A scalable pipelined VLSI architecture for K-best lattice decoders featuring an efficient operation over infinite lattices is proposed. The proposed architecture operates at a significantly lower complexity than currently reported schemes. The key contribution is a means of expanding/visiting the intermediate nodes of the search tree on-demand, rather than exhaustively along with three types of distributed sorters operating in a pipelined structure. The combined expansion and sorting cores are able to find the K best candidates in just K clock cycles. Its support of the unbounded lattice decoding distinguishes our work from previous K-best strategies. Since the expansion and sorting cores cooperate on a data-driven basis, the architecture is well-suited for a pipelined parallel VLSI implementation. The proposed architecture has the lowest latency reported to-date, fixed critical path independent of the constellation order, on-demand expansion scheme, efficient distribute sorters, pipelined high-throughput implementation, and is scalable to higher number of antennas/constellation orders.

IEEE Journal on Selected Areas in Communications | 1998

CMOS

Mohammad Javad Omidi; P.G. Gulak; Subbarayan Pasupathy

Joint data and channel estimation for mobile communication receivers can be realized by employing a Viterbi detector along with channel estimators which estimate the channel impulse response. The behavior of the channel estimator has a strong impact on the overall error rate performance of the receiver. Kalman filtering is an optimum channel estimation technique which can lead to significant improvement in the receiver bit error rate (BER) performance. However, a Kalman filter is a complex algorithm and is sensitive to roundoff errors. Square-root implementation methods are required for robustness against numerical errors. Real-time computation of the Kalman estimator in a mobile communication receiver calls for parallel and pipelined structures to take advantage of the inherent parallelism in the algorithm. In this paper different implementation methods are considered for measurement update and time update equations of the Kalman filter. The unit-lower-triangular-diagonal (LD) correction algorithm is used for the time update equations, and systolic array structures are proposed for its implementation. For the overall implementation of joint data and channel estimation, parallel structures are proposed to perform both the Viterbi algorithm and channel estimation. Simulation results show the numerical stability of different implementation techniques and the number of bits required in the digital computations with different estimators.

international symposium on circuits and systems | 1991

Applications of Algebraic Soft-Decision Decoding of Reed–Solomon Codes

Gennady Feygin; P.G. Gulak

This work extends previous trace-back approaches. A new one-pointer trace-back algorithm for survivor sequence memory management that is particularly well-suited to a VLSI implementation is described. Memory size, latency and implementational complexity of the survivor sequence management are analyzed for both uniprocessor and multiprocessor realizations of Viterbi decoders.<<ETX>>

Wireless Personal Communications | 1999

A pipelined scalable high-throughput implementation of a near-ML K-best complex lattice decoder

Mohammad Javad Omidi; Subbarayan Pasupathy; P.G. Gulak

Channel estimation is an essential part of many detection techniques proposed for data transmission over fading channels. For the frequency selective Rayleigh fading channel an autoregressive moving average representation is proposed based on the fading model parameters. The parameters of this representation are determined based on the fading channel characteristics, making it possible to employ the Kalman filter as the best estimator for the channel impulse response. For IS-136 formatted data transmission the Kalman filter is employed with the Viterbi algorithm in a Per-Survivor Processing (PSP) fashion and the ove rall bit error rate performance is shown to be superior to that of detection techniques using the RLS and LMS estimators. To allow more than one channel estimation per symbol interval, Per-Branch Processing (PBP) method is introduced as a general case of PSP and its effect on performance is evaluated. The sensitivity of performance to parameters such as fading model order and vehicle speed is also studied.

Explore More