Chung-Jay Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chung-Jay Yang is active.

Explore More

Publication

Featured researches published by Chung-Jay Yang.

IEEE Transactions on Circuits and Systems | 2012

An Efficient Layered Decoding Architecture for Nonbinary QC-LDPC Codes

Yeong-Luh Ueng; Chen-Yap Leong; Chung-Jay Yang; Chung-Chao Cheng; Shu-Wei Chen

Compared to binary low-density parity-check (LDPC) codes, nonbinary LDPC codes have better error performance when the code length is moderate. This paper presents an efficient layered decoder architecture for nonbinary quasi-cyclic (QC) LDPC codes using the proposed barrel-shifter-based permutation network and minimum value filter which is used to determine the first few smallest values from a given set. Through the permutation network, the decoding operations related to the multiplications over finite fields can be efficiently handled in the check-node operations, which simplifies the permutations in the variable-node operations and, hence, enables the layered decoder to be realized efficiently. In order to increase the throughput, we utilize the proposed permutation network and the minimum value filter to devise a selective-input min-max decoder architecture. Using a 90-nm CMOS process, we implemented three nonbinary decoders to demonstrate the proposed techniques.

IEEE Transactions on Circuits and Systems | 2013

An Efficient Multi-Standard LDPC Decoder Design Using Hardware-Friendly Shuffled Decoding

Yeong-Luh Ueng; Bo-Jhang Yang; Chung-Jay Yang; Huang-Chang Lee; Jeng-Da Yang

This paper presents an efficient multi-standard low-density parity-check (LDPC) decoder architecture using a shuffled decoding algorithm, where variable nodes are divided into several groups. In order to provide sufficient memory bandwidth without the need for using registers, a FIFO-based check-mode memory, which dominates the decoder area, is used. Since two compensation factors, rather than a single factor, are dynamically used in the offset Min-Sum algorithm, the number of quantization bits, and, hence, the memory size, can be reduced without degradation in error performance. In order to further reduce the memory size, artificial minimum values, which do not need to be stored in memory, are used. We also propose an algorithm that can be used to partition variable nodes such that the hardware cost can be minimized. Using the proposed techniques, a multi-standard decoder that supports the LDPC codes specified in the ITU G.hn, IEEE 802.11n, and IEEE 802.16e standards was designed and implemented using a 90-nm CMOS process. This decoder supports 133 codes, occupies an area of 5.529 mm2 , and achieves an information throughput of 1.956 Gbps.

IEEE Transactions on Circuits and Systems | 2011

Processing-Task Arrangement for a Low-Complexity Full-Mode WiMAX LDPC Codec

Yu-Luen Wang; Yeong-Luh Ueng; Chien-Lien Peng; Chung-Jay Yang

In this paper, we propose dividing the decoding operations of a variety of irregular quasi-cyclic (QC) low-density parity-check (LDPC) codes into several smaller tasks. An algorithm is devised in order to arrange these tasks in a similar form such that a highly reusable multimode architecture can be designed to process these tasks. For this task-based decoder, the associated memory access can be accomplished with the help of the proposed address generators. Using this approach, the difficulty of designing a low-complexity multimode decoder, which is capable of supporting a variety of irregular QC-LDPC codes, can be overcome. Layered encoding that enables the routing networks and memory for decoding to be reused for the encoding is also proposed. Using these techniques, a multimode codec which can support all 114 WiMAX LDPC codes is designed and implemented in a 90-nm process. The full-mode WiMAX codec achieves a moderate encoding (decoding) throughput of 800 Mb/s (200 Mb/s) and occupies an area of only 0.679 mm2.

IEEE Transactions on Circuits and Systems | 2010

A Multimode Shuffled Iterative Decoder Architecture for High-Rate RS-LDPC Codes

Yeong-Luh Ueng; Chung-Jay Yang; Kuan-Chieh Wang; Chun-Jung Chen

For an efficient multimode low-density parity-check (LDPC) decoder, most hardware resources, such as permutators, should be shared among different modes. Although an LDPC code constructed based on a Reed-Solomon (RS) code with two information symbols is not quasi-cyclic, in this paper, we reveal that the structural properties inherent in its parity-check matrix can be adopted in the design of configurable permutators. A partially parallel architecture combined with the proposed permutators is used to mitigate the increase in implementation complexity for the multimode function. The high check-node degree of a high-rate RS-LDPC code leads to challenges in the efficient implementation of a high-throughput decoder. To overcome this difficulty, the variable nodes have been partitioned into several groups, and each group is processed sequentially in order to shorten the critical-path delay and hence increase the maximum operating frequency. In addition, shuffled message-passing decoding is adopted, since fewer iterations can be used to achieve the desired bit-error-rate performance. In order to demonstrate the usefulness of the proposed flexible-permutator-based architecture, one single-mode rate-0.84 decoder and two multimode decoders whose code rates range between 0.79 and 0.93 have been implemented. These decoders can achieve multigigabit-per-second throughput. Using the proposed architecture to support lower rate RS-LDPC codes, e.g., rate-0.568 code, is also investigated.

IEEE Transactions on Signal Processing | 2013

A High-Throughput Trellis-Based Layered Decoding Architecture for Non-Binary LDPC Codes Using Max-Log-QSPA

Yeong-Luh Ueng; Hsueh-Chih Chou; Chung-Jay Yang

This paper presents a high-throughput decoder architecture for non-binary low-density parity-check (LDPC) codes, where the

IEEE Transactions on Signal Processing | 2012

Jointly Designed Architecture-Aware LDPC Convolutional Codes and Memory-Based Shuffled Decoder Architecture

Yeong-Luh Ueng; Yu-Luen Wang; Li-Sheng Kan; Chung-Jay Yang; Yung-Hsiang Su

international symposium on circuits and systems | 2009

A shuffled message-passing decoding method for memory-based LDPC decoders

Yeong-Luh Ueng; Chung-Jay Yang; Chun-Jung Chen

-ary sum-product algorithm (QSPA) in the log domain is considered. We reformulate the check-node processing such that an efficient trellis-based implementation can be used, where forward and backward recursions are involved. In order to increase the decoding throughput, bidirectional forward-backward recursion is used. In addition, layered decoding is adopted to reduce the number of iterations based on a given performance. Finally, a message compression technique is used to reduce the storage requirements and hence the area. Using a 90-nm CMOS process, a 32-ary (837,726) LDPC decoder was implemented to demonstrate the proposed techniques and architecture. This decoder can achieve a throughput of 233.53 Mb/s at a clock frequency of 250 MHz based on the post-layout results. Compared to the decoders presented in previous literature, the proposed decoder can achieve the highest throughput based on a similar/better error-rate performance.

international symposium on circuits and systems | 2008

VLSI decoding architecture with improved convergence speed and reduced decoding latency for irregular LDPC codes in WiMAX

Yeong-Luh Ueng; Chung-Jay Yang; Zong-Cheng Wu; Chen-Eng Wu; Yu-Lun Wang

In this paper, we jointly design architecture-aware (AA) low-density parity-check convolutional codes (LDPC-CCs) and the associated memory-based decoder architecture based on shuffled message-passing decoding (MPD). We propose a method for constructing AA-LDPC-CCs that can facilitate the design of a memory-based shuffled decoder using parallelization in both iteration and node dimensions. Through the use of shuffled MPD, the number of base processors and, hence, the decoder area is significantly reduced, since a fewer number of iterations is required in order to achieve a desired error performance. In addition, the use of memory instead of registers minimizes the implementation cost of each base processor. In the memory-based decoder, collisions in memory access can be avoided and the difficulty in exchanging information between iterations (processors) is overcome by using simple permutation networks. To demonstrate the feasibility of the proposed techniques, we constructed a time-varying (479, 3, 6) AA-LDPC-CC and implemented its associated shuffled decoder using a 90-nm CMOS process. This decoder comprises 11 processors, occupies an area of 5.36 , and achieves an information throughput of 1.025 Gbps at a clock frequency of 256.4 MHz based on post-layout results.

international symposium on vlsi design, automation and test | 2011

A low-complexity LDPC decoder architecture for WiMAX applications

Yu-Luen Wang; Yeong-Luh Ueng; Chian-Lien Peng; Chung-Jay Yang

The convergence speed of shuffled message passing decoding (MPD) is faster than that of standard two phase message passing (TPMP) decoding. Due to complex memory access and requirement of large storage space, the shuffled MPD is not suitable for hardware implementation especially for high-rate LDPC codes. In this paper, we propose a modified shuffled MPD which can achieve a similar convergence speed but with reduced complexity in memory access and storage space as compared to the conventional shuffled MPD. We implement a rate-5/6 LDPC decoder based on the proposed algorithm.

international soc design conference | 2011

A selective-input non-binary LDPC decoder architecture

Yeong-Luh Ueng; Chung-Jay Yang; Shu-Wei Chen; Wei-Xuan Wu

In this paper, we modify a previously proposed decoding algorithm and propose a VLSI architecture to decode the quasi-cyclic low-density parity-check (QC-LDPC) code C used in the IEEE 802.16e standard. The modified decoding algorithm sequentially decodes a plurality of block codes for which its code length is much smaller than that of C. The proposed decoder can achieve a faster speed of convergence, lower decoding latency, higher throughput, and lower number of memory access as compared to the decoders using conventional turbo decoding message passing (TDMP) based on similar hardware complexity.

Explore More