Chih-Lung Chen
National Chiao Tung University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Chih-Lung Chen.
IEEE Journal of Solid-state Circuits | 2008
Chih-Hao Liu; Shau-Wei Yen; Chih-Lung Chen; Hsie-Chia Chang; Chen-Yi Lee; Yarsun Hsu; Shyh-Jye Jou
An LDPC decoder chip fully compliant to IEEE 802.16e applications is presented. Since the parity check matrix can be decomposed into sub-matrices which are either a zero-matrix or a cyclic shifted matrix, a phase-overlapping message passing scheme is applied to update messages immediately, leading to enhance decoding throughput. With only one shifter-based permutation structure, a self-routing switch network is proposed to merge 19 different sub-matrix sizes as defined in IEEE 802.16e and enable parallel message to be routed without congestion. Fabricated in the 90 nm 1P9M CMOS process, this chip achieves 105 Mb/s at 20 iterations while decoding the rate-5/6 2304-bit code at 150 MHz operation frequency. To meet the maximum data rate in IEEE 802.16e, this chip operates at 109 MHz frequency and dissipates 186 mW at 1.0 V supply.
IEEE Journal of Solid-state Circuits | 2012
Shao-Wei Yen; Shiang-Yu Hung; Chih-Lung Chen; Hsie-Chia Chang; Shyh-Jye Jou; Chen-Yi Lee
An LDPC codec chip supporting four code rates of IEEE 802.15.3c applications is presented. After utilizing row-based layered scheduling, the normalized min-sum (NMS) algorithm can reduce half of the iteration number while maintaining similar performance. According to the unique code structure of the parity-check matrix, a reconfigurable 8/16/32-input sorter is designed to deal with LDPC codes in four different code rates. Both sorter input reallocation and pre-coded routing switch are proposed to alleviate routing complexity, leading to 64% input reduction of multiplexers. In addition, an adder-accumulator-shift register (AASR) circuit is proposed for the LDPC encoder to reduce hardware complexity. After implemented in 65-nm 1P10M CMOS process, the proposed LDPC decoder chip can achieve maximum 5.79-Gb/s throughput with the hardware efficiency of 3.7 Gb/s/mm2 and energy efficiency of 62.4 pJ/b, respectively.
IEEE Transactions on Circuits and Systems Ii-express Briefs | 2009
Chih-Hao Liu; Chien-Ching Lin; Shau-Wei Yen; Chih-Lung Chen; Hsie-Chia Chang; Chen-Yi Lee; Yarsun Hsu; Shyh-Jye Jou
A reconfigurable message-passing network is proposed to facilitate message transportation in decoding multimode quasi-cyclic low-density parity-check (QC-LDPC) codes. By exploiting the shift-routing network (SRN) features, the decoding messages are routed in parallel to fully support those specific 19 and 3 submatrix sizes defined in IEEE 802.16e and IEEE 802.11n applications with less hardware complexity. A 6.22- mm2 QC-LDPC decoder with SRN is implemented in a 90-nm 1-Poly 9-Metal (1P9M) CMOS process. Postlayout simulation results show that the operation frequency can achieve 300 MHz, which is sufficient to process the 212-Mb/s 2304-bit and 178-Mb/s 1944-bit codeword streams for IEEE 802.16e and IEEE 802.11n systems, respectively.
Diamond and Related Materials | 2003
Chi-Yi Tsai; Chih-Lung Chen
Abstract In this study, we focus on the immediately improving quality of growing carbon nanotubes without any pre- or post-treatment. The applied biases during the reaction can directly control the diameter and the quality of carbon nanotubes. This simple step skips additional treatments and is easily used in many deposition systems. The diameter of carbon nanotubes noticeably varies from 45 nm without any amorphous carbon (under +80 V) to 120 nm (under −120 V). Raman spectrums indicate that I D / I G ratio decreases with increasing positive bias. This implies applying positive bias could enhance the graphitization of carbon nanotubes. However, positive and negative bias effects slightly vary the field emission enhancement. In addition, carbon nanotubes grown under positive bias possess better field emission characterization. This results from the following reasons: (I) smaller diameter; (II) pure surface; (III) more graphitized structure; and (IV) higher field enhancement β .
IEEE Transactions on Circuits and Systems | 2015
Xin-Ru Lee; Chih-Lung Chen; Hsie-Chia Chang; Chen-Yi Lee
This paper presents the first silicon-proven stochastic LDPC decoder to support multiple code rates for IEEE 802.15.3c applications. The critical path is improved by a reconfigurable stochastic check node unit (CNU) and variable node unit (VNU); therefore, a high throughput scheme can be realized with 768 MHz clock frequency. To achieve higher hardware and energy efficiency, the reduced complexity architecture of tracking forecast memory is experimentally investigated to implement the variable node units for IEEE 802.15.3c applications. Based on the properties of parity check matrices and stochastic arithmetic, the optimized routing networks with re-permutation techniques are adopted to enhance chip utilization. Considering the measurement uncertainties, a delay-lock loop with isolated power domain and a test environment consisting of an encoder, an AWGN generator and bypass circuits are also designed for inner clock and information generation. With these features, our proposed fully parallel LDPC decoder chip fabricated in 90-nm CMOS process with 760.3 K gate count can achieve 7.92 Gb/s data rate and power consumption of 437.2 mW under 1.2 V supply voltage. Compared to the state-of-the-art IEEE 802.15.3c LDPC decoder chips, our proposed chip achieves over 90% reduction of routing wires, 73.8% and 11.5% enhancement of hardware and energy efficiency, respectively.
european solid-state circuits conference | 2009
Chih-Lung Chen; Kao-Shou Lin; Hsie-Chia Chang; Wai-Chi Fang; Chen-Yi Lee
In this paper, a LDPC decoder chip based on CPPEG code construction is presented. The (2048, 1920) irregular LDPC code generated by CP-PEG algorithm has better performance than other PEG-based codes; however, the large check node degrees introduced by high code-rate 15/16 become the implementation bottleneck. To design such a high coderate LDPC decoder, our approach features variable-node-centric sequential scheduling to reduce iteration number, single piplelined decoder architecture to lessen the message storage memory size, as well as optimized check node unit to further compress the register number. Overall 73% message storage memory is saved as compared with traditional architecture. Fabricated in 90nm 1P9M CMOS technology, a test deocder chip could achieve maximum 11.5 Gbps throughput under 1.4V supply voltage with core area of 2.7 × 1.4 mm2. The energy efficiency is only 0.033 nJ/bit with 5.77 Gbps at 0.8V to meet IEEE 802.15.3c requirements.
IEEE Journal of Solid-state Circuits | 2012
Chih-Lung Chen; Yu-Hsiang Lin; Hsie-Chia Chang; Chen-Yi Lee
This paper presents a (491,3,6) time-varying low-density parity check convolutional code (LDPC-CC) decoder chip. This work combines the algorithm level, node level, and bit level optimizations to achieve over 2 Gb/s throughput with acceptable hardware cost and power. The algorithm level optimization is the on-demand variable node activation scheduling with concealing channel values, which can not only achieve twice faster decoding convergence speed than log-belief propagation (log-BP) algorithm, but also reduce the 17% message storage capacity. The node level optimization duplicates the check node units and variable node units and unfolds the message storage first-in-first-outs (FIFOs) so that the throughput becomes twelve multiplying with clock frequency. In the meantime, the bit level optimization is employed to retime the critical path such that the higher clock frequency can be achieved and message storage size is slightly reduced. Furthermore, a novel hybrid-partitioned FIFO is proposed to provide sufficient memory bandwidth to processing units and alleviate power consumption. With these schemes, a test chip of proposed LDPC-CC decoder has been fabricated in 90 nm CMOS technology with core area of 2.37 × 1.14 mm2. Maximum throughput 2.37 Gb/s is measured under 1.2 V supply with energy efficiency of 0.024 nJ/bit/proc. Depending on the operation mode, power can be scaled down to 90.2 mW while maintaining 1.58 Gb/s at 0.8 V supply.
asian solid state circuits conference | 2010
Shiang-Yu Hung; Shao-Wei Yen; Chih-Lung Chen; Hsie-Chia Chang; Shyh-Jye Jou; Chen-Yi Lee
A LDPC decoder chip supporting four code rates of IEEE 802.15.3c applications is presented. The row-based layered scheduling with normalized min-sum algorithm is proposed to reduce the iteration number while maintaining similar performance. In addition, the reconfigurable 8/16/32-input sorter is designed to deal with four LDPC codes. In order to alleviate routing complexity, both of the reallocation of sorter inputs and pre-coding routing network are proposed to reduce the input numbers of multiplexers by 64%. Fabricated in 65-nm 1P10M CMOS process, this test chip can achieve maximum 5.79Gbps throughput for the highest code rate code while the hardware efficiency and energy efficiency are 3.7Gbps/mm2 and 62.4pJ/bit, respectively.
asian solid state circuits conference | 2006
Cheng-Hao Tang; Cheng-Chi Wong; Chih-Lung Chen; Chien-Ching Lin; Hsie-Chia Chang
In this paper, a high-speed Max-Log MAP decoder is presented for soft-in and soft-out trellis decoding. The high throughput is achieved with a two-dimensional ACS design on the high-radix trellis structure, resulting in a highly parallel and area-efficient decoder. We further apply the retiming technique to reduce the critical path delay of ACS operation. After 0.13 mum CMOS chip implementation, the decoder occupies 1.96 mm2 area containing 220 K gates. The estimated timing under the 1.08 V supply and the worst case corner shows that the test chip can achieve the maximum 952 MS/s throughput. To our knowledge, the present Max-Log MAP decoder has the highest throughput with the modest hardware cost.
IEEE Transactions on Very Large Scale Integration Systems | 2016
Kin-Chu Ho; Chih-Lung Chen; Hsie-Chia Chang
Although Latin square is a well-known algorithm to construct low-density parity-check (LDPC) codes for satisfying long code length, high code-rate, good correcting capability, and low error floor, it has a drawback of large submatrix that the hardware implementation will be suffered from large barrel shifter and worse routing congestion in fitting NAND flash applications. In this paper, a top-down design methodology, which not only goes through code construction and optimization, but also hardware implementation to meet all the critical requirements, is presented. A two-step array dispersion algorithm is proposed to construct long LDPC codes with a small submatrix size. Then, the constructed LDPC code is optimized by masking matrix to obtain better bit-error rate (BER) performance and lower error-floor. In addition, our LDPC codes have a diagonal-like structure in the parity-check matrix leading to a proposed hybrid storage architecture, which has the advantages of better area efficiency and large enough data bandwidth for high decoding throughput. To be adopted for NAND flash applications, an (18900, 17010) LDPC code with a code-rate of 0.9 and submatrix size of 63 is constructed and the field-programmable gate array simulations show that the error floor is successfully suppressed down to BER of 10-12. An LDPC decoder using normalized min-sum variable-node-centric sequential scheduling decoding algorithm is implemented in UMC 90-nm CMOS process. The postlayout result shows that the proposed LDPC decoder can achieve a throughput of 1.58 Gb/s at six iterations with a gate count of 520k under a clock frequency of 166.6 MHz. It meets the throughput requirement of both NAND flash memories with Toggle double data rate 1.0 and open NAND flash interface 2.3 NAND interfaces.