Chaohuan Hou
Chinese Academy of Sciences
                                 Network
                            
                            Latest external collaboration on country level. Dive into details by clicking on the dots.
                                 Publication
                            
                            Featured researches published by Chaohuan Hou.
international conference on embedded computer systems: architectures, modeling, and simulation | 2008
Jun Pang; Lei Yang; Lei Shi; Tiejun Zhang; Donghui Wang; Chaohuan Hou
The performance of modern computer system is greatly limited by the bandwidth of DRAM-based memory. Altering the sequence of main memory accesses can reduce observed access latency, therefore improve bus utilization. While previous reordering mechanisms consider factors related to memory access separately, this paper groups several factors together to build a priority expression for bank arbitration based on burst scheduling. The expression considers three factors: wait time of a burst, burst length, and priority of read or write accesses. To make the expression suitable for both read and write accesses, write queue in a bank is designed to buffer bursts, which are defined to be clusters of row hits, other than single write accesses. Experiment results from a modified M5 simulator running selected SPEC CPU2000 and Stream benchmarks show that the priority-expression-based burst scheduling improves the bus utilization about 74% and reduces the execution time 41% over the conventional in-order memory scheduling. It also outperforms burst scheduling 9% and 5% in bus utilization and execution time reduction respectively. The priority-expression-based burst scheduling is proved to be feasible.
international symposium on communications and information technologies | 2007
Jun Pang; Lei Shi; Siliang Hua; Tiejun Zhang; Chaohuan Hou
This paper introduces a reconfigurable on-chip debugger (OCD) with a real-time tracer. By setting several parameters, the OCD is readily integrated with diversified microprocessor cores. Moreover, a trace unit is implemented with Lempel-Ziv compression algorithm to trace instruction address of target processor in real-time, to meet the demand of dynamic debugging. The OCD is successfully integrated with one 8 bit microcontroller and one 32 bit processor. The experiments results verify that our design is feasible.
ieee international conference on solid-state and integrated circuit technology | 2010
Hao Yan; Dian-cheng Wu; Yan Liu; Donghui Wang; Chaohuan Hou
The ASK modulation is widely adopted in wireless communication and also in bio-implanted system. In this paper, a low-power CMOS ASK clock and data recovery (CDR) circuit is designed for cochlear implants based on the proposed monostable circuit under CMOS 0.18µm technology. This monostable circuit has a great robustness and the transient time only differs 5% under 36°C, and behaves just 10% variety in different PVTs. This monostable circuit uses no resistor and is easy for on-chip integration, it dissipates an average current 16.6µA and 10.1µA when recovers clock from “0” and “1” separately. And the whole CDR circuit could recovery correct clock and data within a broad duty cycle range and consumes only 29.52µW.
international conference on asic | 2009
Lei Yang; Tiejun Zhang; Donghui Wang; Chaohuan Hou
Memory is one of the most restricted resources in embedded system. Code compression techniques address this issue by reducing the code size of programs. Huffman coding is the most common used coding method. But during the process of generating symbols from instruction, an experience-based partition way is usually used, which may cause information redundancy. This paper presents an Optimal-Partition Based Code Compression (OPCC) method. Markov tree model is used to extract correlation between bits in instruction. A clustering algorithm is proposed to cluster bits with higher correlation into symbols. Experimental results show that this method could improve the average compression ratio by 4.1%. The decoder part is validated in Altera CycloneII FPGA.1
international conference on asic | 2013
Yingke Gao; Diancheng Wu; Quanquan Li; Tiejun Zhang; Chaohuan Hou
As the complexity of integrated circuit increases, the transaction level model (TLM) bridges the architecture and the hardware implementation. With SuperV_EF01 DSP as the research prototype, this paper presents a new method to design and implement the transaction level model based on UVM, which provides plenty of SystemVerilog libraries. This model can accelerate software development and be used as a golden reference model in the verification of RTL model without complex interface function.
Archive | 2012
Zhu Hao; Peng Chu; Tiejun Zhang; Donghui Wang; Chaohuan Hou
In this paper, a high performance software framework based on multi-level hash table for instruction-set simulator (ISS) is presented. This framework not only enhances the extensibility in Develop-Time by filling out the file of instruction set definition, but also improves the efficiency in Run-Time by loading the instruction identification table and the parameter information table in Compile-Time. This software framework is evaluated by several experiments based on c6xsim [1]. It can be ported to any ISS that simulate the processors with any architecture conveniently, and provides 1-to-2x speedups.
international conference on asic | 2011
Hao Yan; Donghui Wang; Chaohuan Hou
As the development of CMOS technology, the memory takes a great part in the entire chip area and becomes the main power contributor in the SOC system. SRAM which is the most used in on-chip memory for its low activity now consumes a lot of power while in standby mode because of the increasing number of transistors and scaling feature length. Therefore several analysis of traditional 6T transistor has been done and some design principles are given. At last, in this paper a 10T low leakage SRAM cell with high SNM based on SMIC 90nm CMOS technology has been introduced. The proposed SRAM cell saves about 88% leakage current and the SNM in read operation is enlarged 3.5 times and does not decrease in data retention. In order to reduce the sensing delay, a two-stage sense amplifier which turns the differential to single-ended is also proposed. By using this sense amplifier, the sensing delay is reduced to 46% when the load capacitance is 100fF compared with conventional voltage sense amplifier.
ieee international conference on solid-state and integrated circuit technology | 2010
Siliang Hua; Qi Wang; Hao Yan; Donghui Wang; Chaohuan Hou
In this paper a current mode logic (CML) transceiver with ±250mV output swing is proposed. The CML transceiver is designed according to inter-die communication model analysis. The model includes both bonding wire and transmission line based on electromagnetic analysis. The CML transceiver is implemented in 1.8V 0.18µm technology. Simulation results show that the transceiver can reach 2.4Gbps data rate and consumes only 27mW.
asia pacific conference on postgraduate research in microelectronics and electronics | 2010
Hao Yan; Yan Liu; Donghui Wang; Chaohuan Hou
This paper gives a study of leakage in 90nm logic CMOS technology, and by analyzing the power constitutions in multi-ported register files and the leakage in different nMOS transistors with different power supply, a low-swing strategy for bit lines is used in saving power. In this paper an 8 read / 4 write ports write-through register file is designed and it dissipates 10.04mW at 500MHz, which save 33.36% power in read operation and 24.8% energy in average. The leakage power on the bit lines also reduces 1.7% by using high threshold voltage transistors in low-swing scheme.
Archive | 2012
Hao Yan; Yan Liu; Donghui Wang; Chaohuan Hou
In this paper, a novel low-swing strategy is proposed for multi-port register file’s design. This low-swing strategy aims at single-ended bit line structure, and the low voltage swing is achieved by sensing and feedback in this strategy without additional inner chip voltage and reference voltage. This method contains two parts: writing and reading strategy. In WRITE low-swing scheme, the self-sense amplifier memory cell or modified memory cell can be used to support low-swing WRITE. The low-swing WRITE operation can save about 30.1% power. In READ low-swing scheme, a sense amplifier is also introduced. By using low-swing READ strategy, the power dissipated in READ is reduced to 48.1%, and the proposed sense amplifier also gives about 174ps sensing delay improvement.
