Is this you? Create Your Porfile

Chiu-Wing Sham

Hong Kong Polytechnic University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Chiu-Wing Sham is active.

Explore More

Publication

Featured researches published by Chiu-Wing Sham.

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2003

Routability-driven floorplanner with buffer block planning

Chiu-Wing Sham; Evangeline F. Y. Young

In traditional floorplanners, area minimization is an important issue. However, due to the recent advances in very large scale integration technology, the number of transistors in a design are increasing rapidly and so are their switching speeds. This has increased the importance of interconnect delay and routability in the overall performance of a circuit. We should consider interconnect planning, buffer planning, and routability as early as possible. In this paper, we study and implement a routability-driven floorplanner with congestion estimation and buffer planning. Our method is based on a simulated annealing approach that is divided into two phases: the area optimization and congestion optimization phases. In the area optimization phase, modules are roughly placed according to the total area and wirelength. In the congestion optimization phase, a floorplan is evaluated by its area, wirelength, congestion, and routability. We assume that buffers should be inserted at flexible intervals from each other for long enough wires and probabilistic analysis is performed to compute the congestion information taken into account the constraints in buffer locations. Our approach is able to reduce the average number of wires at the congested areas and allow more feasible insertions of buffers to satisfy the delay constraints without having much penalty in increasing the area of the floorplan.

international symposium on physical design | 2002

Routability driven floorplanner with buffer block planning

Chiu-Wing Sham; Evangeline F. Y. Young

In traditional floorplanners, area minimization is an important issue. However, due to the recent advances in VLSI technology, the number of transistors in a design are increasing rapidly and so are their switching speeds. This has increased the importance of interconnect delay and routability in the overall performance of a circuit. We should consider interconnect planning, buffer planning and routability as early as possible. In this paper, we study and implement a routability-driven floorplanner with congestion estimation and buffer planning. Our method is based on a simulated annealing approach that is divided into two phases: the area optimization phase and the congestion optimization phase. In the area optimization phase, modules are roughly placed according to the total area and wirelength. In the congestion optimization phase, a floorplan will be evaluated by its area, wirelength, congestion and routability. We assume that every buffer should be inserted at a flexible interval from each other for long enough wires and probabilistic analysis is performed to compute the congestion information taken into accounts the constraints in buffer locations. Our approach is able to reduce the average number of wires at the congested areas and allow more feasible insertions of buffers to satisfy the delay constraints without having much penalty in increasing the area of the floorplan.

IEEE Transactions on Very Large Scale Integration Systems | 2001

A bitstream reconfigurable FPGA implementation of the WSAT algorithm

Philip Heng Wai Leong; Chiu-Wing Sham; W. C. Wong; Hiu Yung Wong; Wing Seung Yuen; Monk-Ping Leong

A field programmable gate array (FPGA) implementation of a coprocessor which uses the WSAT algorithm to solve Boolean satisfiability problems is presented. The input is a SAT problem description file from which a software program directly generates a problem-specific circuit design which can be downloaded to a Xilinx Virtex FPGA device and executed to find a solution. On an XCV300, problems of 50 variables and 170 clauses can be solved. Compared with previous approaches, it avoids the need for resynthesis, placement, and routing for different constraints. Our coprocessor is eminently suitable for embedded applications where energy, weight and real-time response are of concern.

asia and south pacific design automation conference | 2010

A dual-MST approach for clock network synthesis

Jingwei Lu; Wing-Kai Chow; Chiu-Wing Sham; Evangeline F. Y. Young

In nanometer-scale VLSI physical design, clock network becomes a major concern on determining the total performance of digital circuit. Clock skew and PVT (Process, Voltage and Temperature) variations contribute a lot to its behavior. Previous works mainly focused on skew and wirelength minimization. It may lead to negative influence towards these process variation factors. In this paper, a novel clock network synthesizer is proposed and several algorithms are introduced for performance improvement. A dual-MST (DMST) geometric matching approach is proposed for topology construction. It can help balancing the tree structure to reduce the variation effect. A recursive buffer insertion technique and a blockage handling method are also presented, and they are developed for proper distribution of buffers and saving of capacitance. Experimental results show that our matching approach is better than the traditional methods, and in particular our synthesizer has better performance compared to the results of the winner in the ISPD 2009 contest.

IEEE Transactions on Very Large Scale Integration Systems | 2012

Fast Power- and Slew-Aware Gated Clock Tree Synthesis

Jingwei Lu; Wing-Kai Chow; Chiu-Wing Sham

Clock tree synthesis plays an important role on the total performance of chip. Gated clock tree is an effective approach to reduce the dynamic power usage. In this paper, two novel gated clock tree synthesizers, power-aware clock tree synthesizer (PACTS) and power- and slew-aware clock tree synthesizer (PSACTS), are proposed with zero skew achieved based on Elmore RC model. In PACTS, the topology of the clock tree is constructed with simultaneous buffer/gate insertion, which reduces the switched capacitance. In PSACTS, a more practical clock slew constraint is applied. Compared to previous works, clock tree synthesis is done first and followed by the insertions of clock gates. The clock slew changes a lot after the insertions of clock gates in real cases. In our work, the clock tree is constructed simultaneously with the insertions of clock gates. This ensures the limitation of the clock slew can be strictly satisfied while the limitation of the clock slew is always applied in the real design. The experimental results show that the power cost of our work is smaller and the runtime is reduced. The slew rate constraint is satisfied with a small clock skew from SPICE estimation. Generally, our work has better performance, improved efficiency and is more practical to be applied in the industry.

IEEE Transactions on Circuits and Systems | 2013

A 2.0 Gb/s Throughput Decoder for QC-LDPC Convolutional Codes

Chiu-Wing Sham; Xu Chen; Francis Chung-Ming Lau; Yue Zhao; Wai Man Tam

This paper proposes a decoder architecture for low-density parity-check convolutional code (LDPCCC). Specifically, the LDPCCC is derived from a quasi-cyclic (QC) LDPC block code. By making use of the quasi-cyclic structure, the proposed LDPCCC decoder adopts a dynamic message storage in the memory and uses a simple address controller. The decoder efficiently combines the memories in the pipelining processors into a large memory block so as to take advantage of the data-width of the embedded memory in a modern field-programmable gate array (FPGA). A rate-5/6 QC-LDPCCC has been implemented on an Altera Stratix FPGA. It achieves up to 2.0 Gb/s throughput with a clock frequency of 100 MHz. Moreover, the decoder displays an excellent error performance of lower than 10-13 at a bit-energy-to-noise power-spectral-density ratio (Eb/N0) of 3.55 dB.

system-level interconnect prediction | 2005

Congestion prediction in early stages

Chiu-Wing Sham; Evangeline F. Y. Young

Routability optimization has become a major concern in the physical design cycle of VLSI circuits. Due to the recent advances in VLSI technology, interconnect has become a dominant factor of the overall performance of a circuit. In order to optimize interconnect cost, we need a good congestion estimation method to predict routability in the early stages of the design cycle. Many congestion models have been proposed but theres still a lot of room for improvement. Some existing models [6] are dependent on parameters that are related to the actual congestion of the circuits. Besides, routers will perform rip-up and re-route operations to prevent overflow but most models do not consider this case. The outcome is that the existing models will usually under-estimate the routability. In this paper, we propose a new congestion model to solve the above problems. The estimation process is divided into three steps: preliminary estimation, detailed estimation and congestion redistribution. We have compared our new model and some existing models with the actual congestion measures obtained by global routing some placement results with a publicly available maze router [2]. Results show that our model has significant improvement in prediction accuracy over the existing models.

ACM Transactions on Design Automation of Electronic Systems | 2009

Congestion prediction in early stages of physical design

Chiu-Wing Sham; Evangeline F. Y. Young; Jingwei Lu

Routability optimization has become a major concern in physical design of VLSI circuits. Due to the recent advances in VLSI technology, interconnect has become a dominant factor of the overall performance of a circuit. In order to optimize interconnect cost, we need a good congestion estimation method to predict routability in the early designing stages. Many congestion models have been proposed but theres still a lot of room for improvement. Besides, routers will perform rip-up and reroute operations to prevent overflow, but most models do not consider this case. The outcome is that the existing models will usually underestimate the routability. In this paper, we have a comprehensive study on our proposed congestion models. Results show that the estimation results of our approaches are always more accurate than the previous congestion models.

Integration | 2014

Obstacle-avoiding rectilinear Steiner tree construction in sequential and parallel approach

Wing-Kai Chow; Liang Li; Evangeline F. Y. Young; Chiu-Wing Sham

The Rectilinear Steiner Minimum Tree (RSMT) problem is a fundamental one in VLSI physical design. In this paper, we present a maze routing based heuristics to solve the obstacle-avoiding RSMT (OARSMT) problem. Our approach can handle multi-pin nets in good quality and reasonable running time. We also present an implementation of the heuristics in parallel approach with the aid of graphic processing units (GPU). The parallel algorithm is implemented by using CUDA and has been tested on a NVIDIA graphic card. Our experimental results show that our parallel algorithm has promising speedups over our sequential approach. This work demonstrates that we can apply a parallel algorithm to solve the OARSMT problem with the aid of GPU.

design, automation, and test in europe | 2002

Congestion Estimation with Buffer Planning in Floorplan Design

Chiu-Wing Sham; Wai-Chiu Wong; Evangeline F. Y. Young

In this paper, we study and implement a routability-driven floorplanner with buffer block planning. It evaluates the routability of a floorplan by computing the probability that a net will pass through each particular location of a floorplan taken into account buffer locations and routing blockages. Experimental results show that our congestion model can optimize congestion and delay (by successful buffer insertions) of a circuits better with only a slight penalty in area.

Explore More