Wing-Kai Chow
The Chinese University of Hong Kong
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wing-Kai Chow.
international symposium on physical design | 2014
Wing-Kai Chow; Jian Kuang; Xu He; Wenzan Cai; Evangeline F. Y. Young
Modern placement process involves global placement, legalization, and detailed placement. Global placement produce a placement solution with minimized target objective, which is usually wire-length, routability, timing, etc. Legalization removes cell overlap and aligns the cells to the placement sites. Detailed placement further improves the solution by relocating cells. Since target objectives like wire-length and timing are optimized in global placement, legalization and detailed placement should not only minimize their own objectives but also preserve the global placement solution. In this paper, we propose a detailed placement algorithm for minimizing wire-length, while preserving the global placement solution by cell displacement constraint and target cell density objective. Our detailed placer involves two steps: Global Move that allocates each cell into a bin/region that minimizes wire-length, while not overflowing the target cell density. Local Move that finely adjust the cell locations in local regions to further minimize the wire-length objective. With large-scale benchmarks from ICCAD 2013 detailed placement contest, the results show that our detailed placer, RippleDP, can improve the global placement results by 13.38% - 16.41% on average under displacement constraint and target placement density objective.
design automation conference | 2013
Xu He; Tao Huang; Wing-Kai Chow; Jian Kuang; Ka-Chun Lam; Wenzan Cai; Evangeline F. Y. Young
Due to a significant mismatch between the objectives of wirelength and routing congestion, the routability issue is becoming more and more important in VLSI design. In this paper, we present a high quality placer Ripple 2.0 to solve the routability-driven placement problem. We will study how to make use of the routing path information in cell spreading and relieve congestion with tangled logic in detail. Several techniques are proposed, including (1) lookahead routing analysis with pin density consideration, (2) routing path-based cell inflation and spreading and (3) robust optimization on congested cluster. With the official evaluation protocol, Ripple 2.0 outperforms the top contestants on the ICCAD 2012 Contest benchmark suite.
asia and south pacific design automation conference | 2010
Jingwei Lu; Wing-Kai Chow; Chiu-Wing Sham; Evangeline F. Y. Young
In nanometer-scale VLSI physical design, clock network becomes a major concern on determining the total performance of digital circuit. Clock skew and PVT (Process, Voltage and Temperature) variations contribute a lot to its behavior. Previous works mainly focused on skew and wirelength minimization. It may lead to negative influence towards these process variation factors. In this paper, a novel clock network synthesizer is proposed and several algorithms are introduced for performance improvement. A dual-MST (DMST) geometric matching approach is proposed for topology construction. It can help balancing the tree structure to reduce the variation effect. A recursive buffer insertion technique and a blockage handling method are also presented, and they are developed for proper distribution of buffers and saving of capacitance. Experimental results show that our matching approach is better than the traditional methods, and in particular our synthesizer has better performance compared to the results of the winner in the ISPD 2009 contest.
IEEE Transactions on Very Large Scale Integration Systems | 2012
Jingwei Lu; Wing-Kai Chow; Chiu-Wing Sham
Clock tree synthesis plays an important role on the total performance of chip. Gated clock tree is an effective approach to reduce the dynamic power usage. In this paper, two novel gated clock tree synthesizers, power-aware clock tree synthesizer (PACTS) and power- and slew-aware clock tree synthesizer (PSACTS), are proposed with zero skew achieved based on Elmore RC model. In PACTS, the topology of the clock tree is constructed with simultaneous buffer/gate insertion, which reduces the switched capacitance. In PSACTS, a more practical clock slew constraint is applied. Compared to previous works, clock tree synthesis is done first and followed by the insertions of clock gates. The clock slew changes a lot after the insertions of clock gates in real cases. In our work, the clock tree is constructed simultaneously with the insertions of clock gates. This ensures the limitation of the clock slew can be strictly satisfied while the limitation of the clock slew is always applied in the real design. The experimental results show that the power cost of our work is smaller and the runtime is reduced. The slew rate constraint is satisfied with a small clock skew from SPICE estimation. Generally, our work has better performance, improved efficiency and is more practical to be applied in the industry.
international conference on computer aided design | 2014
Jian Kuang; Wing-Kai Chow; Evangeline F. Y. Young
Triple Patterning Lithography (TPL) is regarded as a promising technique to handle the manufacturing challenges in 14nm and beyond technology node. It is necessary to consider TPL in early design stages to make the layout more TPL friendly and reduce the manufacturing cost. In this paper, we propose a flow to co-optimize cell layout decomposition and detailed placement. Our cell decomposition approach can enumerate all coloring solutions with the minimum number of stitches. Experimental results show that our approach can outperform the existing work in all aspects of stitch number, HPWL and running time.
international conference on computer aided design | 2016
Chak-Wa Pui; Gengjie Chen; Wing-Kai Chow; Ka-Chun Lam; Jian Kuang; Peishan Tu; Hang Zhang; Evangeline F. Y. Young; Bei Yu
As the complexity and scale of FPGA circuits grows, resolving routing congestion becomes more important in FPGA placement. In this paper, we propose a routability-driven placement algorithm for large-scale heterogeneous FPGAs. Our proposed algorithm consists of (1) partitioning, (2) packing, (3) global placement with congestion estimation, (4) window-base legalization, and (5) routing resource-aware detailed placement. Experimental results show that our proposed approach can give routable placement results for all the benchmarks in the ISPD2016 contest and can achieve good result compared to the other wining teams of the ISPD2016 contest.
design automation conference | 2016
Wing-Kai Chow; Chak-Wa Pui; Evangeline F. Y. Young
Typical standard cell placement algorithms assume that all cells are of the same height such that cells can be aligned along the placement rows. However, modern standard cell designs are getting more complicated and multiple-row height cell becomes more common. With multiple-row height cells, placement of cells are not independent among different rows. It turns out that most of the commonly used detailed placement and legalization techniques cannot be extended easily to handle the problem. We propose a novel algorithm in handling legalization of placement involving multiple-row height cells. The algorithm can efficiently legalize a local region of cells with various heights, which is especially useful for local cell movement, cell sizing, and buffer insertion. Experiments on the application of the technique in detailed placement show that our approach can effectively and efficiently legalize global placement results and obtain significant improvement in the objective function.
design, automation, and test in europe | 2015
Jian Kuang; Wing-Kai Chow; Evangeline F. Y. Young
As the minimum feature size continues to shrink, whereas the wavelength of light used for lithography remains constant, Resolution Enhancement Techniques are widely used to optimize mask, so as to improve the subwavelength printability. Besides correcting for error between the printed image and target shape, a mask optimization method also needs to consider process variation. In this paper, a robust mask optimization approach is proposed to optimize the process window as well as the Edge Placement Error (EPE) of the printed image. Experiments results on the public benchmarks are encouraging.
Integration | 2014
Wing-Kai Chow; Liang Li; Evangeline F. Y. Young; Chiu-Wing Sham
The Rectilinear Steiner Minimum Tree (RSMT) problem is a fundamental one in VLSI physical design. In this paper, we present a maze routing based heuristics to solve the obstacle-avoiding RSMT (OARSMT) problem. Our approach can handle multi-pin nets in good quality and reasonable running time. We also present an implementation of the heuristics in parallel approach with the aid of graphic processing units (GPU). The parallel algorithm is implemented by using CUDA and has been tested on a NVIDIA graphic card. Our experimental results show that our parallel algorithm has promising speedups over our sequential approach. This work demonstrates that we can apply a parallel algorithm to solve the OARSMT problem with the aid of GPU.
international symposium on physical design | 2013
Xu He; Wing-Kai Chow; Evangeline F. Y. Young
In this paper, an effective simultaneous routing and placement refinement tool called SRP is proposed for routability improvement. SRP is independent of any placer and global router. Based on a given placement layout and global routing result, SRP relocates problematic cells by considering routing and placement simultaneously. Not only overflow from local nets, but overflow from global and semi-global nets can be solved by SRP. A cell will be relocated and its associated nets will be rerouted if its connections go across any congested region, even if the cell is not in the congested region. Therefore, our method can reduce the overflow effectively. Given the layouts generated by the top four routability-driven placers in the DAC Contest 2012, our method can still reduce the total overflow by 32.6% in average while the routed wirelength and HPWL are not increased obviously.