Is this you? Create Your Porfile

Kai-Pui Lam

The Chinese University of Hong Kong

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kai-Pui Lam is active.

Explore More

Publication

Featured researches published by Kai-Pui Lam.

IEEE Transactions on Neural Systems and Rehabilitation Engineering | 2006

A Component-Based FPGA Design Framework for Neuronal Ion Channel Dynamics Simulations

Terrence S. T. Mak; Guy Rachmuth; Kai-Pui Lam; Chi-Sang Poon

Neuron-machine interfaces such as dynamic clamp and brain-implantable neuroprosthetic devices require real-time simulations of neuronal ion channel dynamics. Field-programmable gate array (FPGA) has emerged as a high-speed digital platform ideal for such application-specific computations. We propose an efficient and flexible component-based FPGA design framework for neuronal ion channel dynamics simulations, which overcomes certain limitations of the recently proposed memory-based approach. A parallel processing strategy is used to minimize computational delay, and a hardware-efficient factoring approach for calculating exponential and division functions in neuronal ion channel models is used to conserve resource consumption. Performances of the various FPGA design approaches are compared theoretically and experimentally in corresponding implementations of the alpha-amino-3-hydroxy-5-methyl-4-isoxazole propionic acid (AMPA) and N-methyl-D-aspartate (NMDA) synaptic ion channel models. Our results suggest that the component-based design framework provides a more memory economic solution, as well as more efficient logic utilization for large word lengths, whereas the memory-based approach may be suitable for time-critical applications where a higher throughput rate is desired

system on chip conference | 2010

A CMOS Current-Mode Dynamic Programming Circuit

Terrence S. T. Mak; Kai-Pui Lam; H. S. Ng; Guy Rachmuth; Chi-Sang Poon

Dynamic programming (DP) is a fundamental algorithm for complex optimization and decision-making in many engineering and biomedical systems. However, conventional DP computation based on digital implementation of the Bellman-Ford recursive algorithm suffers from the “curse of dimensionality” and substantial iteration delays which hinder utility in real-time applications. Previously, an ordinary differential equation system was proposed that transforms the sequential DP iteration into a continuous-time parallel computational network. Here, the network is realized using a CMOS current-mode analog circuit, which provides a powerful computational platform for power-efficient, compact, and high-speed solution of the Bellman formula. Test results for the fabricated DP optimization chip demonstrate a proof of concept for this solution approach. We also propose an error compensation scheme to minimize the errors attributed to nonideal current sources and device mismatch.

IEEE Circuits and Systems Magazine | 2011

Dynamic Programming Networks for Large-Scale 3D Chip Integration

Terrence S. T. Mak; Ra'ed Al-Dujaily; Kuan Zhou; Kai-Pui Lam; Yicong Meng; Alex Yakovlev; Chi-Sang Poon

Recent technological advance in three-dimensional (3-D) on-chip systems integration provides a promising platform to realize multicore, multiprocessor, and networks-on-chip (NoC) based systems with augmented performance. With the additional tightly coupled physical layers, on-chip system complexity grows significantly. The provision for efficient run-time management in large-scale system becomes critical. In this article, we review the design of an emerging on-chip dynamic-programming (DP) network, of which the capabilities have been demonstrated in a range of applications including optimal paths planning, dynamic routing and deadlock detection. A design of DP-network, implemented in a fully stacked 3-layer three-dimensional (3-D) architecture using through-silicon-via (TSV) CMOS technology, is also presented. The vertical inter-layer communication is achieved by the means of TSV, and the mesh interconnection provides a natural minimal area overhead associated with this communication. Testing results demonstrated the effectiveness of such approach for deadlock detection and the minuscule computational delay for detecting deadlock from a large-scale network.

The Computer Journal | 2013

Dynamic On-Chip Thermal Optimization for Three-Dimensional Networks-On-Chip

Ra'ed Al-Dujaily; Terrence S. T. Mak; Kai-Pui Lam; Fei Xia; Alex Yakovlev; Chi-Sang Poon

The complex thermal behaviour prohibits the advancement of three-dimensional (3D) very-large-scale integration system. Particularly, the high-density through-silicon via based integration could lead to ultra-high temperature hotspots and permanent silicon device damage. In this paper, we introduce an adaptive strategy to effectively diffuse heat throughout the 3D geometry. This strategy employs a dynamic programming network to select and optimize the direction of data manoeuvre in a network-on-chip (NoC). We also developed a tool, which is based on the accurate HotSpot thermal model and SystemC cycle accurate model, to simulate the thermal system and evaluate our approach. We found that the proposed approach can significantly diffuse the hotspots from a 3D geometry and overall temperature can be significantly reduced. Given the same thermal constraints, the throughput performance of an adaptive NoC can also be improved. This work enables a new avenue to explore the on-chip adaptability for the future large-scale 3D integration.

international symposium on circuits and systems | 1996

Current-mode optimization circuits for minimax path problems

H.S. Ng; Kai-Pui Lam

The current-mode approach for analog VLSI design in implementing a connectionist network, a binary relation inference network, to solve minimax path problems (MPP) is presented. Previous works focused on using current-mode maximum and minimum circuits as separate entities for fuzzy system applications. This paper proposes a network architecture and solution to the MPP by using the current-mode maximum and minimum circuits as basic building blocks in a uniform feedback arrangement. Conceptually, the network is able to obtain the global optimal solution in a time independent of the problem size. Practically, the current-mode connectionist network has been shown to give comparable performance to the conventional voltage-mode circuits, with a significant reduction in circuit complexity component counts, and power consumption.

2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip | 2011

Comparative ODE benchmarking of unidirectional and bidirectional DP networks for 3D-IC

Kai-Pui Lam; Terrence S. T. Mak; Chi-Sang Poon

There has been great technological stride in 3D-IC on its design, analysis, and fabrication, with prediction that they will eventually lead to significant advances in multicore, multiprocessor, and network-on-chip (NoC) systems. A dynamic programming (DP) network is well suited for the grid stack architecture, because of its capability to achieve global optimality using only local computational units with short inter-grid communication links. In this paper we extend the transitive closure and shortest path unidirectional networks to bidirectional networks, with the development of an effective simulation tool for such type of DP networks. In addition to helping to construct real DP networks on 3D-IC, ODE (ordinary differential equation) simulation methodology for solving an average shortest path length problem provides new insights for comparative bench-marking very large-scale 2D/3D networks for different design considerations in application.

2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip | 2011

Cycle avoidance in 2D/3D bidirectional graphs using shortest-path dynamic programming network

Kai-Pui Lam; Terrence S. T. Mak; Chi-Sang Poon

An ordinary-differential-equation (ODE) simulation model has recently been proposed for an N-node dynamic programming (DP) network, which solves the transitive closure and shortest path problems on an architecturally equivalent N-node 2D/3D grid stack. For large-scale randomly generated bidirectional network, where N is large and the inter-grid paths may take either direction, cycles commonly occur leading to a high percentage of nodes with unbound path lengths. The detection of such cycle nodes can be readily found using a shortest-path DP network. In this work we address several issues on the cycle avoidance problem, by first defining the 〈H〉-index and 〈V〉-index and hence its product 〈HV〉 as the two-dimensional turn ability. A regression model was then proposed and obtained empirically to relate the cycle-node ratio, which is the percentage of cycle nodes over N, with 〈HV〉 for several random networks of sizes N = 10×10×2, 10×10×5, 10×10×8. By reducing 〈HV〉 from 0.8 to 0.2, the cycle-node ratio can be reduced from close to 60% to 20% and indicates a significant avoidance of cycle nodes.

IEEE Transactions on Nanobioscience | 2005

Equivalence-set genes partitioning using an evolutionary-DP approach

Terrence S. T. Mak; Kai-Pui Lam

Computation of transitive-closure equivalence sets has recently emerged as an important step for building static and dynamic models of gene network from DNA sequences. We present an evolutionary-DP approach in which dynamic programming (DP) is embedded into a genetic algorithm (GA) for fitness function evaluation of small equivalence sets (with m genes) within a large-scale genetic network of n genes, where n/spl Gt/m. This approach reduces a computation-intensive optimal problem of high dimension into a heuristic search problem on /sub n/C/sub m/ candidates. The DP computation of transitive closure forms the basic fitness evaluation for selecting candidate chromosomes generated by GA operators. By introducing bounded mutation and conditioned crossover operators to constrain the feasible solution domain, small transitive-closure equivalence sets for large genetic networks can be found with much reduced computational effort. Empirical results have successfully demonstrated the feasibility of our GA-DP approach for offering highly efficient solutions to large scale equivalence gene-set partitioning problem. We also describe dedicated GA-DP hardware using field programmable gate arrays (FPGAs), in which significant speedup could be obtained over software implementation.

IEEE | 2007