Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Weng-Geng Ho is active.

Publication


Featured researches published by Weng-Geng Ho.


international conference on electron devices and solid-state circuits | 2015

Counteracting differential power analysis: Hiding encrypted data from circuit cells

Kwen-Siong Chong; Kyaw Zwa Lwin Ne; Weng-Geng Ho; Nan Liu; Ali H. Akbar; Bah-Hwee Gwee; Joseph Sylvester Chang

We propose a balanced Pre-Charge Static Logic (PCSL) circuit style for asynchronous systems, and compare it against other reported circuit styles to counteract differential power analysis (DPA). Our study shows that all these circuit styles (including our balanced PCSL) dissipate different energy due to data-dependency, and hence balancing the energy of circuits embodying these circuit styles remains challenging. However, in view of low circuit overheads and asynchronous operations (with noise generation), our balanced PCSL is still competitive in terms of DPA-resistance, requiring 3.5x less power traces than its NULL convention logic counterpart.


international symposium on circuits and systems | 2013

Low power sub-threshold asynchronous QDI Static Logic Transistor-level Implementation (SLTI) 32-bit ALU

Weng-Geng Ho; Kwen-Siong Chong; Bah-Hwee Gwee; Joseph Sylvester Chang

We propose an asynchronous-logic (async) Quasi-Delay-Insensitive (QDI) Static Logic Transistor-level Implementation (SLTI) approach for low power sub-threshold operation. The approach is implemented to design 32-bit pipelined Arithmetic and Logic Units (ALUs), the primary computation core for microprocessors, and benchmarked against the reported Pre-Charged Half-Buffer (PCHB). There are two key attributes in this proposed design. First, the proposed SLTI ALU design can perform dynamic voltage scaling seamless by only changing the supply voltage from nominal (1V) to sub-threshold (~0.2V) regions for high speed/low power operation. Second, the ALU achieves ultra-low power dissipation (3.5μW) at the lowest VDD point (~0.15V). For fair of comparison, both implemented ALUs have identical functionality and functional blocks, are implemented using the same 65nm CMOS process. Based on the simulations, the minimum energy point occurs at VDD = 0.2V for SLTI-based ALU and at VDD = 0.3V for PCHB-based ALU. The SLTI-based ALU have ~93% and ~89% lower energy on the arithmetic and logic operations respectively from VDD = 1V to VDD = 0.2V. At VDD = 0.2V, with 9MHz input switching rate, the async ALU based on our proposed SLTI approach dissipates ~51% and ~44% lower power than the reported PCHB counterpart on the arithmetic and logic operations respectively.


asia pacific conference on circuits and systems | 2016

Interceptive side channel attack on AES-128 wireless communications for IoT applications

Ali Akbar Pammu; Kwen-Siong Chong; Weng-Geng Ho; Bah-Hwee Gwee

We propose wireless interceptive Side-Channel Attack (SCA) technique to reveal the 16-byte secret key of the AES-128 encryption algorithm in wireless communications, through Correlation Electromagnetic Analysis (CEMA) for Internet of Things (IoT) applications. The encrypted wireless communication link is established using two ATmega-processor based Arduino boards. There are two key features in our proposed interceptive SCA technique. First, we identify the sensitive modules, which emit significant EM signal (physical leakage information) of the ATmega processor during the encryption process. The significant EM signals are highly correlated with processed data to reveal the secret key. Second, we investigate the resistance of AES-128 encryption algorithm implementation on ATmega processor against CEMA based SCA. The wireless signal is intercepted and correlated with EM signals generated during the encryption process. Based on our experimental results, the correlated EM signals leak out at the three modules — FLASH memory, data bus and SRAM modules during the encryption process are 101.56 dBμV, 105.34 dBμV and 121.79 dBμV respectively. In addition, we perform the CEMA attacks on the AES-128 implementation on the ATmega processor and the secret key is successfully revealed at 20,000 EM traces.


international symposium on circuits and systems | 2015

High robustness energy- and area-efficient dynamic-voltage-scaling 4-phase 4-rail asynchronous-logic Network-on-Chip (ANoC)

Weng-Geng Ho; Kwen-Siong Chong; Ne Kyaw Zwa Lwin; Bah-Hwee Gwee; Joseph Sylvester Chang

We propose an 18-bit 5-interface asynchronous-logic Network-on-Chip (ANoC) router based on the quasi-delay-insensitive (QDI) realization approach for high secured cryptography applications. There are four key features of the proposed ANoC router. First, it embodies the novel high-speed low-power Sense-Amplifier Half Buffer 4-rail cells. Second, it is designed based on QDI protocol, and hence is highly robust against process-voltage-temperature (PVT) variations. Third, it is functional for full dynamic voltage scaling from nominal (VDD=1.2V) to sub-threshold (VDD=0.3V) regions, and is potentially excellent for low power management applications. Fourth, it embodies a distributed-based XY routing algorithm to utilize a 4-bit header of flow control unit (flit) for routing up to 4×4 cluster, hence minimizing the routing overhead. We realize the proposed ANoC router (@65nm CMOS), and benchmark it against the reported ANoC router embodying the conventional Weak-Conditioned Half-Buffer (WCHB) QDI realization approach. Both our proposed and reported designs feature the high operation robustness, but our design is 41% more energy-efficient, and 21% more area-efficient than the reported counterpart. The prototype of ANoC router occupies only 0.105 mm2 and can operate down to 0.3V. At VDD=0.3V, it dissipates 44 fJ per bit and operate 105 ns per flit.


Iet Circuits Devices & Systems | 2015

Low power sub-threshold asynchronous quasi-delay-insensitive 32-bit arithmetic and logic unit based on autonomous signal-validity half-buffer

Weng-Geng Ho; Kwen-Siong Chong; Bah-Hwee Gwee; Joseph Sylvester Chang

The authors propose an asynchronous-logic (async) quasi-delay-insensitive (QDI) autonomous signal-validity half-buffer (ASVHB) realisation approach for low power sub-threshold operation (V DD = 0.2 V). There are three key attributes in the proposed ASVHB realisation approach. First, the ASVHB realisation approach embodies integrated autonomous validity signals, which are unique and are used exclusively to simplify the circuit implementation for QDI protocol. Second, the ASVHB realisation approach applies the fine-grained gate-level method, which propagates data through a single-cell datapath pipeline to maximise the throughput rate. Third, the ASVHB realisation approach adopts the static-logic implementation, which maintain stable output states (by connecting them directly to the power rails), to feature high robustness for sub-threshold operation. They compare their ASVHB realisation approach against the competitive reported weak-conditioned half-buffer (WCHB) and pre-charged half-buffer (PCHB) realisation approaches. The WCHB and PCHB library cells, on average, require ∼2.1 × and ∼1.9 × more transistors than the ASVHB library cells. With respect to a 3-stage pipeline realisation, the WCHB and PCHB pipelines, on average, require 1.8 × and 1.5× more transitions per-cycle than the ASVHB pipeline. They design an async 32-bit arithmetic and logic unit (ALU) based on the proposed ASVHB realisation approach (at 65 nm CMOS process). Their ASVHB ALU occupies 0.092 mm2, and in many merits, outperforms the WCHB and PCHB counterparts. The WCHB and PCHB counterparts require ∼1.7 × and ∼1.4× more transistors, respectively, than their design. At the sub-threshold voltage of V DD = 0.2 V, the WCHB and PCHB counterparts dissipate ∼1.7× and ∼2.6× more energy, respectively, and are, respectively, ∼0.95× and ∼0.73× slower throughput.


asia pacific conference on circuits and systems | 2014

Low delay-variation sub-/near-threshold asynchronous-to-synchronous interface controller for GALS Network-on-Chips

Weng-Geng Ho; Kwen-Siong Chong; Bah-Hwee Gwee; Joseph Sylvester Chang

We propose an Asynchronous-to-Synchronous Interface Controller (A2S-IC) with low delay-variation towards Process, Voltage and Temperature (PVT) variations for sub-threshold/near-threshold operation in low power applications. This A2S-IC is targeted for a full-range Dynamic Voltage Scaling (DVS) Global-Asynchronous-Local-Synchronous (GALS) Network-on-Chip (NoC). There are three key attributes in this proposed A2S-IC. First, it is realized using static-logic (over dynamic-logic), hence is more appropriate for DVS (and sub-threshold operation). Second, it is implemented using gate-level standard-cell to simplify the implementation efforts. Third, it is designed to share some internal nodes, hence reducing the redundant switching for data validity checking. The proposed A2S-IC is compared against its reported dynamic-logic counterpart; both are implemented in the same 65nm CMOS process. Based on the simulations conducted at 27 C, our proposed A2S-IC is more throughput-efficient at near- and sub-threshold operations, featuring ~19% and ~66% faster throughput at FDD =0.5V and FDD =0.3V respectively. When the temperature variation (0°C to 100°C) is considered at the sub-threshold operation, the proposed A2S-IC demonstrates 140% faster throughput than the reported design, the former only features up to 1.6x delay-variation but the latter exhibits up to 4x delay-variation. The proposed A2S-IC is able to operate at the voltage as low as 0.15V (as opposed to 0.3V for the reported design).


asia pacific conference on circuits and systems | 2016

Success rate model for fully AES-128 in correlation power analysis

Ali Akbar Pammu; Kwen-Siong Chong; Ne Kyaw Zwa Lwin; Weng-Geng Ho; Nan Liu; Bah-Hwee Gwee

We propose a Success Rate (SR) estimation model for Correlation Power Analysis (CPA) attack on AES-128 encrypted devices. The SR is a ratio between the number of successful attacks to obtain secret key and the total number of attacks. There are two key features in the proposed model. First, we derive the Second Order Standard Deviation (SOSD) of the processed data to analyze their switching activities during encryption processes, to identify the Least Difficult Sub-Key (LDSK — the easiest revealable sub-key) and Most Difficult Sub-Key (MDSK — the hardest revealable sub-key). Second, we apply the Error Function Model (EFM) by using LDSK and MDSK to estimate the SR with respect to the number of power traces required to reveal the secret key. Our proposed SR estimation model is evaluated based on a Sukura-X encryption board and shows that our proposed SOSD requires only 1,000 processed data to determine the LDSK and MDSK. Based on the EFM of the LDSK and MDSK, it shows that 10%–94% of SR requires 1,220–3,550 power traces respectively to reveal all the 16 sub-keys. We demonstrate the accuracy of our proposed SR estimation model by benchmarking against the two reporting techniques to evaluate 1-byte of key and show that the accuracy of our technique is 96% whereas other reported techniques are only 21% and 49%.


international symposium on circuits and systems | 2012

A comparative study on asynchronous Quasi-Delay-Insensitive templates

Kok-Leong Chang; Tong Lin; Weng-Geng Ho; Kwen-Siong Chong; Bah-Hwee Gwee; Joseph Sylvester Chang

The robustness of asynchronous logic has proved useful in dealing with contemporary problems in CMOS design such as process variations and power management. However, the general cryptic nature of asynchronous logic has stymied the widespread acceptance of this alternate design technique. Fortunately, the semi-custom approach to asynchronous design reduces the tedious handcrafting efforts that are often non-trivial in large system-on-chips (SoCs). However, even with the adoption of this design approach requires careful selection of asynchronous templates that will suit overall system needs. Therefore in this paper, the most eminent Quasi-Delay-Insensitive asynchronous template families reported to date will be presented, and followed by an in-depth comparison of various design FOMs - template area, static/dynamic capacity, cycle time, latency, throughput and Et2. The most aggressive template (EESTFB) can reach a maximum throughput of 3.56Giga items/s on 0.13µm @ 1.2V.


international symposium on circuits and systems | 2011

Improved asynchronous-logic dual-rail Sense Amplifier-based Pass Transistor Logic with high speed and low power operation

Weng-Geng Ho; Kwen-Siong Chong; Bah-Hwee Gwee; Joseph Sylvester Chang; Yin Sun; Kok-Leong Chang

We propose a robust asynchronous-logic dual-rail Sense Amplifier-based Pass Transistor Logic (SAPTL) approach with improved speed and power attributes over reported SAPTL approach. These attributes are achieved by simplifying various sub-blocks therein to reduce the stacking of pass transistors and the number of transistor switchings, and to avoid floating nodes. By means of an 8-bit pipeline adder and on the basis of computation simulations (@ 1V, 45nm SOI process), we show that our proposed SAPTL adder is 37% faster, yet 14% lower power dissipation (@ 200MHz input-rate), 18% lower energy dissipation (per operation), and 47% better energy-delay product. These substantially improved attributes are achieved with insignificant overhead - just 3% more transistors.


IEEE Transactions on Very Large Scale Integration Systems | 2018

Asynchronous-Logic QDI Quad-Rail Sense-Amplifier Half-Buffer Approach for NoC Router Design

Weng-Geng Ho; Kwen-Siong Chong; Kyaw Zwa Lwin Ne; Bah-Hwee Gwee; Joseph Sylvester Chang

We propose a low area overhead and power-efficient asynchronous-logic quasi-delay-insensitive (QDI) sense-amplifier half-buffer (SAHB) approach with quad-rail (i.e., 1-of-4) data encoding. The proposed quad-rail SAHB approach is targeted for area- and energy-efficient asynchronous network-on-chip (ANoC) router designs. There are three main features in the proposed quad-rail SAHB approach. First, the quad-rail SAHB is designed to use four wires for selecting four ANoC router directions, hence reducing the number of transistors and area overhead. Second, the quad-rail SAHB switches only one out of four wires for 2-bit data propagation, hence reducing the number of transistor switchings and dynamic power dissipation. Third, the quad-rail SAHB abides by QDI rules, hence the designed ANoC router features high operational robustness toward process-voltage-temperature (PVT) variations. Based on the 65-nm CMOS process, we use the proposed quad-rail SAHB to implement and prototype an 18-bit ANoC router design. When benchmarked against the dual-rail counterpart, the proposed quad-rail SAHB ANoC router features 32% smaller area and dissipates 50% lower energy under the same excellent operational robustness toward PVT variations. When compared to the other reported ANoC routers, our proposed quad-rail SAHB ANoC router is one of the high operational robustness, smallest area, and most energy-efficient designs.

Collaboration


Dive into the Weng-Geng Ho's collaboration.

Top Co-Authors

Avatar

Bah-Hwee Gwee

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Kwen-Siong Chong

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Joseph Sylvester Chang

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Kyaw Zwa Lwin Ne

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Nan Liu

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Tong Lin

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Ali Akbar Pammu

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Kok-Leong Chang

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Ne Kyaw Zwa Lwin

Nanyang Technological University

View shared research outputs
Top Co-Authors

Avatar

Ali H. Akbar

Nanyang Technological University

View shared research outputs
Researchain Logo
Decentralizing Knowledge