Yuan Wen Hau
Universiti Teknologi Malaysia
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuan Wen Hau.
international symposium on circuits and systems | 2013
Yin Zhen Tei; Muhammad Nadzir Marsono; Nasir Shaikh-Husin; Yuan Wen Hau
Network-on-chip (NoC) has been introduced as a promising on-chip communication architecture to support many IP (intellectual property) cores on a single chip. Application mapping of IP cores onto a NoC topology is considered as a NP-hard problem. The increasing number of IP cores makes NoC application mapping more challenging to obtain optimum core-to-topology mapping. This paper proposes a genetic algorithm approach that incorporates network partitioning and heuristic crossover techniques to improve the NoC application mapping. Our experiment on VOPD (video object plane decoder) shows that our proposed method results in only 0.2% to 0.8% communication cost difference compared to global optimal mapping and 6% better communication cost compared to technique using conventional GA.
international symposium on circuits and systems | 2013
Sieng Wong; Chia Yee Ooi; Yuan Wen Hau; Muhammad Nadzir Marsono; Nasir Shaikh-Husin
This paper presents a feasible transition path (FTP) generation approach for testing extended finite state machines (EFSM). The major problem faced by EFSM-based testing is the existence of the infeasible paths due to conflict of the context variable with the enable conditions in the transition path. In order to avoid infeasible path generation, this paper proposed an approach that uses the modified breadth first search with conflict checker to generate a set of minimum FTP for each transition. An EFSM executable model is developed for algorithm modeling and verification as well as performance evaluation. Experimental results conducted on two EFSM models showed that the proposed approach is able to generate feasible transition path with at least 18% path length reduction.
ifip ieee international conference on very large scale integration | 2015
Jia Wei Tang; Yuan Wen Hau; Muhammad Nadzir Marsono
HW/SW partitioning is an important development step during HW/SW co-design to ensure application performance in embedded System-on-Chip (SoC). This paper formulates the optimization of HW/SW partitioning aiming at maximizing streaming throughput with predefined area constraint, targeted for multi-processor system with hardware accelerator sharing capability. Two software-oriented and the second hardware-oriented greedy heuristic algorithms for HW/SW partitioning are proposed and tested on several random graphs and one multimedia application (MP3 decoder). Results show that the best result from both proposed greedy algorithms produce 93.6% near-optimal solution compared to brute force ground truth with faster HW/SW partitioning time.
soft computing | 2014
Yin Zhen Tei; Yuan Wen Hau; Nasir Shaikh-Husin; Muhammad Nadzir Marsono
This paper proposes a multiobjective application mapping technique targeted for large-scale network-on-chip (NoC). As the number of intellectual property (IP) cores in multiprocessor system-on-chip (MPSoC) increases, NoC application mapping to find optimum core-to-topology mapping becomes more challenging. Besides, the conflicting cost and performance trade-off makes multiobjective application mapping techniques even more complex. This paper proposes an application mapping technique that incorporates domain knowledge into genetic algorithm (GA). The initial population of GA is initialized with network partitioning (NP) while the crossover operator is guided with knowledge on communication demands. NP reduces the large-scale application mapping complexity and provides GA with a potential mapping search space. The proposed genetic operator is compared with state-of-the-art genetic operators in terms of solution quality. In this work, multiobjective optimization of energy and thermal-balance is considered. Through simulation, knowledge-based initial mapping shows significant improvement in Pareto front compared to random initial mapping that is widely used. The proposed knowledge-based crossover also shows better Pareto front compared to state-of-the-art knowledge-based crossover.
international symposium on circuits and systems | 2014
Ling Kim Loo; Chia Yee Ooi; V. Y. Liew; Yuan Wen Hau; Muhammad Nadzir Marsono
The shrinking size of transistors and smaller interconnect elements contribute to higher probability of on-chip faults. In order to sustain the functionality of a system in the presence of faults, fault tolerance becomes one of the key feature in Network-on-Chip (NoC) design methodology. Existing end-to-end (E2E) error detection and correction (EDC) performs well at low error rate whereas switch-to-switch (S2S) EDC performs better at high error rate. Nonetheless, choosing between both techniques is required with changing fault occurrence probability. This paper proposes an adaptive online fault detection based on packet logging mechanism. In this proposed mechanism, each router logs transmitted packets and NACK packets as well as monitors its fault level continuously. Then, the router will determine either to use E2E or S2S EDC based on error probability. Based on experimental results, our proposed adaptive method switches between E2E or S2S relative to error probability performs better than only E2E or S2S.
international conference on microelectronics | 2008
Mohamed Khalil-Hani; Yuan Wen Hau
Today, designs of electronic systems are driven by application-specific embedded systems and system-on-chip (SoC). Designing these systems with conventional RTL-centric approach takes extremely long simulation cycles and painful verification process. The trend now is to describe these systems at a higher level of design abstraction. In this paper, we present a SystemC-based hardware/software (HW/SW) co-design and co-simulation environment that allows design space exploration to be conducted early at higher levels of abstraction. It aims to help a designer to obtain an appropriate HW/SW partitioning that satisfies specified area-speed design tradeoffs. A case study of a SoC implementing an Elliptic Curve Cryptographic (ECC) scheme is presented to illustrate the design flow and the design refinements from UML specification model to the final implementation model targetted for Altera Nios FPGA-based hardware development system.
Journal of Systems Architecture | 2017
Jeevan Sirkunan; Chia Yee Ooi; Nasir Shaikh-Husin; Yuan Wen Hau; Muhammad Nadzir Marsono
Multiprocessor embedded systems integrates diverse dedicated processing units to handle high performance applications such as in multimedia and network processing. However, lock-based synchronization limits the efficiency of such heterogeneous concurrent systems. Hardware Transactional Memory (HTM) is a promising approach in creating an abstraction layer for multi-threaded programming. However, HTM performance is application-specific and determined by version and conflict management configurations. Most previous HTM implementations for embedded system in literature were built on fixed version management that result in significant performance loss when transaction behaviour changes. In this paper, we propose a HTM targeted for embedded applications which is able to adapt its version management based on application behaviour at runtime. It is prototyped and analysed on Altera Cyclone IV platform. Random requests at different contention levels and different transaction sizes are used to verify the performance of the proposed HTM. Based on our experiments, lazy version management is able to obtain up to 12.82% speed-up compared to eager version management at high contention level. Meanwhile, eager version management obtains up to 37.84% speed-up compared to lazy version management at low contention. The adaptive mechanism is able to switch configuration at runtime based on applications behaviour for maximum performance.
field-programmable custom computing machines | 2015
Jeevan Sirkunan; Chia Yee Ooi; Nasir Shaikh-Husin; Yuan Wen Hau; Muhammad Nadzir Marsono
Hardware Transactional memory (HTM) performance is application-specific and is dependent on its version management and conflict management configurations. An adaptive mechanism is needed to adapt its configurations based on multiple application behaviour.
international symposium on circuits and systems | 2014
Jeevan Sirkunan; Chia Yee Ooi; Nasir Shaikh-Husin; Yuan Wen Hau; Muhammad Nadzir Marsono
Transactional memory (TM) is a promising approach in creating an abstraction layer for multi-threaded programming. However, the performance of TM is application-specific. Previous embedded system TM implementations exploit only conflict management to suit the application requirements. In this paper, we propose a hardware transactional memory (HTM) which exploits both version and conflict management. The proposed architecture is targeted for embedded applications and is area efficient compared to current methods that apply cache coherence protocols. The proposed system was tested with random requests at different contention levels. We implemented the HTM with four model processors on Cyclone IV Field Programmable Gate Array. Our results show that it offers up to 14% improvement in terms of clock cycle over the HTM scheme that only exploits conflict management.
international conference on intelligent systems, modelling and simulation | 2010
Yuan Wen Hau; Mohamed Khalil-Hani; Muhammad Nadzir Marsono
This paper presents CODESL, a SystemC-based hardware-software co-design and co-simulation framework for embedded systems based on System-on-Chip (SoC). This modelling platform, which works at Electronic System Level (ESL), enables early system functionality verification, as well as algorithm exploration before the final implementation prototype is available. It can validate the behaviour for both the hardware and the software modules of the embedded SoC, as well as the interaction between them with timed/cycleaccuracy. In addition, the platform also facilitates architecture exploration that assists the system designer in finding the best hardware-software partitioning. Results show that the proposed platform is capable of estimating the system execution cycle count within 5% deviation compared to the RTL deployment model for complex SoC embedded systems.