Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Meng-Huan Wu is active.

Publication


Featured researches published by Meng-Huan Wu.


embedded software | 2009

An effective synchronization approach for fast and accurate multi-core instruction-set simulation

Meng-Huan Wu; Cheng-Yang Fu; Peng-Chih Wang; Ren-Song Tsay

This paper proposes a synchronization approach for fast and accu-rate Multi-Core Instruction-Set Simulation (MCISS). An ideal MCISS should run accurately in a real-time fashion. In order to achieve accurate simulation results of MCISS, a lock-step approach, which synchronizes every cycle, is commonly used. However, this approach introduces immense overhead and lowers the simulation speed. Instead of synchronizing every cycle, our approach synchronizes the MCISS based on the data dependency among the simulated programs. Therefore, the synchronization overheads can be highly reduced while the accurate simulation results are ensured. With the proposed approach applied, the simulation speed of MCISS is up to 40 ~ 1,000 million instructions per second (MIPS) in general.


design, automation, and test in europe | 2010

Automatic generation of software TLM in multiple abstraction layers for efficient HW/SW co-simulation

Meng-Huan Wu; Wen-Chuan Lee; Chen-Yu Chuang; Ren-Song Tsay

This paper proposes a novel software Transaction-Level Modeling (TLM) approach for efficient HW/SW co-simulation. In HW/SW co-simulation, timing synchronization should be involved between the hardware and software simulations for keeping their concurrency. However, improperly handling timing synchronization either slows down the simulation speed or scarifies the simulation accuracy. Our approach performs timing synchronization only at the points of HW/SW interactions, so the accurate simulation result can be achieved efficiently. Furthermore, we define three abstraction levels of software TLM models based on the type of interactions captured. Given the target software, the software TLM models can be automatically generated in multiple abstraction layers. The experimental results show that our software TLM models attain 3 million instructions per second (MIPS) for low-level abstraction and go as high as 248 MIPS for higher level abstraction. Therefore, designers can have efficient co-simulation by selecting a proper layer according to the abstraction of corresponding hardware components.


design, automation, and test in europe | 2011

Cycle-count-accurate processor modeling for fast and accurate system-level simulation

Chen Kang Lo; Li-Chun Chen; Meng-Huan Wu; Ren-Song Tsay

Ideally, system-level simulation should provide a high simulation speed with sufficient timing details for both functional verification and performance evaluation. However, existing cycle-accurate (CA) and cycle-approximate (CX) processor models either incur low simulation speeds due to excessive timing details or low accuracy due to simplified timing models. To achieve high simulation speeds while maintaining timing accuracy of the system simulation, we propose a first cycle-count-accurate (CCA) processor modeling approach which pre-abstracts internal pipeline and cache into models with accurate cycle count information and guarantees accurate timing and functional behaviors on processor interface. The experimental results show that the CCA model performs 50 times faster than the corresponding CA model while providing the same execution cycle count information as the target RTL model.


design automation conference | 2011

A high-parallelism distributed scheduling mechanism for multi-core instruction-set simulation

Meng-Huan Wu; Peng-Chih Wang; Cheng-Yang Fu; Ren-Song Tsay

Ideally, multi-core instruction-set simulation should run in parallel to improve simulation performance. However, the conventional low-parallelism centralized scheduler greatly constrains simulation performance. To resolve this issue, we propose a high-parallelism distributed scheduling mechanism. The experimental results show that our proposed approach accelerates simulation by 6 to 20 times, depending on the number of cores.


design, automation, and test in europe | 2011

DOM: A Data-dependency-Oriented Modeling approach for efficient simulation of OS preemptive scheduling

Peng-Chih Wang; Meng-Huan Wu; Ren-Song Tsay

Operating system (OS) models are widely used to alleviate the overwhelmed complexity of running system-level simulation of software applications on specific OS implementation. Nevertheless, current OS modeling approaches are unable to maintain both simulation speed and accuracy when dealing with preemptive scheduling. This paper proposes a Data-dependency-Oriented Modeling (DOM) approach. By guaranteeing the order of shared variable accesses, accurate simulation results are obtained. Meanwhile, the simulation effort of our approach is considerably less than that of the conventional Cycle-Accurate (CA) modeling approach, thereby leading to high simulation speed, 42 to 223 million instructions per second (MIPS) or 114 times faster, than CA modeling as supported by our experimental results.


ACM Transactions in Embedded Computing Systems | 2013

A distributed timing synchronization technique for parallel multi-core instruction-set simulation

Meng-Huan Wu; Cheng-Yang Fu; Peng-Chih Wang; Ren-Song Tsay

As multi-core architecture has become the mainstream, the corresponding multi-core instruction-set simulation (MCISS) is also needed to aid system development. Ideally, we may run a MCISS in parallel to enhance the simulation speed. However, the conventional centralized timing synchronization mechanism would greatly constrain the parallelism of a MCISS, so the simulation speed is bounded. To resolve this issue, we propose a new distributed timing synchronization technique which allows higher parallelism for a MCISS. Hence, it accelerates the simulation speed by 9 to 20 times as the number of cores increases in contrast to the centralized synchronization approach.


design, automation, and test in europe | 2011

A shared-variable-based synchronization approach to efficient cache coherence simulation for multi-core systems

Cheng-Yang Fu; Meng-Huan Wu; Ren-Song Tsay

This paper proposes a shared-variable-based approach for fast and accurate multi-core cache coherence simulation. While the intuitive, conventional approach — synchronizing at either every cycle or memory access — gives accurate simulation results, it has poor performance due to huge simulation overloads. We observe that timing synchronization is only needed before shared variable accesses in order to maintain accuracy while improving the efficiency in the proposed shared-variable-based approach. The experimental results show that our approach performs 6 to 8 times faster than the memory-access-based approach and 18 to 44 times faster than the cycle-based approach while maintaining accuracy.


ACM Transactions on Design Automation of Electronic Systems | 2012

An Extended SystemC Framework for Efficient HW/SW Co-Simulation

Meng-Huan Wu; Peng-Chih Wang; Cheng-Yang Fu; Ren-Song Tsay

In this article, we propose an extended SystemC framework that directly enables software simulation in SystemC. Although SystemC has been widely adopted for system-level simulation of hardware designs nowadays, to complete HW/SW co-simulation, it still requires an additional instruction set simulator (ISS) for software execution. However, the heavy intercommunication overheads between the two heterogeneous simulators would significantly slow down simulation performance. To deal with this issue, our proposed approach automatically generates high-speed and equivalent SystemC models for target software applications that can be directly integrated with hardware models for complete HW/SW co-simulation. In addition, to properly handle multitasking, an efficient OS model is devised to support accurate preemptive scheduling. Since both the generated application model and the OS model are constructed in SystemC modules, our approach avoids heavy intercommunication overheads and achieves over 1,000 times faster simulation than that of the conventional ISS-SystemC approach. Experimental results demonstrate that our extended SystemC approach can perform at 50 to 220 MIPS while offering accurate simulation results.


Archive | 2009

Method and device for multi-core instruction-set simulation

Meng-Huan Wu; Cheng-Yang Fu; Peng-Chih Wang; Ren-Song Tsay


Archive | 2010

Method, System and Computer Readable Medium for Generating Software Transaction-Level Modeling (TLM) Model

Meng-Huan Wu; Ren-Song Tsay

Collaboration


Dive into the Meng-Huan Wu's collaboration.

Top Co-Authors

Avatar

Ren-Song Tsay

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Cheng-Yang Fu

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Peng-Chih Wang

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Chen-Kang Lo

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Li-Chun Chen

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Chen Kang Lo

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Chen-Yu Chuang

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Chien-Min Lee

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Hsien-lun Pai

National Tsing Hua University

View shared research outputs
Top Co-Authors

Avatar

Wen-Chuan Lee

National Tsing Hua University

View shared research outputs
Researchain Logo
Decentralizing Knowledge