Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Zhehui Wang is active.

Publication


Featured researches published by Zhehui Wang.


ieee computer society annual symposium on vlsi | 2011

A NoC Traffic Suite Based on Real Applications

Weichen Liu; Jiang Xu; Xiaowen Wu; Yaoyao Ye; Xuan Wang; Wei Zhang; Mahdi Nikdast; Zhehui Wang

As benchmark programs for microprocessor architectures, network-on-chip (NoC) traffic patterns are essential tools for NoC performance assessments and architecture explorations. The fidelity of NoC traffic patterns has profound influence on NoC studies. For the first time, this paper presents a realistic traffic benchmark suite, called MCSL, and the methodology used to generate it. The publicly released MCSL benchmark suite includes a set of realistic traffic patterns for 8 real applications and covers popular NoC architectures. It captures not only the communication behaviors in NoCs but also the temporal dependencies among them. MCSL benchmark suite can be easily incorporated into existing NoC simulators and significantly improve NoC simulation accuracy. We developed a systematic traffic generation methodology to create MCSL based on real applications. The methodology uses formal computational models to capture both communication and computation requirements of applications. It optimizes application mapping and scheduling to faithfully maximize overall system performance and utilization before extracting realistic traffic patterns through cycle-accurate simulations. Experiment results show that MCSL benchmark suite can be used to study NoC characteristics more accurately than traditional random traffic patterns.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2013

3-D Mesh-Based Optical Network-on-Chip for Multiprocessor System-on-Chip

Yaoyao Ye; Jiang Xu; Baihan Huang; Xiaowen Wu; Wei Zhang; Xuan Wang; Mahdi Nikdast; Zhehui Wang; Weichen Liu; Zhe Wang

Optical networks-on-chip (ONoCs) are emerging communication architectures that can potentially offer ultrahigh communication bandwidth and low latency to multiprocessor systems-on-chip (MPSoCs). In addition to ONoC architectures, 3-D integrated technologies offer an opportunity to continue performance improvements with higher integration densities. In this paper, we present a 3-D mesh-based ONoC for MPSoCs, and new low-cost nonblocking 4 × 4, 5 × 5, 6 × 6, and 7 × 7 optical routers for dimension-order routing in the 3-D mesh-based ONoC. Besides, we propose an optimized floorplan for the 3-D mesh-based ONoC. The floorplan follows the regular 3-D mesh topology but implements all optical routers in a single optical layer. The floorplan is optimized to minimize the number of extra waveguide crossings caused when merging the 3-D ONoC to one optical layer. Based on a set of real applications and uniform traffic pattern, we develop a SystemC-based cycle-accurate NoC simulator and compare the 3-D mesh-based ONoC with the matched 2-D mesh-based ONoC and 2-D electronic NoC for performance and energy efficiency. Additionally, we quantitatively analyze thermal effects on the 3-D 8 × 8 × 2 mesh-based ONoC.


IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2014

Systematic Analysis of Crosstalk Noise in Folded-Torus-Based Optical Networks-on-Chip

Mahdi Nikdast; Jiang Xu; Xiaowen Wu; Wei Zhang; Yaoyao Ye; Xuan Wang; Zhehui Wang; Zhe Wang

Photonic devices are widely used in optical networks-on-chip (ONoCs) and suffer from crosstalk noise. The accumulative crosstalk noise in large scale ONoCs diminishes the signal-to-noise ratio (SNR), causes severe performance degradation, and constrains the network scalability. For the first time, this paper systematically analyzes and models the worst-case crosstalk noise and SNR in folded-torus-based ONoCs. Formal analytical models for the worst-case crosstalk noise and SNR are presented. The crosstalk noise analysis is hierarchically performed at the basic photonic device level, then at the optical router level, and finally at the network level. We consider a general 5 × 5 optical router model to enable crosstalk noise and SNR analyses in folded-torus-based ONoCs using an arbitrary 5 × 5 optical router. Using the general optical router model, the worst-case SNR link candidates, which restrict the network scalability, are found. Also, we present a novel crosstalk noise and loss analysis platform, called CLAP, which can analyze the crosstalk noise and SNR of arbitrary ONoCs. Case studies of optimized crossbar and Crux optical routers using recent photonic device parameters are presented. Moreover, we compare the worst-case crosstalk noise and SNR in folded-torus-based and mesh-based ONoCs using optimized crossbar and Crux optical routers. The quantitative simulation results show the critical behavior of crosstalk noise in large scale ONoCs. For example, in folded-torus-based ONoCs using the Crux optical router, the noise power exceeds the signal power for network sizes larger than 12 × 12; when the network size is 20 × 20 and the injection signal power equals 0 dBm, the signal power and noise power are -9.4 dBm and -6.1 dBm, respectively.


IEEE Transactions on Very Large Scale Integration Systems | 2013

Formal Worst-Case Analysis of Crosstalk Noise in Mesh-Based Optical Networks-on-Chip

Yiyuan Xie; Mahdi Nikdast; Jiang Xu; Xiaowen Wu; Wei Zhang; Yaoyao Ye; Xuan Wang; Zhehui Wang; Weichen Liu

Crosstalk noise is an intrinsic characteristic as well as a potential issue of photonic devices. In large scale optical networks-on-chips (ONoCs), crosstalk noise could cause severe performance degradation and prevent ONoC from communicating properly. The novel contribution of this paper is the systematical modeling and analysis of the crosstalk noise and the signal-to-noise ratio (SNR) of optical routers and mesh-based ONoCs using a formal method. Formal analytical models for the worst-case crosstalk noise and minimum SNR in mesh-based ONoCs are presented. The crosstalk analysis is performed at device, router, and network levels. A general 5 × 5 optical router model is proposed for router level analysis. The minimum SNR optical link candidates, which constrain the scalability of mesh-based ONoCs, are identified. It is also shown that symmetric mesh-based ONoCs have the best SNR performance. The presented formal analyses can be easily applied to other optical routers and mesh-based ONoCs. Finally, we present case studies of mesh-based ONoCs using the optimized crossbar and Crux optical routers to evaluate the proposed formal method. We find that crosstalk noise can significantly limit the scalability of mesh-based ONoCs. For example, when the mesh-based ONoC size, using optimized crossbar, is larger than 8 × 8, the optical signal power is smaller than the crosstalk noise power; when the network size is 16 × 16 and the input power is 0 dBm, in the worst-case, the signal power is -24.9 dBm and the crosstalk noise power is -11 dBm.


IEEE Transactions on Very Large Scale Integration Systems | 2013

System-Level Modeling and Analysis of Thermal Effects in Optical Networks-on-Chip

Yaoyao Ye; Jiang Xu; Xiaowen Wu; Wei Zhang; Xuan Wang; Mahdi Nikdast; Zhehui Wang; Weichen Liu

The performance of multiprocessor systems, such as chip multiprocessors (CMPs), is determined not only by individual processor performance, but also by how efficiently the processors collaborate with one another. It is the communication architecture that determines the collaboration efficiency on the hardware side. Optical networks-on-chip (ONoCs) are emerging communication architectures that can potentially offer ultra-high communication bandwidth and low latency to multiprocessor systems. Thermal sensitivity is an intrinsic characteristic of photonic devices used by ONoCs as well as a potential issue. This paper systematically modeled and quantitatively analyzed the thermal effects in ONoCs. We used an 8 × 8 mesh-based ONoC as a case study and evaluated the impacts of thermal effects in the average power efficiency for real MPSoC applications. We revealed three important factors regarding ONoC power efficiency under temperature variations, and proposed several techniques to reduce the temperature sensitivity of ONoCs. These techniques include the optimal initial setting of microresonator resonant wavelength, increasing the 3-dB bandwidth of optical switching elements by parallel coupling multiple microresonators, and the use of passive-routing optical router Crux to minimize the number of switching stages in mesh-based ONoCs. We gave a mathematical analysis of periodically parallel coupling of multiple microresonators and show that the 3-dB bandwidth of optical switching elements can be widened nearly linearly with the ring number. Evaluation results for different real MPSoC applications show that, on the basis of thermal tuning, the optimal device setting improves the average power efficiency by 54% to 1.2 pJ/bit when chip temperature reaches 85 °C. The findings in this paper can help support the further development of this emerging technology.


ACM Journal on Emerging Technologies in Computing Systems | 2014

SUOR: Sectioned Undirectional Optical Ring for Chip Multiprocessor

Xiaowen Wu; Jiang Xu; Yaoyao Ye; Zhehui Wang; Mahdi Nikdast; Xuan Wang

Chip multiprocessor (CMP) is becoming an attractive platform for applications seeking both high performance and high energy efficiency. In large-scale CMPs, the communication efficiency among cores is crucial for the overall system performance and energy consumption. In this article, we propose a ring-based optical network-on-chip, called SUOR, to fulfill the communication requirement of CMPs. SUOR effectively explores the distinctive properties of optical signals and photonic devices, and dynamically partitions each data channel into multiple sections. Each section can be utilized independently to boost performance as well as reduce energy consumption. We develop a set of distributed control protocols and algorithms for SUOR, but physically allocate the corresponding cluster agents close to each other to benefit from the strengths of optical interconnects at long distances as well as electrical interconnects at short distances. Simulation results show that SUOR outperforms the alternative optical networks under a wide range of traffic patterns. For example, compared with MWSR design, SUOR achieves 2.58× throughput as well as saves 64% energy consumption on average in a 256-core CMP. Compared with MWMR design, SUOR achieves 1.52× throughput and reduces 73% energy consumption on average.


IEEE Transactions on Computers | 2014

Floorplan Optimization of Fat-Tree-Based Networks-on-Chip for Chip Multiprocessors

Zhehui Wang; Jiang Xu; Xiaowen Wu; Yaoyao Ye; Wei Zhang; Mahdi Nikdast; Xuan Wang; Zhe Wang

Chip multiprocessor (CMP) is becoming increasingly popular in the processor industry. Efficient network-on-chip (NoC) that has similar performance to the processor cores is important in CMP design. Fat-tree-based on-chip network has many advantages over traditional mesh or torus-based networks in terms of throughput, power efficiency, and latency. It has a bright future in the development of CMP. However, the floorplan design of the fat-tree-based NoC is very challenging because of the complexity of topology. There are a large number of crossings and long interconnects, which cause severe performance degradation in the network. In electronic NoCs, the parasitic capacitance and inductance will be significant. In optical ones, large crosstalk noise and power loss will be introduced. The novel contribution of this paper is to propose a method to optimize the fat-tree floorplan, which can effectively reduce the number of crossings and minimize the interconnect length. Two types of floorplans are proposed, which could be applied to fat-tree-based networks of arbitrary size. Compared with the traditional one, our floorplans could reduce more than 87% of the crossings. Since the traversal distance for signals is related to the aspect ratio of the processor cores, we also present a method to calculate the optimum aspect ratio of the processor cores to minimize the traversal distance.


IEEE Design & Test of Computers | 2014

A Case Study of Signal-to-Noise Ratio in Ring-Based Optical Networks-on-Chip

Luan H. K. Duong; Mahdi Nikdast; Sébastien Le Beux; Jiang Xu; Xiaowen Wu; Zhehui Wang; Peng Yang

Microresonators have been utilized to construct optical interconnection networks. One of the drawbacks of these microresonators is that they suffer from intrinsic crosstalk noise and power loss, resulting in Signal-to-Noise Ratio (SNR) reduction and system performance degradation at the network level. The novel contribution of this paper is to systematically study the worst-case crosstalk noise and SNR in a ring-based ONoC, the Corona. In the paper, Coronas data channel and broadcast bus are investigated, with formal general analytical models presented at the device and network levels. Leveraging our detailed analytical models, we present quantitative simulations of the worst-case power loss, crosstalk noise, and SNR in Corona. Moreover, we compare the worst-case results in Corona with those in mesh-based and folded-torus-based ONoCs, all of which consist of the same number of cores as Corona. The quantitative results demonstrate the damaging impact of crosstalk noise and power loss in Corona: the worst-case SNR is roughly 14.0 dB in the network, while the worst-case power loss is substantially high at -69.3 dB in the data channel.


IEEE Transactions on Very Large Scale Integration Systems | 2015

Fat-Tree-Based Optical Interconnection Networks Under Crosstalk Noise Constraint

Mahdi Nikdast; Jiang Xu; Luan H. K. Duong; Xiaowen Wu; Zhehui Wang; Xuan Wang; Zhe Wang

Optical networks-on-chip (ONoCs) have shown the potential to be substituted for electronic networks-on-chip (NoCs) to bring substantially higher bandwidth and more efficient power consumption in both onand off-chip communication. However, basic optical devices, which are the key components in constructing ONoCs, experience inevitable crosstalk noise and power loss; the crosstalk noise from the basic devices accumulates in large-scale ONoCs and considerably hurts the signal-to-noise ratio (SNR) as well as restricts the network scalability. For the first time, this paper presents a formal system-level analytical approach to analyze the worst-case crosstalk noise and SNR in arbitrary fat-tree-based ONoCs. The analyses are performed hierarchically at the basic optical device level, then at the optical router level, and finally at the network level. A general 4 × 4 optical router model is considered to enable the proposed method to be adaptable to fat-tree-based ONoCs using an arbitrary 4×4 optical router. Utilizing the proposed general router model, the worst-case SNR link candidates in the network are determined. Moreover, we apply the proposed analyses to a case study of fat-tree-based ONoCs using an optical turnaround router (OTAR). Quantitative simulation results indicate low values of SNR and scalability constraints in large scale fat-tree-based ONoCs, which is due to the high power of crosstalk noise and power loss. For instance, in fat-tree-based ONoCs using the OTAR, when the injection laser power equals 0 dBm, the crosstalk noise power is higher than the signal power when the number of processor cores exceeds 128; when it is equal to 256, the signal power, crosstalk noise power, and SNR are -17.3, -11.9, and -5.5 dB, respectively.


IEEE Transactions on Very Large Scale Integration Systems | 2015

An Inter/Intra-Chip Optical Network for Manycore Processors

Xiaowen Wu; Jiang Xu; Yaoyao Ye; Xuan Wang; Mahdi Nikdast; Zhehui Wang; Zhe Wang

Manycore processor system is becoming an attractive platform for applications seeking both high performance and high energy efficiency. However, huge communication demands among cores, large power density, and low process yield will be three significant limitations for the scalability of future manycore processors. Breaking a large chip into multiple smaller ones can alleviate the problems of power density and yield, but would worsen the problem of communication efficiency due to the limited off-chip bandwidth. In response, we propose an inter/intra-chip optical network, which will not only fulfill the intra-chip communication requirements but also address the inter-chip communication, by exploiting the advantages of optical links with high bandwidth and energy efficiency. The network is composed of an inter-chip subnetwork and multiple intra-chip subnetworks, and the subnetworks closely coordinate with each other to balance the traffic. The proposed network effectively explores the distinctive properties of optical signals and photonic devices, and dynamically partitions each data channel into multiple sections. Each section can be utilized independently to boost performance as well as reduce energy consumption. Simulation results show that our network can achieve higher throughput with lower power consumption than alternative designs under most of synthetic traffics and real applications.

Collaboration


Dive into the Zhehui Wang's collaboration.

Top Co-Authors

Avatar

Jiang Xu

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xiaowen Wu

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xuan Wang

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Zhe Wang

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Peng Yang

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Mahdi Nikdast

École Polytechnique de Montréal

View shared research outputs
Top Co-Authors

Avatar

Luan H. K. Duong

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Yaoyao Ye

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Zhifei Wang

Hong Kong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Wei Zhang

Hong Kong University of Science and Technology

View shared research outputs
Researchain Logo
Decentralizing Knowledge