Is this you? Create Your Porfile

Arghavan Asad

Iran University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Arghavan Asad is active.

Explore More

Publication

Featured researches published by Arghavan Asad.

digital systems design | 2009

A Fault Tolerant NoC Architecture for Reliability Improvement and Latency Reduction

Amir Ehsani Zonouz; Mehrdad Seyrafi; Arghavan Asad; Mohsen Soryani; Mahmoud Fathy; Reza Berangi

With reducing feature size of transistors and increasing number of cores on a single chip, fault tolerance and reliability have become two significant challenges for IC designers. Since chip design is extremely cost-sensitive, the fault tolerance redundancy must be provided at a reasonable cost. In this paper, a fault tolerant NoC architecture with cores linked to two switches instead of one, is proposed. This architecture is able to save cores with a faulty switch. Also, to be more efficient and to compensate this redundancy, a new routing algorithm is suggested that can be dynamically reconfigured to escape faulty switches. According to evaluation of the proposed architecture, latency and reliability are improved.

Intelligent Decision Technologies | 2009

A Predominant Routing for on-chip networks

Arghavan Asad; Mehrdad Seyrafi; Amir Ehsani Zonouz; Mohsen Soryani; Mahmood Fathy

It is necessary to suppress the average delay to low when a packet is forwarded from a source node to a destination node in Network-on-Chip (NoC) for the quality maintenance of the communication between nodes. Routing algorithms have a prominent impact on communication quality and performance in on chip interconnection networks. In this paper, we present a novel routing method called Predominant Routing which can select the best route for communication flows using a simple setup network. In the setup network time, a number of low-latency virtual point-to-point connections are provided to construct the best route at run time when a new flow (a connection between a source and its destination to carry messages) is detected. Evaluation results show that the proposed routing algorithm consumes low power and approach to extremely low latency in packet switched Networks-on-Chip.

international conference on electronics and information engineering | 2010

A new low cost fault tolerant solution for mesh based NoCs

Mehrdad Seyrafi; Arghavan Asad; Amir Ehsani Zonouz; Reza Berangi; Mahmood Fathy; Mohsen Soryani

In this paper a new fault tolerant routing algorithm with minimum hardware requirements and extremely high fault tolerance for 2D-mesh based NoCs is proposed. The LCFT (Low Cost Fault Tolerant) algorithm, removes the main limitations (forbidden turns) of the famous XY. So not only many new routes will be added to the list of selectable paths as well as deadlock freedom, but also it creates high level of fault tolerance. All these things are yielded only by the cost of adding one more virtual channel (for a total of two). Results show that LCFT algorithm can work well under almost bad conditions of faults in comparison with the already published methods.

digital systems design | 2015

Exploiting Heterogeneity in Cache Hierarchy in Dark-Silicon 3D Chip Multi-processors

Arghavan Asad; Ozcan Ozturk; Mahmood Fathy; Mohammad Reza Jahed-Motlagh

Technology scaling has enabled increasing number of cores on a chip in Chip-Multiprocessors (CMPs). As the number of cores increases, the overall system will need to provide more cache resources to feed all the cores. However, increasing the size of each cache level in the cache hierarchy of CMPs mitigates the large off-chip memory access latencies and bandwidth constraints. Moreover, cache hierarchy is known as one of the most power-hungry components in many-core CMPs because leakage power within the cache systems has become a significant contributor in the overall chip power budget in deep sub-micron as well as dark silicon era. Due to the many advantages of Non-Volatile Memory (NVM) technology such as high density, near zero leakage, and non-volatility, in this paper, we focus on exploiting such memories in the cache hierarchy. Specifically, we focus on 3D CMPs to decrease the leakage power consumption and mitigating the dark silicon phenomenon. Experimental results show that the proposed method on average improves the throughput by 45.5% and energy-delay product by 56% when compared to the conventional single cache technology.

reconfigurable computing and fpgas | 2009

Modeling and Analyzing of Blocking Time Effects on Power Consumption in Network-on-Chips

Arghavan Asad; Amir Ehsani Zonouz; Mehrdad Seyrafi; Mohsen Soryani; Mahmood Fathy

Networks-on-Chip (NoC) has been proposed as an only efficient and scalable solution for providing global on-chip communications in any large VLSI design. Simultaneously, power dissipation issues have grown to such importance that they now constrain attainable performance. The large value of power consumption, relative to the active power, can therefore have serious implications for the feasibility of deploying NoCs. If NoCs are to be accepted, their full power implications need to be known. Moreover, these power characteristics must be accurately understood across the large possible design space of NoCs. Blocking time is one of the effective factors on NoC power consumption. In this paper we present a Markovian model for evaluating the amount of the dissipated power comes from packet blocking and show the blocking time effects on total power consumption of on-chip networks approach.

Microprocessors and Microsystems | 2017

Optimization-based power and thermal management for dark silicon aware 3D chip multiprocessors using heterogeneous cache hierarchy

Arghavan Asad; Ozcan Ozturk; Mahmood Fathy; Mohammad Reza Jahed-Motlagh

Abstract Management of a problem recently known as “dark silicon” is a new challenge in multicore designs. Prior innovative studies have addressed the dark silicon problem in the fields of power-efficient core design. However, addressing dark silicon challenges in uncore component designs such as cache hierarchy, on-chip interconnect etc. that consume significant portion of the on-chip power consumption is largely unexplored. In this paper, for the first time, we propose an integrated approach which considers the impact of power consumption of core and uncore components simultaneously to improve multi/many-core performance in the dark silicon era. The proposed approach dynamically (1) predicts the changing program behavior on each core; (2) re-determines frequency/voltage, cache capacity and technology in each level of the cache hierarchy based on the programs scalability in order to satisfy the power and temperature constraints. In the proposed architecture, for future chip-multiprocessors (CMPs), we exploit emerging technologies such as non-volatile memories (NVMs) and 3D techniques to combat dark silicon. Also, for the first time, we propose a detailed power model which is useful for future dark silicon CMPs power modeling. Experimental results on SPEC 2000/2006 benchmarks show that the proposed method improves throughput by about 54.3% and energy-delay product by about 61% on average, respectively, in comparison with the conventional CMP architecture with homogenous cache system. (A preliminary short version of this work was presented in the 18th Euromicro Conference on Digital System Design (DSD), 2015.)

IEEE Transactions on Emerging Topics in Computing | 2016

An Energy-Efficient Heterogeneous Memory Architecture for Future Dark Silicon Embedded Chip-Multiprocessors

Salman Onsori; Arghavan Asad; Kaamran Raahemifar; Mahmood Fathy

Main memories play an important role in overall energy consumption of embedded systems. Using conventional memory technologies in future designs in nanoscale era causes a drastic increase in leakage power consumption and temperature-related problems. Emerging non-volatile memory (NVM) technologies offer many desirable characteristics such as near-zero leakage power, high density and non-volatility. They can significantly mitigate the issue of memory leakage power in future embedded chip-multiprocessor (eCMP) systems. However, they suffer from challenges such as limited write endurance and high write energy consumption which restrict them for adoption in modern memory systems. In this article, we present a convex optimization model to design a 3D stacked hybrid memory architecture in order to minimize the future embedded systems energy consumption in the dark silicon era. This proposed approach satisfies endurance constraint in order to design a reliable memory system. Our convex model optimizes numbers and placement of eDRAM and STT-RAM memory banks on the memory layer to exploit the advantages of both technologies in future eCMPs. Energy consumption, the main challenge in the dark silicon era, is represented as a major target in this work and it is minimized by the detailed optimization model in order to design a dark silicon aware 3D Chip-Multiprocessor. Experimental results show that in comparison with the Baseline memory design, the proposed architecture improves the energy consumption and performance of the 3D CMP on average about 61.33 and 9 percent respectively.

international symposium on telecommunications | 2008

Some enhanced cache replacement policies for reducing power in mobile devices

Mahmoud Fathy; Mohsen Soryani; Amir Ehsani Zonouz; Arghavan Asad; Mehrdad Seyrafi

Developing widely useful mobile computing applications presents difficult challenges. On one hand, mobile users demand fast response times, and deep relevant content. On the other hand, mobile devices have limited storage, power and communication resources. Caching frequently accessed data items on the mobile client is an effective technique to improve the system performance in mobile environment. Due to cache size limitation, the choice of cache replacement technique to find a suitable subset of items for eviction from cache becomes important. Power consumption and lookup latency both are crucial factors of performance in embedded systems. One important decision in designing hierarchical memories is selection of cache replacement policies. In this paper, we firstly explain briefly some of the available and used replacement policies in modern cache structures. Then we propose some improved replacement policies and evaluate their performance. We show that, taking the dirty blocks into account in cache policy will reduce the average power consumption by some percents.

digital systems design | 2017

Optimal Placement of Heterogeneous Uncore Component in 3D Chip-Multiprocessors

Aniseh Dorostkar; Arghavan Asad; Mahmood Fathy; Farah Mohammadi

In this article, we present a convex optimization model to design a stacked hybrid memory system contains eDRAM and STT-RAM banks with minimum write energy consumption of STT-RAM memory banks and minimum refresh energy of eDRAM banks with efficient number of TSVs in embedded CMP. Our convex model optimizes numbers and placement of memory banks from different technologies on the memory layer and finding optimal number and optimal placement of TSVs while satisfy maximum Load on TSVs. Experimental results on PARSEC benchmark show that the proposed Architecture improves energy-delay product (EDP) by 51% on average.

network on chip architectures | 2016

UCA: An Energy-efficient Hybrid Uncore Architecture in 3D Chip-Multiprocessors to minimize crosstalk

Pooneh Safayenikoo; Arghavan Asad; Kaamran Raahemifar; Mahmood Fathy

With technology scaling, the number of uncore components increases on a chip in Chip-Multiprocessors (CMPs). As the number of cores increases, power consumption becomes the main concern in Network on Chip (NoC) and Last Level Cache (LLC). Emerging technologies, such as three-dimensional integrated circuits (3D ICs) and non-volatile memories (NVMs) are among the newest solutions to the design of dark-silicon-aware multi/many-core systems. In on-chip interconnection networks, components must be activated for each access, consequently the energy of NoC increases. Although NVMs have many advantages like low leakage and high density, they suffer from shortcomings such as the limited number of write operations and long write operation latency and high energy. In this paper, we propose a new architecture called Uncore-Coding Architecture (UCA) to simultaneously target the short lifetime of NVM LLC and the crosstalk problem of Through-Silicon-Vias (TSVs). This architecture identifies frequent values at runtime in order to encode these values using limited weight codes and therefore reduce the number of bit flips to minimize energy and crosstalk in NoC. Furthermore, this encoding can also improve the life of NVMs integrated into the LLC. Experimental results show that the proposed method improves energy by about 30% on average under PARSEC workloads execution. Moreover, this technique provides Average Memory Access Time approximately, on average, equal to the conventional methods with SRAM cache technology under PARSEC workloads execution.

Explore More