Avesta Sasan | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Avesta Sasan is active.

Explore More

Publication

Featured researches published by Avesta Sasan.

design, automation, and test in europe | 2009

Process variation aware SRAM/cache for aggressive voltage-frequency scaling

Avesta Sasan; Houman Homayoun; Ahmed M. Eltawil; Fadi J. Kurdahi

This paper proposes a novel Process Variation Aware SRAM architecture designed to inherently support voltage scaling. The peripheral circuitry of the SRAM is modified to selectively allow overdriving a wordline which contains weak cell(s). This architecture allows reducing the power on the entire array; however it selectively trades power for correctness when rows containing weak cells are accessed. The cell sizing is designed to assure successful read operations. This avoids flipping the content of the cells when the wordline is overdriven. Our simulations report 23% to 30% improvement in cell access time and 31% to 51% improvement in cell write time in overdriven wordlines. Total area overhead is negligible (4%). Low voltage operation achieves more than 40% reduction in dynamic power consumption and approximately 50% reduction in leakage power consumption.

intelligent networking and collaborative systems | 2009

Fuzzy Based Trust Estimation for Congestion Control in Wireless Sensor Networks

Mani Zarei; Amir Masoud Rahmani; Avesta Sasan; Mohammad Teshnehlab

In this paper we present a novel congestion control scheme based on fuzzy logic systems for a wireless sensor network (WSN). In sensor networks existence of malicious nodes is undeniable. These nodes aggravate congestion problem by diffusing useless packets. In our approach using in-network fuzzy based processing, each node monitors the behavior of its neighbor nodes estimating their related trust. Therefore, malicious nodes will be detected and isolated to reduce congestion rate contributed by relative packets. Results show our proposed scheme increases packet delivery of legitimate nodes up to 27.5% and reduces packet loss ratio up to 40% when 16% of all nodes are inferred as malicious.

international conference on big data | 2015

System and architecture level characterization of big data applications on big and little core server architectures

Maria Malik; Setareh Rafatirah; Avesta Sasan; Houman Homayoun

Emerging Big Data applications require a significant amount of server computational power. Big data analytics applications rely heavily on specific deep machine learning and data mining algorithms, and exhibit high computational intensity, memory intensity, I/O intensity and control intensity. Big data applications require computing resources that can efficiently scale to manage massive amounts of diverse data. However, the rapid growth in the data yields challenges to process data efficiently using current server architectures such as big Xeon cores. Furthermore, physical design constraints, such as power and density, have become the dominant limiting factor for scaling out servers. Therefore recent work advocates the use of low-power embedded cores in servers such as little Atom to address these challenges. In this work, through methodical investigation of power and performance measurements, and comprehensive system level and micro-architectural analysis, we characterize emerging big data applications on big Xeon and little Atom-based server architecture. The characterization results across a wide range of real-world big data applications and various software stacks demonstrate how the choice of big vs little core-based server for energy-efficiency is significantly influenced by the size of data, performance constraints, and presence of accelerator. Furthermore, the microarchitecture-level analysis highlights where improvement is needed in big and little cores microarchitecture.

international conference on big data | 2015

Energy-efficient acceleration of big data analytics applications using FPGAs

Katayoun Neshatpour; Maria Malik; Mohammad Ali Ghodrat; Avesta Sasan; Houman Homayoun

A recent trend for big data analytics is to provide heterogeneous architectures to allow support for hardware specialization. Considering the time dedicated to create such hardware implementations, an analysis that estimates how much benefit we gain in terms of speed and energy efficiency, through offloading various functions to hardware would be necessary. This work analyzes data mining and machine learning algorithms, which are utilized extensively in big data applications in a heterogeneous CPU+FPGA platform. We select and offload the computational intensive kernels to the hardware accelerator to achieve the highest speed-up and best energy-efficiency. We use the latest Xilinx Zynq boards for implementation and result analysis. We also perform a first order comprehensive analysis of communication and computation overheads to understand how the speedup of each application contributes to its overall execution in an end-to-end Hadoop MapReduce environment. Moreover, we study how other system parameters such as the choice of CPU (big vs little) and the number of mapper slots affect the performance and power-efficiency benefits of hardware acceleration. The results show that a kernel speedup of upto χ 321.5 with hardware+software co-design can be achieved. This results in χ2.72 speedup, 2.13χ power reduction, and 15.21χ energy efficiency improvement (EDP) in an end-to-end Hadoop MapReduce environment.

IEEE Transactions on Very Large Scale Integration Systems | 2011

Inquisitive Defect Cache: A Means of Combating Manufacturing Induced Process Variation

Avesta Sasan; Houman Homayoun; Ahmed M. Eltawil; Fadi J. Kurdahi

This paper proposes a new fault tolerant cache organization capable of dynamically mapping the in-use defective locations in a processor cache to an auxiliary parallel memory, creating a defect-free view of the cache for the processor. While voltage scaling has a super-linear effect on reducing power, it exponentially increases the defect rate in memory. The ability of the proposed cache organization to tolerate a large number of defects makes it a perfect candidate for voltage-scalable architectures, especially in smaller geometries where manufacturing induced process variation (MIPV) is expected to rapidly increase. The introduced fault tolerant architecture consumes little energy and area overhead, but enables the system to operate correctly and boosts the system performance close to a defect-free system. Power savings of over 40% is reported on standard benchmarks while the performance degradation is maintained below 1%.

high performance embedded architectures and compilers | 2010

RELOCATE: register file local access pattern redistribution mechanism for power and thermal management in out-of-order embedded processor

Houman Homayoun; Aseem Gupta; Alexander V. Veidenbaum; Avesta Sasan; Fadi J. Kurdahi; Nikil D. Dutt

In order to reduce register files peak temperature in an embedded processor we propose RELOCATE: an architectural solution which redistributes the access pattern to physical registers through a novel register allocation mechanism. RELOCATE regionalizes the register file such that even though accesses within a region are uniformly distributed, the activity levels are spread over the entire register file in a deterministic pattern. It partitions the register file and uses a micro-architectural mechanism to concentrate the accesses to a single or a subset of such partitions through a novel register allocation mechanism. The goal is to keep some partitions unused (idle) and cooling down. The temperature of idle partitions is further reduced by power gating them into destructive sleep mode to reduce their leakage power. The redistribution mechanism changes the active region periodically to modulate the activity within the register file and prevent the active region from heating up excessively. Our approach resulted in an average reduction of 8.3°C in the register files peak temperature for standard benchmarks.

IEEE Transactions on Very Large Scale Integration Systems | 2011

MZZ-HVS: Multiple Sleep Modes Zig-Zag Horizontal and Vertical Sleep Transistor Sharing to Reduce Leakage Power in On-Chip SRAM Peripheral Circuits

Houman Homayoun; Avesta Sasan; Alexander V. Veidenbaum; Hsin-Cheng Yao; Shahin Golshan; Payam Heydari

Recent studies show that peripheral circuit (including decoders, wordline drivers, input and output drivers) constitutes a large portion of the cache leakage. In addition, as technology migrates to smaller geometries, leakage contribution to total power consumption increases faster than dynamic power, indicating that leakage will be a major contributor to overall power consumption. This paper presents zig-zag share, a circuit technique to reduce leakage in SRAM peripherals by putting them into low-leakage power sleep mode. The zig-zag share circuit is further extended to enable multiple sleep modes for cache peripherals. Each mode represents a trade-off between leakage reduction and the wakeup delay. Using architectural control of multiple sleep modes, an integrated technique called MSleep-Share is proposed and applied in L1 and L2 caches. MSleep-share relies on cache miss information to guide leakage control mechanism and switch peripheral circuits power mode. The results show leakage reduction by up to 40× in deeply pipelined SRAM peripheral circuits, with small area overhead and small additional delay. This noticeable leakage reduction translates to up to 85% overall leakage reduction in on-chip memories.

design, automation, and test in europe | 2017

Big vs little core for energy-efficient Hadoop computing

Maria Malik; Katayoun Neshatpour; Tinoosh Mohsenin; Avesta Sasan; Houman Homayoun

The rapid growth in the data yields challenges to process data efficiently using current high-performance server architectures such as big Xeon cores. Furthermore, physical design constraints, such as power and density, have become the dominant limiting factor for scaling out servers. Heterogeneous architectures that combine big Xeon cores with little Atom cores have emerged as a promising solution to enhance energy-efficiency by allowing each application to run on an architecture that matches resource needs more closely than a one-size-fits-all architecture. Therefore, the question of whether to map the application to big Xeon or little Atom in heterogeneous server architecture becomes important. In this paper, we characterize Hadoop-based applications and their corresponding MapReduce tasks on big Xeon and little Atom-based server architectures to understand how the choice of big vs little cores is affected by various parameters at application, system and architecture levels and the interplay among these parameters. Furthermore, we have evaluated the operational and the capital cost to understand how performance, power and area constraints for big data analytics affects the choice of big vs little core server as a more cost and energy efficient architecture.

international symposium on performance analysis of systems and software | 2016

Characterizing Hadoop applications on microservers for performance and energy efficiency optimizations

Maria Malik; Avesta Sasan; Rajiv V. Joshi; Setareh Rafatirah; Houman Homayoun

The traditional low-power embedded processors such as Atom and ARM are entering the high-performance server market. At the same time, as the size of data grows, emerging Big Data applications require more and more server computational power that yields challenges to process data energy-efficiently using current high performance server architectures. Furthermore, physical design constraints, such as power and density have become the dominant limiting factor for scaling out servers. Numerous big data applications rely on using the Hadoop MapReduce framework to perform their analysis on large-scale datasets. Since Hadoop configuration parameters as well as architecture parameters directly affect the MapReduce job performance and energy-efficiency, system and architecture level parameters tuning is vital to maximize the energy efficiency. In this work, through methodical investigation of performance and power measurements, we demonstrate how the interplay among various Hadoop configurations and system and architecture level parameters affect the performance and energy-efficiency across various Hadoop applications.

IEEE Transactions on Very Large Scale Integration Systems | 2011

Reducing Power in All Major CAM and SRAM-Based Processor Units via Centralized, Dynamic Resource Size Management

Houman Homayoun; Avesta Sasan; Jean-Luc Gaudiot; Alexander V. Veidenbaum

Power minimization has become a primary concern in microprocessor design. In recent years, many circuit and micro-architectural innovations have been proposed to reduce power in many individual processor units. However, many of these prior efforts have concentrated on the approaches which require considerable redesign and verification efforts. Also it has not been investigated whether these techniques can be combined. Therefore a challenge is to find a centralized and simple algorithm which can address power issues for more than one unit, and ultimately the entire chip and comes with the least amount of redesign and verification efforts, the lowest possible design risk and the least hardware overhead. This paper proposes such a centralized approach that attempts to simultaneously reduce power in processor units with highest dissipation: reorder buffer, instruction queue, load/store queue, and register files. It is based on an observation that utilization for the aforementioned units varies significantly, during cache miss period. Therefore we propose to dynamically adjust the size and thus power dissipation of these resources during such periods. Circuit level modifications required for such resource adaptation are presented. Simulation results show a substantial power reduction at the cost of a negligible performance impact and a small hardware overhead.

Explore More