Shounak Chakraborty
Indian Institute of Technology Guwahati
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shounak Chakraborty.
acm symposium on applied computing | 2015
Hemangee K. Kapoor; Shirshendu Das; Shounak Chakraborty
With the rapid growth in semiconductor technology, modern processor chips have multiple number of processor cores with multi-level on-chip caches. Recent study about the chip power consumption indicates that, the principal amount of chip power is consumed by the on chip caches which can be divided into two major parts: dynamic power and static power. Dynamic power is consumed when the cache is accessed and static power is generally referred as leakage power of the cache. This increased power consumption of chip increases chip-temperature which increases on chip leakage power. In this paper we attempt to reduce the static power consumption by intelligently powering off cache banks and mapping its requests to other active cache banks. We use a performance based criteria for the shutdown decision and the bank to be powered off is chosen based on usage statistics. The remapping of requests for a powered off cache bank is done at the L2-controller and thus the L1 caches are transparent to this approach. Thus, depending on the applications workload and data distribution, a controlled number of banks can be dynamically shutdown saving on the leakage power dissipation. Experimental analysis shows 43% reduction in static power and 19% reduction in EDP.
international parallel and distributed processing symposium | 2015
Shounak Chakraborty; Shirshendu Das; Hemangee K. Kapoor
Most of chip-multiprocessors share a common large sized last level cache (LLC). In non-uniform cache access based architectures, the LLC is divided into multiple banks to be accessed independently. It has been observed that the principal amount of chip power in CMP is consumed by the LLC banks which can be divided into two major parts: dynamic and static. Techniques have been proposed to reduce the static power consumption of LLC by powering off the less utilized banks and forwarding its requests to other active banks (target banks). Once a bank is powered off, all the future requests arrive to its controller and get forwarded to the target bank. Such a bank shutdown process saves static power but reduces the performance of LLC. Due to multiple banks shutdown the target banks may also get overloaded. Additionally, the request forwarding increases the on chip traffic. In this paper, we improve the performance of the target banks by dynamically managing its associativity. The cost of request forwarding is optimized by considering network distance as an additional metric for target selection. These two strategies help to reduce performance degradation. Experimental analysis shows 43% reduction in static energy and 23% reduction in EDP for a 4MB LLC with a performance constraint of 3%.
ifip ieee international conference on very large scale integration | 2016
Shounak Chakraborty; Hemangee K. Kapoor
The increased power density with short channel effect in modern transistors significantly increases the leakage energy consumptions of on-chip Last Level Caches (LLCs) in recent Chip Multi-Processors(CMPs). Performance linked dynamic shrinking in the LLC size is a promising option for reducing cache leakage. Prior works attempt to reduce the cache leakage by predicting Working Set Size(WSS) of the applications and by putting some cache portions in low power mode. This paper aims to reduce leakage energy by using a combination of cache bank shutdown and way shutdown. The banks with minimal usages are candidates for shutdown. In banks with average usages, some ways are turned off to save leakage. To mitigate the impact of smaller set-size, we apply dynamic associativity management technique. Experimental evaluation using full system simulation on a 4MB 8-way set associative L2 cache gives 70% average savings in static energy with 35% average savings in EDP. In case applications cache demand increases we can turn-on some ways to maintain performance.
vlsi design and test | 2014
Narendra Kumar Meena; Hemangee K. Kapoor; Shounak Chakraborty
Network on chip (NoC) provides a fast and scalable interconnect for communication between many IP cores and System on Chips (SoCs). As the number of on chip elements increase to fulfill the demand of high performance computing, scalable and efficient communication infrastructure is required for higher levels of integration, for which 3D NoCs have evolved. Multicast communication provides a better solution for many cache coherence protocols and parallel algorithms. This paper proposes a New Recursive Partitioning Multicast Routing Algorithm (3D-RPM) along with its optimized version, a New Optimized Recursive Partitioning Multicast Routing Algorithm (3D-ORPM) for 3D mesh networks. Simulation results show around 8-13% reduction in percentage link utilization and link power consumed for the proposed approach compared to the tree based 3D-XYZ multicast routing algorithm. Results show that the approach is scalable for larger networks, as well as large number of multicast destinations.
great lakes symposium on vlsi | 2018
Ashwini A. Kulkarni; Shounak Chakraborty; Shrinivas P. Mahajan; Hemangee K. Kapoor
Heavy leakage power consumption of on-chip last level caches (LLCs) has become the primary obstacle for architecting chip multi-processors (CMPs) in recent times. As leakage power has a direct relationship with the supply voltage, hence, periodic access profile based dynamic voltage scaling (DVS) in the LLC banks can be a promising option towards reducing this heavy cache leakage. A plethora of prior attempts have reduced this by anticipating working set size (WSS) of the applications and eventually putting some portions of the cache banks in low power mode. This proposed work aims to reduce leakage by putting a whole LLC bank into a low power (snoozy) mode through exploiting DVS at cache banks having minimal usages. Additionally, the resulting performance impacts of the low power snoozy mode are alleviated further by putting some snoozy banks in active mode on-demand. Experimental evaluations using full system simulation on a multi-banked 2MB 8-way set associative L2 cache show 10% more leakage savings on an average over a prior drowsy technique.
international conference on vlsi design | 2017
Shounak Chakraborty; Hemangee K. Kapoor
To meet the ever increasing processing demand, we have more number of components in modern Chip MultiProcessors (CMPs). This leads to increased on-chip power density resulting in prohibitive chip temperatures. The increased chip temperature not only increases the cooling cost, but it also increases leakage power dissipation, which in turn increases chip temperature. On-chip caches that occupy the largest onchip area are usually assumed as a cooler on-chip portion. However, this is a misconception as a recent study shows 30°C spatial temperature variance in modern large on-chip caches. This paper attempts to reduce the effective chip temperature by intelligently turning off/on some on-chip Last Level Cache (LLC) banks in a tiled CMP during process execution. The turned off cache banks will act as on-chip thermal buffers helping to reduce the effective chip temperature. Resizing decision will be taken based upon generated cache hotspots or performance. With a negligible performance overhead simulation results show a reduction of 4°C in the chip temperature with 52% maximum savings in cache leakage.
Microprocessors and Microsystems | 2017
Shounak Chakraborty; Hemangee K. Kapoor
Abstract Advancement in semiconductor technology increases power density in recent Chip Multi-Processors (CMPs) which significantly increases the leakage energy consumptions of on-chip Last Level Caches (LLCs). Performance linked dynamic tuning in LLC size is a promising option for reducing the cache leakage. This paper reduces static power consumption by dynamically shutting down or turning on cache banks based upon system performance and cache bank usage statistics. Shutting down of a cache bank remaps its future requests to another active bank, called as target bank. The proposed method is evaluated on three different implementation policies, viz (1) The system can decide to shutdown or turn-on some cache banks periodically throughout the process execution. (2) The system allows to shutdown banks initially and once the bank restarting initiates, no more shutdown is permitted further. (3) This policy resizes cache like first policy with some predefined time slices, in which cache cannot be resized. For a 4MB 4 way set associative L2 cache, experimental analysis shows 66% reduction in static energy with 29% gain in Energy Delay Product (EDP) for first strategy; for the second policy, static power is reduced by 59% with 27% savings in EDP. Finally, last policy saves 65% in static power and 30% in EDP with minimal performance penalty.
acm symposium on applied computing | 2016
Shounak Chakraborty; Shirshendu Das; Hemangee K. Kapoor
Rapid growth in semiconductor technology permits to integrate multiple number of processor cores with multi-level on-chip caches. Integration of more on-chip components increases the on-chip power density. As per recent studies, on-chip caches are the principal contributors to the total power consumed by the chip. This cache power consumption can be divided into two major parts: dynamic power and static power. Dynamic power is consumed during cache accesses and static power is referred to as the leakage power of the cache. This increased power consumption increases effective chip-temperature which in turn increases the leakage power. In this paper we attempt to reduce the static power consumption by powering off cache ways from the cache banks of a Tiled DNUCA cache. We use a bank utilisation based criteria for the way shutdown decision. The number of ways to be turned off from a bank is chosen based on banks usage statistics. The contents of the powered off cache ways are written back to main memory. Thus, depending upon the applications working set size and data distribution, a controlled number of ways from a set of banks can be dynamically shutdown to save leakage power dissipation. For a 4MB 8 way L2 Tiled DNUCA cache, experimental analysis shows 17% reduction in EDP and 33% reduction in the static power. The powered-off ways are also aligned, simplifying the gating circuitry.
vlsi design and test | 2015
Shounak Chakraborty; Shirshendu Das; Hemangee K. Kapoor
Most of the chip-multiprocessors share a large sized last level cache(LLC) which is divided into multiple banks in NUCA based architectures. Recent study on LLC power consumption indicates that, LLC consumes principal amount of chip power. The LLC power consumption can be divided into two major parts: dynamic power and static power. Techniques have been proposed to reduce static power by powering off some less utilized cache portions. But, powering off some cache portion can degrade the system performance. In this paper, we reduce the cache power consumption by shutting down some cache ways of less utilized cache sets and then apply victim retention(VR) technique in the remaining portion to reduce cache misses. Experimental analysis shows 35% reduction in static power and 11.31% reduction in EDP, on an average for a 2MB LLC with negligible change in performance.
Electronics and Communication Systems (ICECS), 2014 International Conference on | 2014
Shounak Chakraborty; Dipika Deb; Dhantu Buragohain; Hemangee K. Kapoor
Minimizing power consumption of Chip Multiprocessors has drawn attention of the researchers now-a-days. A single chip contains a number of processor cores and equally larger caches. According to recent research, it is seen that, on chip caches consume the maximum amount of total power consumed by the chip. Reducing on-chip cache size may be a solution for reducing on-chip power consumption, but it will degrade the performance. In this paper we present a study of reducing cache capacity and analyzing its effect on power and performance. We reduce the number of available cache banks and see its effect on reduction in dynamic and static energy. Experimental evaluation shows that for most of the benchmarks, we get significant reduction in static energy; which can result in controlling chip temperature. We use CACTI and full system simulator for our experiments.