Mohammad Javad Dousti

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mohammad Javad Dousti is active.

Explore More

Publication

Featured researches published by Mohammad Javad Dousti.

design, automation, and test in europe | 2012

Minimizing the latency of quantum circuits during mapping to the ion-trap circuit fabric

Mohammad Javad Dousti; Massoud Pedram

Quantum computers are exponentially faster than their classical counterparts in terms of solving some specific, but important problems. The biggest challenge in realizing a quantum computing system is the environmental noise. One way to decrease the effect of noise (and hence, reduce the overhead of building fault tolerant quantum circuits) is to reduce the latency of the quantum circuit that runs on a quantum circuit. In this paper, a novel algorithm is presented for scheduling, placement, and routing of a quantum algorithm, which is to be realized on a target quantum circuit fabric technology. This algorithm, and the accompanying software tool, advances state-of-the-art in quantum CAD methodologies and methods while considering key characteristics and constraints of the ion-trap quantum circuit fabric. Experimental results show that the presented tool improves results of the previous tool by about 41%.

international symposium on low power electronics and design | 2014

Therminator: a thermal simulator for smartphones producing accurate chip and skin temperature maps

Qing Xie; Mohammad Javad Dousti; Massoud Pedram

Maintaining safe chip and device skin temperatures in small form-factor mobile devices (such as smartphones and tablets) while continuing to add new functionalities and provide higher performance has emerged as a key challenge. This paper presents Therminator, an early stage, fast, full-device thermal analyzer, which generates accurate steady-state temperature maps of the entire smartphone starting from the Application Processor and other key device components, extending to the skin of the device itself. The thermal analysis is sensitive to detailed device specifications (including its material composition and 3-D layout) as well as different use cases (each case specifying the set of active device components and their activity levels). Therminator considers all major components within the device, builds a corresponding compact thermal model for each component and the whole device, and produces their steady-state temperature maps. Temperature results obtained by using Therminator have been validated against a commercial computational fluid dynamics-based tool, i.e., Autodesk Simulation CFD, and thermocouple measurements on a Qualcomm Mobile Developer Platform. A case study on a Samsung Galaxy S4 using Therminator is provided to relate the device performance to the skin temperature and investigate the thermal path design.

ieee computer society annual symposium on vlsi | 2014

5nm FinFET Standard Cell Library Optimization and Circuit Synthesis in Near-and Super-Threshold Voltage Regimes

Qing Xie; Xue Lin; Yanzhi Wang; Mohammad Javad Dousti; Alireza Shafaei; Majid Ghasemi-Gol; Massoud Pedram

FinFET device has been proposed as a promising substitute for the traditional bulk CMOS-based device at the nanoscale, due to its extraordinary properties such as improved channel controllability, high ON/OFF current ratio, reduced short-channel effects, and relative immunity to gate line-edge roughness. In addition, the near-ideal subthreshold behavior indicates the potential application of FinFET circuits in the near-threshold supply voltage regime, which consumes an order of magnitude less energy than the regular strong-inversion circuits operating in the super-threshold supply voltage regime. This paper presents a design flow of creating standard cells by using the FinFET 5nm technology node, including both near-threshold and super-threshold operations, and building a Liberty-format standard cell library. The circuit synthesis results of various combinational and sequential circuits based on the 5nm FinFET standard cell library show up to 40X circuit speed improvement and three orders of magnitude energy reduction compared to those of 45nm bulk CMOS technology.

IEEE Transactions on Circuits and Systems Ii-express Briefs | 2015

Performance Comparisons Between 7-nm FinFET and Conventional Bulk CMOS Standard Cell Libraries

Qing Xie; Xue Lin; Yanzhi Wang; Shuang Chen; Mohammad Javad Dousti; Massoud Pedram

FinFET devices have been proposed as a promising substitute for conventional bulk CMOS-based devices at the nanoscale due to their extraordinary properties such as improved channel controllability, a high on/off current ratio, reduced short-channel effects, and relative immunity to gate line-edge roughness. This brief builds standard cell libraries for the advanced 7-nm FinFET technology, supporting multiple threshold voltages and supply voltages. The circuit synthesis results of various combinational and sequential circuits based on the presented 7-nm FinFET standard cell libraries forecast 10× and 1000× energy reductions on average in a superthreshold regime and 16× and 3000× energy reductions on average in a near-threshold regime as compared with the results of the 14-nm and 45-nm bulk CMOS technology nodes, respectively.

design automation conference | 2014

Power-Aware Deployment and Control of Forced-Convection and Thermoelectric Coolers

Mohammad Javad Dousti; Massoud Pedram

Advances in the thermoelectric cooling technology have made it one of the promising solutions for spot cooling in VLSI circuits. Thermoelectric coolers (TECs) generate heat during their operation. This heat plus the heat generated in the circuit should be transferred to the ambient environment in order to avoid high die temperatures. This paper describes a hybrid cooling solution in which TECs are augmented with forced-convection coolers (fans). Precisely, an optimization framework called OFTEC is presented which finds the optimum TEC driving current and the fan speed to minimize the overall power consumption of the cooling system while maintaining safe die temperatures. Simulation results on a set of eight benchmarks show the benefits of the proposed approach. In particular, a baseline system without TECs but with a fan could meet the thermal constraint for only three of the benchmarks whereas the OFTEC solution satisfied thermal constraints for all benchmarks. In addition, OFTEC resulted in 5.4% less average power consumption for the aforesaid three benchmarks while lowering the maximum die temperature by an average of 3.7°C.

international symposium on low power electronics and design | 2013

Platform-dependent, leakage-aware control of the driving current of embedded thermoelectric coolers

Mohammad Javad Dousti; Massoud Pedram

One of the biggest stumbling blocks for the successful continuation of the Moores law is the substrate temperature of VLSI circuits. Thermoelectric cooling is one of the promising cooling methods to combat high die temperatures. This method provides key benefits such as compactness, high reliability, and exceptionally high heat-pumping capability. On the other hand, even with the recent advances in the fabrication techniques, thermoelectric coolers (TECs) are suffering from a poor coefficient of performance (COP), which denotes the ratio of heat removed per second to the power needed to drive the TEC, is rather low. In this paper, different techniques to improve the performance of a TEC, when it is embedded inside a processor package, are investigated. In particular, first the COP of TECs is reformulated to consider the leakage power, which is exponentially dependent on the die temperature. Next it is demonstrated that the TEC driving current that yields the maximum decrease in the die temperature is quite different from the one that runs the TEC in its highest COP state. Based on these observations, a platform-dependent, leakage-aware cooling policy in which the TEC driving current is set based on the target specs (high-performance vs. low-power) and actual conditions of the chip (emergency vs. preventive thermal management) is proposed. Experimental results show that, with this policy, one can reduce the temperature of chip hotspots while achieving a high COP.

design, automation, and test in europe | 2015

Accurate electrothermal modeling of thermoelectric generators

Mohammad Javad Dousti; Antonio Petraglia; Massoud Pedram

Thermoelectric generators (TEGs) provide a unique way for harvesting thermal energy. These devices are compact, durable, inexpensive, and scalable. Unfortunately, the conversion efficiency of TEGs is low. This requires careful design of energy harvesting systems including the interface circuitry between the TEG module and the load, with the purpose of minimizing power losses. In this paper, it is analytically shown that the traditional approach for estimating the internal resistance of TEGs may result in a significant loss of harvested power. This drawback comes from ignoring the dependence of the electrical behavior of TEGs on their thermal behavior. Accordingly, a systematic method for accurately determining the TEG input resistance is presented. Next, through a case study on automotive TEGs, it is shown that compared to prior art, more than 11% of power losses in the interface circuitry that lies between the TEG and the electrical load can be saved by the proposed modeling technique. In addition, it is demonstrated that the traditional approach would have resulted in a deviation from the target regulated voltage by as much as 59%.

great lakes symposium on vlsi | 2014

Squash: a scalable quantum mapper considering ancilla sharing

Mohammad Javad Dousti; Alireza Shafaei; Massoud Pedram

Quantum algorithms for solving problems of interesting size often result in circuits with a very large number of qubits and quantum gates. Fortunately, these algorithms also tend to contain a small number of repetitively-used quantum kernels. Identifying the quantum logic blocks that implement such quantum kernels is critical to the complexity management for realizing the corresponding quantum circuit. Moreover, quantum computation requires some type of quantum error correction coding to combat decoherence, which in turn results in a large number of ancilla qubits in the circuit. Sharing the ancilla qubits among quantum operations (even though this sharing can increase the overall circuit latency) is important in order to curb the resource demand of the quantum algorithm. This paper presents a multi-core reconfigurable quantum processor architecture, called Requp, which supports a layered approach to mapping a quantum algorithm and ancilla sharing. More precisely, a scalable quantum mapper, called Squash, is introduced, which divides a given quantum circuit into a number of quantum kernels--each kernel comprises k parts such that each part will run on exactly one of k available cores. Experimental results demonstrate that Squash can handle large-scale quantum algorithms while providing an effective mechanism for sharing ancilla qubits.

design automation conference | 2013

LEQA: latency estimation for a quantum algorithm mapped to a quantum circuit fabric

Mohammad Javad Dousti; Massoud Pedram

This paper presents LEQA, a fast latency estimation tool for evaluating the performance of a quantum algorithm mapped to a quantum fabric. The actual quantum algorithm latency can be computed by performing detailed scheduling, placement and routing of the quantum instructions and qubits in a quantum operation dependency graph on a quantum circuit fabric. This is, however, a very expensive proposition that requires large amounts of processing time. Instead, LEQA, which is based on computing the neighborhood population counts of qubits, can produce estimates of the circuit latency with good accuracy (i.e., an average of less than 3% error) with up to two orders of magnitude speedup for mid-size benchmarks. This speedup is expected to increase superlinearly as a function of circuit size (operation count).

design, automation, and test in europe | 2015

Power-efficient control of thermoelectric coolers considering distributed hot spots

Mohammad Javad Dousti; Massoud Pedram

Thermoelectric coolers are compact devices that can target hot spots on a VLSI die. These devices are connected electrically in series and controlled together, i.e., all are ON or OFF at the same time. However, spatial and temporal distributions of hot spots on a VLSI die are non-uniform, and therefore, activating all of TECs to address one or a few localized hot spots is not economical. This traditional technique indeed leads to a significant power waste. This paper suggests that adjacent hot spots with the same thermal behavior can be grouped and controlled by a cluster of TECs. A bypass switch for each TEC cluster is added in order to allow selectively turning OFF some TEC clusters which are needed. More precisely, a clustering problem is formulated which aims to minimize the power waste due to excessive use of TECs. Due to the large number of variables in problems of interesting sizes, a greedy heuristic method for solving the problem is introduced. It is shown that the proposed heuristic can reduce the wasted power on average by 81% and also decrease the total TEC power consumption on average by 42%.

Explore More