Nishi Ahuja
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Nishi Ahuja.
semiconductor thermal measurement and management symposium | 2012
Nishi Ahuja
In a typical datacenter, almost 40% of the total power consumption is spent on datacenter cooling. In addition, the capital expenditure costs for the cooling infrastructure are also significant. Large Internet Portal Datacenters are looking at every possible way to reduce the cooling cost. One of the emerging trends in the industry is to move to higher ambient datacenter operation. Some datacenter operators are even wanting to operate the datacenters at ambient as high as 40C. It is shown that both server power increase and facility level cooling power savings must be considered to determine net power savings at the datacenter level. CFD modeling is used to demonstrate that following best practices in airflow management: using blanking panels, floor layout, eliminating cable obstructions, hot/cold aisle containment and bypass and re-circulation reduction are first important steps to get ready for high ambient datacenter operation. It is shown that without these practices, there is large variation in inlet air temperature from one server to other creating hotspots forcing thermostats on CRAC units to be set low. It is shown that CFD modeling can be used to quantify how cooling path management improves with hot/cold aisle containment. The study shows that significant datacenter level power savings can be achieved by operating the datacenter up to 35C with the use of Economizers.
semiconductor thermal measurement and management symposium | 2011
Nishi Ahuja; Chuck Rego; Sandeep Ahuja; Matt Warner; Akhil Docca
Advances in server technology have resulted in the cost of acquiring server equipment trending down, while economies of scale in data centers have significantly reduced the cost of labor. This leaves the cost of the energy as the next target for optimization. Energy costs are driven by operating the IT equipment, the switchgear that provides uninterrupted power to the equipment, and in cooling the IT equipment. In a typical datacenter, almost 40% of the total power consumption is spent on cooling. In addition, cooling effectiveness is a first order factor in determining the lifespan of the data center. One of the emerging trends in the industry is to move datacenter operations to higher ambient temperatures with some Operators wanting to set supply air temperatures as high as 40°C while improving cooling system efficiency. This study will show that with optimized cooling control one could reduce the total cost of ownership at the datacenter level by optimizing the datacenter cooling budget while ensuring no performance loss at increased ambient temperature conditions. This paper describes a platform-assisted thermal management approach that uses new sensors providing server airflow and server outlet temperature to improve control of the data centers cooling solution. This data is also used as input to a computational fluid dynamics (CFD) model for accurate predictive analysis and optimization of future change scenarios, thus increasing the data center efficiency and reducing power consumption. A key component of the study will be the use of computational fluid dynamics CFD analysis for optimizing the data center cooling system.
semiconductor thermal measurement and management symposium | 2013
Nishi Ahuja; Charles W. Rego; Sandeep Ahuja; Shen Zhou; Saurabh Shrivastava
In a typical data center, a significant portion of the total power consumption is spent on data center cooling. This paper addresses the problem of cooling inefficiency resulting from airflow mismatch between what is required by the IT equipment vs. what is supplied by the facility fans. The paper proposes two new virtual sensors for a server; volumetric airflow and outlet temperature sensor. A method for calculating server volumetric airflow in real time based on fan RPM values is outlined. The calculated volumetric airflow and outlet temperature are exposed to Intel Data center Manager (DCM) using IPMI commands. DCM can then aggregate server airflow to compute airflow demand by rack, row or data center level. The information can be used to control facility fans to improve cooling efficiency of the data center. This represents a significant improvement over use of expensive temperature & pressure sensors distributed throughout the data center to control facility fans.
ASME 2015 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems collocated with the ASME 2015 13th International Conference on Nanochannels, Microchannels, and Minichannels | 2015
Sheng Kang; Guofeng Chen; Chun Wang; Ruiquan Ding; Jiajun Zhang; Pinyan Zhu; Chao Liu; Nishi Ahuja
With the advent of big data and cloud computing solutions, enterprise demand for servers is increasing. There is especially high growth for Intel based x86 server platforms. Today’s datacenters are in constant pursuit of high performance/high availability computing solutions coupled with low power consumption and low heat generation and the ability to manage all of this through advanced telemetry data gathering. This paper showcases one such solution of an updated rack and server architecture that promises such improvements.The ability to manage server and data center power consumption and cooling more completely is critical in effectively managing datacenter costs and reducing the PUE in the data center. Traditional Intel based 1U and 2U form factor servers have existed in the data center for decades. These general purpose x86 server designs by the major OEM’s are, for all practical purposes, very similar in their power consumption and thermal output. Power supplies and thermal designs for server in the past have not been optimized for high efficiency. In addition, IT managers need to know more information about servers in order to optimize data center cooling and power use, an improved server/rack design needs to be built to take advantage of more efficient power supplies or PDU’s and more efficient means of cooling server compute resources than from traditional internal server fans. This is the constant pursuit of corporations looking at new ways to improving efficiency and gaining a competitive advantage.A new way to optimize power consumption and improve cooling is a complete redesign of the traditional server rack. Extracting internal server power supplies and server fans and centralizing these within the rack aims to achieve this goal. This type of design achieves an entirely new low power target by utilizing centralized, high efficiency PDU’s that power all servers within the rack. Cooling is improved by also utilizing large efficient rack based fans for airflow to all servers. Also, opening up the server design is to allow greater airflow across server components for improved cooling. This centralized power supply breaks through the traditional server power limits. Rack based PDU’s can adjust the power efficiency to a more optimum point. Combine this with the use of online + offline modes within one single power supply. Cold backup makes data center power to achieve optimal power efficiency. In addition, unifying the mechanical structure and thermal definitions within the rack solution for server cooling and PSU information allows IT to collect all server power and thermal information centrally for improved ease in analyzing and processing.Copyright
ASME 2015 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems collocated with the ASME 2015 13th International Conference on Nanochannels, Microchannels, and Minichannels | 2015
Yongzhan He; Guofeng Chen; Jiajun Zhang; Tianyu Zhou; Tao Liu; Pinyan Zhu; Chao Liu; Nishi Ahuja
The advent of the big data era, the rapid development of mobile internet, and the rising demand of cloud computing services require increasingly more compute capability from their data center. This compute increase will most likely come from higher rack and room power densities or even construction of new Internet data centers. But an increase in a data center’s business-critical IT equipment (servers, hubs, routers, wiring patch panels, and other network appliances), not to mention the infrastructure needed to keep these devices alive and protected, encroaches on another IT goal: to reduce long-term energy usage. Large Internet Data Centers are looking at every possible way to reduce the cooling cost and improve efficiency. One of the emerging trends in the industry is to move to higher ambient data center operation and use air side economizers. However, these two trends can have significant implications for corrosion risk in data centers.The prevailing practice surrounding the data centers has often been “The colder, the better.” However, some leading server manufacturers and data center efficiency experts share the opinion that data centers can run far hotter than they do today without sacrificing uptime, and with a huge savings in both cooling related costs and CO2 emissions. Why do we need to increase the temperatures? To cool data center requires huge refrigeration system which is energy hog and also cost of cooling infrastructure, maintenance cost and operation cost are heavy cost burden.Ahuja et al [1] studied cooling path management in data center at typical operating temperature as well as higher ambient operating temperatures. High Temperatures and Corrosion Resistance technology will reduce the refrigeration output and how this innovation will open up new direction in data centers.Note that, HTA is not to say that the higher the better. Before embracing HTA two keys points need to be addressed and understood. Firstly, server stability along with optimal temperature from data center perspective. Secondly, corrosion resistant technology. With Fresh air cooling the server has to bear with the seasons and diurnal variation of temperatures and these can be over 35 degree C, therefore to some extent, we have to say, HTA design is the premise of corrosion resistant design.In this paper, we present methods to realize precise HTA operation along with corrosive resistant technology. This is achieved through an orchestrated collaboration between the IT and cooling infrastructures.Copyright
intersociety conference on thermal and thermomechanical phenomena in electronic systems | 2014
Shu Zhang; Tianyu Zhou; Nishi Ahuja; Gamal Refai-Ahmed; Yongzhong Zhu; Guofeng Chen; Zhihua Wang; Weiwei Song
In recent years, the demand for compute capability has increased rapidly, and data centers all over the world are consuming much more power than before. Data centers across the globe consume around 100 GWh and it is expected to increase by 30% by 2016. In 2012, the increase in data center energy consumption in the United States and China was 800MWh and 500MWh respectively, based on Forest & Sullivan market survey. This increasing demand of data usage and the resulting growth in power consumption presents a challenge to the thermal community when trying to introduce efficient real time thermal management controls. These controls are vital to maintain power efficiency, from a cooling point of view.
2015 31st Thermal Measurement, Modeling & Management Symposium (SEMI-THERM) | 2015
Shu Zhang; Nishi Ahuja; Yu Han; Huahua Ren; Yanchang Chen; Guangliang Guo
Based on the Forest & Sullivan market survey. Data centers across the globe now consume around 100GWh of power and this number is expected to increase by 30% by 2016. With growth trends increasing and development expanding, IDC owners realize that small improvements in efficiency, from architecture design to daily operations, will yield large cost reductions over time. Cooling energy is a significant part of the daily operational expense of an IDC. One trend in this industry is to raise the operational temperature of an IDC, which also means running IT equipment at higher ambient temperature (HTA) environment. This might also include cooling improvements such as water-side or air-side economizers which can be used in place of traditional closed loop CRAC systems. The more efficient systems can typically be run for much of the year, and energy consumption is saved by avoiding running of the Chiller component. The conventional method of calculating the benefit of higher operational temperature is just to find the balance between the IT power consumption and IDC cooling energy savings. The performance of specific business application and long term reliability of key components do not get enough attention and is not studied carefully. This paper tries to list all important aspects we should consider in an HTA environment, and tries to find the key points and inflection points of running an IDC more efficiently. This paper lists the main challenges IT infrastructure when implementing high operational temperatures in an IDC, and describes the strategy of dealing with those challenges. An important trend seen in industry today is customized IT infrastructure designs for IT equipment and IDC infrastructure from the cloud service provider. This trend brings an opportunity to consider IT and IDC together when designing and IDC, from the early design phase to the daily operation phase, when faced with the challenge of improving efficiency. This trend also provides a chance to get more potential benefit out of higher operational temperatures. The advantages and key components that make up a customized rack server design include reduced power consumption, more thermal margin with less fan power, and accurate thermal monitoring, etc. Accordingly, the specific IDC infrastructure can be redesigned to meet high temperature operation. The chiller will not be the primary cooling source anymore and the BMS system will also be re-designed to setup the communication between IT equipment and IDC cooling equipment. To raise the supply air temperature always means less thermal headroom for IT equipment. This equates to less allowable temperature variation with cooling infrastructure. IDC operators will have less responses time with large power variations or any IDC failures happen. In this paper, we will introduce the new solution like ODC (on-demand cooling) and PTAS to show how those solutions meet these challenges. These solutions use the real time thermal data of IT equipment as the key input data for the cooling controls versus traditional ceiling installed sensors. This new method helps to improve the cooling control accuracy, decrease the response time and reduce temperature variation. By establishing a smart thermal operation like this, one can next explore implementing an aggressive thermal management control policy confident that there is thermal safety in doing so.
2017 33rd Thermal Measurement, Modeling & Management Symposium (SEMI-THERM) | 2017
Chuan Song; Yanbing Sun; Nishi Ahuja; Xiaogang Sun; Litrin Jiang; Abishai Daniel; Rahul Khanna; Tianyu Zhou; Xiang Zhou; Lifei Zhang
This paper introduced one optimized proactive cooling management approach based on power variation trend analysis. Through analyzing the data center historical power telemetries, the power predictor is able to predicate power variation with 5– 15 minutes granularity. The cooling controller uses the observed heat information and estimated thermal variation trend to drive CRAC to manage temperature situation at prediction window. To validate cooling results from different cooling parameters, one risk level evaluation method is proposed and the experiments for different prediction window are conducted and the result is presented.
2016 32nd Thermal Measurement, Modeling & Management Symposium (SEMI-THERM) | 2016
Cong Li; Abishai Daniel; Nishi Ahuja
When aging fans in server chassis wear out, they slow down and circulate less air, eventually causing overheating condition and unexpected thermal shutdown. We propose a novel approach to predictive analysis of fan failures before the significant wear-out or slow-down by modeling the correlation of the speed measurements of multiple fans. We present the empirical result that the speed measurements of multiple fans in a server are highly correlated and exploit the observation to devise the use of data-driven anomaly detection methods in predictive fan failure analysis. A comparative study on several variants of the approach is performed on the simulated data sets demonstrating how certain heuristics substantially improve the performance.
2016 32nd Thermal Measurement, Modeling & Management Symposium (SEMI-THERM) | 2016
Haifeng Gong; Nishi Ahuja; Chuan Song; Chun Wang; Xiang Zhou; Gabriel C Cox
In todays data center, in order to achieve higher density and better energy efficiency, the traditional server systems with dedicated fans and power supply units are evolving to the Rack Scale System with shared cooling and shared power supply at rack level, some examples are Open Compute Project and China Scorpio Rack. This evolution brings the challenges of how to efficiently control the rack cooling and effectively manage the rack power at coarse granularity level, e.g. at the whole rack or half rack level, while no increasing in system cost and management complexity. In this paper, the management architecture for the Rack Scale System is introduced, the cooling zone, power zone, and the parameters for thermal and power management are elaborated. Some new virtual sensors for thermal management are also introduced in this paper, the modeling method for the volumetric airflow virtual sensor is described, and the validation test results are presented. With the rack thermal and power management, the IT devices can be closely associated with the datacenter facilities, and thus be intelligently orchestrated and adapted to the workload for TCO (Total Cost of Ownership) saving and reliability improvement.