Effective Cooling of Server Boards in Data Centers by Liquid Immersion Based on Natural Convection Demonstrating PUE below 1.04.
The Internet is a very big player, and the data center, or the cloud which supports it, is now in the third wave. The first wave is "web", the second wave is "social network service", and the third wave is "Internet of Things (IoT)". The data center is a hard infrastructure that has supported the development and transformation of the Internet from the first wave to the third wave. In the third wave, IoT becomes a quite important aspect in several businesses. In an IoT evolution, the connection of all "things" is mandatory assumption. The amount of data continues to increase without limit and several businesses, including IoT, Artificial Intelligence (AI)/Big data, security, and Financial Technology (Fintech) also are promoted on the cloud platform. In such circumstances, in addition to the large-scale centralized cloud, a lot of data centers for Fog computing (Numhauser 2013) and Edge computing (Lopez 2015) will be deployed for the IoT businesses. The several IoT businesses will be supported by this kind of hierarchical data center group, as shown in Figure 1. Eventually, more highly compact and highly integrated data centers will be distributed more than ever, and the power efficiency of each data center will become very important.
The electric power cost is increasing around 8 times in this 10 years. DOE reported that the power consumption of the data center worldwide reaches around 2% of the total power consumption in the world (DOE Report). Moreover, AI engines with high heat exhaust will be distributed and operated in such Fog and Edge computing systems, in the centralized cloud of course. Many high level heat processors will be implemented in the data center. Several deep learning processors which include software for building a neural network with the Deep Learning Graphic Processing Unit (GPU) training system have been released. They clearly anticipate machine learning growth in datacenter. These processors normally exhaust high level of heat, and this means that we should remove the heat from the datacenter. It becomes a huge problem from the aspect of cooling.
Power Usage Effectiveness (PUE) is an effectiveness of operation from the aspect of power consumption of IT equipment. The PUE in the world has been decreasing since 2006, and reaches around 1.1 at minimum (Kitada 2016). Nowadays, as described above, server power is dramatically increasing in the data center including high power computer infrastructure (HPCI) and high-end General-Purpose computing on GPU (GPGPU) computing. For such high power data center, air-cooling technology is not effective enough to handle the heat generated. Also, the wall of 1.1 of the PUE definitely exists. When the PUE=1.0 is demonstrated, the power consumption decreases around 9% from the PUE=1.1 level. From the aspect of OPEX, the 9% is significant. In this study, we proposed and demonstrated a sophisticated cooling technology for such high power consumption server board in data centers. Also, we are aiming to demonstrate the next challenging and super high level stage, PUE =1.0x (x<4). In order to demonstrate such high efficient data center for high heat density servers, the liquid immersion technology is applied in this study. Already, several trials of the liquid immersion technology were performed, including immersion with pumping to provide circulation and 2 phase liquid immersion. In those systems, low PUE was demonstrated for high heat density servers. Also, in a conventional data center with air cooling system, lot of space is required for effective cooling, in which air conditioners and server racks are located. On the other hand, the required space for the immersion cooling technology can be ideally suppressed below around 1/3 of that of an air cooling system, as shown in Figure 2. (Fujitsu Report 2016, Strohmaier 2017, Prucnal 2015)
Our proposed technology is the liquid immersion technology with natural convection, without any pumping or any fan in the bathtub for the refrigerant circulation. This system enables high efficient cooling technology for high density server with PUE= 1.0x, which is expected to be lower than the already demonstrated liquid immersion cooling technologies because, for operation of this system, electric power required for circulating the refrigerant is unnecessary, and only electric power for circulating the cold water having a low specific gravity inside the cold plates and heat exchange of the water is required. In this paper, computational fluid dynamics (CFD) simulation was tried at first for making basic design of the liquid cooling immersion system with natural convection, and next the actual experiment was demonstrated.
The boards were immersed into the bathtub filled by the refrigerant without any pump or any fan for the refrigerant circulation in the bathtub. The experimental apparatus and a model of the liquid immersion cooling system with natural convection are shown in Figures 3 and 4. For the CFD simulation, the 2 CPUs with 140 W power at maximum and 16 memory boards with 10 W power at maximum were set on a mother board. The 24 or 48 mother boards (48 or 96 CPUs) were immersed into the bathtub with 600mm X 600mm X 870mm. The total power of the system reached a maximum of 14kW. Four cooling plates with 560mm X 560mm were implemented in the bathtub, three of them were placed parallel to the wall and one to the bottom, and cool water with temperature between 15 to 35 degrees C (59 to 95 degrees F) was flowed inside the plates. The heat exchanger for making the cold water is located outdoor. As mentioned above, the electric power required for circulating the refrigerant is unnecessary, and only electric power for circulating the cold water having a low specific gravity inside the cold plates and heat exchange of the water is required. By using CFD, we achieved the natural convection without any pump or any fan inside the bathtub. For the simulation, FlowDesigner Ver. 2017 of Advanced Knowledge Laboratory, Inc. was used. For the actual experiment, 6 server units were placed in the bathtub, each server unit consists 4 servers, and eventually 48 CPUs (Intel Zeon processor which power reaches around 120 W at maximum task) are immersed into the bathtub. The CPU power of 100 W was applied to each CPU at maximum task. The total size of the system actually reaches around 1/3 of the conventional rack structure for air cooling. As the refrigerants, we applied several types of non-conductive, thermally and chemically stable fluids, including silicone oil, soybean oil, and perfluorocarbon structured refrigerant (Fluorinert, 3M Company).
RESULTS AND DISCUSSION
Figure 5 shows a typical snapshot of temperature distribution. As clearly seen in the image, CPUs are heated and the refrigerant heated by the CPU is circulating from the bottom to the top of the bathtub. Figure 6 also shows a snap shot of flow rate distribution of the refrigerant viewed from the front and side. The space above the middle line is the air layer, and the refrigerant is filled up to the line. As clearly seen in the image, an upward flow is generated by the heat of CPU. The flow rate is about 0.1 m/sec (19.7 fpm) at 140 W. The cooling plates consist of 3 side cooling plates and one bottom cooling plate. The cooling water is flowed from the heat exchange system located outside. The temperature and the flow rate were changed from 15 to 35 degrees C (59 to 95 degrees F) and from 0.005 to 1 m/sec (0.98 to 197 fpm), respectively. Next, we analyze the thermal characteristics for several refrigerants. Figure 7 shows some patterns of temperature distribution for several refrigerants. From the left side, the pattern corresponds to fluorinate FC3283, and fluorinate FC43, and silicon oil 20, also silicon oil 50, and last, on the right side, soybean oil. As clearly seen in the images, the temperature of the CPU reached around 50 degrees C (122 degrees F) in every case. Also, when the fluorinate was used as a refrigerant, the temperature of the refrigerant was kept lowest. When silicon oil is used, the refrigerant temperature reached around 50 degrees C (122 degrees F). This suggests that fluorinate FC3283 and FC43 of smoother refrigerant are better than silicon 20, 50 and soybean oil of muddy or sticky refrigerants. These results indicate that the cooling effect for natural convection strongly depends on the viscosity of the refrigerant. The more detailed simulation results are shown in Figure 8, which represent a relationship between Rayleigh numbers of the refrigerants, a sort of the index of the viscosity, and CPU surface temperature, and a relationship between the Rayleigh numbers and flow rate of the refrigerant on the CPU surface. Here, the Rayleigh numbers are estimated at 25 degrees C (77 degrees F) due to the model's geometry. As seen in thes figure, the CPU temperature monotonically decreases and the flow rate increases with the Rayleigh numbers.
These results indicate that the smooth refrigerant, like fluorinate, is better for cooling the high power density CPU. In this model, we assumed that the heat generation in the CPU is uniform. Therefore, the temperature monotonically decreases from inside the CPU and to the CPU surface. Since, in the real CPU, heat generates only at the junction point, this model is not accurate, but as we will look at the experimental results later, the simulation results qualitatively explain the experimental results well.
Figure 9 shows results in a bit more detail. This figure shows the change in the average CPU surface temperature with respect to the flow rate of water flowing inside the cooling plates. Here, the parameter is the temperature of the water inside the cooling plate. As clearly seen in here, the CPU temperature monotonically decreases until a certain water flow level. As long as the flow rate of the water is 0.1 m/sec, the heat generated from the CPU is removed well at any temperature. Also, Figure 10 shows a map of the cooling performance against the Rayleigh number of refrigerant and the temperature of cold water flowing in the cooling plate. As clearly seen in the graph, the larger the Rayleigh number is, also the lower the water temperature inside the cooling plate is, the better the cooling performance is. The dot line shows operation limit when the CPU power is 140w and 90W. In each case, we can operate with parameters below the line. For example, when the CPU power reaches 140W, in case of FC3283, in order to keep the CPU surface temperature below 50degrees C (122 degrees F), the water temperature in the cooling plate should be reduced below 30 degrees C (86 degrees F). Also, when using Si50, it is necessary to suppress the cooling plate water temperature below about 20 degrees C (68 degrees F). It was also found that when the heat sink is attached to the CPU surface, even if the Rayleigh number of the refrigerant changes, its cooling efficiency is not greatly affected. This indicates that the heatsink with fins spaced by 6 mm is insensitive to the natural convection of the refrigerant.
Next, we performed the actual experiment. The simulation results predict well the experimental results qualitatively. The natural convection enables the effective cooling of high power density board. As described before, the 24 server boards are immersed into the refrigerant bath. No pump or no fan is placed inside this bathtub. From the CFD results, among the refrigerants we applied, we concluded that FC3283 is the best refrigerant with the best cooling performance in case of the immersion technology. Therefore, hereafter, we use the FC3283 as a refrigerant in an actual experiment. The lower the temperature of flowing water inside the cooling plate, and also the higher the flow rate, the higher the cooling efficiency of CPU. In other words, it is understood from these results that natural convection technology can be used sufficiently within a practical range. The cooling performance of immersion technology using natural convection will be shown in further detail. Figure 12 on the left shows the change in the CPU junction temperature with respect to the temperature of the water flowing inside the cooling plate. The graph on the right shows the change in CPU junction temperature with respect to the flow rate of cold water flowing inside the cooling plate. As clearly seen in the figure, the lower the temperature of flowing water inside the cooling plate, and also the higher the flow rate, the higher the cooling efficiency. In other words, it is understood from these results that natural convection technology can be used sufficiently within a practical range. Figure 13 shows the change in CPU junction temperature with respect to the CPU power. The refrigerant is FC3283. Unlike simulation, the temperature is not the CPU surface temperature but the junction temperature, because we cannot measure the surface temperature in this real environment. The junction temperature data comes out from each CPU itself. As clearly seen here, the CPU junction temperature rises linearly with the CPU power. In this system, the upper limit of CPU junction temperature is 100 degrees C (207 degrees F), so as you can see here, it can be said that around 120W CPU power can be applied, and this indicats that 14kW power system can be cooled down effectively per buth tub. These results show that this natural convection method has practically enough performance, considering that the maximum CPU power of HPCI is almost around 100 W at maximum. Although the simulation results by CFD need to further adjust the absolute value, the tendency is in good agreement with the experiment result. The effect of heat sink on the CPU is shown in Figure 14. Solid line represents the case with CPU heat sink (corresponding to the left side of this picture) and dashed line epresents the case without CPU heat sink (corresponding to the right side of this picture). As clearly seen in this figure, by attaching a heat sink, it has about twice the cooling performance. In other words, if there is no CPU heat sink, the upper limit of CPU power is only about 60W, that is, only tasks of about 50% can be applied. From the simulation, the effects of heatsink is bigger in case of muddy refrigerant. However, in an actual experiment, even in FC3283 with smooth refrigerant, the effect of the heat sink is clearly observed.
We also evaluated the PUE, based on the actual experiment with natural convention. The PUE was calculated with the power consumption of the servers, and pumping motors and heat exchange system implemented outside the system. Eventually, the PUE was demonstrated confirmed up to around 1.04. This value will be lower by improving the heat exchange system. These results indicate that the proposed technology exhibits promising potential for practical cooling technology in an actual data center.
We proposed an immersion cooling technology with natural convection for high power servers used in data centers. The cooling performance was evaluated by CFD simulation and actual experiments. Although the simulation results by CFD need to further adjust the absolute value, the tendency is in good agreement with the experiment results. The smoother refrigerant is better for cooling the high power CPUs. Among the refrigerants tried in this study, a perfluorocarbon structured refrigerant (Fluorinert) proved to be the most suitable for immersion cooling with natural convection. For operation of this system, electric power required for circulating the refrigerant is unnecessary, and only electric power for circulating the cold water inside the cooling plates and heat exchange of the water is required. When the upper limit of the junction temperature of the CPU was 98 degrees C (208 degrees F), it was found that at least 120 W class CPUs can be cooled down. This shows that the technology can be applyed for cooling system with exceeding 10 kW per bath tub. Eventually, in this system, PUE below 1.04 was demonstrated. Consequently, this proposed technology with natural convection exhibits promising potential for low energy and space-saving system in data center.
This study was supported by the development and demonstration projects for CO2 emission reduction of the Ministry of Environment in Japan. CFD analysis was supported by Advanced Knowledge Laboratory, Inc. The perfluorocarbon structured refrigerants (Fluorinert) were provided by 3M company. The heat exchange system for cooling water was developed by Takasago Thermal Engineering Co., Ltd. We sincerely thank for their helpful comments and discussions. We thank Mr. Fujimaki and Mr. Yamamoto of Fujitsu Limited for the fruitful discussions.
"Fog Computing introduction to a New Cloud Evolution", Bar-Magen Numhauser, Jonathan, University of Alcala, 2013.
"Edge-centric Computing: Vision and Challenges", Garcia Lopez, et al., SIGCOMM Comput. Commun. Rev. 45 (5): 37-42, 2015. https://energy.gov/eere/buildings/data-centers-and-servers
"Dynamic power simulation utilizing computational fluid dynamics and machine learning for proposing ideal task allocation in a data center", Kitada, Nakamura, Matsuda and Matsuoka, International Conf. on Cloud Computing, 2016.
Fujitsu. 2016. Totally Submerging a Server in Liquid Reduces Power Consumption up to 30% - Data Center Innovation by a Novel Cooling Technology. http://journal.jp.fujitsu.com/en/2016/08/15/01.
Erich Strohmaier, Jack Dongarra, Horst Simon, and Martin Meuer. 2017. The GREEN 500 LISTS June 2017. https://www.top500.org/green500/lists/2017/06.
David Prucnal. 2015. Doing more with less: Cooling computers with oil pays off. The Next Wave. Vol.20 No.2:21-29.
Morito Matsuoka, PhD
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||power usage efficiency|
|Author:||Matsuoka, Morito; Matsuda, Kazuhiro; Kubo, Hideo|
|Publication:||ASHRAE Conference Papers|
|Date:||Jan 1, 2018|
|Previous Article:||Control-Oriented Modeling of an Air-Based Electric Thermal Energy Storage Device.|
|Next Article:||Proposal of Cooling System for High Performance Computing by Drip-Feeding Cooling.|