Addressing power and thermal challenges in the datacenter.Rising utility rates and escalating computing requirements are creating new power and thermal challenges for datacenter The Datacenter No matter how much computers are distributed into the organization, there always seems to be a need for a centralized datacenter in the large enterprise. managers. Today's high-density rack and blade servers bring these issues into especially sharp focus. Since these architectures are inherently more scalable, adaptable, and manageable than traditional platforms, they deliver much needed relief in complex, crowded datacenters. Yet they also introduce power and thermal loads that are substantially higher than those of more currently deployed systems. In some cases, they may even push the cooling infrastructures of older design facilities beyond their limits. A long-term solution to these challenges will require broad industry innovation and collaboration. The most important enhancement is a shift toward multi-core architectures that incorporate two or more processing units on a single chip. This would deliver major performance improvements while helping maintain power usage. Moving Toward Integrated Datacenter Management: Performance Planning The goal of datacenter integration is to enable IT facilities managers to automate and optimize performance, power and cooling management across the datacenter, and to monitor and control all relevant variables at the component, system, rack and datacenter level. In the short term, companies can move towards this environment through an integrated datacenter approach. The goal of this integrated approach is to ensure that IT planning factors their performance (or capacity) requirements to sustain appropriate business service levels with the implications of underlying IT resource decisions beyond simple "price/performance" ratios. Today, ISVs provide performance and prediction software that enables companies to plan for the balancing of workloads, response times and system resource utilization with business and IT HW resource changes. Those plans can then be factored into power and thermal planning activities, which fall out from resource technology decisions and timelines. The components of such a performance/capacity plan include: Build an Asset Inventory: It's impossible to plan datacenter requirements without understanding your starting point. You must be able to answer the "What" question about your resources. Know and document the physical aspects of your asset inventory including software licenses, asset purchase dates, cost, owner, server/desktop identity, etc., in order to better understand the performance utilization of underlying assets. Baseline Asset Performance: After the physical assets have been catalogued, it is important to answer the "How" question in terms of asset performance utilization. It is not uncommon to find server CPU utilization on some applications averaging well below 15% across the typical enterprise. Leveraging performance software tools enables businesses to effectively collect, analyze and present performance data that directly associates performance measurements with business-oriented metrics. This first step also typically helps identify "low hanging fruit" opportunities. For example, idle resources, which can potentially be rapidly re-provisioned in lieu of incremental resource acquisition. Report/View: Once performance utilization information is available, it is time to understand the "Why" aspect. Should IT care about any given resource? Reporting capabilities enable IT professionals to examine the correlation between performance/utilization against asset ownership and purpose. It helps companies understand not just that a given asset might be "underutilized" but that it has an (un)important role within the organization. By correlating utilization, business purpose and stakeholder information, IT can make rapid, informed decisions to proceed to more comprehensive asset performance analysis for those assets whose performance (or cost) factors necessitate more detailed planning. Asset Analysis: It is critical to understand the impact of the business cycle on the utilization of underlying IT resources. No business plan (whether for performance/capacity or for technology choice, power and/or cooling) is complete without factoring the requirements to sustain business service-level requirements across the business cycle. Optimizing asset utilization in conjunction with reductions in power and cooling costs is worthless if the resultant configurations under (or over) provision the needs of the business. Factors such as business workload performance over time (e.g., "trade settlement application requires X CPU and Y I/O resources for a given transaction rate"), business cycle variance (e.g., "trade settlement transactions peak at 5 p.m. daily, with monthly variance of Y, and business peaks quarterly") can ensure the all-important common view between IT and business so needed to build an overall datacenter resource plan. Asset Modeling: Data center planning requires the analysis of various technology choices in terms of impact on business throughput, response time and utilization. The previous steps provide the foundation upon which business change scenarios (growth, consolidation, etc.) can be factored into the overall performance analysis, ensuring high-confidence that the results will not only resolve today's performance problems, but will cost-effectively scale with a company's business. This results in a series of technology choices, which can then be evaluated as part of the Power and Thermal Planning effort. Optimize Ongoing Operations: Maintaining "Actual versus Planned" measurements of ongoing performance and utilization is a necessary step to ensure ongoing cost and performance optimization. Business risks associated with change are minimized, mis-provisioning is reduced/eliminated, and costs are lowered not only by increasing average utilizations, but also by being able to maintain optimal server, power and thermal-related expenditures over the business lifecycle of the server assets. Moving Toward Integrated Datacenter Management: Power and Thermal Planning Once you have the right mix of servers that can meet your business service performance, response time and throughput requirements, then you must consider their physical placement in the datacenter. Companies should consider both rack-level optimization and datacenter-level optimization scenarios. Rack-level Optimizations Most system, rack- and room-level cooling issues are created due to insufficient airflow or inadvertent mixing of hot and cold air. The need for sufficient airflow is obvious, but is often overlooked by IT personnel who are focused on other concerns. The mixing of hot and cold air is a more subtle issue, but equally problematic, since it can dramatically reduce the efficiency of a cooling system and may also impact airflow. Understand Airflow Requirements for Specific Equipment: There are four basic airflow scenarios: front-to-back, side-to-side, bottom-to-top, and top-to-bottom. Understanding the requirements for specific equipment will enable an efficient rack-level design and cooling strategy. Map the technologies under consideration as part of the Performance Planning activities against the thermal characteristics specified by the vendor(s). Standardize on Racks Designed for High-Density Environments: Standardize on appropriate power and thermal policies to make effective rack decisions. Examples: Avoid shallow racks to ensure in-rack cabling will not obstruct airflow. Consider racks that support retrofit fan or cooling units (but verify the benefits of these add-on units) and that minimize future risks associated with miscalculations. Planning for localized supplemental cooling for individual racks can accommodate future high-density systems without compromising room-wide efficiency. Arrange Racks in Rows to Establish Hot and Cold Aisles: Racks should be aligned front-to-front along cold aisles, and back-to-back along hot aisles. Within each row, racks should be tightly abutted. For this strategy to be effective, cold air must be delivered to cold aisles and hot air extracted from hot aisles. Work to eliminate hot air mixing, which will cause short cycling of the cooling system. Use Blanking Panels: Blanking panels improve airflow through the rack, minimize air loss and help prevent exhaust air recirculation. Ensure Adequate Airflow to Individual Racks and Systems: Clearly define power and cooling requirements at the room, row, and cabinet level. Ensure sufficient airflow to racks based on system-level inlet air temperature and airflow requirements, and use thermal and aerodynamic analysis tools to model and design your cooling solutions. Insufficient airflow will often result in hotter systems and turbulence that decrease cooling efficiency. For example, if a rack requires more cold air than the room provides, its fans will pull a mix of hot and cold air. This will result in reheating of the room, hotter systems, unhealthy airflows, and a substantial reduction in the efficiency of the cooling system. Explore the Benefits of Blade Servers: Blade architectures can reduce total power consumption (per unit of complete power) and deliver substantial TCO benefits through reduced cabling, easier provisioning and improved modularity and other management costs. However, they will likely increase power and cooling density. It is therefore important to look at total costs, risks, and benefits within your particular physical and operational environment. Datacenter-Level Optimizations Understand Datacenter Airflow: The locations of cooling systems and ductwork are obviously critical, but so are the locations of the racks, cable trays, firewalls and other infrastructure elements. Blank off any floor opening that allows access air to escape the plenum. Software tools are now available that greatly simplify airflow and thermal analysis. Consider consulting with facilities cooling specialists for complex implementations. Optimize Room Temperature Settings: Consider increasing the Delta T of your cooling system to more closely match IT equipment specifications. This may allow you to reduce total airflow, while meeting the same cooling capacity and reducing operational costs. (For example, Intel IT has found it beneficial to lower supply air temperatures to between 55 and 65 degrees Fahrenheit while increasing Delta T values to 2 degrees Fahrenheit). Pay Attention to Infrastructure Efficiency: It is generally worthwhile to spend more on infrastructure components that run efficiently at anticipated loads. Power loss in uninterruptible power supplies, power distribution units, cooling systems, etc., just add to the thermal load. Perform Regular Power and Thermal Audits: New systems, upgrades, and room changes can have unintended consequences, so it is important to monitor airflow, temperature, and other environmental factors on a regular basis. Avoid Over-Design: Right-sizing power and cooling infrastructure is one of the most effective ways to reduce capital and operational costs in the datacenter. Work to understand lifecycle requirements and size infrastructure accordingly. Track vendor innovations, and, whenever possible, move toward more modular, flexible, and standardized solutions that improve agility and scalability. Right sizing has obvious ties with the benefits of having a long-term performance capacity plan and ongoing performance optimization methodology. Establish Policies and Educate Personnel: Best practices for power and thermal management must become an integral component of datacenter operations. Everything from temperature and humidity settings to new system deployments should follow well-understood guidelines that optimize cooling efficiency and minimize airflow obstructions and hot/cold air mixing. For New Datacenters, Establish a Master Plan Based on Usage: Different usage models require different layout and capacities to enable optimized cooling. According to a recent reader survey conducted by Intelligent Enterprise, the majority of companies want to make the most of what they have by simply maintaining and improving their existing technologies. With these steps you can create a next generation datacenter and begin to reap significant ROI from your investment. Charles Rego is senior practice principal for Intel Solution Services at Intel Corp. (Santa Clara, CA). David Wagner is director of product management, Enterprise Performance Assurance (EPA) solutions for BMC Software, Inc. (Houston, TX) www.intel.com www.bmc.com |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion