How AI and Machine Learning Are Ready To Change the Game for Data Center Operations – Data Center Knowledge

Todays data centers face a challenge that, initially, looks like its almost impossible to resolve. While operations have never been busier, teams are pressured to reduce their facilities energy consumption as part of corporate carbon reduction goals. And, as if that wasnt difficult enough, dramatically rising electricity prices are placing real stress on data center budgets.

With data centers focused on supporting the essential technology services that people increasingly demand to support their personal and professional lives, its not surprising that data center operations have never been busier. Driven by trends that show no sign of slowing down, were seeing massively increased data usage associated with video, storage, compute demands, smart IoT integrations, as well as 5G connectivity rollouts. However, despite these escalating workloads, the unfortunate reality is that many of todays critical facilities simply arent running efficiently enough.

Given that the average data center operates for over 20 years, this shouldnt really be a surprise. Efficiency is invariably tied to a facilitys original design - and based on expected IT loads that have long been overtaken. At the same time change is a constant factor, with platforms, equipment design, topologies, power density requirements and cooling demands all evolving with the continued drive for new applications. The result is a global data center infrastructure that regularly finds it hard to match current and planned IT loads to their critical infrastructure. This will only be exacerbated as data center demands increase, with analyst projections suggesting that workload volumes are set to continue growing at around 20% a year between now and 2025.

Traditional data center approaches are struggling to meet these escalating requirements. Prioritizing availability is largely achieved at efficiencys expense, with too much reliance still placed on operator experience and trusting that assumptions are correct. Unfortunately, the evidence suggests that this model is no longer realistic. EkkoSense research reveals an average figure of 15% of IT racks in data centers operating outside of ASHRAEs temperature and humidity guidelines, and that customers strand up to 60% of their cooling capacity due to inefficiencies. And thats a problem, with Uptime Institute estimating that the global value attributed to inefficient cooling and airflow management is around $18bn. Thats equivalent to some 150bn wasted kilowatt hours.

With 35% of the energy used in a data center utilized to support the cooling infrastructure, its clear that traditional performance optimization approaches are missing a huge opportunity to unlock efficiency improvements. EkkoSense data indicates that a third of unplanned data center outages are triggered by thermal issues. Finding a different way to manage this problem can provide operations teams with a great way to secure both availability and efficiency improvements.

Limitations of traditional monitoringUnfortunately, only around 5% of M&E teams currently monitor and report their data center equipment temperatures on a rack-by-rack basis. Additionally, DCIM and traditional monitoring solutions can provide trend data and be set up to provide alerts when breaches occur, but that is where they stop. They lack the analytics to provide deeper insite into the cause of the issues and how both to resolve them and avoid them in the future.

Operations teams recognize that this kind of traditional monitoring has its limitations, but they also know that they simply dont have the resources and time to take the data they have and convert it from background noise into meaningful actions. The good news is that technology solutions are now available to help data centers tackle this problem.

It's time for data centers to go granular with machine learning and AIThe application of machine learning and AI creates a new paradigm in terms of how to approach data center operations. Instead of being swamped by too much performance data, operations teams can now take advantage of machine learning to gather data at a much more granular level meaning they can start to access how their data center is performing in real-time. The key is to make this accessible, and using smart 3D visualizations is a great way of making it easy for data center teams to interpret performance data at a deeper level: for example, by showing changes and highlighting anomalies.

The next stage is to apply machine learning and AI analytics to provide actionable insights. By augmenting measured datasets with machine learning algorithms, data center teams can immediately benefit from easy-to-understand insights to help support their real-time optimization decisions. The combination of real-time granular data collection every five minutes and AI/machine learning analytics allows operations not just to see what is happening across their critical facilities but also find out why and what exactly they should do about it.

AI and machine learning powered analytics can also uncover the insights required to recommend actionable changes across key areas such as optimum set points, floor grille layouts, cooling unit operation and fan speed adjustments. Thermal analysis will also indicate optimum rack locations. And because AI enables real-time visualizations, data center teams can quickly gain immediate performance feedback on any actioned changes.

Helping data center operations to make an immediate difference Given pressure to reduce carbon consumption and minimize the impact of electricity price increases, data center teams need new levels of optimization support if they are to deliver against their reliability and efficiency goals.

Taking advantage of the latest machine learning and AI-powered data center optimization approaches can certainly make a difference by cutting cooling energy and usage with results achievable within weeks. Bringing granular data to the forefront of their optimization plans, data center teams have already been able to not only remove thermal and power risk, but also secure reductions in cooling energy consumption costs and carbon emmissions by an average of 30%. Its hard to ignore the impact these kind of savings can have particularly during a period of rapid electricity price increases. The days of trading off risk and availability for optimization is a thing of the past with power of AI and Machine learning at the forefront of operating your data center.

Related: Scale Your Machine Learning with MLOps

Want to know more? Register for Wednesday's AFCOMwebinar on the subject here.

About the author

Tracy Collins is Vice President of EkkoSense Americas, the company that enables true M&E capacity planning for power, cooling and space. He was previously CEO at Simple Helix, a leading AL-based Tier III data center operator.

Tracy has over 25 years in-depth data center industry experience, having previously served as Vice President of IT Solutions for Vertiv and, before that, with Emerson Network Power. In his role, Tracy is committed to challenging traditional approaches to data center management particularly in terms of solving the optimization challenge of balancing increased data center workloads while also delivering against corporate energy saving targets.

Read this article:
How AI and Machine Learning Are Ready To Change the Game for Data Center Operations - Data Center Knowledge

Related Posts
This entry was posted in $1$s. Bookmark the permalink.