Data center operators deploying tools that rely on machine learning today are benefiting from initial gains in efficiency and reliability, but they’ve only started to scratch the surface of the full impact machine learning will have on data center management.
Machine learning, a subset of Artificial Intelligence, is expected to optimize every facet of future data center operations, including planning and design, managing IT workloads, ensuring uptime, and controlling costs. By 2022, IDC predicts that 50 percent of IT assets in data centers will be able to run autonomously because of embedded AI functionality.
“This is the future of data center management, but we are still in the early stages,” Rhonda Ascierto, VP of research at Uptime Institute, said.
Creating smarter data centers becomes increasingly important as more companies adopt a hybrid environment that includes the cloud, colocation facilities, and in-house data centers and will increasingly include edge sites, Jennifer Cooke, research director of IDC’s Cloud to Edge Datacenter Trends service, said.
“Moving forward, relying on human decisions and intuition is not going to approach the level of accuracy and efficiency that’s needed,” Cooke said. “The whole shift toward data-driven decisions and leveraging all that data to improve outcomes is the only sustainable way to meet the needs for IT services at scale.”
Hyperscale platforms are already applying machine learning to their data centers. They have the vast amounts of data, internal compute resources, and in-house data science expertise necessary to pursue their own machine learning initiatives, Ascierto said.
Some enterprises or colocation providers that don’t have the same scale or skills have become early machine learning adopters by turning to vendors, such as Schneider Electric, Maya Heat Transfer Technologies (HTT), and Nlyte Software, which offer data center management software or cloud-based services that take advantage of the technology.
Here are five of the biggest use cases for machine learning in data center management today:
1. Efficiency Analysis
Organizations today are using machine learning to improve energy efficiency, primarily by monitoring temperatures and adjusting cooling systems, Ascierto said.
Google, for example, told us earlier this year that it was using AI to autonomously manage and finetune cooling at its data centers by analyzing 21 variables, such as outside air temperature, a data center’s power load, and air pressure in the back of the servers where hot air comes out. Google’s machine learning algorithms automatically adjust cooling plant settings continuously, in real-time, resulting in a 30 percent decrease in annual energy usage from cooling, the company said.
Machine learning can also optimize data center efficiency by using algorithms to analyze IT infrastructure to determine how best to utilize resources, such as the most efficient way or best time to perform tasks, Cooke said.
Furthermore, it can make recommendations on the most efficient way to design or configure a data center, including the best physical placement of IT equipment or workloads, Ascierto said.
For example, Montreal-based Maya HTT, which has added machine learning capabilities in its data center infrastructure management (DCIM) software, can analyze servers and detect anomalies, such as ghost servers running applications no longer in use. It can also discover older servers with high workloads and recommend that the IT staff move those workloads to newer, more energy-efficient servers that have lower utilization, Remi Duquette, the company’s VP of Applied AI and Data Center Clarity LC, explained.
“Humans often have an if-it’s-not-broken-why-fix-it mentality, so they might not think of moving loads to a new server to reduce power consumption,” he said.
2. Capacity Planning
Machine learning can assist IT organizations in forecasting demand, so they don’t run out of power, cooling, IT resources, and space. For example, if a company is consolidating data centers and migrating applications and data to a central data center, algorithms can help it determine how the move affects capacity at that facility, Ascierto said.
Capacity planning is an important service for organizations building new data centers, said Enzo Greco, chief strategy officer of Nlyte Software, a DCIM software vendor that recently launched a Data Center Management as a Service (DMaaS) offering and partnered with IBM Watson to integrate its machine learning capabilities into its products.
“You need to be as accurate as possible with data centers. How many servers do you need? How much cooling do you need? You only want as much cooling as the number of servers you have,” he said. “Also, how much power do you need? That depends on cooling and server capacity.”
3. Risk Analysis
Of all the use cases, using machine learning for risk analysis is the most critical, because it can identify anomalies and help prevent downtime. “Machines can detect anomalies that would otherwise go undetected,” Ascierto said.
For example, Schneider Electric’s DMaaS can analyze performance data from critical data center equipment, such as power management and cooling systems, and predict when they might fail. When algorithms detect anomalies that shows signs of an impending failure, the system alerts customers so they can troubleshoot before the equipment goes down, said Joe Reele, VP of data center solution architects at Schneider Electric.
Risk analysis through machine learning can also improve data center uptime in other ways. It can bolster cybersecurity, and in the future help with predictive maintenance, which replaces maintenance at regularly scheduled intervals with maintenance just when it’s needed. Another potential application is scenario planning, or ability to model different data center configurations to improve resiliency.
If downtime does occur, a machine learning algorithm can also assist with incident analysis to determine the root cause faster and more accurately, Ascierto said.
4. Customer Churn Analysis
In the future, Ascierto sees colocation providers using machine learning to better understand their customers and predict their behavior – from purchasing or adding new services to the likelihood of renewing their contracts or even paying bills. This is an extension of customer relationship management and can include automated customer engagement through chatboxes, she said.
Maya HTT already analyze customer sentiment. It currently doesn’t have data center customers using it, but through natural language processing, the company’s software can analyze email and recorded support calls to predict future customer behavior, Duquette said.
5. Budget Impact Analysis and Modeling
This mixes data center operational and performance data with financial data – even including things like applicable taxes – to understand the cost of purchasing and maintaining IT equipment, Ascierto said.
“It’s modeling out the total cost of ownership and lifecycle of a piece of equipment, such as one type of cooling system compared with another,” she said.
Salesforce, for example, in 2016 acquired a startup called Coolan, which used machine learning to analyze total cost of ownership of IT equipment down to individual server components.
The question is how soon more companies use machine learning to perform budget impact analysis. Some private companies could be doing this on their own, but it’s quite complex, because it requires financial data to be readily available in a format that computer models can ingest, Ascierto said.
DMaaS customers are less likely to want to share their financial data with a third party for security reasons. “For DMaaS services, getting customers to share their financial data is a trickier proposition in these early days,” she said.
Again, Maya HTT is one of the trailblazers in this area. The company currently offers a machine learning-powered service that combines capacity planning with budget impact analysis. According to Duquette, through deep learning algorithms, it can take data it already has – a client’s current data center capacity and the amount of capacity planned projects will take – and compare it to the sales funnel from CRM software and predict how future sales will impact capacity.
That allows the client to purchase new servers and storage on an as-needed basis. “Instead of buying a full rack of servers now, they can do financial engineering and buy servers just in time,” he said. “It allows for better forecasting.”
Early Adopters Tackle Efficiency, Risk Analysis First
Vendors and data center operators that are actively exploring machine learning today are focused on using it for the big pain points: improving efficiency and reducing risk, Ascierto said.
For example, colocation giant provider Digital Realty Trust, which owns more than 200 data centers worldwide, recently began piloting machine learning technology to improve efficiency.
The company, which is currently feeding DCIM data to a third-party vendor for analysis, is focused first on optimizing its cooling systems. But in the future, Digital is planning to explore using AI to forecast future resource needs and predictive maintenance, Ted Hellewell, Digital Realty’s director of operations, innovation, and technology, said.
He expects the technology to be “a huge benefit for our operations team, even beyond what DCIM provides today.” Those benefits will be driven by exponential growth of data centers, cloud computing, the Internet of Things, and edge computing, and inability of humans to manage the level of complexity all this infrastructure will have in the future.
“The quantity of underlying systems, devices, and data required to support the infrastructure is quickly exceeding what a human can consume and process,” Hellewell said. “This is going to allow Digital Realty to excel in real-time processing, response, communications, and decision making.”