UPDATED 22:26 EDT / NOVEMBER 27 2021

CLOUD

Automation puts data centers on the path to a hyperscaler-style experience

Picture a network operator at a hospital. In the past year, this operator has been responsible for managing mission-critical infrastructure at the most challenging time.

The hospital relies heavily on network infrastructure to support technologies that are essential to its operations and the health of its patients. Downtime and security failures are not an option, but the network operator can only do so much with a limited team and an impossible task.

Diagnosing an issue in the network to find the root cause and fixing it is not easy. For instance, if a switch fails, causing either a lack of connectivity to a part of the infrastructure or latency spikes in serving critical data, the network operator needs to take immediate action. Manual intervention and tracing fault alerts to identify root causes takes too much time and effort.

This is where network fabric automation and data center automation come in. With data center automation, the network automation suite would be proactively probing from every endpoint to every other endpoint and would proactively pinpoint the exact root cause – in this case, identifying the failure at a specific switch.

The automation would alert the network operator via a dashboard highlighting the root cause of the issue and providing a recommended solution. In many cases, the operational automation may not only identify a fix, but may also automatically remediate the issue. Without automation, it is a different story altogether, with system alerts at massive scale often overwhelming operators who simply don’t have the time to comb through them manually.

This is a very real scenario that many are finding themselves in. In the past year, data centers have been the backbone supporting essential services, the shift to remote work, hospitals, home learning and more. That reality has been accelerated by the pandemic, which has added fuel to major societal trends that are driving data center growth, including the shift to cloud, the “internet of things” and bandwidth-intensive applications. In fact, Gartner predicts that worldwide data center infrastructure spending will grow 6% in 2021, with year-over-year growth through 2024.

This growth is creating challenges for data center operators that expand beyond what they have dealt with before. The explosion of new services, cloud technologies, DevOps and more has moved management past the abilities of a traditional team and created a need for automation.

Data centers are also dealing with a mix of new and old applications running on everything from virtual machines to containers to even bare-metal servers, adding to the complexity of the environment. Data center operators need a seamless experience in managing their infrastructure that puts them on par with hyperscaler or large-scale cloud operations in terms of simplicity, efficiency with self-healing, self-diagnosis and self-planning.

Hyperscalers such as Amazon Web Services Inc., Microsoft Corp. and Google LLC are capable of operating efficiently at scale because of this type of automation, and the fact that they are not held back by specific challenges that may be unique to an individual enterprise. Instead, they are able to build out more efficient operations that provide the best experience for whoever is utilizing the resources. The next evolution of the data center must be focused on using automation to achieve hyperscaler-style experience.

Setting the stage with automation

Operating a data center at the efficiency of a hyperscaler may seem aspirational, but advancements in automation and network software have brought it closer to reality. Hyperscalers manage massive data centers with incredible sophistication and speed: Problems are identified and resolved in a matter of hours, not days. That type of efficiency is within reach if the right investments are made in automation that applies advanced, artificial intelligence-driven analytics to root cause analysis and remediation.

One factor enabling the hyperscalers to reach this level of automation is the uniformity of their network deployments. Automation tools are tailored to a limited set of designs and hence can understand the semantics of networks. Likewise, data centers will have to settle on a set of reference designs that can be managed by off-the-shelve automation software.

As such, data center operators can take advantage of new technologies that help gather advanced telemetry and analytics that feed into automation capabilities. That creates a robust data center network fabric connecting everything with software designed to manage the fabric to extract the most value.

Take a nuisance problem, such as parts maintenance. Defective cables are the bane of a data center operator’s existence. Not only do they cause major problems, but finding them is a time-consuming and frustrating job. At scale, it is a true needle in a haystack challenge.

This is where automation is critical and can help operators achieve the same level of fabric defect detection or prediction and automated maintenance and replacement as the hyperscalers. Triggered by predictive analytics, automated systems can notify users of old or defective parts, schedule routine maintenance, reroute traffic around affected systems and even order parts through integration with a procurement system. Solving the problem of parts maintenance at massive scale is an invaluable use of automation technology, and it’s only the beginning.

Automation technology’s benefits truly shine when it comes to service assurance, root-cause analysis, self-healing and self-planning. Taking self-planning as an example, capacity and cost management are difficult challenges for data centers in an environment where there is unprecedented demand. A network architect adding a rack of compute to a data center may be guessing at whether the network can handle the new traffic load without access to historical data or the tools to provide visibility into the network.

However, if an architect can leverage intent-based analytics, deep insights can be gained into average bandwidth utilization across all links in the data center, or whether the spine links that will support added hardware have the capacity to support the added load. Automation is at its best when analytics are delivered across the stack, surfacing insights into the operation of physical and virtual networks.

Finally, many data centers are operated in silos. There are experts in specific data center functions like networking, storage and servers, and when working seamlessly within each silo, the entire system can operate efficiently. But that system functions in a perfect world, and as data center operations have become more complex, it has created weaknesses. Problems that manifest in storage access times may be caused by problems in networking, but the groups working to solve these problems often aren’t connected to one another.

Automation breaks down these silos by automatically tracking events and errors across these domains, correlating them in real time and then automatically identifying both a root cause and a potential remediation. Additionally, by standardizing on network architectures and technologies, automation tools can operate more efficiently. Hyperscalers follow best practices when designing network fabrics, with the goal of providing an environment that is as simple as possible for automation tools to understand.

No matter the use case, if data center operators are interested in evolving their solutions, automation must be a key part of their strategy. Operating at optimal efficiency can be a challenge for data center operators, but it will only become more complex if they continue on their current path.

The future of data centers will rely on scalability to keep up with the demand for data and new services. By delivering hyperscalerlike features in the enterprise, automation can provide the experience end users will expect.

Raj Yavatkar is chief technology officer of Juniper Networks. He wrote this article for SiliconANGLE.

Image: Elchinator/Pixabay

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU