UPDATED 18:35 EST / MAY 31 2019

AI

Managing the multicloud will require lots of AI – but people too

Business will increasingly run in multiple clouds, which means that managing this critical resource will become the core function of many information technology professionals.

Automating the majority of multicloud management workloads will become a key initiative for IT departments that wish to cut costs, improve service levels and ensure strong governance. As noted in this recent Mesosphere research study, the business stakes behind multicloud management continue to grow:

  • Adoption of cloud-native computing in large enterprise companies is expanding rapidly.
  • Multicloud adoption is doubling year over year, by way of transitional hybrid cloud deployments.
  • More enterprises are moving their workloads to large-scale production multiclouds.
  • Containerized microservices are the most popular workloads running in multiclouds, followed by legacy apps, data services and analytics.
  • Kubernetes is the most popular software for multicloud microservices, followed by Kafka streaming.

Getting their arms around multicloud complexity will be the chief challenge for IT professionals in coming years. When we consider the management plane for the mesh multiclouds of the future, they’ll need to include the following key capabilities, many of which will require the real-time automated insights provided by embedded artificial intelligence:

  • Resource management: There will be more even distribution of compute, storage and memory resources across all tiers, clusters and nodes, with more workloads parallelized to execute across increasingly powerful edge devices. This will require sophisticated resource management controls such as load balancing and fine-grained routing, rate limiting, flow control, protocol translation, authentication and authorization, and monitoring and logging.
  • Workload management: There will be more flexible movement, routing and control of workloads, with streaming, publish-and-subscribe and stateful continuous processing becoming the dominant approaches for handling real-time, low-latency, distributed workloads across the multicloud. This will require fine-grained control of microservices traffic behavior with rich routing rules, fault tolerance and fault injection, as well as automatic zone-aware load balancing and failover for diverse traffic types.
  • Interface management: In the mesh multicloud, development abstractions will deliver programmatic access to all routing, policy, security and other control-plane functions. This will require distributed catalogs for managing APIs, service definitions, machine-learning models and metadata to facilitate discovery, delivery and management of application interfaces.
  • State management: In the edge-oriented multicloud mesh, there will be management of shared application state as a shared context. This will require a distributed persistence plane — distinct from hypervisor, container, serverless and streaming application backplanes — that manages state, context and other metadata as a shared resource.
  • Performance management: As command-and-control gives way to dynamic cross-mesh operations, there will be more software-defined, artificial intelligence-driven monitoring, orchestration, optimization and assurance of end-to-end application performance across the multicloud. This will require continuous monitoring of traffic and workloads, using this data to enforce policy decisions such as fine-grained access control and rate limits.
  • Identity management: As the edges begin to dominate the multicloud, the need for distributed strong authentication — built on multifactor identity assertions — will grow. This will require end-to-end trust relationships, role-based access controls and confidentiality across all nodes, applications and microservices, perhaps leveraging blockchains for secure credentials management.
  • Network management: As multicloud meshes grow more complex, AI-enhanced software-defined networking capabilities will drive intent-based networking, application-aware firewalling, intrusion prevention, health monitoring, anti-malware and URL filtering across the meshes.
  • Orchestration management: In the decentralized multicloud, there will be more peer-to-peer orchestration of nodes within and across all tiers, all the way out to mobile, embedded, “internet of things” and other edge devices. This will require proxy servers that intermediate the network path between service nodes.

There is no way that any distributed organization can effectively tackle every cloud-computing IT issue, incident or situation issue without extensive AI-driven automation. As enterprises grow their hybrid-cloud environments into full-on multicloud meshes, the exceptional circumstances — such as technical glitches, security alerts and performance bottlenecks — will grow in frequency and severity unless automation is made 24-by-7, proactive and consistent.

However, the more complex your multicloud becomes, the less likely it is that you’ll be able to entirely automate responses the vast range of underlying platform, application, service and other issues. Human-in-the-loop exception handling will become the order of the day for the long tail of rare cloud-computing use cases up and down this multilayered management plane. The more complex cloud management functions — including cost management, security and compliance, application development, deployment and operational management — will continue to rely on collaborative responses that skilled human IT personnel may need to improvise on the fly.

The orchestration layer in the more complex cloud deployment use cases will need to drive human-response flows alongside entirely system-automated responses. The less common a specific incident or situation is, the less likely it is that there will be sufficient historical “ground truth” data for training the highly predictive statistical models upon which AI-driven automations depend. In many multicloud operational circumstances, AI-driven workflows will often span several tiers of IT support resources working in lockstep over indefinite periods.

Where automation is concerned, the key is a collaborative control plane that drives two concurrent layers of coordinated action in managing end-to-end issues across the multicloud:

  • Automated orchestration of containerized microservices: This requires a distributed multicloud operating environment, such as Mesosphere DC/OS, that has baked-in support for continuous integration and continuous deployment workflows. A DevOps platform such as Mesosphere DC/OS enables enterprise IT to automate the orchestrated deployment and management of Kubernetes containers over practically any cloud deployment, including public, on-premises and edge platforms, as well as bare metal and virtualized. Collaboration focuses on a unified dashboard that enables IT teams to monitor and manage end-to-end cloud platforms, workloads, and infrastructure.
  • Automated orchestration of human IT management resources: This requires a distributed control plane such as Moogsoft AIops, which leverages AI to automate the orchestration of human IT workflows for service assurance across the multicloud. It simplifies creation of the logic that drives both automated and human-mediated incident management workflows that may include integration with automated third-party remediation tools. Situation rooms and similarity clusters provide a window into how situations are being handled by both automated and manual workflows. Multicloud operations use a visual tool to create custom workflows that trigger notifications, ticket creation and other automated tasks. The solution enables real-time correlation of related alerts from across the multicloud, giving IT teams contextual awareness for discovering incidents faster. Radar charts and other situational awareness tools enable teams to rapidly understand the issues uncovered by alert clustering algorithms. The visual dashboard shows IT personnel the criteria that explain why specific cloud events have been correlated together into a single situation, along with the probable root causes. IT managers can drill down to the workloads and key performance indicators associated with each team member involved in managing various multicloud management tasks.

Underlining the business stakes in providing robust multicloud management, here is Dayna Rothman, Mesosphere’s vice president of marketing, speaking with John Furrier, co-host of SiliconANGLE Media’s video studio theCUBE in December:

Image: MaxPixel.net

A message from John Furrier, co-founder of SiliconANGLE:

Your vote of support is important to us and it helps us keep the content FREE.

One click below supports our mission to provide free, deep, and relevant content.  

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well” – Andy Jassy

THANK YOU