Meet GitOps, the key to launching effective software releases in the cloud-native era
The automation story behind DevOps centers on CI/CD, the continuous integration and continuous deployment that results in working code ready for production.
Deployment isn’t the end of the process, however. Releasing code is the missing step — putting new software in front of customers and end-users while ensuring it meets the ongoing objectives of the business.
Achieving this customer centricity and rapid deployments of CI/CD is difficult enough with traditional on-premises and cloud environments. But when deploying to Kubernetes-powered cloud-native environments, the massive scale and ephemerality of the operational environment requires an end-to-end rethink of how to release software into production and operate it once it’s there.
The unprecedented demand for cloud-native computing
While most enterprises are currently in the midst of ramping up their Kubernetes deployments, certain industries – telecommunications in particular – are already looking ahead to the need for unprecedented scale.
As part of the 5G buildout, telcos are standing up small data centers at cell towers and points of presence. But “small” is a misleading adjective, since these data centers are essentially clouds in their own right, running potentially hundreds of thousands or even millions of Kubernetes clusters each.
From the perspective of the telco business, product managers want the ability to roll out new services to customers in sophisticated, dynamic ways. They may want to roll out new capabilities to a small group of customers, and then expand the deployment over time. They may have geographically specific offerings. Or perhaps they will delineate different service categories by compliance restrictions.
Furthermore, the telcos represent the tip of the sword. Many industries, from banking to automotive to media, are also looking to leverage similar capabilities to drive market share and customer value.
The list of possible variations in service offerings that such enterprises might want to roll out to different segments of their respective customer bases is extensive. Similarly, the scale that their technical infrastructures, as well as the personnel supporting them, also goes well beyond their earlier requirements from a mere handful of years previous.
On the one hand, this explosive growth in business demand for ephemerality and scale is driving the exceptionally rapid maturation of the Kubernetes ecosystem.
On the other hand, all this cutting-edge technology actually has to work. And that’s where cloud-native operations fits in.
The basics of cloud-native operations
Cloud-native computing takes the established “infrastructure as code” principle and extends it to model-driven, configuration-based infrastructure. Cloud-native also leverages the “shift-left,” immutable infrastructure principle as well as favoring extensibility over customizability, itself a model-driven practice.
Although a model-driven, configuration-based approach to software deployment is necessary for achieving the goals of cloud-native computing, it is not sufficient to address the challenges of ensuring the scale and ephemerality characteristics of deployed software in the cloud-native context.
Software teams must extend such configurability to production environments in a way that expects and deals with ongoing change in production. To this end, canary deployments, blue/green rollouts, automated rollbacks and other techniques are necessary to both deal with and take advantage of ongoing, often unpredictable change in production environments.
Abstracting across different production environments is also an important challenge. Whether it be different public clouds, different Kubernetes distributions, or hybrid IT challenges that mix cloud and on-premises environments (perhaps for compliance reasons), cloud-native release orchestration must abstract such differences in order to provide a coherent, configuration-based approach to automating deployments across such variations.
Dependency management is also essential. Whether it be dependencies among individual microservices, or perhaps dependencies upon APIs that provide access to other types of software components, it’s important that unexpected dependencies don’t break the deployment, even when individual components are ephemeral.
Finally, software teams must be able to deal with unprecedented scale. Kubernetes itself is built to scale, with an architecture that deploys microservices into containers, containers into pods, and pods into clusters – but clusters aren’t enough.
Enterprises are already working through the intricacies of multicluster Kubernetes deployments. Software teams must also consider groups of clusters and then “fleets” of groups of clusters. Such fleets would typically cover multiple regions or data centers, bringing additional challenges of massive scale to the cloud-native party.
GitOps: a cloud-native model for operations
In a useful oversimplification, the cloud-native community has boiled down everything organizations need to do to get Kubernetes running in full production into three days.
Day 0 is the planning day. Day 1 is when you roll out Kubernetes and the rest of your cloud-native ecosystem. Day 2 represents full operations at scale.
Dividing such a complex, interconnected set of tasks into three discrete days highlights one important fact: Day 2 has so far gotten short shrift. To provide adequate attention to day 2 issues, the community has coined a term: GitOps.
GitOps is a cloud-native model for operations that takes into account all the concepts this article has covered so far, including model-driven, configuration-based deployments onto immutable infrastructure that supports dynamic production environments at scale.
GitOps gets its name from Git, the hugely popular open source source code management tool. Yet, although SCM is primarily focused on the pre-release parts of the software lifecycle, GitOps focuses more on the Ops than the Git.
GitOps extends the Git-oriented best practices of the software development world to ops, aligning with the configuration-based approach necessary for cloud-native operations – only now, the team uses Git to manage and deploy the configurations as well as source code.
Such an approach promises to work at scale – even at the fleet level, since GitOps is well-qualified to abstract all the various differences among environments, deployments, and configurations necessary to deal with ephemeral software assets at scale.
GitOps also promises a new approach to software governance that resolves issues of bottlenecks. In traditional software development (including Agile), a quality gate or change control board review requirement can stop a software deployment dead in its tracks. Instead, GitOps abstracts the policies that lead to such slowdowns, empowering organizations to better leverage automation to deliver adequate software government at speed.
Vendors and open-source projects step up to the cloud native plate
The beating heart of cloud-native computing is open-source software, so it’s only logical that open-source projects are spearheading efforts in cloud-native operations.
For instance, Argo CD is a declarative, GitOps-centric CD tool for Kubernetes. Similarly, Tekton is a flexible open source framework for creating CI/CD systems, allowing developers to build, test and deploy across cloud providers and on-premises systems.
In many ways, however, such projects are only pieces of the cloud-native operations puzzle, and it falls to the vendors to put the pieces together. To begin with, a number of vendors tout the model-driven configuration-based approach. Here are a few examples.
Digital.ai Software Inc., for example, takes a model-driven, scalable approach, making changes simple to make and to propagate to all environments. With Digital.ai, developers don’t need to maintain complicated scripts or workflows for each deployment instance.
Octopus Deploy Pty Ltd. follows a similar approach, with model-driven ops configuration that provides simple configuration abstractions across heterogeneous environments, for example, on-premises as well as in the cloud.
With Octopus, instead of writing separate scripts for each environment, developers can put those scripts into Octopus and parametrize them, creating an abstracted configuration representation. Instead of separate CI/CD tooling, ops tooling and runbook automation, Octopus provides one deployment tool across all tools, environments and platforms.
Similar to Octopus, ShuttleOps Inc. encapsulates a host of connectors and its own coded application and infrastructure configurations under the covers, parametrizing them as steps in the pipeline workflow. It then reports results to the orchestration and management tools of choice.
CircleCI (Circle Internet Services Inc.) and Cloudbees Inc. are two other vendors that represent a full deployment via declarative configuration files.
Many vendors also resolve the interdependencies among microservices (as well as other components) in production. Cloud66 Inc. enables developers and architects to define service dependencies in an abstracted but deterministic fashion. Those dependencies define the workflows that operations must manage.
Cloud66 can then tell developers when they need a new version of a particular piece of software in order to resolve such dependencies, and it also tells operators what they need to do to support it.
Harness Inc. offers what it calls a “continuous delivery abstraction model” that uses templates to eliminate dependencies. The CDAM resolves the impact of upstream and downstream microservices dependencies with automatic rollbacks.
GitOps in action
Several vendors pull together the cloud-native operations story with a GitOps offering.
At WeaveWorks Inc., GitOps is context-aware, leading to a model of the entire system which represents its desired state. WeaveWorks supports multiple variations, for example, custom platform as a service on-premises – as part of the same comprehensive model. WeaveWorks leverages a distributed database for configurations that supports potentially millions of clusters and works in high latency and occasionally disconnected environments.
GitLab Inc. is another vendor with explicit GitOps support. GitLab offers a single platform that takes an infrastructure as code approach, defining configurations and policies as “code” while leveraging automation to apply changes with Git merge requests.
This automation support in GitLab resolves many governance issues, as it leads to approvals with fewer bottlenecks. GitLabs’ GitOps strategy is all about automation, for example, automated rollbacks. GitLab also supports release evidence, which gives an audit trail of everything included in each release along with associated metadata.
D2IQ Inc. touts its own flavor of GitOps it calls GitNative, which combines GitOps and Kubernetes-native CI/CD. The goal is to maximize speed, scale, and quality via full-lifecycle Git automation from DevOps to GitOps to GitNative.
D2IQ takes an immutable infrastructure approach that leverages Kubernetes APIs and primitives. Its platform is both serverless and stateless, also works on-premises. D2IQ leverages both the Argo CD and Tekton open source projects.
A final GitOps-centric vendor is Codefresh Inc., which uses Git as the single source of truth, automating and securing pull requests and deployments. It handles source code provenance and support for multiple regions.
Sailing with the fleet
Where the rubber hits the road with Day 2 Kubernetes deployments is whether they will handle massive scale – scale on the order of millions of clusters.
Several vendors tout such capabilities. WeaveWorks offers cluster management that runs on the customer’s choice of managed Kubernetes platform plus application management, including release automation and progressive CD that scales to fleets.
Vamp.io BV leverages Kubernetes-based environments to provide release orchestration for applications that consist of large numbers of ephemeral microservices. This vendor offers release orchestration for DevOps that fully automates releases, including A/B testing, fine-grained segmentation and multitenant releases.
Rancher Labs Inc., soon to be part of SUSE, offers GitOps at scale. It deals well with large numbers of heterogeneous nodes, including clusters, cluster groups and fleets. D2IQ also touts a single pane of glass for managing fleets of Kubernetes clusters.
Intent-based operations
A few vendors are also tackling the difficult challenge of ensuring that code in production continues to meet the business need – even when that code is inherently dynamic and ephemeral. I call this capability “intent-based operations.”
On this list: the Keptn open-source project from Dynatrace LLC. Keptn produces a remediation file that automates the remediation of code in production as it drifts from its intended purpose. This remediation also allows for graceful failure in an automated fashion.
Keptn validates whether a particular remediation action works and, if not, it tries another one. Dynatrace calls this automated iterative approach to remediation “micro-operations.”
Harness’s GitOps approach also includes continuous verification across performance, quality and revenue, with automatic rollbacks – another example of intent-based operations.
Finally, Vamp leverages metrics from production traffic to provide continuous validation, ensuring released code meets requirements on a continual basis.
The Intellyx take
It is tempting for anyone in a “traditional” enterprise to look at the massive scale and ephemerality characteristics of cloud-native deployments and wonder whether their organizations would ever need software that follows such patterns, which are so dramatically different from most of the software they’re familiar with in today’s enterprise environments.
While it’s true that industry needs will vary, and individual companies will face different challenges from their competitors, no one should be too confident that the Day 2 vision this article lays out won’t apply to them.
Remember, if a technical capability becomes available that improves the ability for certain organizations to roll out differentiated products and services that meet customer needs, then their competition must also leverage similar capabilities or risk becoming uncompetitive and, in the end, failing to survive.
In other words, cloud-native computing is here. It’s already delivering massive scale and ephemerality to enterprises that are leveraging such capabilities to deliver differentiated products and services to their respective markets. If your organization doesn’t jump on this bandwagon as well – and quickly – your future is in question. Don’t be left behind.
Jason Bloomberg is founder and president of Intellyx, which publishes the Cloud-Native Computing Poster and advises business leaders and technology vendors on their digital transformation strategies. He wrote this article for SiliconANGLE. (* Disclosure: At the time of writing, Digital.ai and Dynatrace are former Intellyx customers. None of the other organizations mention in this article is an Intellyx customer.)
Photo: Skeeze/Pixabay
A message from John Furrier, co-founder of SiliconANGLE:
Your vote of support is important to us and it helps us keep the content FREE.
One click below supports our mission to provide free, deep, and relevant content.
Join our community on YouTube
Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
THANK YOU