Kubernetes-native technologies are those which are specifically designed to run on top of Kubernetes. This often means that their architecture follows the controller pattern, leveraging the Kubernetes API machinery for declarative configuration.
Resilient CI/CD is a critical concern for many organizations, especially at scale. Being able to leverage Kubernetes for CI/CD provides the necessary support for building a resilient, scalable solution and brings all the benefits of declarative configuration.
We regularly receive questions from customers related to managing infrastructure at scale and the CI/CD automation required to power such capabilities. As Kubernetes has matured, so has its ability to handle more of this logic natively, taking advantage of the machinery that drives typical application deployments. Here we describe an opinionated repository layout and Kubernetes-native CI/CD setup for managing a fleet of Kubernetes clusters. We first look at the set of tools used and then describe how they fit together, with a demo repository to show some of the lower-level details.
Tools Used
Here we describe key tooling which we will use to achieve Kubernetes-native CI/CD and declarative cluster management.
Tekton Pipelines
Tekton Pipelines (originally Knative Build) extends the Kubernetes API to support declarative pipelines and comes with 5 new resources: Tasks
, Pipelines
, TaskRuns
, PipelineRuns
and PipelineResources
.
A Task
describes a sequence of steps to be run in order and is implemented as a Kubernetes Pod
, where each container in the Pod
is a step in the Task
. Pipelines
define how Tasks
are put together and can be run in any order you choose, including concurrently, passing inputs and outputs between them. TaskRuns
and PipelineRuns
can then reference Tasks
and Pipelines
respectively to invoke them; this allows Task
and Pipeline
definitions to be reused across runs. Alternatively, TaskRuns
and PipelineRuns
can specify Tasks
and Pipelines
inline. Finally, PipelineResources
provide runtime information to those runs, such as the Git repository a run should be executed against, although this information can also be provided using parameters.
The Tekton Pipeline repository has extensive documentation if you would like to know more. In particular, I find Tekton’s capabilities for providing runtime credentials to be especially powerful.
We will use Tekton Pipelines to automate the generation of manifests for cluster addons in response to GitHub events.
We use the term cluster addon
loosely to refer to any application deployed to a cluster to support developer workloads, for example Nginx Ingress Controller or cert-mananger.
Lighthouse
Jenkins X is a full Kubernetes-native CI/CD solution that supports an opinionated end-to-end life cycle for building and deploying Kubernetes applications. Jenkins X is a very powerful tool, however I have found it to be too heavyweight for some use cases. Fortunately, the Jenkins X folks have designed the components in a modular way so that they can be used in isolation.
Lighthouse is a component of Jenkins X and a webhook handler for Git provider events. Lighthouse is a fork of Prow which is used to implement CI/CD for developing Kubernetes itself. Events received by Lighthouse from Git providers (for example PR creation from GitHub) can trigger Tekton Pipelines
to automate actions (such as CI tests). Note that Lighthouse also supports triggering Jenkins pipelines.
The main advantage of Lighthouse over Prow is that it uses jenkins-x/go-scm and so supports a large number of Git providers (rather than just GitHub).
We will use Lighthouse to receive GitHub events and invoke the Tekton Pipelines
that perform cluster addon manifest generation.
Flux
Flux v2 is a GitOps tool which hydrates and syncs manifests from a Git repository to a Kubernetes cluster. Flux v2 brought a whole host of new capabilities compared to v1, including the ability to declaratively apply Helm charts and Kustomize configuration. Config Sync, a component of Google’s Anthos platform, has similar capabilities to Flux v1.
We will use Flux v2 (from now on just Flux) to sync cluster addons and workload cluster declarations.
Cluster API
Cluster API brings declarative, Kubernetes-style APIs to cluster creation, configuration and management. We have covered the power of Cluster API in a previous post and in this post we will be using it to provision our Kubernetes workload clusters. There is already a large number of infrastructure providers to choose from depending on your environment. There are also some alternatives to Cluster API such as Google’s Config Connector, which can provision GKE clusters declared as Kubernetes resources, or Crossplane, which has similar capabilities to Config Connector but supports multiple clouds.
We will use Cluster API to declare workload clusters using an experimental Cluster API infrastructure provider that I wrote to create cluster Nodes
as Pods
to save on infrastructure costs
kfmt
kfmt is small Go binary I wrote to format collections of manifests into a canonical structure, one manifest per file. The structure corresponds with the logical resource structure presented by the Kubernetes API, separating namespaced resources from non-namespaced resources and giving all resources in the same Namespace
their own directory. kfmt makes it easier to manage and review changes to large collections of manifests which is particularly useful when upgrading manifests for third-party tools.
We will use kfmt to format cluster addon manifests.
Bringing the Tools Together
Here we will describe how the tools listed above can be used together to manage configuration across a fleet of clusters. We will of course be using other tools, but they play a more minor role compared to the ones listed above. A demo repository has been created here which we will refer to for lower-level details.
The aim of this repository is to facilitate managing a large number of clusters in the simplest and DRYest way possible. In particular, we only use a single repository to manage all clusters. In addition, we want a way to pin clusters to particular Git references and to promote those references to downstream clusters, but have the flexibility to define cluster specific parameters.
Repository Walkthrough
As is typical when using Cluster API, we provision a management cluster (using gcloud
in this case) to host our Cluster API controllers and components. Cluster API resources are then synced to this management cluster using Flux to provision workload clusters. Cluster addons are defined as Kustomize bases which are pulled into various cluster flavours to standardize the capabilities of the clusters an organization may want to support. In this case we only have two cluster flavors, management and workload, however it is straight forward to define more for specific use cases.
Each cluster syncs configuration from its own directory using Flux. Each directory is created by running flux bootstrap
against the corresponding cluster. The main difference between each cluster’s directory is the cluster-sync.yaml
configuration which targets a particular repository reference (through the spec.ref.branch
field of Flux’s GitRepository
resource) and Kustomize flavour (through the spec.path
field of Flux’s Kustomization
resource) to install addons, providing cluster specific parameters. Note that the name cluster-sync.yaml
is not special outside of the context of this post, I just picked it to hold the configuration for syncing cluster addons.
To help with managing the Customize bases, the generate target in the Makefile runs the procedure to generate much of the addon configuration. When running generation locally, this target can be wrapped in a Docker container, which contains all the required dependencies, by running make docker_generate
. Much of the resulting configuration is formatted using kfmt to give consistency and to make it easier to review any changes upon regeneration.
To automate this generation procedure we use Lighthouse and Tekton Pipelines; whenever a PR is created or changed (or a commit added to the main branch) a GitHub webhook event is sent to Lighthouse which spins up a Tekton Pipeline. This Pipeline checks out the PR branch (or main branch), runs make generate
(within the same container image as is used locally) and pushes any changes as a new commit. These changes will then be picked up by Flux when they reach the main branch.
An example PR demonstrating this process by upgrading Nginx Ingress Controller can be found here. We can see that the only change from a human is the initial commit bumping the version in the Makefile. This triggers a Tekton Pipeline
which appends a commit to the PR with the commit message Generated
containing the updated manifests of the new version. We can see the output of this process by visiting the Tekton Dashboard:
A really powerful aspect of Lighthouse is the ability to include the Pipeline
configuration within the repository. .lighthouse/triggers.yaml defines the list of presubmit (PR) and postsubmit (merge) jobs to run with the PipelineRun
definitions themselves referenced through the source
field. Documentation for configuring Pipelines
within repositories to be invoked by Lighthouse can be found here.
To learn more about the capabilities of Lighthouse, see the Lighthouse documentation together with the management cluster’s cluster-sync.yaml
configuration which specifies patches for the Lighthouse HelmRelease. In addition, the plugins directory of the Lighthouse repository contains a directory for each in-built plugin, which includes descriptions of the capabilities of each plugin.
Cluster Management
Here we look in more detail at the different aspects of the repository that contribute to cluster management.
The clusters
target in the Makefile generates Cluster API configuration to be synced to the management cluster. This is applied to the infrastructure Namespace
.
We can use the cluster-api
category to see all the resulting resources after they are synced to the management cluster using Flux:
$ kubectl get cluster-api -n infrastructure
NAME AGE
kubeadmconfig.bootstrap.cluster.x-k8s.io/development-84qd5 3m2s
kubeadmconfig.bootstrap.cluster.x-k8s.io/development-xfxwh 3m42s
NAME AGE
kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/development 3m44s
NAME PHASE REPLICAS READY UPDATED UNAVAILABLE
machinedeployment.cluster.x-k8s.io/development Running 1 1 1
NAME PROVIDERID PHASE VERSION
machine.cluster.x-k8s.io/development-6c5fb44c5c-z64cx kubernetes://infrastructure/development-worker-xfcft Running v1.17.17
machine.cluster.x-k8s.io/development-zcpw8 kubernetes://infrastructure/development-controller-vcptf Running v1.17.17
NAME PHASE
cluster.cluster.x-k8s.io/development Provisioned
NAME MAXUNHEALTHY EXPECTEDMACHINES CURRENTHEALTHY
machinehealthcheck.cluster.x-k8s.io/development 100% 2 2
NAME REPLICAS AVAILABLE READY
machineset.cluster.x-k8s.io/development-6c5fb44c5c 1 1 1
NAME AGE
kubernetesmachinetemplate.infrastructure.dippynark.co.uk/development-controller 3m43s
kubernetesmachinetemplate.infrastructure.dippynark.co.uk/development-worker 3m43s
NAME PHASE HOST PORT AGE
kubernetescluster.infrastructure.dippynark.co.uk/development Provisioned 34.77.174.110 443 3m43s
NAME PROVIDERID PHASE VERSION AGE
kubernetesmachine.infrastructure.dippynark.co.uk/development-controller-vcptf kubernetes://infrastructure/development-controller-vcptf Running v1.17.17 3m2s
kubernetesmachine.infrastructure.dippynark.co.uk/development-worker-xfcft kubernetes://infrastructure/development-worker-xfcft Running v1.17.17 3m42s
NAME AGE
clusterresourceset.addons.cluster.x-k8s.io/calico-addon 6m14s
NAME AGE
clusterresourcesetbinding.addons.cluster.x-k8s.io/development 2m1s
NAME INITIALIZED API SERVER AVAILABLE VERSION REPLICAS READY UPDATED UNAVAILABLE
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/development true true v1.17.17 1 1 1
Once a cluster definition has been applied to the management cluster, flux bootstrap is run through a simple CronJob
. In particular, this bootstrap procedure creates the directory within the repository corresponding to the cluster. A ClusterResourceSet
handles installing Calico CNI into workload clusters to allow the Flux components to schedule.
Note that CronJobs
still experience a bug where if they miss too many runs they will never recover. This has been fixed in the v2 implementation.
Once a cluster has been bootstrapped it is associated with a flavour so that the corresponding addons can be synced. For development clusters, which sync addons directly from the main branch, a Kustomization
resource is used to pick out the flavour (for example the development cluster’s cluster-sync.yaml
). For clusters downstream of development, we can promote to them, for example:
make docker_promote SOURCE=development DESTINATION=staging
If the destination cluster does not have its own cluster-sync.yaml
, this will be copied from the source. The Git hash of the main branch (or whichever branch is targeted by the source cluster) will then be taken and set in the destination cluster’s cluster-sync.yaml
, effectively pinning the destination cluster’s configuration to a particular point in time for a particular branch. This hash can then be propagated further downstream if desired (for example to a production cluster), allowing particular states to be tested and promoted as necessary. Further automation can be used to manage this promotion across a large number of clusters.
To illustrate this procedure more concretely, we will run through the steps and corresponding PRs for promoting the development cluster to staging, adding a production cluster and then promoting the staging cluster to production:
make docker_promote SOURCE=development DESTINATION=staging
- At the time this command was run the staging cluster had been freshly bootstrapped, so the development cluster’s
cluster-sync.yaml
was copied in its entirety, but with the target branch set to the hash of the main branch at the time - PR: https://github.com/dippynark/cicd-demo/pull/20
- Once the PR is approved and merged, the Flux instance running on the staging cluster syncs the manifests corresponding to that hash to match the development cluster
- Note that all Flux instances on all clusters are syncing from the main branch (although using different paths), however it’s the configuration itself (i.e. the
GitRepository
resources) that provide the redirection to point at a particular Git reference
- At the time this command was run the staging cluster had been freshly bootstrapped, so the development cluster’s
make docker_promote SOURCE=development DESTINATION=staging
- For illustration purposes, we promote to staging again. This time, only the branch hash is modified which again corresponds to the main branch at the time the command was run
- PR: https://github.com/dippynark/cicd-demo/pull/21
make docker_clusters
- After adding the production cluster to our list of clusters in the Makefile (within the
clusters
target), we generate and commit the corresponding manifests - PR: https://github.com/dippynark/cicd-demo/pull/22
- Once the PR is approved and merged, the Cluster API resources are synced to the management cluster and the production cluster is provisioned
- After adding the production cluster to our list of clusters in the Makefile (within the
$ kubectl get kubernetesclusters -n infrastructure
NAME PHASE HOST PORT AGE
development Provisioned 34.77.174.110 443 2h
production Provisioned 35.189.198.11 443 61s
staging Provisioned 35.205.108.153 443 2h
make docker_promote SOURCE=staging DESTINATION=production
- Once
flux bootstrap
has been run against the production cluster, we can pull the main branch (which now contains the production cluster’s corresponding directory) and promote the staging cluster to production - PR: https://github.com/dippynark/cicd-demo/pull/23
- Note that this time we modify the
Kustomization
patches to increase the number of replicas of an NginxDeployment
in production to 2. These patches will be retained throughout subsequent promotions which allows for cluster specific parameters to be set
- Once
Note that subsequent runs of make docker_promote SOURCE=staging DESTINATION=production
will do nothing until we promote to staging again since the hash pinned to the production cluster is now the same used by staging (i.e the promotion procedure is idempotent). We can also access the production cluster to see that we have 2 replicas of Nginx deployed instead of 1 as in the development and staging clusters.
CLUSTER_NAME="production"
make get_kubeconfig_$CLUSTER_NAME
export KUBECONFIG=kubeconfig
kubectl get pods -n nginx
This gives the following output:
NAME READY STATUS RESTARTS AGE
nginx-574b87c764-b6qgp 1/1 Running 0 115s
nginx-574b87c764-fwnc6 1/1 Running 0 115s
For large numbers of clusters, promotion could be managed in groups using repeated runs of make docker_promote
, but no matter how this is done, the logic can all be managed through file manipulation within the repository rather than by interacting with the clusters directly.
Benefits
One of the main benefits of this setup is that the entire configuration for every cluster is maintained in a single repository. A common requirement we see from customers is ensuring clusters conform to organisational policy (described as Gatekeeper constraints for example) and are managed consistently to reduce cluster sprawl, so being able to bring this common policy and consistency into cluster flavours is a useful capability. Any cluster specific configuration (for example cluster specific parameters and RBAC resources for user access) can be placed in the cluster specific directory without affecting the DRY approach implemented by the flavors.
In addition, even though the setup supports pinning clusters to specific Git references, the current configuration for each cluster is visible in the main branch, giving a single view of the desired state of your entire fleet.
Drawbacks
The main drawbacks from this setup in my opinion are around visibility. Firstly, when using Kubernetes resources which in turn hydrate or generate further resources (for example Flux’s HelmRelease
or Kustomization
resources), it can be tricky to see exactly what you are going to apply or change by looking at a PR. In addition, after changes have merged to the main branch, we do not immediately know if Flux was able to apply them successfully or if resulting application rollouts completed successfully. For this reason it is critical that substantial monitoring is in place for Flux and other applications to help raise any problems that might occur when syncing changes.
This setup also doesn’t have good support for promoting individual addons explicitly. Instead, entire references to cluster flavors are promoted, treating the entire collection of addons as a single unit. This can make changes easier to reason about but sacrifices flexibility.
The GitOps methodology can also be challenging to adhere to for certain applications. To give a specific example, when using Istio, applications with injected sidecars need to be rolled as part of a control plane upgrade. Managing such complex upgrade procedures without tedious ordered commits requires more logic to live within the cluster (for example the Istio Operator) which can come with a certain amount of engineering overhead, particularly for in-house applications. Flagger is a powerful tool that can help implement more complex upgrade procedures in a generic way.
Another drawback is that each Flux instance targets a separate directory. When you get to 100s or 1000s of clusters this may not be the nicest structure to deal with, even with good tooling. The setup could be modified to support directories that are synced to multiple clusters, but at the cost of cluster specific configuration being more complicated to manage.
A final drawback to mention is around safety; if the cluster resources defined in the management cluster are accidentally deleted then all of the workload clusters would be deleted which could be catastrophic. It may be prudent then to deploy multiple management clusters that each define a failure domain.
Closing Remarks
In this post we have described one way of managing a fleet of Kubernetes clusters using Kubernetes-native tooling and the GitOps methodology.
There are many more potential use cases for the tools mentioned in this post. For example, the kfmt tool used to format manifests, as described above, uses a Tekton release Pipeline
(triggered by Lighthouse) for running CI tests before building and pushing an image in-cluster using kaniko and creating a GitHub release. In addition, an interesting extension to the demo repository would be to trigger a Pipeline
to promote the development cluster to staging whenever the repository is tagged with a semantic tag (see an example Lighthouse TriggerConfig
here used by kfmt).
More inspiration for how to use Tekton Pipelines can be found by looking through the Tekton Catalog which contains community maintained Tasks
that can be used as part of your Pipelines
.