Increased demand for certificates when using an Istio service mesh solution often leads to the proliferation of unapproved self-signed certificate authorities, which opens up security risks. Venafi's Firefly product is designed to address this issue by providing a compliant Istio Certificate Authority (CA) solution. Firefly enables PKI administrators to maintain policy control and visibility over mesh workload identities, ensuring that all mesh traffic is rooted within an enterprise's trust domain. As a rapid issuer, Firefly can be used as an in-cluster Kubernetes controller or externally on an isolated virtual machine (VM).
In this blog we'll show you the setup process for Firefly using Red Hat's Istio environment, based on configuring Venafi TLS Protect Cloud and Kubernetes cluster components such as Firefly, istio-csr and the Red Hat Istio service mesh. We will discuss the importance of using root trust within a service mesh, emphasizing the need to work with trust domains. You can use the guidance provided here to learn about the scalability and security benefits of using Firefly as the Istio CA and its importance for building a Kubernetes zero-trust environment.
Why we need an alternative to self-signed certificates
What is increasingly common is the rise of unofficial, unregulated self-signed, or shadow certificate authority (CA) solutions. These are utilized to meet the growing demand for mTLS certificates, whether it's for a custom mTLS connection or for broader cluster applications in a service mesh.
Notably, the Istio service mesh employs its self-signed CA, which is generated by the control plane and stored within a Kubernetes secret known as "istio-ca," referred to as the Citadel CA. The inherent lack of management and self-signed nature of this root CA results in several challenges. These issues encompass the risk of certificate spoofing by those who access the CA, complications during system upgrades, and difficulties in achieving multi-cluster connectivity due to the absence of a link to any enterprise root of trust.
This is even officially recognized to be an anti-pattern for the security of the mesh by the NIST-SP800-204A directivespecifically section SM-DR12.
Get the advantages of Istio’s zero trust infrastructure without the risks
We have designed Venafi Firefly from the ground up to elegantly address this use-case and many other security use cases. Firefly along with the open source istio-csr from cert-manager project gives the platform engineering team a straightforward way to get a compliant Istio CA solution while giving the PKI admin policy control and visibility over the mesh workload identities.
This has the advantage that all your meshes are rooted in your enterprise root of trust and enables flexible multi-cluster patterns. Venafi Firefly is an extremely versatile machine identity issuer, it can run as a native in-cluster Kubernetes controller, or externally in an isolated VM for example.
Venafi Firefly Deployment Patterns
The most convenient method for issuing certificates in a Kubernetes environment is to operate within the cluster. This approach ensures a high level of security by employing an ephemeral sub-ca within the cluster, while the main CA certificate remains securely stored in restricted memory. As a result, these CAs are treated as dispensable entities rather than cherished possessions. The TLS trust mechanism relies on placing trust in the next higher level, offering flexibility in defining the appropriate trust domain, which will be discussed further a later section.
The below diagram highlights the range of valid configurations for using Firefly to sign different kinds of Kubernetes workloads
Figure 1: In-cluster Firefly Kubernetes use-cases
Using Firefly to sign Red Hat OpenShift workload certificates using a built-in CA
In this blog post, we will look in detail at the following PKI solution for the Red Hat service mesh (formerly known as Maistra), which is itself a flavor of Istio. Istio is of course a popular mesh solution that we regularly come across when working with customers.
Figure 2: Istio built-in CA use-case
This setup demonstrates how to replace the Istio Citadel CA by successfully leveraging the istio-csr project from cert-manger and plugging it to Firefly to provide highly scalable issuance in order to easily handle clusters with tens of thousands of pods.
To keep things simple, we will leverage the built-in CA as the sub-CA provider in TLS Protect Cloud. The Kubernetes setup is the same regardless of the sub-CA provider.
Changing sub-CA providers in TLS Protect Cloud to use more robust options is easy and requires a simple configuration change (see this example using Zero Touch PKI). You can also connect back to your TLS Protect Datacenter environment (formerly known as TPP) by leveraging the connector under the TLS Protect Datacenter tab to sign your sub-CAs.
Venafi has a 30 day trial of Firefly and TLS Protect Cloud, and you can sign up to create your TLSPC service and use it to follow this tutorial.
Solution pre-requisites
- An OpenShift Kubernetes cluster which has outbound access to the internet, specifically using a tenant from the TLS Protect Cloud domain.
- A valid Venafi Cloud Firefly account. You can sign up for a 30 day trial here
- Network access to our public ECR repository domain (registry.venafi.cloud/public/venafi-images) to pull our public image and OCI chart
- cert-manager is assumed to be already installed in this cluster
- An RSA public private pem key pair, which can be generated using OpenSSL for example:
openssl genrsa -out key.pem
openssl rsa -in key.pem -outform PEM -pubout -out public.pem
Let’s get started.
TLS Protect Cloud setup
1. Create service account for authentication
First step would be to submit the public key (quoted above) to create a service account in the Firefly UI. Please refer to the create a new service account documentation for more details.
Once you have done so, please note the ClientID
of that service account, as we will use it later.
2. Create workload certificate policy
For Istio certificates the different policy fields should be set after creating a new policy in the TLSPC Policies tab with the details specified below:
Figure 3: In-cluster Firefly Kubernetes policy (1st part)
continued..
Figure 4: In-cluster Firefly Kubernetes policy (2nd part)
You should set the leaf certificate validity duration to a reasonable short period, this will be the validity for all workloads signed using this policy. Istio by default uses 1 day for workload certificates validity but we can go further and use 1 hour according to the NIST-SP800-204A directive (specifically section SM-DR12).
When we choose the 1-hour duration at the policy level it will force Istio to renew all the workload certificates every half-hour, and the Firefly certificates will also be renewed in that time frame.
The policy engine is quite flexible: it allows you to set most key CSR parameters as either allowed but optional, or required fields can use a plain domain syntax or a regex syntax using the carrot sign.
SPIFFE SAN
- We set the URI SPIFFE SAN since Istio follows the SPIFFE standard for workload identity. SPIFFE decouples identity from DNS or IP layer and a spiffe URI SAN usually indicates that some kind of attestation took place to give a workload that identity. In our case the format looks like
spiffe://cluster.local/ns/k8s-namespace/sa/svc-acct-name
Which means that each workload has been authenticated to prove that is part of your cluster trust domain cluster.local
and that it uses the service account called svc-acct-name
in namespace k8s-namespace
. In our case this attestation will be performed by the istio-csr
component.
Istio key usages
- Since we will be using mutual auth, TLS expects the Client authentication and Server authentication extended usages to be set for all the workload certificates. For more information about the mechanism of policy please refer to the Firefly policy docs
3. Create sub-ca provider
Create a custom sub-CA provider leveraging the Venafi Built-In CA with a duration longer than the duration of the leaf certificate. In our case for example we’ve set it to 7 days. Note that since the sub-CAs are ephemeral and fully automated, the lifetime has no effect on operations and can be very short.
You can read more on using Built-In CAs here
4. Create configuration
We can tie the sub-CA provider, service account and policy created in the previous two steps together in a “Configuration” in the TLS Protect Cloud UI as explained in reference docs.
Figure 5: In-cluster Firefly control plane configuration
We use the None auth method because we will be running Firefly as a Kubernetes controller and not as a server. The instance metadata authentication can also be ignored for this particular configuration as it is irrelevant.
We have completed the TLS Protect Cloud side of the setup, we’re now ready to configure Firefly in-cluster, istio-csr and finally Istio!
Kubernetes-side setup
1. Firefly Helm chart installation
Install a Firefly instance using the helm chart to create a secret containing the service account private key from the pre-requisite section above:
oc new-project venafi
oc create secret generic -n venafi venafi-credentials --from-file=svc-acct.key=key.pem
For OpenShift specifically we need to create this SCC because Firefly requires the IPC_LOCK capability, this capability is used by Firefly to lock the ephemeral intermediate crypto material in memory locked pages.
kind: SecurityContextConstraints
apiVersion: security.openshift.io/v1
metadata:
name: firefly
namespace: venafi
allowPrivilegedContainer: false
runAsUser:
type: MustRunAsNonRoot
seLinuxContext:
type: RunAsAny
allowedCapabilities:
- IPC_LOCK
And the create the binding to the Firefly service account:
oc adm policy add-scc-to-user firefly -z firefly -n venafi
Finally assuming cert-manager is installed without approver policy, which is the default, we can deploy the Firefly chart as an in-cluster deployment with the following values:
# -- Setting acceptTerms to true is required to consent to the terms and conditions
acceptTerms: true
deployment:
image: registry.venafi.cloud/public/venafi-images/firefly
# -- Toggle for running the firefly controller inside the kubernetes
# cluster as an in-cluster Certificate Authority (CA).
enabled: true
# -- (string) REQUIRED: The ClientID of a your TLS Protect Cloud service account associated with the desired configuration.
venafiClientID: "<REPLACE_WITH_CLIENT_ID_OF_THE_SERVICEACCOUNT>"
securityContext:
allowPrivilegeEscalation: false
capabilities:
add: ["IPC_LOCK"]
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1001
crd:
# -- Installs the CRD in the cluster. Required to enable firefly with
# the given group.
enabled: true
# -- Group name of the issuer.
groupName: firefly.venafi.com
approver:
# -- Enable or disable the creation of a ClusterRole and ClusterRoleBinding
# to allow an approver to approve CertificateRequest resources which use
# the Firefly issuer group name.
enabled: true
# -- The subject which will be granted permission to approve
# CertifcateRequest resources which use the Firefly issuer group.
subject:
kind: ServiceAccount
namespace: cert-manager
name: cert-manager # change to approver policy if using that instead
Save the above values to a file, for example Firefly-values.yaml
, replace the client ID, then run:
helm upgrade -i -n venafi --create-namespace firefly \
oci://registry.venafi.cloud/public/venafi-images/helm/firefly --version v1.2.0 \
-f firefly-values.yaml`
This will spin up a Firefly instance and you can check the readiness status with the following message:
I0713 13:55:15.444872 1 vaas.go:123] agent/bootstrap/vaas "msg"="issued intermediate certificate from VaaS" "CN=firefly.."
This confirms that it bootstrapped itself an issuer certificate successfully!
2. Istio-csr setup
Once the Firefly instances are ready it’s time to deploy istio-csr. istio-csr will translate Istio certificate requests into cert-manager requests. The key parameters needed for Firefly to sign those certificates will use the correct Issuer group name, which we can see from above is set to:
firefly.venafi.com.
You also need to specify an annotation on generated certificates to select which policy will be applied for this certificate. This is handled in the chart by supplying the policy name as one of the additionalAnnotations. The other thing you will need to establish is the root of trust for Istio: this should be the built-in CA intermediate certificate, which you can download from the UI. We explain the root of trust more deeply in this later section.
With that in mind, here are the istio-csr values you should use for this scenario:
replicaCount: 3
image:
repository: quay.io/jetstack/cert-manager-istio-csr
tag: v0.7.0
pullPolicy: IfNotPresent
app:
certmanager:
namespace: istio-system
preserveCertificateRequests: false
additionalAnnotations:
- name: firefly.venafi.com/policy-name
value: <REPLACE_WITH_POLICY_NAME_FROM_TLSPROTECT_CLOUD>
issuer:
group: firefly.venafi.com
kind: Issuer
name: firefly-istio
controller:
configmapNamespaceSelector: "maistra.io/member-of=istio-system"
leaderElectionNamespace: istio-system
istio:
namespace: istio-system
revisions: ["basic"]
tls:
trustDomain: cluster.local
certificateDNSNames:
- istio-csr.istio-system.svc
- cert-manager-istio-csr.istio-system.svc
rootCAFile: /etc/tls/root-cert.pem
server:
maxCertificateDuration: 1h
serving:
address: 0.0.0.0
port: 6443
# -- Optional extra volumes. Useful for mounting custom root CAs
volumes:
- name: root-ca
secret:
secretName: root-cert
# -- Optional extra volume mounts. Useful for mounting custom root CAs
volumeMounts:
- name: root-ca
mountPath: /etc/tls
Before you apply the above, you need to create the root of trust. You can download this from CA Account -> Built-in CA -> Download chain -> Root certificate first.
Take the first certificate that you downloaded, the root and save it to a file root.pem.
This will be the mesh’s root of trust and will be managed by istio-csr through the usual istio-ca-root-cert.
You can now run the installation:
$ oc new-project istio-system
$ oc create secret generic -n istio-system root-cert --from-file=root-cert.pem=root.pem
$ helm upgrade -i -n istio-system cert-manager-istio-csr jetstack/cert-manager-istio-csr -f istio-csr-values.yaml
You should see the pod become ready in a moment; this already means that Firefly was able to issue certificates for istio-csr's deployment.
Finally, we can install Red Hat Service Mesh.
3. Red Hat Service Mesh installation with istio-csr
Install the Red Hat Service Mesh from Operatorhub if you haven’t already. Then you can use the following
servicemeshcontrolplane
object which is obtained here.
You might want to download this and adjust the servicemeshcontrolplane as you see fit.
You will notice that the only parameter that matters for mesh identity is the spec.security.certManager key which tells the Red Hat Service Mesh where to contact the istio-csr server to issue the mesh identities.
$ oc new-project bookinfo # we pre-create the bookinfo namespace because of the servicemeshmemberroll
$ oc apply -n istio-system -f mesh.yaml
Alternatively, you can install the service mesh control plane from the UI with the following parameters set for the identity of the mesh:
Figure 6: In-cluster Firefly Service Mesh operator configuration
You can now deploy a workload like the bookinfo application and check it has the right issuer:
$ oc apply -f https://raw.githubusercontent.com/maistra/istio/maistra-2.4/samples/bookinfo/platform/kube/bookinfo.yaml -n bookinfo
$ oc get po -n bookinfo
oc get po -n bookinfo
NAME READY STATUS RESTARTS AGE
details-v1-76d4d44995-xhh6v 2/2 Running 0 5s
productpage-v1-65b44c79c9-l46m2 2/2 Running 0 5s
ratings-v1-6ff8fd9bcf-nrfmt 2/2 Running 0 5s
reviews-v1-575cf6648f-9bstq 2/2 Running 0 5s
reviews-v2-b85b4cf85-k7vqf 2/2 Running 0 5s
reviews-v3-55f9d7445c-2dl7s 2/2 Running 0 5s
Our new pod now has an identity signed by our approved TLS Protect Cloud built-in CA and this can scale to a mesh of tens of thousands of pods without sacrificing performance.
$ istioctl pc secret details-v1-76d4d44995-xhh6v -n bookinfo -o json | jq -r '.dynamicActiveSecrets[0].secret.tlsCertificate.certificateChain.inlineBytes' | base64 --decode | openssl x509 -noout -text -in /dev/stdin
You should also see the ultra short certificate count under Venafi management increase in the TLS Protect Cloud UI:
Figure 7: In-cluster Firefly certificate count
You might notice multiple entries with the same name which can correspond to multiple replicas or restarts of Firefly. This is normal and we’re working to improve the UX for the stats there.
A note on root of trust
The root of trust has to be at least one level up from the Firefly intermediate and we often recommend the root. We will attempt to explain our reasoning in this paragraph.
Imagine a trust chain like this:
[root] -> [intermediate1] -> [intermediate2] -> [Firefly's ephemeral intermediate] -> [Firefly signed workload certificate]
(our notation means root
signed intermediate1
which in turn signed intermediate2
etc..)
The most important thing is that you should not include Firefly’s own intermediate as your root of trust for the proper function of Firefly. You should at least use the intermediate2
or a higher trusted intermediate in your root of trust (intermediate1
or root
).
The Firefly sub-CA is an ephemeral intermediate, usually lasting weeks at the most. The point behind trusting intermediate2 is that when the cert expires, pod dies or fails leader election, the fact that you trust the intermediate above all the ephemeral ones means Firefly continues to run reliably.
We often get asked about how this works from the TLS perspective. The way TLS checks certificate chains is that it walks up from the leaf certificate until it finds a certificate it has within its trusted set of certificates. This is called the Trust Store. In Istio the istio-ca-root-cert
configmap is the Trust Store of the mesh
When we add intermediate2
to our Trust Store we are being relatively broad in that any intermediate signed by intermediate2
will be trusted, which includes for example Firefly instances in other clusters that have been configured to use intermediate2.
This is crucial for cross-cluster connectivity use-cases.
Trusting intermediate2
can pose its own set of challenges, particularly influenced by the Certificate Authority's lifespan. For instance, if the CA's duration is relatively short, such as two years, we may encounter complications related to CA rotation in the near future. Security decisions often entail these sorts of trade-offs. In this context, the trade-off emerges between convenience and enhanced isolation. The most convenient configuration would involve placing trust in the root CA, which typically boasts a longer lifespan, spanning between 10 to 30 years. This practical choice implies that we only need to be concerned about root rotation in the event of misconfigurations or, even worse, security breaches.
One could insist and not want to be too broad in what we allow, and want to set strict trust domain boundaries within our different organizational units, environments etc. For example another Firefly instance for an isolated organizational unit could be using another intermediate intermediate3
which is signed by intermediate1
, this forms a new trust chain:
[root] -> [intermediate1] -> [intermediate3] -> [Firefly's ephemeral intermediate] -> [Firefly signed workload cert]
In this case the certs for this mesh are still isolated from the ones signed by intermediate2
, as long as both use their respective intermediates in their Trust Stores, since any identity from the other mesh will not verify.
This would still allow both meshes to adjust their trust levels and begin trusting intermediate1
.
The meshes can later be connected, but this is not the default setting. In this more secure configuration, it is advisable not to include the root or 'intermediate1' in our trust store if we want to enhance trust domain isolation. Sometimes, operators unintentionally add them!
Our recommendation is to use a single trust domain and employ authorization policies, unless there is a compelling regulatory need for heightened segregation, as is the case with PCI-DSS compliance.
If you decide to have stricter isolation between environments or organizational units based on the trust domain, we suggest selecting an intermediate with a relatively long lifespan, such as a 5-year certificate, to strike a balance between operational convenience and security.
Conclusion
Hopefully you’ve been able to follow along this relatively straightforward setup which describes a one-off process that will make both platform teams and Infosec extremely happy! For our part, we stand by this solution as the most scalable and secure implementation of an Istio CA to help you to achieve a zero trust environment for Kubernetes workloads. We recommend this setup as the best way to configure Istio identities, and we are continuing to evolve this solution yet further so we welcome your feedback. You can email the Venafi CRE team directly (team-cre@venafi.com) with any technical feedback on this process, or simply to request help with your own Istio environment.
Once you’re happy with this setup, upgrading the backend CA for production is very straightforward as your PKI admin could swap the built-in CA and issue the sub-CAs from a range of sub CA providers. He could use your existing TLS Protect Data center, Zero Touch PKI or even a direct MSCA connection. As you can see, Firefly has been designed for maximum flexibility.
For a production-ready and scalable solution to streamline and secure your Istio service mesh identity management, Venafi's Firefly not only simplifies the setup process but also ensures the highest level of trust and control over your certificates. To experience the merits of this stand-out Istio service mesh identity solution, you can explore Venafi's Firefly and TLS Protect Cloud using our dedicated resource center. Or reach out to us to for a Firefly product demo and to help you on your journey for enhanced security and efficiency in your cloud-native environments.