[Editor’s note: This post was written by Haoxiang Zhou who was a work placement student at our company for the past four months. We are grateful to Haoxiang for adding this very useful feature, and all his other contributions, and wish him all the best with his final year of study.]
This post will explore the newest addition to the kubectl
plugin of cert-manager, kubectl cert-manager status certificate
, a command designed to make the troubleshooting experience of cert-manager problems easier. The command was hugely improved in the recent v1 release. Jump to the bottom for more information on how to get involved and start contributing!
Why do we need this kubectl command?
Troubleshooting cert-manager has always been a recurring topic in the community, that is why the cert-manager team has created the Troubleshooting guide along with extra help for ACME Certificates to assist users to identify the most common causes of errors. Even if the user is unable to find the solution themselves, these guides enable them to narrow down the problem and make seeking help more effective.
We have received a lot of positive feedback from users saying that the guides were very helpful. As a small bonus point, it made our team members’ lives easier as well. But we wanted to take it a step further.
The usual advice that we give users and which is documented in the Troubleshooting guides is to use kubectl describe
on various resources in order to pinpoint where the error is happening.
For example, to find out why an ACME Certificate is not ready, in the worst case, the user has to run kubectl describe
on the Certificate resource, copy the name of the CertificateRequest then kubectl describe
on the CertificateRequest. Then copy the name of the Issuer to do kubectl describe
on the Issuer. Then again for the Order, and again for the Challenges. In the example below I omitted parts of the actual output for the sake of readability.
$ kubectl get certificate
NAME READY SECRET AGE
acme-certificate False acme-tls 66s
$ kubectl describe certificate acme-certificate
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Issuing 90s cert-manager Issuing certificate as Secret does not exist
Normal Generated 90s cert-manager Stored new private key in temporary Secret resource "acme-certificate-tr8b2"
Normal Requested 89s cert-manager Created new CertificateRequest resource "acme-certificate-qp5dm"
$ kubectl describe certificaterequest acme-certificate-qp5dm
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal OrderCreated 7m17s cert-manager Created Order resource default/acme-certificate-qp5dm-1319513028
$ kubectl describe order acme-certificate-qp5dm-1319513028
[...]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 7m51s cert-manager Created Challenge resource "acme-certificate-qp5dm-1319513028-1825664779" for domain "example-domain.net"
$ kubectl describe challenge acme-certificate-qp5dm-1319513028-1825664779
[...]
Status:
Presented: false
Processing: true
Reason: error getting clouddns service account: secret "clouddns-accoun" not found
State: pending
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Started 8m56s cert-manager Challenge scheduled for processing
Warning PresentError 3m52s (x7 over 8m56s) cert-manager Error presenting challenge: error getting clouddns service account: secret "clouddns-accoun" not found
While it helps us achieve what we want, speaking out of personal experience, that can quickly become tedious. That is why when the team discussed building a command-line tool that outputs all the information needed for troubleshooting as an addition to our kubectl
CLI plugin, I gladly took over the task.
What does this kubectl command do?
To run the command, you need to have the cert-manager kubectl
CLI plugin installed. You can either follow these steps in the cert-manager docs, or use krew, the package manager for kubectl
plugins, to install the plugin.
Once the plugin is ready, you can run kubectl cert-manager status certificate <name-of-cert>
. That will then look for the Certificate with the name <name-of-cert>
in the specified/default namespace and any related resources like CertificateRequest, Secret, Issuer, as well as Order and Challenges if it is an ACME Certificate. The command outputs information about the resources, including Conditions, Events and resource specific fields like Key Usages and Extended Key Usages of the Secret or Authorizations of the Order.
Taking the example from above:
$ kubectl cert-manager status certificate acme-certificate
Name: acme-certificate
Namespace: default
Created at: 2020-08-21T16:44:13+02:00
Conditions:
Ready: False, Reason: DoesNotExist, Message: Issuing certificate as Secret does not exist
Issuing: True, Reason: DoesNotExist, Message: Issuing certificate as Secret does not exist
DNS Names:
- example-domain.net
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Issuing 18m cert-manager Issuing certificate as Secret does not exist
Normal Generated 18m cert-manager Stored new private key in temporary Secret resource "acme-certificate-tr8b2"
Normal Requested 18m cert-manager Created new CertificateRequest resource "acme-certificate-qp5dm"
Issuer:
Name: acme-issuer
Kind: Issuer
Conditions:
Ready: True, Reason: ACMEAccountRegistered, Message: The ACME account was registered with the ACME server
Events: <none>
error when finding Secret "acme-tls": secrets "acme-tls" not found
Not Before: <none>
Not After: <none>
Renewal Time: <none>
CertificateRequest:
Name: acme-certificate-qp5dm
Namespace: default
Conditions:
Ready: False, Reason: Pending, Message: Waiting on certificate issuance from order default/acme-certificate-qp5dm-1319513028: "pending"
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal OrderCreated 18m cert-manager Created Order resource default/acme-certificate-qp5dm-1319513028
Order:
Name: acme-certificate-qp5dm-1319513028
State: pending, Reason:
Authorizations:
URL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/12345678, Identifier: example-domain.net, Initial State: pending, Wildcard: false
Challenges:
- Name: acme-certificate-qp5dm-1319513028-1825664779, Type: DNS-01, Token: exampleToken, Key: exampleKey, State: pending, Reason: error getting clouddns service account: secret "clouddns-accoun" not found, Processing: true, Presented: false
Conclusion
With the command, the user can just dump all the information about a Certificate resource if something is not going well, making it much simpler to find out what is causing trouble. It also makes it easier for others to help spot any issues, e.g. when seeking advice on Slack, the output of the status certificate command is a great starting point for others to assist in the troubleshooting process, rather than just “My certificate is stuck in pending state” or pasting a page of logs which are usually far too verbose to be useful in this context, especially in production environments.
Finally, I encourage everybody to use this command as well as other commands of the plugin. Feedback, bug reports, feature requests or even PRs are very much welcome - not only for the cert-manager kubectl
plugin, but to anything cert-manager related!