Think of the last time you experienced a certificate-based outage. I bet it was painful, regardless of the point-of-view you experienced it from.
If you're on your organization's security team and responsible for PKI, you probably got blamed for the expired certificate and resulting outage. While that's painful, the deeper pain is that you couldn't fix the problem. The expired certificate was on a web server, an application, a load balancer, or something else that someone else is responsible for. The most you could do is notify somebody about the expiring certificate. If you even knew who to notify, that is.
Now if you're the owner of that web server, application, or load balancer on which the certificate expired, I bet your experience was a painful one too. It's likely you didn't know a certificate was about to expire. Even if you did, you might not have had all the information you needed to replace it before it triggered a painful outage. Maybe that certificate was really installed on six web servers rather than the four that “Bob” told you about when he passed you management of them—just before he switched departments earlier this year. And then there’s all the time you took to troubleshoot why you had a system outage, while your system was unavailable. So. Much. Pain.
How did we get here?
There are lots of reasons why expired certificates cause outages on web servers, applications, load balancers and other systems. As a start, most organizations are simply using more certificates than they can keep track of manually. Digital transformation has resulted in more machines doing more things, meaning there are way more machine identities—aka keys and certificates—in use than ever before. Businesses are also spinning up and relying on machine identities faster than ever as they take advantage of DevOps techniques and work in highly scalable cloud environments. To complicate matters even more, shorter certificate lifespans mean certificates need to be renewed more frequently, further adding to the challenge.
Another big reason why certificate-based outages are so common has to do with the people mentioned earlier who experience the pain of outages. I’m talking about those who are responsible for PKI and those who are responsible for the servers, applications, devices and other machines where certificates run. In most organizations, the team responsible for PKI is very small. Even in many of the largest organizations that we work with here at Venafi, those teams are often just 3-5 people. On the other hand, the teams, and individuals responsible for applications and systems number in the hundreds or even thousands. If the small PKI team is trying to manually keep up with the hundreds or thousands of applications and system owners, it’s easy for people on either side to miss a certificate renewal, have the certificate expire, and cause an outage.
Can certificate outage prevention be painless?
At Venafi, we have helped hundreds of organizations eliminate certificate-based outages and the pain they cause. We’ve learned that if security teams and certificate owners have the right tools and processes in place and know what they are responsible for, they can effectively manage certificates throughout their lifecycles.
At a high level, here’s what these teams need:
- Security teams need visibility into all certificates in use throughout their organization, intelligence into those certificates (e.g., who’s the certificate owner, when does the certificate expire, does the certificate meet corporate policy), and the ability to automate certificate lifecycle management tasks. Automation is critical. With so few PKI experts on security teams, and so many certificate owners and certificates in use, manual processes simply can’t keep up.
- Owners of web servers, applications, load balancers and other systems need a way to request and work with certificates in a way that’s part of their normal workflow. They also need to have access to certificates when they need them, not a few days or a week after making a request. If requesting certificates takes too long or if adhering to security policy is a burden, people will find ways to circumvent policy.
Venafi has a proven methodology that organizations can follow to ensure these security teams and certificate owners get what they need to eliminate the pain of certificate-based outages. The 8 Steps of our specific and comprehensive path—encompassing people, processes and technology—is known as VIA Venafi, the Venafi Way. In fact, we’re so certain that a Venafi customer who follows the Venafi Way will experience no certificate-related outages, we guarantee it.
A case study in eliminating certificate-based outages
A great example of what painless certificate-based outage elimination looks like is this case study from a major retailer in the USA. The retailer had been enduring persistent certificate-related outages for several years. On average, the company suffered at least one certificate-related outage every two weeks, which impacted customers and multiple stores. The company’s PKI team lacked the capability to
effectively manage the hundreds of thousands of certificates in its IT ecosystem. They could track only a few thousand of those certificates, most of which resided on load balancers and were managed manually using spreadsheets. Within just two months of installing Venafi, the retailer’s outages in business units using certificate automation were eliminated and the retailer saw significant improvements in operational workflows, such as being able to automate certificate requests and shorten provisioning time from several days to less than one minute, on average.
If you’re interested in eliminating the pain of certificate-based outages in your organization, talk to us. Venafi experts can help you understand where your organization is at along the VIA Venafi Path and what steps you need to take to complete the journey.
- Venafi Study: Are Financial Service Organizations More Likely to Suffer Certificate-Related Outages?
- Majority of Businesses Still Experience Outages: Are You Protecting Your Certificates?
- GAO Report: Expired Certificate Allowed Extended Exfiltration
- How Big Is Your Risk of Certificate-based Outages?