I have a family member that works at a well-known Fortune 500 company. When this person needs to consume an IT service they have a standard process to follow. For most services the process is pretty similar. There’s usually a portal involved that includes some education on how to consume the service for those that need it, the process itself is usually mostly self-service with the exception of any approvals required. Once the service request is completed and the service is delivered. It’s simple and straightforward.
Why is TLS certificate management different from all those other IT services?
Most organizations organize TLS services along these lines; the network services team sends an email to the PKI guy when they need a public certificate, but they do their own thing using self-signed certs for internal needs. For example, the web services team generates their own keys and CSRs then submits it to the PKI team for public certificates. For internal certs they go directly to their Microsoft CA. The DevOps team is also doing their own thing. They use Vault or Let’s Encrypt or who knows what else because they need to go much faster than other teams so there’s no way for corporate certificate services to keep up with them. If you multiply this across all the different teams in the organization that use TLS certificates you can see how it can get crazy with each team doing their own thing. Who’s responsible for tracking all these certificates? Who’s responsible for securing them? Who’s ultimately responsible for the overall security risks posture connected to TLS certificates? You can’t blame each team for doing their own thing because there’s no clear process that can server everyone’s needs. TLS certificate management is like hot potato; everybody has to hold it at some point but nobody holds onto it for long.
If you have worked with me in the past you’ve probably heard me say that TLS is like electricity or water. We all use it every single day, but we really don’t truly know or care about the details of how it gets to our house. However, when it stops working it is a major crisis. The reality is that many organizations, if not most, experience certificate related outages when TLS certificates stop working pretty regularly.
So, what should organizations do about this problem? What if we decided to treat TLS certificates like a standard IT service? The most common goal our clients have when they first start working with Venafi is to stop certificate outages. At this point it’s VERY important to note that outages are a symptom of the problem they need to solve, but it is not the root cause of the problem. The problem is not securing your TLS certificates, and this is impossible without a standard IT service and process. Once you secure those bad boys, the outages go away. Here at Venafi we’ve worked with many organizations to deploy ‘certificates-as-a-Service’ with the goal of preventing outages. The outages stop once the service is deployed, but more importantly, we’ve installed a new IT service that secures and protects TLS machine identities (certificates).
I’ll bet you’re wondering exactly how to set up certificates-as-a-service. At Venafi, we have this down to a science. We know you need to address people, process and technology in order to deliver an effective service that can prevent outages and reduce security risks.
8 steps necessary to deliver certificates-as-a-service:
- Establish an Outage Safety Net
- Establish a Solid Foundation
- Align the Organization
- Define and Design the Service(s)
- Train the Service Support team
- Train the Service Owners – Early Adopters
- Expand Adoption
- Assess Service Effectiveness, Adoption Process and Tune
I won’t go into great detail for each of these steps, but you can visit Venafi’s website for a prescriptive guide. The purpose of this article is to provide you a high-level description of each step and an overview of the process.
1. Establish an Outage Safety Net
When an organization is experiencing regular outages, the first step must be to stop the bleeding with an outage safety net. You will define a process of finding any certificates that have fallen through the cracks and are likely to create an outage. The outage safety net is different from organization to organization. For some organizations this is escalation of certificate expiration alerts. For others it is visibility of all certificates in use, or it could be Sev 1 ticket creation. The key point here is that the safety net is not the standard notifications or reports used for regular certificate renewal. The safety net is a process specifically to catch certs that are not addressed in the normal process, and replace them before they expire.
2. Establish a Solid Foundation
Before we deploy certificate-as-a-service to all users we need the infrastructure on which it will be built to be rock solid, and for all of the underlying processes to be reliable, scalable and redundant. If we ask our users to consume a service, and the infrastructure that operates the service isn’t reliable or consistent, users can’t and won’t use the service.
3. Align the Organization
You are going to need support getting your organization to adopt certificates-as-a-service. Some users are going to love the idea, but some users don’t like change, even if the change is going to make their lives easier. Organizational change is never easy. Since most teams don’t think about TLS certificates very often, you’re going to need executive sponsorship to help drive and influence the initiative. It’s very tempting to skip this step, but don’t do it. Getting everyone on board is the secret to success.
4. Define and Design the Service(s)
What will the service look like? How will users consume it? Which team will support it, and how will that support be provided? These are questions you have more than likely already answered for other services, so don’t reinvent the wheel. Use your service desk or ITSM platform and integrate your certificate management solution into existing systems that support IT services.
5. Train the Service Support Team
Train the team that is going to support the service. Make sure they know processes and procedures. Doubt leads to confusion, which leads to unhappy users who quickly revert back to “the way we used to do it.” In the previous step we defined the service. In this step we want to ensure that the service and our customers are properly supported.
6. Train the Early Adopters
Identify a team that is interested and motivated to use the service. Perhaps there’s a team that has recently experienced an outage, or maybe they are looking for more automation because of resource constraints. Recruit this team to be your early adopters and train them on how to use the service. Set up your support team to ensure they’re successful and learn what you can from the process of onboarding these users. What worked? What didn’t? Modify the service as needed to make sure it will scale in your organization. You might want to repeat this step two or three times, depending on how each onboarding process goes.
7. Expand Adoption
Once you have a couple of successful teams onboard, you’re ready to roll out the service more broadly and hang out your “Open for Business” sign. However, you’re going to need to let all of your teams know about your new service. It makes sense to have a communications plan for this. If you have a very large organization, you’ll want to repeat the communications more than once.
8. Assess Service Effectiveness, Adoption Process and Tune
Once the service is up and running and you have multiple teams consuming the service, you might be tempted to walk away. This is not a great idea because the way your organization uses TLS machine identities should continually change as teams adopt new tools, change processes and increase automation. You should set up regular check points to monitor service requests and the most frequently asked questions you receive. Is there confusion about how to consume the service? Is there a common request to modify an existing service or to create a new service? Is more education or more training needed? Is there sufficient self-help information for users that are not TLS experts so they can successfully consume the service? By setting up regular check points, you’ll be able to make changes to accommodate the way your organization is changing and make your customers successful.
What Comes Next
Once the process is complete your new certificate service should look very similar to all your other IT services. The key to success is to not reinvent the wheel; you want the service you design and deliver to be easy to consume. Ultimately, this service will be secure and promote best security practices, reducing TLS security risks and improving reliability and availability of your IT infrastructure. Better yet, it will make day-to-day life much easier for you and your customers.