We've encountered a lot of problems of our own making in the TLS/PKI ecosystem in recent years, and whilst we've got better at dealing with them and even avoiding them, there's still a way to go.
Certificate Lifetime
The focus of these blog posts will be on the maximum allowed validity period of certificates, but not just the certificates used by websites, we'll be taking a look at CA certificates too. To get started, I'll be looking at the certificates that almost all of us use, and that's the certificates we all get so we can have HTTPS on our websites! I'll refer to them throughout as server certificates, but you may have come across them being called HTTPS certificates, SSL certificates, end-entity certificates, or one of a few other terms too!
Let's have a brief look at the history of certificates and if we roll back the clock far enough, there was a time when there wasn't a defined cap on the maximum validity period of a certificate. Before July 2012 you could go wild and it's easy to come across certificates that were issued with a validity period of 9 or even 10 years! [1][2]
Thinking about that now is absolutely mind blowing and we've chopped that down to just a fraction of what it used to be, and here's how we got here.
60 Months
The first time a limit was set on the validity period of a publicly-trusted certificate was in v1.0 of the Baseline Requirements. In §9.4 of that very first document, you can find the following:
Certificates issued after the Effective Date MUST have a Validity Period no greater than 60 months.
With an effective date of 1st July 2012, we saw the introduction of a limit that reduced certificates to being valid for "only" 5 years... That's pretty wild, and still completely inappropriate by today's standards, but that was a reduction from the 9 or 10 years you could see back then.
39 Months
The next reduction in certificate lifetime came on 1st April 2015 in v1.3.0 of the Baseline Requirements, and that saw the follow stipulation in §6.3.2:
Certificates issued after 1 April 2015 MUST have a Validity Period no greater than 39 months
A step in the right direction for sure, but this was still far too long, and there are many good reasons why, a few of which I detail here.
825 Days
As time progressed, we saw the next reduction in certificate lifetime come on 1st March 2018 in v1.4.4 of the Baseline Requirements, which set out in §6.3.2:
Certificates issued after 1 March 2018 MUST have a Validity Period no greater than 825 days
At this stage, certificates are getting to close to what has always been my recommendation of 12 months being the absolute limit, but we started to encounter some real friction from here.
398 Days
The industry tried to push through the idea of 398 day certificates before we had 825 day certificates, but the vote failed. Ballot 185 called for 398 day certificates from 24th August 2017 and was met with widespread resistance from major CA players in the industry. The 825 day certificates vote was seen as the compromise and passed, but it was inevitable that 398 day certificates would be proposed again.
I gave widespread coverage to the next ballot, SC22, which again proposed that certificates be reduced to 398 days in validity and again, failed. The reasons presented by those in the industry who wanted this change were all valid and they sought to resolve genuine concerns, I even covered many of them myself in 'Why we need to do more to reduce certificate lifetimes'. Having failed twice to shorten certificates, a key player in the ecosystem was to step up and in the interests of improving security for their users and single-handedly push through this change.
It was a surprise to see Apple announce this, but their 'About upcoming limits on trusted certificates' is short and sweet.
TLS server certificates issued on or after 1 September 2020 00:00 GMT/UTC must not have a validity period greater than 398 days.
Published on 11th March 2020, the announcement gave the industry 6 months of advanced warning and showed that Apple was serious about improving the health of the ecosystem. Shorter certificates were coming, and while Apple was making it happen, Google and Mozilla stood with them. With this change now being effectively a mandatory change, ballot SC31 'Browser Alignment' solidified the 398 day validity period in the Baseline Requirements V1.7.7 §6.3.2:
Subscriber Certificates issued on or after 1 September 2020 SHOULD NOT have a Validity Period greater than 397 days and MUST NOT have a Validity Period greater than 398 days.
Where to next?
For a long while now, I've been eagerly awaiting the announcement of the next reduction in certificate lifetime and it hasn't yet arrived. I think the pandemic certainly didn't help, but still, I feel we're now overdue for the next step.
Keeping a close eye on things happening in the industry, I have seen a few changes that could be interpreted as an indication that another announcement is definitely coming, and they also hint at what that next step might be.
Certificate Transparency Policy
If you aren't familiar with Certificate Transparency, I have an introductory blog post that should help you get started. To give a TLDR; here, CT logs are public logs that contain all certificates so that the existence of any certificate can't be hidden and certificates can't be issued in secret. The number of logs a certificate must be written to depends on how long the certificate is valid for, with longer certificates needing to be written to more logs.
Both Apple and Google have had their own CT policy requirements for some time now, and they looked like this.
Certificate Lifetime | Number of SCTs from distinct CT Logs |
---|---|
< 15 months | 2 |
>= 15 and <= 27 months | 3 |
> 27 and <= 39 months | 4 |
> 39 months | 5 |
As you can see, the longer the certificate is valid for, the more logs it has to be written to. Of course, as time went by, the longer certificates simply didn't exist and the policy became mostly redundant, to be replaced with this.
Certificate Lifetime | Number of SCTs from distinct CT Logs |
---|---|
<= 180 days | 2 |
> 180 days | 3 |
Logging to CT logs creates a burden for the issuing CA and it's a burden that they're likely to want to minimise. By introducing these new requirements, both Apple and Google are giving CAs a benefit to achieve by issuing shorter certificates and this is one of the things that first got me wondering if this was their way of hinting at what was coming next. Drawing the line at 180 days would certainly seem to fit well with our progress on shortening certificates, and this could be a good way to get CAs and subscribers (the people acquiring certificates) used to a 180-day cadence. Then, Google gave another hint.
Chrome Root Program
Much like Mozilla run their own Root Authority Program for their products like Firefox, so too do Google with Chrome and other major players in the industry with their own Root Authority Programs for their own clients like Microsoft and Apple. The Chrome Root Program, sadly not named the Chrome Root Authority Program (CRAP) as would it would otherwise be, published Moving Forward, Together, a post worth reading for many reasons, but I'll focus on just one of them here.
a reduction of TLS server authentication subscriber certificate maximum validity from 398 days to 90 days
90 days?! Before we get too excited, it's worth noting that the section I quoted that text from begins with:
In a future policy update or CA/Browser Forum Ballot Proposal, we intend to introduce
This is Chrome laying the groundwork for the next change in certificate validity periods, but it doesn't exclude another step between where we are now and when 90 days becomes the norm.
What's the next reduction?
Personally, I'm torn between what's the right option for the next change, and that might come as a surprise to some given my views and the blog posts I've published over the years!
On one hand, the answer is both easy and obvious, it should be 90-day certificates! On the other hand, trying to be a little more pragmatic, one has to wonder if the industry is quite ready for 90-day certificates...
My biggest hesitation is the low number of CAs that support ACME, the protocol that allows easy and standardised automatic renewal of certificates. I've detailed the few that do offer free certificates via ACME, but at this point, I don't understand why all CAs don't support ACME, including the commercial ones.
If we're going to 90-day certificates, the process of renewal simply must be automated. There's no way that it would be reasonable for anyone to consider renewing those certificates manually because they should be renewed every 30 days, or maybe 60 days at the absolute most. Once I think that, though, I then wonder how 180 days would make any difference... Would people renew those certificates manually? Is that still a reasonable expectation or should 180-day certificates also be renewed in an automated fashion? If that's the deciding factor then we should just go to 90-day certificates as everyone will need to automate anyway.
Maybe another way to look at this is to look at the reductions in certificate validity over time, and then plot both 90-day or 180-day certificates as the next change. If we do that, you can see that one of these follows the trend much more nicely and seems like the more logical choice. The following graphs show the certificate validity limit in months on the Y axis and the date the change came into effect on the X axis. The graphs assume the next change will be introduced in Jan 2024.
Here is the graph with 180-day certificates.
Here is the graph with 90-day certificates.
You can see that the trend line is following much better when the next change is to 90 days and the 180 day change is really flattening the bottom of that slope. On top of that, you can see that 90 day or 180 day are both, quite clearly, a reduction in our rate of progress over time and that's assuming the change comes in Jan 2024! If we push this out further than that, which is almost a certainty at this point, things only start to look worse.
Other Considerations
The most obvious consideration for this change is the impact on subscribers, those using the certificates for HTTPS. It means more frequent renewals of certificates, more frequent deployments of certificates, and possibly implementing a whole new set of technologies and processes if you're doing manual renewal at present. There are some other considerations too though and I thought I'd list them here briefly.
Can CAs handle the load?
Every certificate that is issued requires a CA to go through the issuance process and that requires the appropriate amount of infrastructure. If we reduce certificate validity periods, then CAs will have to complete that process more frequently, even without increasing their number of customers. For a CA issuing only 10,000,000 certificates per year, they'd be doing ~27,400 issuances per day at present assuming 1-year certificates. If we go to 90-day certificates, that same CA would now need to handle ~333,300 issuances per day, quite an increase!
Considerations on this load increase would need to made for their HSM capabilities to do the signing operation, database activity, storage for logs, bandwidth both internally and externally, along with much more. There aren't many orgs out there that have the ability to do >10x on their production load on short notice! You can read this article from Let's Encrypt on their concerns with having to reissue all of the non-expired certificates that they have, something which is a slightly different concern, but mirrors all of the same performance and infrastructure worries. If you look at the Let's Encrypt stats, however, they're comfortably issuing >3,000,000 certificates per day without any problems so it can be done, the CAs might just need to make some improvements.
Can CT Logs handle the load?
I briefly mentioned Certificate Transparency at the start of this blog post and if you're unfamiliar with how it works, it requires that CAs log all certificates they issue to a minimum of two independently operated CT Logs. A lower lifespan on certificates of course means that more entries will need to be made into the CT Logs because more issuance events are taking place, and thus an increase in the associated costs of operating the log. This will require more bandwidth, more computational power and more storage, at a minimum, from all log operators! We've already dealt with some quite significant increases in the load place on CT logs and Temporal Sharding, as explained here by Venafi, should be good enough to keep the sheer storage requirements at bay, but it doesn't solve the other concerns like bandwidth and compute power. Here are some details on the Let's Encrypt CT Log which are almost 4 years old and based on 1,000,000 certificates being issued per day, so imagine where we are now as Let's Encrypt are comfortably doing 3,000,000 certificates per day!
We use 2x db.r5.4xlarge instances for RDS for each CT log. Each of these instances contains 8 CPU cores and 128GB of RAM.
We use 4x c5.2xlarge EC2 instances for the worker node pool for each CT log. Each of these instances contains 8 CPU cores and 16GB of RAM.
A back of the napkin storage estimation is 1TB per 100 million entries. We expect to need to store 1 billion certificates and precertificates per annual temporal shard, for which we would need 10TB ... We decided to create a 12TB storage block per log (10TB plus some breathing room)
With a global issuance rate of ~250,000 new certificates per hour, a rate that is only growing, CT Log Operators will certainly have some interesting times ahead!
90-day certificates
I've long pointed out the need for shorter certificates and once the process of issuance and deployment is automated, the validity period of a certificate no longer matters. All of the certificates that I use, both internally and externally, are automatically renewed and deployed, so I could renew them every 7 days if I really wanted to and apart from changing the frequency of the renewal task, I wouldn't have to lift a single finger. The big push here isn't really about the validity period of a certificate, it's a push towards automation, and once we get that widespread, we'll be in a much better place!
If you enjoyed this blog post and would like to dive deeper into the World of TLS and PKI, why not join me on our Practical TLS and PKI workshop!