It's time for another 6 month update on the state of security online that's a little late! This is the second report using the new data source that was announced in the last report so we have some good comparisons to make when we take a look at the data.
As always, the data for this scan is taken from Crawler.Ninja and it's all available, in raw form too, over on that site. This is now the 10th report I've done on the Top 1 Million sites over 5 years!!!
I fund the infrastructure for the crawler and the time to do the analysis and report all out of my own pocket so if you do have a couple of dollars / pounds / rupees / yen / whatever to kick in then that'd be awesome! Please head to the donate section on the site on look in my Support section here for ways to help keep the project going.
This is the second crawler report that I've done using the new Tranco Top 1 Million list that I announced in the previous crawler report in Sep 2019. What that means is that our comparisons between the last report of everything that's changed over the last 6 months should be a lot more reliable.
Looking at the changes since the last report we can see that everything is positive and we continue to make great progress in improving security online. Notably, the use of Content Security Policy (CSP) has seen some good growth with both the CSP and CSPRO header increasing by a good margin. It is odd to see growth in HTTP Public Key Pinning (HPKP) given that it's no longer supported and can be quite dangerous (1, 2, 3) but the change here could be related to sites shifting in and out of the Top 1 Million ranking. Other than that, as I said, everything continues to see growth where needed so let's take a closer look at just what's going on.
We saw a bit of a dip in HTTPS in the last report because of the change in data source but I'm happy to say that now, with the second report using the Tranco list, we have seen growth in the use of HTTPS!
Looking at the absolute numbers we're up to 528,498 sites out of the Top 1 Million using HTTPS! The number is probably slightly higher than that too as there's always a small number of failed scans so it's great to see such awesome numbers. Looking at the % and things are still looking really healthy.
You can see that same dip back in Sep 2019 caused by the switch in data source but there is nice growth over the last 6 months. Even though there was a slightly dip it's good to see that switching data source didn't have a particularly huge impact, giving me more confidence in the accuracy of these numbers. We're now up to 60.9338% of the Top 1 Million sites actively redirecting users to HTTPS.
HTTP Strict Transport Security
If you're not familiar with HSTS then you should check out my blog post HSTS - The missing link in Transport Layer Security and my HSTS Cheat Sheet. If you really want to up your game then take a look at HSTS Preloading too! HSTS is essential for sites that expect their visitors to use HTTPS all the time so it's good to see continued increase in the use of HSTS.
The use of HSTS has been tracked across the entire history of these reports so I have data going back 5 years and you can really see the increase in adoption over that time. In the first report in Aug 2015 there were only 11,308 sites using HSTS and in this latest report we have 132,466 sites using it! That's phenomenal growth and we've seen a 12.49% increase in the last 6 months alone.
I started tracking more metrics about certificates over the years as they become more important in our increasingly HTTPS world and there are still some interesting trend emerging over time.
I started tracking the presence of Let's Encrypt in the Top 1 Million back in 2016 and they're seen some truly amazing growth in that time. Like other metrics they took a hit when I changed data source but also like other metrics they've seen nice growth in the last 6 months again.
Let's Encrypt are now covering, 181,896 sites in the Top 1 Million, a share of 20.97%!
Another continuing trend over the course of these scans is the decline in the presence of EV certificates in the Top 1 Million. Despite there being more sites than I've ever recorded using HTTPS there is also the lowest number of sites using EV certificates that I've ever recorded.
That graph doesn't really do it justice so I've represented the data slightly differently here.
That's a really sharp and noticeable decline in the last 6 months alone and there are currently 15,604 sites using EV certificates, the lowest absolute number I've ever recorded, and that represents 1.80% of sites using EV certificates, the lowest market share I've ever recorded. Given the tremendous growth in the use of certificates over the last few years, it's interesting, but unsurprising, that EV is not only failing to capture any of the new sites using HTTPS but also losing existing ground as sites switch to DV certificates. If you've missed the back story on what's happening with EV then check out my posts Gone forEVer!, Sites that used to have EV and Are EV certificates worth the paper they're written on?. Just to wrap up on the certificate section I also track who is issuing certificates to the Top 1 Million sites so here's the data on that.
Certificate Issuers: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 181,895 C = US, ST = CA, L = San Francisco, O = "CloudFlare, Inc.", CN = CloudFlare Inc ECC CA-2 88,085 C = GB, ST = Greater Manchester, L = Salford, O = Sectigo Limited, CN = Sectigo RSA Domain Validation Secure Server CA 35,568 C = US, ST = Arizona, L = Scottsdale, O = "GoDaddy.com, Inc.", OU = http://certs.godaddy.com/repository/, CN = Go Daddy Secure Certificate Authority - G2 35,500 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon 23,000 C = US, ST = TX, L = Houston, O = "cPanel, Inc.", CN = "cPanel, Inc. Certification Authority" 16,828 C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA 16,191 C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO RSA Domain Validation Secure Server CA 12,338 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = RapidSSL RSA CA 2018 11,036 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = GeoTrust RSA CA 2018 9,189
Alongside that meteoric growth of Let's Encrypt there is another interesting thing to note and it's the continuing decline in the presence of traditional CAs and the continuing rise in platforms providing certificates. All of the traditional CAs like DigiCert, Sectigo/Comodo and GoDaddy have all lost ground while Amazon and CloudFlare have gained ground. It's probably safe to roll at least a little of Let's Encrypt into that same category too as whilst they are a traditional CA in that they hand out certificates to site operators, they are also used at scale for platforms like GitHub Pages and WordPress Blogs. There are certainly interesting times ahead and I'd be curious to see how this trend continues over time. If do you want to look at the live data each day you can find that here.
Certificate Authority Authorisation
Whilst we're still on the topic of certificates it's of course important to talk about Certificate Authority Authorisation (CAA). The ability to control which CAs can issue certificates for your site and when they can issue them is a great feature to leverage and more sites need to use it. On that note, it's great to say that more sites are using it!
You can see that huge focus on using CAA in the most highly ranked sites to the left side of the graph and the 10 highest ranked sites using CAA show us what kind of organisations are using it.
Sites using CAA: 1 google.com 2 facebook.com 3 youtube.com 9 netflix.com 13 wikipedia.org 15 yahoo.com 16 doubleclick.net 20 wikipedia.com 24 googletagmanager.com 25 youtu.be
You can see the daily list of sites using CAA published here but as with most of these security mechanisms, it's the larger sites that are focusing on using them and usage quickly tails off as we move down the ranking. Another thing you can look at that's updated daily is the list of configurations that sites are using right here. Here's a sample of the 10 most common configurations.
Values for CAA: CAA 0 issue "letsencrypt.org" 1,855 CAA 0 issue "pki.goog" 520 CAA 0 issue "comodoca.com" 452 CAA 0 issue "digicert.com" 395 CAA 128 issue "letsencrypt.org" 338 CAA 0 issue "\;" 178 CAA 0 issue "globalsign.com" 148 CAA 0 issue "godaddy.com" 122 CAA 0 issue "sectigo.com" 113 CAA 0 issuewild "godaddy.com" 108
The general stats section is a nice overview of each crawl and it's updated daily so if you want to browse through the latest one then click right here.
Total Rows: 869874 Security Headers Grades: A 23,597 A+ 3,538 B 21,226 C 31,577 D 118,783 E 12,029 F 659,013 R 111 Sites using strict-transport-security: 133,054 Sites using content-security-policy: 52,174 Sites using content-security-policy-report-only: 2,399 Sites using x-webkit-csp: 632 Sites using x-content-security-policy: 1,898 Sites using public-key-pins: 703 Sites using public-key-pins-report-only: 38 Sites using x-content-type-options: 151,403 Sites using x-frame-options: 158,265 Sites using x-xss-protection: 120,717 Sites using x-download-options: 18,780 Sites using x-permitted-cross-domain-policies: 17,207 Sites using access-control-allow-origin: 37,005 Sites using referrer-policy: 36,325 Sites using feature-policy: 4,416 Sites using report-to: 12,339 Sites using nel: 12,131 Sites using security.txt: 1,766 Sites redirecting to HTTPS: 528,895 Sites using Let's Encrypt certificate: 182,033 Sites using EV Certificates: 15,631 Top 10 Server headers: Apache 181,642 cloudflare 147,798 nginx 143,441 Microsoft-IIS/7.5 37,619 Microsoft-IIS/8.5 27,026 Microsoft-IIS/10.0 17,240 LiteSpeed 16,115 openresty 11,781 nginx/1.16.1 9,220 Apache/2 8,323 Top 10 TLDs: .com 490,999 .org 65,000 .net 39,796 .ru 26,661 .cn 16,643 .de 16,347 .uk 14,310 .jp 8,578 .br 8,372 .in 7,147 Top 10 Certificate Issuers: C = US, O = Let's Encrypt, CN = Let's Encrypt Authority X3 182,032 C = US, ST = CA, L = San Francisco, O = "CloudFlare, Inc.", CN = CloudFlare Inc ECC CA-2 89,596 C = GB, ST = Greater Manchester, L = Salford, O = Sectigo Limited, CN = Sectigo RSA Domain Validation Secure Server CA 35,620 C = US, ST = Arizona, L = Scottsdale, O = "GoDaddy.com, Inc.", OU = http://certs.godaddy.com/repository/, CN = Go Daddy Secure Certificate Authority - G2 35,425 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon 23,113 C = US, ST = TX, L = Houston, O = "cPanel, Inc.", CN = "cPanel, Inc. Certification Authority" 16,901 C = US, O = DigiCert Inc, CN = DigiCert SHA2 Secure Server CA 16,078 C = GB, ST = Greater Manchester, L = Salford, O = COMODO CA Limited, CN = COMODO RSA Domain Validation Secure Server CA 12,115 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = RapidSSL RSA CA 2018 11,035 C = US, O = DigiCert Inc, OU = www.digicert.com, CN = GeoTrust RSA CA 2018 9,302 Top 10 Protocols: TLSv1.2 327,483 TLSv1 1,929 TLSv1.1 17 Top 10 Cipher Suites: ECDHE-RSA-AES256-GCM-SHA384 132,896 ECDHE-RSA-AES128-GCM-SHA256 112,794 ECDHE-ECDSA-AES128-GCM-SHA256 57,992 ECDHE-RSA-AES256-SHA384 11,172 DHE-RSA-AES256-GCM-SHA384 2,385 0 2,251 ECDHE-ECDSA-AES256-GCM-SHA384 1,766 ECDHE-RSA-AES256-SHA 1,306 AES256-SHA 1,236 ECDHE-RSA-AES128-SHA256 878 Top 10 PFS Key Exchange Params: ECDH, P-256, 256 bits 303,032 ECDH, P-384, 384 bits 11,063 ECDH, P-521, 521 bits 5,058 DH, 1024 bits 2,368 DH, 2048 bits 892 DH, 4096 bits 108 ECDH, B-571, 570 bits 36 ECDH, brainpoolP512r1, 512 bits 13 DH, 3072 bits 9 ECDH, secp256k1, 256 bits 3 Top Key Sizes: 2048 bit 245,090 256 bit 59,070 4096 bit 21,317 3072 bit 826 384 bit 695 1024 bit 147 8192 bit 19 4056 bit 3 4048 bit 3 2255 bit 2 Sites using CAA: 17,245
Just looking through the general statistics for each crawl and there are already a few things that jump out to me and are worth talking about.
I recently wrote about Legacy TLS is on the way out: Start deprecating TLSv1.0 and TLSv1.1 now and it's great to see such low numbers of reliance on Legacy TLS in the crawl.
Y0u can see the absolute numbers in the daily crawl data here but I was surprised that there are still a few thousand sites that don't support higher than TLSv1.0 or TLSv1.1 which are both quite old.
Top 10 Protocols: TLSv1.2 327,483 TLSv1 1,929 TLSv1.1 17
Another surprising thing came in the cipher suites and that AES256 was so much more prevalent than AES128. You can see the live daily data here and here are the top 5 most common cipher suites from today.
Top 10 Cipher Suites: ECDHE-RSA-AES256-GCM-SHA384 132,896 ECDHE-RSA-AES128-GCM-SHA256 112,794 ECDHE-ECDSA-AES128-GCM-SHA256 57,992 ECDHE-RSA-AES256-SHA384 11,172 DHE-RSA-AES256-GCM-SHA384 2,385
Given that AES128 is sufficient and better performance than AES256 I'd have expected that to be more the prevalent cipher suite, but indeed not!
RSA vs ECDSA
Focusing on the performance side of things again and using an ECDSA key for your certificate is far better and offers slightly better security too. The data for key types might not indicate that though.
Unfortunately the use of ECDSA is still quite low despite the fact that they are better for performance and security, but there's probably a good reason why: Windows XP. Well it's not just Windows XP but it is legacy clients that can't support ECDSA and only support RSA so we have to RSA if you have legacy client concerns. There's also an element of RSA just being 'the default' so there are probably people who could upgrade to ECDSA and just haven't. I have written about ECDSA certificates on my blog before and if you want to get really fancy it is possible to support both RSA and ECDSA together, but for now, we do need to drive those ECDSA numbers up quite a bit.
Talking about performance again and another surprising thing jumped out at me. Looking at the most common key sizes used for authentication we have the following data, available daily here.
Top Key Sizes: 2048 bit 245,090 256 bit 59,070 4096 bit 21,317 3072 bit 826 384 bit 695
So the top 2 key sizes are where I'd expect. The 2,048bit key is an RSA key and the 256bit is an ECDSA key. As I said in the previous section we can see that RSA is more common than ECDSA but there are different key size available for each. The most surprising thing here is the absolute crazy amount of 4,096bit RSA keys! This is insane! The performance hit of such an unnecessarily massive RSA key won't be small and there is a heap of sites using them.
In this section we'll look at the utilisation rate of different headers and the grade that sites score on my Security Headers analyser service which is free to use so head over there if you've not checked it out before.
Looking at utilisation first and we can see that these headers are more popular amongst the higher ranked sites which has been a consistent them throughout the history of these scans.
Running all of these sites against the Security Headers API to fetch their scores yields the same disappointing results that the homepage of Security Headers tells us: lower grades are very common.
The most common grades are F and D as you can see and interestingly the F grades are lowest at the high end of the ranking and the D grades are the highest.
Get the data
If you want to see the data that these scans are based on then there are several things to check out. All of the tables/graphs/data that this report was based on are available on the Google Sheet here. The crawler fleet itself and the daily data are available over on Crawler.Ninja so head over there for those. There's also a full mysqldump of the crawler database with the raw crawl data for every single scan I've ever done, over 2.5TB of data, available on Scans.io which means if you want to do some additional analysis the data is there for you to use. On top of all of that, if you've enjoyed this post, the data and the analysis then please consider support the project in some way on the donations section!
Other quick observations
I don't want to make a long post much longer but there were a couple of other quick things that grabbed my attention so I will post them up here for a brief look.
Because I'm analysing certificates for all sites that get scanned I can also see when they expire so I have a list for sites serving expired certificates, and for sites whose certificate expires in one, three and seven days.
There are a surprising number of sites still using HPKP as they are issuing PKP/PKPRO headers. The even more surprising thing when looking at the actual header values is that the vast majority of them are completely wrong/invalid! Here's the 2 most common headers:
pin-sha256="X3pGTSOuJeEVw989IJ/cEtXUEmy52zs1TZQrU06KUKg=" max-age=15552000; includeSubDomains x 200 pin-sha256="base64+primary=="; pin-sha256="base64+backup=="; max-age=5184000; includeSubDomains x 55
Records of Server headers have changed loads over time and there are still some super old values in the full list. Here's the top 5
Server headers: Apache 181,642 cloudflare 147,798 nginx 143,441 Microsoft-IIS/7.5 37,619 Microsoft-IIS/8.5 27,026
How about X-ASPNet-Version anyone? Full list.
Values for X-Aspnet-Version: 4.0.30319 31,725 2.0.50727 3,734 411 0 265 1.1.4322 53
The X-Page-Speed header, Full List. It's interesting how many people have set the value of the header to
Values for X-Page-Speed: 126.96.36.199-0 1,135 Powered By ngx_pagespeed 126 on 109 188.8.131.52-0 93 184.108.40.206-0 75
The X-Powered-By header is actually quite grim reading, stuff is so old in there! Here's the Full List and the top 5.
Values for X-Powered-By: ASP.NET 85,373 PleskLin 19,600 WP Engine 19,459 PHP/5.6.40 18,977 PHP/5.4.45 11,193
Okay, I could keep going over this data for hours! Take a look at all of the daily files and maybe there's other interesting observations to make and if you want to automate something up they're all available as JSON files too. Have fun!