There have been quite a few mentions of FLoC recently and several people have been providing various links, bits of information and questions about the new feature. Whilst it's still quite a new and, as yet, not a widely supported feature, I thought I'd give it a brief introduction!
Federated Learning of Cohorts
It sounds like quite a fancy name for a new feature and you can find a lot of detailed information in the spec, but I want to give an overview here as it's touted as a privacy feature and I think those are quite important.
Many people dislike privacy invasive tracking online, quite understandably, and adverts are the usual culprit for this. I'm not opposed to the idea of adverts, I've found out about a lot of cool stuff from adverts and they support a lot of things I like, but I'm sure we all hate the heavy, bloated, distracting adverts that worst of all, are tracking you everywhere you go. FLoC aims to remove the need for invasive tracking to serve relevant ads to users by categorising users based on their browsing history on the client device itself. The client would then provide a FLoC ID to the server upon request and the site would know what your interests are. I might have a FLoC ID of 721954 generated which means I'm interested in cars and infosec, along with thousands of other people who may end up with the same FLoC ID as they have the same interests. The site can't tell us apart, or who we are, based on our FLoC ID, but it can now serve us more relevant ads. That's a massively oversimplified example, but it's enough to get across the point of how it works and the stated goal of it allowing the removal of 3rd party cookies.
Shocker, right... You can find more details out about FLoC quite easily and there's certainly not a shortage of news about the technology already. The topic of advertising online is difficult to discuss but I'm not here to talk about whether or not advertising is good or bad, or even whether FLoC is good or bad, but rather, to inform you that FLoC is coming, what it is, what it does and what you can do about it if you want to do something about it.
I've talked about Permissions Policy before, it's a HTTP Response Header, previously called Feature Policy, that allows you to configure whether you intend to use various technologies on the client. For example, with Permissions Policy you can disable the use of the camera, microphone or geolocation on your site if you don't expect to use those things and want to make sure that anything that finds its way into your pages can't use them either. With an extension to Permissions Policy, you can now also control if you want your site to be included in the calculations of the FLoC ID for the client currently visiting your site.
This opt-out is detailed in the spec and is simple and easy to deploy, but I think initially there has been some confusion about what this actually does, certainly from the questions and comments I've received anyway. First of all, this opt-out prevents the browser from including your site in the 'cohort calculation' for the current client. Second, it means that nothing can call
document.interestCohort() to get the FLoC ID of the current client. Of course, this doesn't work or do anything outside of the context of the current site being visited and doesn't 'disable' FLoC on the client beyond that scope.
Even though FLoC is still quite new and not even widely supported yet, I thought I'd take a look at how many sites were already choosing to opt-out!
Updating my crawler
I'm sure that regular readers will know about my crawler over at Crawler.Ninja that scans the top 1 million sites in the World every day and produces raw data for analysis. If you don't know about it, head over and check out the information available.
What's relevant for this blog post is that the crawler produces a daily list of all sites that were issuing a Permissions Policy header right here. You could check a site on that list by visiting it to see the header they're issuing, but the crawler also produces a list of all the distinct Permissions Policies that it saw here. This list is quite interesting as you can see what sites are enabling/disabling, but I filtered out the policies that contained the new
interest-cohort directive. Here is the distinct policy followed by the number of sites using it:
interest-cohort=() 71 camera=(), microphone=(), midi=(), geolocation=(), interest-cohort=() 9 microphone=();payment=();interest-cohort=() 1 document-domain=(), interest-cohort=() 1 web-share=(self "https://ej.uz"), interest-cohort=() 1 interest-cohort=(); accelerometer=(); camera=(self 'https://www.masters-of-cloud.de'); geolocation=(); gyroscope=(); magnetometer=(); microphone=(self 'https://www.masters-of-cloud.de'); payment=(); usb=() 1 interest-cohort=(); accelerometer=(); camera=(); geolocation=(); gyroscope=(); magnetometer=(); microphone=(); payment=(); usb=() https://www.jabber-germany.de 1 interest-cohort=(); accelerometer=(); camera=(self 'https://www.morbitzer.de'); geolocation=(); gyroscope=(); magnetometer=(); microphone=(self 'https://www.morbitzer.de'); payment=(); usb=() 1
There's only a total of 86 sites out of the top 1 million issuing a header that contains
interest-cohort as of today, but that's still more than I was actually expecting! Also, another interesting point is that the vast majority of them seem to have added a Permissions Policy header purely for the purpose of opting out of FLoC
interest-cohort=() 71. That's a lot of people now using Permissions Policy for one very specific purpose!
If you're interested in who is opting out of FLoC on their website already and don't want to have to download the entire dataset yourself to run the query, I also updated the crawler to produce a new daily file here. This is a list of sites using
interest-cohort in their Permissions Policy header and here is the current list.
Sites opting out of FLoC: 78 theguardian.com 316 duckduckgo.com 548 guardian.co.uk 2,869 fraunhofer.de 3,085 brave.com 8,455 metafilter.com 11,957 vivaldi.com 16,889 bravesoftware.com 17,466 privacyinternational.org 32,879 basicattentiontoken.org 41,055 fhg.de 41,839 themarkup.org 48,973 observer.co.uk 50,154 egu.eu 54,280 nic.ad.jp 56,989 vivaldi.net 61,587 gu.com 70,411 fosstodon.org 81,311 httpstatus.io 85,068 ikea-usa.com 91,805 chaos.social 98,716 scalemates.com 141,192 guardiannews.com 157,647 ikea.se 205,908 irccloud.com 207,647 ej.uz 212,853 guardianunlimited.co.uk 243,642 wolf-howl.com 246,380 ikea.nl 266,011 ikea.de 266,119 bsd.network 309,651 irbnet.de 311,137 ikea.ru 323,221 animeuknews.net 348,469 sitespeed.io 348,790 weirder.earth 366,060 locvacances.com 387,292 zgp.org 391,348 ikea.ca 394,383 mp3licensing.com 419,971 kevq.uk 444,142 dontbubble.us 460,533 railfreight.com 469,937 bitblokes.de 475,243 naturfotografen-forum.de 485,247 rijschoolpro.nl 490,702 eve-files.com 491,349 bluf.com 523,435 burnallgifs.org 537,368 cloudtraff.com 558,207 taxipro.nl 581,150 infrasite.nl 592,581 childinthecity.org 598,392 aco.net 598,643 railtech.com 602,336 ikea.jp 637,668 ikea.us 660,205 spurint.org 666,956 ddg.gg 668,267 fraunhofer.org 693,625 theguardian.co.uk 711,587 ikea.co.uk 719,606 spoorpro.nl 720,138 digitalcourage.social 732,348 polipundit.com 748,444 ikea.com.sg 765,487 ikea.it 769,304 ikea.com.my 800,026 ikea.fr 804,530 duck.com 811,856 naver.cm 820,466 masto.pt 836,306 masters-of-cloud.de 846,711 gmgplc.co.uk 847,211 tankpro.nl 848,143 jabber-germany.de 866,246 tommorris.org 868,373 ikea.pl 879,402 guardianweekly.co.uk 885,293 fluxenergie.nl 904,775 ikea.mx 934,543 mastodontech.de 939,246 frag-ikea.de 942,378 morbitzer.de 949,350 keltia.net 990,883 ikea.com.au
There's a very distinct theme of sites in that list with IKEA and The Guardian taking up quite a few entries across various domains and ccTLDs, along with some privacy centric browsers and organisations. I've recently opened issues to add the directive to the Permissions Policies on all of my sites, so expect to see them pop up on there soon too, but interestingly, I'm not sure it will actually do anything. The current article on taking part in FLoC says a site will only be included in the FLoC cohort calculation if "they load ads-related resources or if they use
document.interestCohort()", which none of my sites do so as I understand it, they wouldn't be included anyway. It's not clear if this will be the case going forwards or not, but I've opted out anyway, and it's something worth keeping an eye on.
Does your browser support FLoC?
The EFF have a site where you can check if your browser is currently part of the FLoC trial and going forwards you will be able to see when it is enabled if it currently isn't.
If you'd like to stop your FLoC ID being provided, DuckDuckGo have a browser extension that will do precisely that for you too.
It feels like it's still a little too early to say how FLoC will progress in terms of default on or off, or if will require a permissions prompt on the site, but for now, it doesn't require one.
Along with updating my crawler to look for and produce the information about FLoC, I've also updated Security Headers so it is aware of the new Permissions Policy directive and will highlight it accordingly.
You can check my scan results to see an example of a site using Permissions Policy that opts out of FLoC.
This will be really awesome because if you opt-out of FLoC on your site using your Permissions Policy header, but something on the page then tries to access the FLoC ID of one of your visitors, not only will that be blocked but you will get to find out exactly what's happening!