Google's upcoming plans to replace third-party cookies with a less invasive ad targeted mechanism have a number of issues that could defeat its privacy objectives and allow for significant linkability of user behavior, possibly even identifying individual users.
"FLoC is premised on a compelling idea: enable ad targeting without exposing users to risk," said Eric Rescorla, author of TLS standard and chief technology officer of Mozilla. "But the current design has a number of privacy properties that could create significant risks if it were to be widely deployed in its current form."
Short for Federated Learning of Cohorts, FLoC is part of Google's fledgling Privacy Sandbox initiative that aims to develop alternate solutions to satisfy cross-site use cases without resorting to third-party cookies or other opaque tracking mechanisms.
Essentially, FLoC allows marketers to guess users' interests without having to uniquely identify them, thereby eliminating the privacy implications associated with tailored advertising, which currently relies on techniques such as tracking cookies and device fingerprinting that expose users' browsing history across sites to advertisers or ad platforms.
FLoC sidesteps the cookie with a new "cohort" identifier wherein users are bucketed into clusters based on similar browsing behaviors. Advertisers can aggregate this information to build a list of websites that all the users in a cohort visit as opposed to using the history of visits made by a specific user, and then target ads based on the cohort interest.
The idea, in a nutshell, is to leverage on-device machine learning and "hide" individuals in the crowd by keeping users' web history private on the Chrome browser.
"With FLoC, individual profiles are a potential source of additional information about the properties of the FLoC as a whole," Mozilla said. "For instance, information from individual profiles can be generalized to inform decisions about the FLoC cohort as a whole."
Additionally, the cohort ID assigned to users is recalculated weekly on the device, which is meant to reflect their evolving interests over time as well as prevent its use as a persistent identifier to track users. Google is currently running an origin trial for FLoC in its Chrome browser, with plans to roll it out in place of third-party cookies at some point next year.
Despite its promise to offer a greater degree of anonymity, Google's proposals have been met with stiff resistance from regulators, privacy advocates, publishers, and every major browser that uses the open-source Chromium project, including Brave, Vivaldi, Opera, and Microsoft Edge. "The worst aspect of FLoC is that it materially harms user privacy, under the guise of being privacy-friendly," Brave said in April.
The "privacy-safe ad targeting" method has also come under the scanner from the Electronic Frontier Foundation, which called FLoC a "terrible idea" that can lower the barrier to companies gathering information about individuals just based on the cohort IDs assigned to them. "If a tracker starts with your FLoC cohort, it only has to distinguish your browser from a few thousand others (rather than a few hundred million)," the EFF said.
Indeed, according to a recent report from Digiday, "companies are starting to combine FLoC IDs with existing identifiable profile information, linking unique insights about people's digital travels to what they already know about them, even before third-party cookie tracking could have revealed it," effectively neutralizing the privacy benefits of the system.
Mozilla's analysis of FLoC backs up this argument. Given that only a few thousand users share a specific cohort ID, trackers that are in possession of additional information can narrow down the set of users very quickly by coupling the identifiers with fingerprinting data and even leverage the periodically recomputed cohort IDs as a leakage point to distinguish individual users from one week to the other.
"Before the pandemic and some time back, I attended a Mew concert, a Ghost concert, Disney on Ice, and a Def Leppard concert. At each of those events I was part of a large crowd. But I bet you I was the only one to attend all four," said John Wilander, WebKit privacy and security engineer, earlier this April, pointing out how cohort IDs can be collected over time to create cross-site tracking IDs.
What's more, because FLoC IDs are the same across all websites for all users in a cohort, the identifiers undermine restrictive cookie policies and leak more information than necessary by turning into a shared key to which trackers can map data from other external sources, the researchers detailed.
Google has put in place mechanisms to address these undesirable privacy shortcomings, including making FLoC opt-in for websites and suppressing cohorts that it believes are closely correlated with "sensitive" topics. But Mozilla said "these countermeasures rely on the ability of the browser manufacturer to determine which FLoC inputs and outputs are sensitive, which itself depends on their ability to analyze user browsing history as revealed by FLoC," in turn circumventing the privacy protections.
As potential avenues for improvement, the researchers suggest creating FLoC IDs per domain, partitioning the FLoC ID by the first-party site, and falsely suppressing the cohort ID belonging to users without sensitive browsing histories so as to protect users who cannot report a cohort ID. It's worth noting that the FLoC API returns an empty string when a cohort is marked as sensitive.
"When considered as coexisting with existing state-based tracking mechanisms, FLoC has the potential to significantly increase the power of cross-site tracking," the researchers concluded. "In particular, in cases where cross-site tracking is prevented by partitioned storage, the longitudinal pattern of FLoC IDs might allow an observer to re-synchronize visits by the same user across multiple sites, thus partially obviating the value of these defenses."
Ultimately, the greatest threat to FLoC may be Google itself, which is not only the biggest search engine, but also the developer behind the world's most used web browser and the owner of the world's largest advertising platform, landing it between a rock and a hard place where any attempt to rewrite the rules of the web could be perceived as an attempt to bolster its own dominance in the sector.
Such is its scope and outsized impact, Privacy Sandbox is attracting plenty of regulatory scrutiny. The U.K.'s Competition and Markets Authority (CMA) earlier today announced that it's taking up a "role in the design and development of Google's Privacy Sandbox proposals to ensure they do not distort competition."