Difference between revisions of "Related Website Sets"

From MgmtWiki
Jump to: navigation, search
(Reference)
(Reference)
Line 172: Line 172:
 
[[Category: Browser]]
 
[[Category: Browser]]
 
[[Category: User Agent]]
 
[[Category: User Agent]]
 +
[[Category: Privacy]]

Revision as of 09:57, 9 October 2023

Full Title or Meme

The Trusted First Party is typically the human user of a User Agent such as a Browser. While it has been the practice up to 2021 to allow Cookies stored by one party in a session to be read by a Third Party, that has been exploited by advertising and is scheduled to be shut down.

Context

The First-Party Set (FPS) is a proposal in W3C to retain some of the advantages of this practice while limiting user tracking. That effort was renamed on 2023-08-31 to Related Website Sets as the APIs became more complex.[1] At that time the domain limit was increased to 5 domains, which is small compared to the number of sites that are owned by the large corporations can be in the hundreds.
RWS is designed to minimize disruptions to specific user-facing features once Chrome starts limiting access to third-party cookies by default. Our goal is to allow users to browse the web with minimal disruption while still upholding the privacy goals of the Privacy Sandbox. To strike this balance, RWS targets specific use cases related to website functionality.
  • Some privacy focused browsers, like Safari, Fire Fox and Brave, have started blocking access by Third-Parties in 2021. Google is more dependent on advertising revenue and so continues to look for an alternate way that will not result in massive revenue losses like the $10 billon loss experienced by FaceBook.

Current OIDC Front Channel Flow

Action From To Cookie
user navigates to RP Browser GET RP RP creates session cookie
user selects IdP RP Browser RP sends redirect to user browser
Browser Redirect Broser POST IdP The IdP may or may not have a cookie
IdP logon screen IdP Browser Only happens when IdP not already logged on
User Logon Browser IdP Only happens when IdP not already logged on
User Consent screen IdP Browser

Current Status

In identity protocols, a cross-site navigation resulting in a POST request is typically happens by the first site returning an HTML page that has a form that is auto-submitted via javascript to the second site. That's how SAML Post binding works. And so does the OIDC/OAuth form post response mode.

(As best I understand it anyway) a previously set cookie with SameSite=None will be sent by the browser on such a top-level cross-site POST request. Some folks have suggested that that will change with 3rd party cookies going away and that even a SameSite=None cookie will no longer be sent in that situation. But in my mental model of this stuff, the situation will be unchanged by 3rd party cookies going away - it's a cross-site request but because it is a top-level navigation the cookies are 1st party. SameSite enforcement is in place so SameSite=None cookies will be sent. But it's not 3rd party so is not impacted by disappearance or partitioning of 3rd party cookies.

Anyway, that's what I'm hoping Sam can provide clarification on. Mostly for the benefit of my own understanding but also for the benefit of the group here as recent discussions have suggested that folks have divergent understanding and expectations of things.

That behaviour changing would be problematic, for example and as others have pointed out, because OIDC RPs receiving an ID token via the form post response mode need the 'nonce cookie' value (which ties the ID token to the browser the SSO flow was initiated on) at that point in validating the token. Maybe further confusing things is that at least in Chrome there was a temporary(?) exception made for the nonce cookie case with the rollout of the SameSite default change to Lax - the "Lax + POST mitigation" section at https://www.chromium.org/updates/same-site/faq and it looks like there's an attempt to capture that in the coming update to RFC 6265 https://github.com/httpwg/http-extensions/pull/1435/files

User Experience

Part of this section was taken from Nick Doty <ndoty@cdt.org> on Mon, Feb 28 with Don, Aram, Ralph, Robin, Scott

Still trying to understand the potential benefits to the user from First-Party Sets and potential relaxation of browser privacy protections among members of those sets.

To summarize some proposed benefits could be:

  1. combining data across origins to provide personalization (preferences for shopping sites, or remembering past purchases)
  2. providing transparency to the user or to researchers/regulators about which domains are operated by the same company
  3. letting a user sign-in on one origin and be logged-in on other origins operated by that login provider
  4. accountability to a privacy policy because of the threat of losing first-party set benefits
    The issue is the user's mental resources -- how much cognitive energy and reputation processing they are being asked to invest in the set, and what they get for it. Making the "user-visible branding" the same across the entire set is how we keep that investment low, but the user still needs to get "paid for their time" somehow. The IEE's review of the FPS policy, and credible assertion that the member domains comply, is the service that is being offered to pay for the user's investment in the FPS.

This last one seems novel for us in the Web standards space, but if I understand Don's proposal correctly, the idea is that there would be a financial benefit to a company from having cookie scope combined within a first-party set, and that there could be a funding mechanism to provide external audits to receive that benefit, and that a user can get stronger commitment to a company's privacy policy if the company has the threat of losing access to the first-party set benefit. I'm not clear that browsers will want to be the enforcers of this kind of trade-off, or that we can easily set up the infrastructure for this auditing, or that users should give up some valuable privacy just in the hope that losing it later will be enough of a threat to get companies to keep their promises.

I have been encouraged by the transparency possibility when it's been raised in the past (this was a feature during both P3P and DNT efforts). But on using data for personalization and single sign-on, I'm not clear on why a set with combined cookies is necessary, compared to explicit user opt-in, consolidating domains or proposals specific to authentication (OAuth redirect flows, federated identity, etc.).

There also doesn't seem to be agreement on whether a first-party set should include only domains operated by the same company. Either because one use case is combining data for external service providers, or because a company will want to split its operations (perhaps to commit tax fraud?) but still combine data.

And there doesn't seem to be agreement on whether all the domains in a first-party set should have a common privacy policy, or if users should volunteer to combine data between domains with different privacy policies.

And there doesn't seem to be agreement on whether the number of domains in a set should be strictly limited, or include hundreds or thousands of domains (with a much wider scope of potential abuse). Or whether the sets should be mutually exclusive.

I'd consider myself one of the skeptics at this point. But if you are interested in working on First-Party Sets, I think clarity on these points would make the discussion more productive:

  • what is the direct user benefit (if any, and compared to alternatives)?
  • what use cases are definitely in and out of scope?
  • can enterprise use cases be satisfied while abuse of this feature is minimized?

I've tried to read the existing explainer and issues closely, and maybe it's just interest in expanding the scope beyond the current proposal that's leading to some of our general confusion. I see that https://github.com/privacycg/first-party-sets/issues/62 is one attempt to try to gain consensus on the purposes/use cases, and https://github.com/privacycg/first-party-sets/issues/53 on the user benefits, although many many of the open issues are touching on it.

Thanks all for the conversation and I hope it someday leads to more clarity for us. Cheers, Nick

On Fri, Jan 14, 2022 at 3:40 PM Don Marti <dmarti@cafemedia.com> wrote: On Fri, Jan 14, 2022 at 11:28 AM Nick Doty <ndoty@cdt.org> wrote: On Thu, Jan 13, 2022 at 2:31 PM Don Marti <dmarti@cafemedia.com> wrote:

On Thu, Jan 13, 2022 at 9:53 AM Zucker-Scharff, Aram <Aram.Zucker-Scharff@washpost.com> wrote:

But I don’t really see how any of this lands us on FPS anyway. There is no better way to have a clear shared indicator of shared context then operating on the same domain as far as I can see, and I’m not really clear on how FPS would give us the ability to enforce any clearer way than ‘operates on the same domain’ or would otherwise meet the minimum clarity required to make the affiliation visible to all users. Arguably, even that isn’t enough to make clear to users what is going on with their data, as it still leaves them with the mysteries of how these companies operate internally, but it still is significantly clearer than any other options I have heard or could conceive. It at least makes it unmistakable who the operator they have to object to is.

I’m open to hearing some clear articulation of why every business needs to run on multiple TLDs and that t/f requires FPS… but I haven’t even heard that yet.

I appreciate the work that has gone into trust.txt but I’m just not sure why we would want to shave a square peg to fit a round hole when we could have a round peg made for purpose. I know that in theory this means More Standards which can be undesirable, but in this case--especially with the idea that we’re going to have to build some theoretical user-manned regulatory body that will be reviewing FPSs, a presumably extensive and never-ending queue--it seems like a new standard for how to proclaim FPSs that is a best-possible fit is worth the time and effort.

It is possible for FPS to be a net win for users.

I'm interested to understand how this would be a benefit for users, so thanks for giving this example to work through.

For example, let's say that dobbsford.example and dobbstoyota.example are two car dealership sites, and users of both are aware of the common brand identity of the two sites. The Bob Dobbs who tells them "Bob Dobbs won't make you pay a lot for a Ford!" and the Bob Dobbs who tells them "Bob Dobbs won't make you pay a lot for a Toyota!" are the same recognizable advertising personality.

The two sites have the same design elements, shared copy, and privacy policy text. The two identical privacy policies state that the site will not allow your email address to be used for spam email if you provide it.

What was the user benefit here? As the user, did I want both dealerships to know what cars I was looking at on the other site without logging in?

As the user, I'm shopping for a car, and I want to get a notification when a car matching my preferences is available for a test drive. (I already filtered the inventory list on the dobbsford example site down to the Ford Focus, and want to see what's similar on the Toyota lot without slogging through a bunch of unrelated vehicles)

Could be any kind of activity that stays within the same service and context ("getting a great deal on a car from Bob Dobbs") but spans multiple domains.

When the sites claim an FPS, the IEE gives them an incentive to adhere to their own published privacy policy. If the IEE makes an account with a spamtrap address on one of the two sites, and then receives spam, the FPS is invalid. The decision to claim an FPS and stick to it is a way for a single service with multiple domains to make a credible commitment to its own privacy policy. FPSs are asking the user for an exception to the normal rule, and offering to pay for the exception with the validation services provided by the IEE.

I'm not clear how in this proposal the FPS is a way for a company to commit to its own privacy policy. I'm not precisely sure what redress I would have if a company promised not to do something in their privacy policy and then did it anyway, but I would expect to reach out to a local consumer protection authority -- maybe this is a deceptive trade practice. That doesn't seem to rely on their being two different domains that claim in a machine-readable way to be owned by the same party. Is the commitment more credible because a browser might restrict the scope of cookies if a violation of the commitment comes to light and that penalty would be more meaningful than what local consumer protection would bring? Or would it be similar to a BBB or other trust seal?

Yes. Realistically, today a company is not taking much risk by violating its own privacy policy. (The budget for the entire California Privacy Protection Agency is $10 million and most US states don't have such an agency.) The risk of losing an FPS for a violation is a more credible deterrent -- especially since a violator would lose their FPS worldwide for a violation in one jurisdiction.

(I don't know if the two sites in this example actually have the same "ownership". The two dealerships are LLCs with overlapping member lists, and have issued convertible debt instruments to different parties. Bob Dobbs is one step ahead of the IRS, and at least one step ahead of any IEE that tried to figure out the same info.)

I believe you that companies may use complicated arrangements to defraud local tax authorities. As a user, I would be very confused if I granted special access to combine my data across domains because I thought it was the same entity and then it turned out that the data was actually being shared by two different companies. That the privacy policy (that I surely didn't read) was identical text for the two companies doesn't necessarily seem like a big advantage to the end user. Which company should I report to the local authority when my email address was shared by one of them for spam?

If there's no FPS in the picture, you have just as much leverage as you have with your existing spam. If you do believe that your address was misused by an FPS member, you forward the spam to the IEE, and they run a test.

Maybe an IEE would not be necessary if all users had access to regulators with the time and resources to investigate all violations.

Users are already dealing with a single service or context that spans multiple "companies" on paper. (Look at web ToS documents that say, if you're in these countries you have a contract with example.com USA, if you're in these other countries you have a contract with example.com Ltd. in Ireland, in these other countries, etc... Real-world company structures are interesting and probably not feasible for an IEE to analyze.)

Apple WebKit's feedback on the First Party Sets proposal

John Wilander <wilander@apple.com> (2022-024)

to public-privacycg

Our meeting on Thursday will cover First Party Sets. We wanted to share our collected feedback on the proposal. Please see below. We shared this with the editors about a month ago.

As far as we know, at least Google explicitly intends to use First Party Sets, or FPS, to allow cross-site cookies and storage within sets. We will include feedback on that even though FPS by itself doesn’t have to mandate what it’s used for.

  1. Feedback on partitioning and cross-site cookies in the context of FPS

We have already implemented partitioning of state and HTML storage. We think that is where the web platform should be. We have already implemented blocking of cross-site cookies and the Storage Access API as a general purpose way for cross-site resources to get cookie access. We think either that or partitioned cookies are where the web platform should be. We are against cross-site cookie access by default in all of its forms, FPS or other (this feedback was provided in for instance issue #53). We think cross-site cookie access was an unfortunate mistake early in the development of the web and we have worked since the very first version of our browser to rein in that mistake. We reached that goal in early 2020 and do not intend to regress that privacy protection. We think major browsers allowing cross-site cookies by default, for instance through FPS, would hold the web back, cause fragmentation, and cement legacy functionality (this feedback was provided in issue #53). Policy based on FPS that would lead to this risk is explicitly mentioned in the proposal: ‘ensures continued operation of existing functionality which would otherwise be broken by blocking cross-domain cookies (“third-party cookies”)‘.

  1. General feedback on FPS

We don’t think users in general understand or know which companies or consortia own which domains. This means that FPS has the risk of hiding relationships between websites which would otherwise have to be more explicit and potentially understood by users. Setting browser policy based on joint domain ownership will very likely go against the user’s interest in many cases, violating the responsibility of the user agent. Relaxing data partitioning because of joint domain ownership is one such example. We think large domain sets are especially troublesome from a user perspective since there will be no reasonable way to show or tell the user about the (vast) set. We think set sizes over five domains become troublesome in this way. This feedback has been provided multiple times throughout the lifetime of the FPS proposal, for instance in issue #29, but the proposal doesn’t seem to address this. We are worried that curation of an FPS list will: Create winners and losers where some have the means to get on the FPS list and others don’t. Create barriers to entry when new players have to wait to get onto shipping versions of the FPS list. Cause inconsistencies across browsers because of different versions of the FPS list. Put non-western countries and businesses at a disadvantage since all major web browser implementors are western. Get into serious challenges of what “party,” “business entity,” “domain owner,” etc mean. For example, some brands have different owners in different countries. Create a barrier to entry for new browsers where they must either create a compatible curated list or get permission from an established player to use theirs. Community governance and maintenance of the list and free access to it would be required to not create such a barrier and community governance is not easy. See for instance the Public Suffix List which is referred to directly in the proposal. The proposal mentions that an “independent entity must verify” and we don’t know what that independent entity will be.

We are worried that per-site declaration of FPS will: Incentivize the creation of domain consortia to get some kind of preferred treatment. This could potentially undo the privacy work we and others have done for a decade or more. See for instance our comment on issue #17. Lead to false claims of domain ownership and a huge burden to try to police it. Lead to either page load performance hits as the user agent needs to check a multitude of domains if they belong to the current set, or lead to page load failures due to stale cached FPS data. Issue #6 contains our original concerns on “Incentives to Form Sets” and “Personalized Sets.” (Thanks, Kaustubha, for finding this link for us since it was filed against a different repo.)

  1. Feedback on use cases other than relaxed partitioning

We don’t currently believe that a trustworthy and equitable version of FPS can be created. That said, were that to happen, we think such a technology could potentially be useful in the following ways: Allow single sign-on within a set while being stricter on login-like data transfer such as link decoration across domains which are not in the same set. This requires that sets contain metadata instructing the user agent which domain is the single sign-on domain (this feedback was provided in issue #28). Note that we mean single sign-on as authentication between domains with a joint owner. We are not referring to federated logins here. Allow for different wording in cross-site data prompts such as the one for the Storage Access API or for WebAuthn. The different wording would be for domains within a set (this feedback was provided in issue #53). Enhanced reporting to users on which parties has data on them and how that landscape changes over time.

On May 25, 2022, at 5:06 PM, Kaustubha Govind <kaustubhag@google.com> wrote:

We are not proposing our quirks as standards and doing so would be bad. It’s important to distinguish web standards and per-browser compatibility measures. Browsers all face slightly different compatibility challenges.
Since you mentioned that you won't be able to attend tomorrow's meeting, where one of the topics is to discuss the right venue to incubate First-Party Sets; I wanted to focus on this part of your email for now.
It sounds like you're saying that we should be content with every browser shipping their own custom quirks to solve the exact same problem (site compat issues caused by privacy protections). I understand what you're saying about not wanting to keep these measures around forever; but I don't think the point of web standards is to create something permanent. In my opinion, the point of standards is to enable platform predictability and interoperability.

Indeed. We’ve said that we’re not interested in relaxing cross-site cookie blocking based on FPS and I believe Mozilla has said the same thing so I currently don’t see a path to standardization there. For that aspect of FPS, we are instead raising concerns of only Chrome (or possibly Chrome+Edge) allowing cross-site cookie access based on FPS, bifurcating the web.

We did however list other potential use cases for FPS. We’d love to get interoperability there, as long as we can get to a trustworthy and equitable version of FPS.

As for our quirks, it’s a set of per-site alternative behaviors to fix specific issues like:

shouldDisableContentChangeObserverTouchEventAdjustment for youtube.com
shouldDispatchSyntheticMouseEventsWhenModifyingSelection for medium.com and webbly.com
needsDeferKeyDownAndKeyPressTimersUntilNextEditingCommand for docs.google.com
shouldSilenceWindowResizeEvents for nytimes.com and twitter.com
isStorageAccessQuirkDomainAndElement for microsoft.com, outlook.live.com, and playstation.com

These quirks are based on user feedback and bug investigations. The only way I see such a list feed into standards work is to make sure we don’t need those quirks. You may be referring to the latter here. It’s us calling the Storage Access API on behalf of the site.

To expand on this, we have no per-site quirks in Safari or WebKit that would allow cross-site cookies without user consent. The most we will do is call Storage Access API on behalf of the site when the user takes particular actions on that site, so the user is presented with the choice. It’s for a very limited set of sites, and we have been working with each of them to update their login flows to no longer depend on the quirk.

We would welcome other browsers taking this same approach to cross-site cookie compatibility, instead of silently allowing cross-site cookies between select pairs of sites. But we don’t think such a list needs a site-controlled extension mechanism, and we don’t think it belongs in standards, since our goal is to drive the list to empty, as with any other site-specific compatibility quirk.

Reference

  1. Helen Cho, Related Website Sets - the new name for First-Party Sets in Chrome 11