Nothing Special   »   [go: up one dir, main page]

Infrequently Noted

Alex Russell on browsers, standards, and the process of progress.

Misfire

We're in a bad place when even the W3C TAG falls for Apple's privacy schtick.

The W3C Technical Architecture Group[1] is out with a blog post and an updated Finding regarding Google's recent announcement that it will not be imminently removing third-party cookies.

The current TAG members are competent technologists who have a long history of nuanced advice that looks past the shouting to get at the technical bedrock of complex situations. The TAG also plays a uniquely helpful role in boiling down the guidance it issues into actionable principles that developers can easily follow.

All of which makes these pronouncements seem like weak tea. To understand why, we need to walk through the threat model, look at the technology options, and try to understasnd the limits of technical interventions.

But before that, I should stipulate my personal position on third-party cookies: they aren't great!

They should be removed from browsers when replacements are good and ready, and Google's climbdown isn't helpful. That said, we have seen nothing of the hinted-at alternatives, so the jury's out on what the impact will be in practice.[2]

So why am I dissapointed in the TAG, given that my position is essentially what they wrote? Because it failed to acknowledge the limited and contingent upside of removing third-party cookies, or the thorny issues we're left with after they're gone.

Unmasking The Problem

So, what third-party cookies do? And how do they relate to the privacy theat model?

Like a lot of web technology, third-party cookies have both positive and negative uses. Owing to a historcal lack of platform-level identity APIs, they form the backbone of nearly every large Single Sign-On (SSO) system. Thankfully, replacements have been developed and are being iterated on.

Unfortunately, some browsers have unilaterally removed them without developing such replacements, disrupting sign-in flows across the web, harming users and pushing businesses toward native mobile apps. That's bad, as native apps face no limits on communicating with third parties and are generally worse for tracking. They're not even subject to pro-user interventions like browser extensions. The TAG should have called out this aspect of the current debate in it's Finding, encouraging vendors to adopt APIs that will make the transition smoother.

The creepy uses of third-party cookies relate to advertising. Third-party cookies provide ad networks and data brokers the ability to silently reidentify users as they browse the web. Some build "shadow profiles", and most target ads based on sites users visit. This targeting is at the core of the debate around third-party cookies.

Adtech companies like to claim targeting based on these dossiers allows them to put ads in front of users most likely to buy, reducing wasted ad spending. The industry even has a shorthand: "right people, right time, right place."

Despite the bold claims and a consensus that "targeting works," there's reason to believe pervasive surveillence doesn't deliver, and even when it does, isn't more effective.

Assuming the social utility of targeted ads is low — likely much lower than adtech firms claim — shouldn't we support the TAG's finding? Sadly, no. The TAG missed a critical opportunity to call for legislative fixes to the technically unfixable problems it failed to enumerate.

Privacy isn't just about collection, it's about correlation across time. Adtech can and will migrate to the server-side, meaning publishers will become active participants in tracking, funneling data back to ad networks directly from their own logs. Targeting pipelines will still work, with the largest adtech vendors consolidating market share in the process.

This is why "give us your email address for 30% discount" popups and account signup forms are suddenly everywhere. Email addresses are stable, long-lived reidentifiers. Overt mechanisms like this are already replacing third-party cookies. Make no mistake: post-removal, tracking will continue for long as reidentification has perceived positive economic value. The only way to change that equation is legislation; anything else is a band-aid.

Pulling tracking out of the shadows is good, but a limited and contingent good. Users have a terrible time recognising and mitigating risk on the multi-month time-scales where privacy invasions play out. There's virtually no way to control or predict where collected data will end up in most jurisdictions, and long-term collection gets cheaper by the day.

Once correlates are established, or "consent" is given to process data in ways that facilitate unmasking, re-identification becomes trivial. It only takes giving a phone number to one delivery company, or an email address to one e-commerce site to suddenly light up a shadow profile, linking a vast amount of previously un-attributed browsing to a user. Clearing caches can reset things for a little while, but any tracking vendor that can observe a large proportion of browsing will eventually be able to join things back up.

Removal of third-party cookies can temporarialy disrupt this reidentification while collection funnels are rebuilt to use "first party" data, but that's not going to improve the situation over the long haul. The problem isn't just what's being collected now, it's the ocean of dormant data that was previously slurped up.[3] The only way to avoid pervasive collection and reidentification over the long term is to change the economics of correlation.

The TAG surely understands the only way to make that happen is for more jurisdictions to pass privacy laws worth a damn. It should say so.

Fire And Movement

The goal of tracking is to pick users out of crowds, or at least bucket them into small unique clusters. As I explained on Mastodon, this boils down to bits of entropy, and those bits are everywhere. From screen resolution and pixel density, to the intrinsic properties of the networks, to extensions, to language and accessibility settings that folks rely on to make browsing liveable. Every attribute that is even subtly different can be a building block for silent reidentification; A.K.A., "fingerprinting."[4]

In jurisdictions where laws allow collected data to remain the property of the collector, the risks posed by data-at-rest is only slightly attenuated by narrowing the straw through which collection takes place.

It's possible to imagine computing that isn't fingerprintable, but that isn't what anyone is selling. For complex reasons, even the most cautious use of commodity computers is likely to be uniquely identifiable with enough time. This means that the question to answer isn't "do we think tracking is bad?", it's "given that we can't technically eliminate it, how can we rebuild privacy?". The TAG's new Finding doesn't wrestle with that question, doing the community a disservice in the process.

The most third-party cookie removal can deliver is temporary disruption. That disruption will affect distasteful collectors, costing them money in the short run. Many think of this as a win, I suspect because they fail to think through the longer-term consequences. The predictable effect will be a recalibration and entrenchment of surveillence methods. It will not put the panopticon out of business; only laws can do that.

For a preview of what this will look like, think back on Apple's "App Tracking Transparency" kayfabe, which did not visibly dent Facebook's long-term profits..

So this is not a solution to privacy, it's fire-and-movement tactics against corporate enemies. Because of the deep technical challenges in defeating fingerprinting, even the most outspoken vendors have given up, instutiting "nutrition labels" to shift responsibility for privacy onto consumers.

If the best that vertically-integrated native ecosystems can do is to shift blame, the TAG should call out those same vendors when they posture about net-ineffetive changes rather than lobbying for stronger laws that can truly change the game. The TAG should also advocate for the web, not play into fearmongering by folks trying to lock users into proprietary native apps.

Finding A Way Forward

The most generous take I can muster is that the TAG's work is half-done. Calling on vendors to drop third-party cookies has the virtue of being technical and actionable, properties I believe all TAG writing should embody. But having looked deeply at the situation, the TAG should have also called on browser vendors to support further reform along several axes — particularly vendors that also make native OSes.

First, if the TAG is serious about preventing tracking and improving the web ecosystem, it should call on all OS vendors to prohibit the use of "in-app browsers" when displaying third-party content within native apps.

It is not sufficient to prevent JavaScript injection because the largest native apps can simply convince the sites to include their scripts directly. For browser-based tracking attenuation to be effective, these side-doors must be closed. Firms grandstanding about browser privacy features without ensuring users can reliably enjoy the protections of their browser need to do better. The TAG is uniquely positioned to call for this erosion of privacy and the web ecosystem to end.

Next, the TAG should have outlined the limits of technical approaches to attenuating data collection. It should also call on browser vendors to adopt scale-based interventions (rather than absolutism) in mitigating high-entropy API use.[5] The TAG should go first in moving past debates that don't acknowledge impossibilities in removing all reidentification, and encourage vendors to do the same. There's no solution to the privacy puzzle that can be solved by the purchase of a new phone, and the TAG should be clarion about what will end our privacy nightmare: privacy laws worth a damn.

Lastly, the TAG should highlight discrepancies between privacy marketing and the failure of vendors to push for strong privacy laws and enforcement. Because the threat model of privacy intrusion renders solely techincal interventions ineffective on long timeframes, this is the rare case in which the TAG should push past providing technical advice.

The TAG's role is to explain complex things with rigor and signpost credible ways forward. It has not done that yet regarding third-party cookies, but it's not too late.


  1. Praise, as well as concern, in this post is specific to today's TAG's, not the output of the group while I served. I surely got a lot of things wrong, and the current TAG is providing a lot of value. My hope here is that it can extend this good work by expanding its new Finding. ↩︎

  2. Also, James Roswell can go suck eggs. ↩︎

  3. It's neither here nor there, but the TAG also failed in these posts to encourage users and developers to move their use of digital technology into real browsers and out of native apps which invasively track and fingerprint users to a degree web adtech vendors only fantasize about.

    A balanced finding would call on Apple to stop stonewalling the technologies needed to bring users to safer waters, including PWA installation prompts. ↩︎

  4. As part of the drafting of the 2015 finding on Unsanctioned Web Tracking, the then-TAG (myself included) spent a great deal of time working through the details of potential fingerprinting vectors. What we came to realise was that only the Tor Browser had done the work to credibly analyise fingerprinting vectors and produce a coherent threat model. To the best of my knowledge, that remains true today.

    Other vendors continue to publish gussied-up marketing documents and stroppy blog posts that purport to cover the same ground, but consistently fail to do so.

    To understand the difference, we can do a small thought experiment, enumerating what would be necessary to sand off currently-identifiable attributes of individual users. Because only 31 or 32 bits are needed to uniquely identify anybody (often less), we want a high safety factor. This means bundling users into very large crowds by removing distinct observable properties. To sand off variations between users, a truly private browser might:

    • Run the entire browser in a VM in order to:
      • Cap the number of CPU cores, their frequency, and possibly to centralise on a single instruction set (e.g., emulating ARM when running on x86). This will likely result in a 2-5x slowdown.
      • Ensure (high) fixed latency for all disk access operations via VM.
      • Set a consistent and low cap on total available memory and, via VM, emulate slower memory access.
    • Disable hardware acceleration for all graphics and media.
    • Disable JIT, slowing JavaScript-based apps by 10x or more.
    • Only allow a small, fixed set of fonts, screen sizes, pixel densities, and refresh rates; no more resizing your browser with a mouse, and expect everything to look pixelated. Expect all animations to feel choppy.
    • Remove most accessibility settings.
    • Remove the ability to install extensions.
    • Eliminate direct typing and touch-based interactions, as those can leak timing information that's unique over long-enough sessions.
    • Run all traffic through Tor or a similarly high-latency VPN egress nodes.
    • Disable all reidentifying APIs (no more web-based video conferencing!)

    Only the Tor project is shipping a browser anything like this today, and it's how you can tell that most of what passes for "privacy" features in other browsers are anti-annoyance and anti-creep-factor interventions; they matter, but won't end the digital panopticon. ↩︎

  5. It's not a problem that sign-in flows need third-party cookies today, but it is a problem that they're used for pervasive tracking.

    Likewise, the privacy problems inherent in email collection or camera access or filesystem folders aren't absolute, they're related to scale of use. There are important use-cases that demand these features, and computers aren't going to stop supporting them. This means the debate is only whether or not users can use the web to meet those needs. Folks who push an absolutist line are, in effect, working against the web's success. This is anti-user, as the alternatives are generally much more invasive native apps.

    Privacy problems arise at scale and across time. Browsers should be doing more to discourage high-quality reidentifiaction across cache clearing and in ways that escalate with risk. The first site you grant camera access isn't the issue; it's the 10th. Similarly, speed bumps should be put in place for use of reidentifying APIs on sites across cache clearing where possible.

    The TAG can be instrumental is calling for this sort of change in approach. ↩︎

Why Browsers Get Built

There are only two-and-a-half reasons to build a browser, and they couldn't be more different in intent and outcome, even when they look superficially similar. Learning to tell the difference is helpful for browser project managers and engineers, but also working web developers who struggle to develop theories of change for affecting browser teams.

Like Platform Adjacency Theory and The Core Web Platform Loop, this post started[1] as a set of framing devices that I've been sketching on whiteboards for the best part of a decade. These lenses aren't perfect, but they provide starting points for thinking about the complex dynamics of browsers, OSes, "native" platforms, and the standards-based Web platform.

The reasons to build browsers are most easily distinguished by the OSes they support and the size and composition of their teams ("platform" vs. "product"). Even so, there are subtleties that throw casual observers for a loop. In industrial-scale engineering projects like browsers headcount is destiny, but it isn't the whole story.

Web As Platform

This is simultaneously the simplest and most vexing reason to build a browser.

Under this logic, browsers are strategically important to a broader business, and investments in platforms are investments in their future competitiveness compared with other platforms, not just other browsers. But none of those investments come good until the project has massive scale.

This strategy is exemplified by 1990s-era Andreesen's goal to render Windows "a poorly debugged set of device drivers".

The idea is that the web is where the action is, and that the browser winning more user Jobs To Be Done follows from increasing the web platform's capability. This developer-enabling flywheel aims to liberate computing from any single OS, supporting a services model.

A Web As Platform play depends on credibly keeping up with expansions in underlying OS features. The goal is to deliver safe portable, interoperable, and effective versions of important capabilities at a fast enough clip to maintain faith in the web as a viable ongoing investment.

In some sense it's a confidence-management exercise. A Web As Platform endgame requires the platform increasaes expressive capacity year over year. It must do as many new things each year as new devices can, even if the introduction of those features is delayed for the web by several years; the price of standards.

Platform-play browsers aim to grow and empower the web ecosystem, rather than contain it or treat it as a dying legacy. Examples of this strategic orientation include Netscape, Mozilla (before it lost the plot), Chrome, and Chromium-based Edge (on a good day).

Distinguishing Traits

The OS Agenda

There are two primary tactical modes of this strategic posture, both serving the same goal: to make an operating system look good by enabling a corpus of web content to run well on it while maintaining a competitive distance between the preferred (i.e., native, OS-specific) platform and the hopefully weaker web platform.

The two sub-variants differ in ambition owing to the market positions of their OS sponsors.

OSes treat browsers they sponsor as bridges or moats, but never the main object.
Photo by Paul Arky

Browsers as Bridges

OSes deploy browsers as a bridge for users into their environment when they're underdogs or fear disruption.

Of course, it would be better from the OS vendor's perspective if everyone simply wrote all of the software for their proprietary platform, maximising OS feature differentiation. But smart vendors also know that's not possible when an OS isn't dominant.

OS challengers, therefore, strike a bargain. For the price of developing a browser, they gain the web's corpus of essential apps and services, serving to "de-risk" the purchase of a niche device by offering broad compatibility with existing software through the web. If they do a good job, a conflicted short-term investment can yield enough browser share to enable a future turn towards moat tactics (see below). Examples include Internet Explorer 3-6 as well as Safari on Mac OS X and the first iPhone.

Conversely, incumbents fearing disruption may lower their API drawbridges and allow the web's power to expand far enough that the incumbent can gain share, even if it's not for their favoured platform; the classic example here being Internet Explorer in the late 90s. Once Microsoft knew it had Netscape well and truly beat, it simply disbanded the IE team, leaving the slowly rusting husk of IE to decay. And it would have worked, too, if it weren't for those pesky Googlers pushing IE6 beyond what was "possible"!

Browsers as Moats

Without meaningful regulation, powerful incumbents can use anti-competitive tactics to suppress the web's potential to disrupt the OS and tilt the field towards the incumbent's proprietary software ecosystem.

This strategy works by maintaining high browser share while never allowing the browser team to deliver features that are sufficient to disrupt OS-specific alternatives.

In practice, moats are arbitrage on the unwillingness of web developers to understand or play the game, e.g. by loudly demanding timely features or recommending better browsers to users. Incumbents know that web developers are easily led and are happy to invent excuses for them. It's cheap to add a few features here an there to show you're "really trying", despite underfunding browser teams so much they can never do more than a glorified PR for the OS. This was the strategy behind IE 7-11 and EdgeHTML. Even relatively low share browsers can serve as effective moats if they can't be supplanted by competitive forces.

Apple has perfected the moat, preventing competitors from even potentially offering disruptive features. This adds powerfully to the usual moat-digger's weaponisation of consensus processes. Engineering stop-energy in standards and quasi-standards bodies is nice and all, but it is so much more work than simply denying anyone the ability to ship the features that you won't.

Tipping Points

Bridge and moat tactics appear very different, but the common thread is control with an intent to suppress web platform expansion. In both cases, the OS will task the browser team to heavily prioritise integrations with the latest OS and hardware features at the expense of more broadly useful capabilities — e.g. shipping "notch" CSS and "force touch" events while neglecting Push.

Browser teams tasked to build bridges can grow quickly and have remit that looks similar to a browser with a platform agenda. Still, the overwhelming focus starts (and stays) on existing content, seldom providing time or space to deliver powerful new features to the Web. A few brave folks bucked this trend, using the fog of war to smuggle out powerful web platform improvements under a more limited bridge remit; particularly the IE 4-6 crew.

Teams tasked with defending (rather than digging) a moat will simply be starved by their OS overlords. Examples include IE 7+ and Safari from 2010 onward. It's the simplest way to keep web developers from getting uppity without leaving fingerprints. The "soft bigotry of low expectations", to quote a catastrophic American president.

Distinguishing Traits

Searchbox Pirates

This is the "half-reason"; it's not so much a strategic posture as it is environment-surfing.

Over the years, many browsers that provide little more than searchboxes atop someone else's engine have come and gone. They lack staying power because their teams lack the skills, attitudes, and management priorities necessary to avoid being quickly supplanted by a fast-following competitor pursuing one of the other agendas.

These browsers also tend to be short-lived because they do not build platform engineering capacity. Without agency in most of their codebase, they either get washed away in unmanaged security debt, swamped by rebasing challenges (i.e., a failure to "work upstream"). They also lack the ability to staunch bleeding when their underlying engine fails to implement table-stakes features, which leads to lost market share.

Historical examples have included UC Browser, and more recently, the current crop of "secure enterprise browsers" (Chromium + keyloggers). Perhaps more controversially, I'd include Brave and Arc in this list, but their engineering chops make me think they could cross the chasm and choose to someday become platform-led browsers. They certainly have leaders who understand the difference.

Distinguishing Traits

Implications

This model isn't perfect, but it has helped me tremendously in reliably predicting the next moves of various browser players, particularly regarding standards posture and platform feature pace.[2]

The implications are only sometimes actionable, but they can help us navigate. Should we hold out hope that a vendor in a late-stage browser-as-moat crouch will suddenly turn things around? Well, that depends on the priorities of the OS, not the browser team.

Similarly, a Web As Platform strategy will maximise a browser's reach and its developers' potential, albeit at the occasional expense of end-user features.

The most important takeaway for developers may be what this model implies about browser choice. Products with an OS-first agenda are always playing second fiddle to a larger goal that does not put web developers first, second, or even third. Coming to grips with this reality lets us more accurately recommend browsers to users that align with our collective interests in a vibrant, growing Web.


  1. I hadn't planned to write this now, but an unruly footnote in an upcoming post, along with Frances' constant advice to break things up, made me realise that I already had 90% of it of ready. ↩︎

  2. Modern-day Mozilla presents a puzzle within this model.

    In theory, Mozilla's aims and interests align with growing the web as a platform; expanding its power to enable a larger market for browsers, and through it, a larger market for Firefox.

    In practice, that's not what's happening. Despite investing almost everything it makes back into browser development, Mozilla has also begun to slow-walk platform improvements. It walked away from PWAs and has continued to spread FUD about device APIs and other features that would indicate an appetite for an expansive vision of the platform.

    In a sense, it's playing the OS Agenda, but without an OS to profit from or a proprietary platform to benefit with delay and deflection. This is vexing, but perhaps expected within an organisation that has entered a revenue-crunch crouch. Another way to square the circle is to note that the the Mozilla Manifesto doesn't actually speak about the web at all. If the web is just another fungible application running atop the internet (which the manifesto does center), then it's fine for the web to be frozen in time, or even shrink.

    Still, Mozilla leadership should be thinking hard about the point of maintaining an engine. Is it to hold the coats of proprietary-favouring OS vendors? Or to make the web a true competitor? ↩︎

Older Posts