Do you remember the good old days?
Time was when all you had to do to avoid being tracked online was to remember to delete your cookies. And dig out your DOM storage. And flush your Flash LSOs. And clean the ETags out of your cache, steer clear of those mobile phone vendors who strap unique IDs on to outgoing traffic and decide if the security benefits of HSTS outweigh the potential privacy trade-offs.
Ah, simpler times.
Keeping on top of those things could be a headache but no matter how many tracking barnacles anchored themselves to the hull of your browser you were still in charge, provided that you and your plugins could find them and scrape them off.
Then came fingerprinting and tracking was turned on its head.
Unlike the stateful, active, tracking of deletable beacons like cookies and LSOs fingerprinting is the passive recording of your browser’s individual attributes.
It turns out that browsers share (and overshare) so much information about themselves, and that the information varies so much from one user to another, that your browser doesn’t need a cookie to be tracked at all because your browser itself is a cookie.
And good luck deleting that.
From screen sizes and user-agent strings to long lists of fonts and plugins, web browsers are brimming with sources of entropy. Individually the values of each source of entropy aren’t discrete enough to act as a unique ID but add them together and they are. That ID is your browser’s fingerprint.
The more sources of entropy there are the easier it is to create stable fingerprints.
The EFF’s pioneering Panopticlick test turned our heads with just eight browser attributes and kicked off a search for more. The more we looked, the worse it got.
As the list of fingerprintable characteristics grew in size and sophistication it became clear that some of them, such as how much charge your battery has left in it or how your browser renders 3D graphics, are looking past the browser and into aspects of the hardware and OS (Operating System) below.
Those characteristics are the same no matter which browser you use.
Researchers in the USA have now demonstrated that browsers leak enough of this kind of information that it’s possible to create stable, usable fingerprints for your computer.
In other words it isn’t your browser that’s a cookie, your computer is a cookie.
Building on earlier work by Károly Boda et al from the Budapest University of Technology and Economics, Yinzhi Cao and Song Li of Lehigh University and Erik Wijmans of Washington University, St Louis have developed state-of-the-art browser fingerprinting and cross-browser fingerprinting techniques:
…that can identify not only users behind one browser but also these that use different browsers on the same machine. Our approach adopts OS and hardware levels features including graphic cards exposed by WebGL, audio stack by AudioContext, and CPU by hardwareConcurrency. Our evaluation shows that our approach can uniquely identify more users than AmIUnique for single-browser fingerprinting, and than Boda et al. for cross-browser fingerprinting. Our approach is highly reliable, i.e., the removal of any single feature only decreases the accuracy by at most 0.3%.
…our approach can successfully identify 99.24% of users as opposed to 90.84% for state of the art on single-browser fingerprinting against the same dataset. Further, our approach can achieve higher uniqueness rate than the only cross-browser approach in the literature…
To achieve their results the researchers mixed improved versions of existing fingerprinting techniques, such as a more robust cross-browser method for capturing screen resolution, with a raft of exotic new techniques.
The new techniques include the first use of information from audio devices in fingerprinting and what looks like a full body workout of the GPU (Graphics Processing Unit).
Its been known for a while that different browsers exhibit measurable differences in the way they render pictures and text on the HTML canvas element (a virtual drawing surface you can include in web pages).
So-called canvas fingerprinting is good enough to have been used for tracking in the wild and research carried out in 2014 found 20 separate implementations of canvas fingerprinting across the top 100000 websites (including one that was rolled out silently on to 13m sites by social media button peddler AddThis).
Cao et al take that idea of canvas fingerprinting and run a marathon with it.
GPUs are put through their paces with lighting and shadow mapping, clipping, vertex and fragment shading, font rendering and anti-aliasing, among other things.
Your computer’s execution of these tests will be different enough from mine that we can be reliably told apart.
Some screenshots of the canvas fingperinting tests taken from the researcher’s testing site uniquemachine.org are included below. Of course they’re only visible because this is research – somebody interested in silently tracking you would tuck these away where you couldn’t see them:
Some sources of entropy are captured indirectly via “side channels”.
You browser will happily give up lists of the fonts it supports but won’t list the languages that you’ve got installed. That’s easily overcome though: all you have to do is to test if a language is installed is to try and use it to write its own name. If your browser tries to write “Javanese” in Javanese but isn’t equipped with the right writing system then it’ll render a string of white boxes like this: □□□□□.
A string of white boxes mean the language isn’t installed, anything else means it is. Rinse and repeat for each language and you’ve got your list.
Other sources of entropy are harvested from your browser simply by asking for them.
Did you know that any website you visit can ask how many virtual cores your computer’s CPU has, for example? It’s right there in the
One notable absentee from the list of “ask and we’ll tell you” browser attributes is battery life.
In 2015 researchers in France and Belgium showed that they could use the amount of charge left in your battery (accessible via browsers’ Battery Status API) as a unique ID.
About a year later a large scale study of tracking techniques used by the top 1m websites discovered battery life being used as a fingerprinting tactic in the wild.
Faced with clear evidence of its use as a tracking tool and serious questions about whether it had ever been used for its intended purpose by anyone, ever Firefox promptly pulled the plug on it.
What you can do
The paper deals with fingerprinting in a fairly even-handed way, suggesting that while it can be used to deliver unwanted, targeted ad, it might also be a useful second authentication factor.
I’m not ready to drink the Kool-Aid on this: cookies work just fine, thanks, and anything else says you’re not comfortable giving users a say in whether or not they’re tracked.
Unfortunately fingerprinting is very difficult to defend against.
The only browser that really makes life hard for fingerprinters is, not surprisingly, the Tor Browser. Privacy and security are the organising principles for the Tor Browser, a modified version of Firefox, and it’s been called out in all the research on the topic I’ve seen, including this one:
Tor Browser can successfully defend many browser fingerprinting techniques, including features proposed in our paper.
Provided that you disallow use of the canvas element (Tor tells you when a website tries to use it) the researchers reckon that Tor only gives them access to a few of the information sources they use.
It’s possible that browser plugins like Privacy Badger, Ghostery or NoScript help too. They don’t counter fingerprinting directly but their disruption of trackers and ads might stop you from loading a 3rd party fingerprinting script. Weighed against that is that an esoteric collection of plugins will make you a bit less like everyone else and therefore easier to fingerprint.
You can generate your own browser and computer fingerprints at the researchers’ site uniquemachine.org. It’ll give you a good feel for some of the graphical tricks used in the research but it won’t tell you how easily tracked you are.
The code used in the research is available from Song Li’s GitHub pages.