User:Ergozat/iir2

Browser fingerprint is a way to uniquely identifie a device on internet.

Definition
A unique browser fingerprint is derived by the unique pattern of information visible whenever a computer visits a website. The permutations thus collected are sufficiently distinct that they can be used as a tool for tracking. Unlike cookies, Fingerprints are generated on server side and are difficult for a user to influence.

Fingerprints usually carry 18-20 bits of identifying information which is enough to uniquely identify a browser.

We investigate the degree to which modern web browsers are subject to “device fingerprinting” via the version and configuration information that they will transmit to websites upon request. We implemented one possible fingerprinting algorithm, and collected these fingerprints from a large sample of browsers that visited our test side,panopticlick.eff.org.

We know that in the particular sample of browsers observed by Panopticlick,83.6% had unique fingerprints.

An important issue to be considered is whether fingerprinting is active or passive.

In computer security, ﬁngerprinting consists of identifying a system from the outside, i.e. guessing its kind and version [1] by observing speciﬁc behaviors (passive ﬁngerprinting), or collecting system responses to various stimuli (active ﬁngerprinting)

Also similarly to OS ﬁngerprinting, there are two kinds of browser ﬁngerprinting. On the one hand, one may uniquely identify a browser (see e.g. [3]), on the other hand, one may uniquely identify a browser type, that is, identifying the browser implementation (e.g. Firefox vs Internet Explorer) and its version number (e.g. IE8 vs IE9).

Stateless web tracking (ﬁngerprinting). Stateless tracking methods rely on device-speciﬁc information and userspeciﬁc conﬁgurations in order to uniquely re-identify users.

Stateless web tracking does not rely on unique identiﬁers stored on user devices, but on the properties of user devices including: browser version, installed fonts, browser plugins, and screen resolution

Browser ﬁngerprinting is a technique that can be used by a web server to uniquely identify a platform; it involves examining information provided by the browser, e.g. to website-originated JavaScript

Of course, web cookies and/or the client IP address can be used for the same purposes, but browser ﬁngerprinting is designed to enable browser identiﬁcation even if cookies are not available and the IP address is obfuscated, e.g. through the use of anonymising proxies

A key insight is that when it comes to web tracking, the real problem with ﬁngerprinting is not uniqueness of a ﬁngerprint, it is linkability, i.e., the ability to connect the same ﬁngerprint across multiple visits

Usages
In some cases, we detected fingerprinting scripts that were embedded in ad banners. It is unclear whether first parties serving these ad banners are aware of the existence of these embedded fingerprinting scripts.

Companies express that they deploy device fingerprinting in the context of a variety of web services. The spectrum of use cases include fraud detection, protection against account hijacking, anti-bot and anti-scraping services, enterprise security management, protection against DDOS attacks, real-time targeted marketing, campaign measurement, reaching customers across devices, and limiting number of access to services.

In all the cases we encountered, there were no visible effects of fingerprinting, and the users were not informed that they were being fingerprinted.

In the wild
Follow-up studies by Nikiforakis et al. and Acar et al. showed that stateless web tracking is already used in the wild. Englehardt and Narayanan [9] recently showed that fingerprinters are expanding their arsenal of techniques,e.g., with audio-based fingerprinting.

Our methodology can be divided into two main steps. In the first, we identified the ways we can detect canvas fingerprinting, developed a crawler based on an instrumented browser and ran exploratory crawls. This stage allowed us to develop a formal and automated method based on thearly findings. In the second step, we applied the analys method we distilled from the early findings and nearly full automated the detection of canvas fingerprinting.

We logged the URL of the caller script and the line number of the calling (initiator) code using Fire-fox’s nsContentUtils::GetCurrentJSContext and nsJSUtils::GetCallingLocation methods. This allowed us to precisely attribute the fingerprinting attempt to the responsible script and the code segment.

We crawled the home pages of the top 100,000 Alexa sites with the instrumented Firefox browser between 1-5 May 2014.

Table 1 shows the prevalence of the canvas fingerprinting scripts found during the home page crawl of the Top Alexa 100,000 sites. We found that more than 5.5% of crawled sites actively ran canvas fingerprinting scripts on their home pages. Although the overwhelming majority (95%) of the scripts belong to a single provider (addthis.com), we discovered a total of 20 canvas fingerprinting provider domains, active on 5542 of the top 100,000 sites5. Of these, 11 provider domains, encompassing 5532 sites, are third parties. Based on these providers’ websites, they appear to be companies that deploy fingerprinting as part of some other service rather than offering fingerprinting directly as a service to first parties. We found that the other nine provider domains (active on 10 sites) are in-house fingerprinting scripts deployed by first parties. Note that our crawl in this paper was limited to home pages.

To quantify the use of web-based ﬁngerprinting on popular websites, we crawled up to 20 pages for each of the Alexa top 10,000 sites, searching for script inclusions and iframes originating from the domains that the three studied companies utilize to serve their ﬁngerprinting code

Through this process, we discovered 40 sites (0.4% of the Alexa top 10,000) utilizing ﬁngerprinting code from the three commercial providers

The most popular site making use of ﬁngerprinting is skype.com, while the two most popular categories of sites are: “Pornography” (15%) and “Personals/Dating” (12.5%).

the afore mentioned adoption numbers are lower bounds since our results do not include pages of the 10,000 sites that were not crawled [...] Moreover, some popular sites may be using their own ﬁngerprinting algorithms for performing device identiﬁcation and not rely on the three studied ﬁngerprinting companies

To discover less popular sites making use of ﬁngerprinting, we used a list of 3,804 domains of sites that, when analyzed by Wepawet [27], requested the previously identiﬁed ﬁngerprinting scripts.

Each domain was submitted to TrendMicro’s and McAfee’s categorization services 7 which provided as output the domain’s category and “safety” score

The top 10 categories of websites utilizing ﬁngerprinting : spam, malicious sites, adult/mature content, computers/internet, datings/personnals...

(About less popular sites thats uses fingerprinting) The top two categories are also the ones that were the least expected. 163 websites were identiﬁed as malicious, such as using exploits for vulnerable browsers, conducting phishing attacks or extracting private data from users, whereas 1,063 sites were categorized as “Spam” by the two categorizing engines (on 3804 sites)

While our data-set is inherently skewed towards “maliciousness” due to its source, it is important to point out that all of these sites were found to include,at some point in time, ﬁngerprinting code provided by the three studied providers. This observation, coupled with the fact that for all three companies, an interested client must set an appointment with a sales representative in order to acquire ﬁngerprinting services, point to the possibility of ﬁngerprinting companies working together with sites of dubious nature, possibly for the expansion of their ﬁngerprint databases and the acquisition of more user data.

we analyzed the ﬁngerprinting libraries of three large, commercial companies: BlueCava, Iovation and ThreatMetrix

We used Ghostery [9], a browser-extension which lists known third-party tracking libraries on websites, to obtain the list of domains which the three code providers use to serve their ﬁngerprinting scripts

we crawled popular Internet websites, in search for code inclusions, originating from these ﬁngerprinting-owned domains.

we isolated the ﬁngerprinting code, extracted all individual features,and grouped similar features of each company together

since each company provides ﬁngerprinting services to many websites (through the inclusion of third-party scripts) and needs to obtain user ﬁngerprints from each of these sites.

Through our code analysis, we found two different scenarios of ﬁngerprinting. In the ﬁrst scenario, the ﬁrst party site was not involved in the ﬁngerprinting process.

ﬁngerprinting code was delivered by an advertising syndicator, and the resulting ﬁngerprint was sent back to the ﬁngerprintingcompany.Thiswasmostlikelydonetocombat click-fraud, and it is unclear whether the ﬁrst-party site is even aware of the fact that its users are being ﬁngerprinted.

In the second scenario, where the ﬁrst-party website is the one requesting the ﬁngerprint, we saw that two out of the three companies were adding the ﬁnal ﬁngerprint of the user into the DOM of the hosting page. For instance, www. imvu.com is using BlueCava for device ﬁngerprinting by including remote scripts hosted on BlueCava’s servers. When BlueCava’s scripts combine all features into a single ﬁngerprint, the ﬁngerprint is DES-encrypted (DES keys generated on the ﬂy and then encrypted with a public key), concatenated with the encrypted keys and ﬁnally converted to Base64 encoding. The resulting string is added into the DOM of www.imvu.com; more precisely, as a new hidden input element in IMVU’s login form. In this way, when the user submits her username and password, the ﬁngerprint is also sent to IMVU’s web servers. Note, however, that IMVU cannot decrypt the ﬁngerprint and must thus submit it back to BlueCava, which will then reply with a “trustworthiness” score and other device information. This architecture allows BlueCava to hide the implementation details from its clients and to correlate user proﬁles across its entire client-base. Iovation’s ﬁngerprinting scripts operate in a similar manner

The including site, i.e., a customer of ThreatMetrix, creates a session identiﬁer that it places into a element with a predeﬁned identiﬁer. ThreatMetrix’s scripts, upon loading, read this session identiﬁer and append it to all requests towards the ThreatMetrix servers. This means that the including site never gets access to a user’s ﬁngerprint, but only information about the user by querying ThreatMetrix for speciﬁc session identiﬁers.

Today, companies such as BlueCava [7], ThreatMetrix [23] and iovation [15] routinely ﬁngerprint millions of web users.

Tracking
An fingerprint with a enough high entropy make a user unique among others. It's used by companies for tracking users and learn their interests. The main purpose is to provide targeted advertising.

Fingerprint aren't just used to track user across websites, but also to regenerate deleted cookies. Or relink old cookies.

Malicious intentions
(About less popular sites thats uses fingerprinting) The top two categories are also the ones that were the least expected. 163 websites were identiﬁed as malicious, such as using exploits for vulnerable browsers, conducting phishing attacks or extracting private data from users, whereas 1,063 sites were categorized as “Spam” by the two categorizing engines (on 3804 sites)

We were however able to locate many “quiz/survey” sites that are, at the time of this writing, including ﬁngerprinting code from one of the three studied companies. Visitors of these sites are greeted with a “Congratulations” message, which informs them that they have won and asks them to proceed to receive their prize. At some later step, these sites extract a user’s personal details and try to subscribe the user to expensive mobile services.

Identifying device vulnerabilities
Malware propagation via browsers is done through browsers exploit kits. This is a piece of server side software that ﬁngerprints client browsers in order to deliver malware. Browser exploit kits will implement more and more advanced browser ﬁngerprinting mechanisms.

Drive-by downloads and web attacks in general use ﬁngerprinting to understand if the browser that they are executing on is vulnerable to one of the multiple available exploits

the attackers can decide, at the server-side, which exploit to reveal to the client, exposing as little as they can of their attack capabilities

Bot and fraud prevention
Defense Using Client Side Honeypots

By knowing browser ﬁngerprints summarizing high interaction ﬁngerprinting challenges, low interaction client side honeypots are much easier to build and maintain compared to high interaction honey-clients.

Detection of XSS proxiﬁcation with all kinds of techniques based on TCP network shape, HTTP headers (incl. user-agent) and IP addresses is vain, since the infected browser itself does the request. However, browser ﬁngerprinting can be used to detect XSS proxiﬁcation since the browser engine of the attacker is likely to be different from the infected engine.

Detecting disguised crawlers is especially important to ban clients that are eating all resources up to all kinds of deny-of-service. We think that techniques based on browser ﬁngerprinting may be used to detect whether a client is a bot or not.

ﬁngerprinting code was delivered by an advertising syndicator, and the resulting ﬁngerprint was sent back to the ﬁngerprintingcompany.This was most likely done to combat click-fraud

while for dating sites to ensure that attackers do not create multiple proﬁles for social-engineering purposes.

(About less popular websites that uses fingerprint)eight out of the ten categories, include sites which operate with user subscriptions, many of which contain personal and possibly ﬁnancial information. These sites are usually interested in identifying fraudulent activities and the hijacking of user accounts

It is sometimes argued that ﬁngerprints can be used for fraud prevention. We refer the interested reader to some of the literature from the ﬁngerprinting companies themselves [15,22,23] for further details. We should note that it is not obvious that collected ﬁngerprints cannot be also sold to third parties or abused for tracking purposes by the companies that collect them.

Augmented authentification
For pornographic sites, a reasonable explanation is that ﬁngerprinting is used to detect shared or stolen credentials of paying members,

eight out of the ten categories, include sites which operate with user subscriptions, many of which contain personal and possibly ﬁnancial information. These sites are usually interested in identifying fraudulent activities and the hijacking of user accounts

We identify and classify 29 available device ﬁngerprinting mechanisms, primarily browser-based and known, but including several network-based methods and others not in the literature;

We again emphasize that the ﬁngerprinting mechanisms discussed herein require no new user interaction and thus impose no additional usability burdens on users; given increasing attention to usability, this strongly motivates the use of device ﬁngerprinting to augment user authentication.

Extensions
Countermeasure tools to fingerprinting already exist, for example, FireGloves [2] and NoScript [3], which are add-ons of Mozilla Firefox, and Tor Browser Bundle [18], which allows anonymous communication by a Web browser, are the widely used countermeasure tools. In addition, Ghostery [19] which Ghostery Enterprise has developed and Chameleon [20] as extensions for Google Chrome are employed as countermeasure tools.

An interesting sidenote is that these unique features can be used to expose the real version of Mozilla Firefox browser, even when the user is using the Torbutton extension.

we were interested in studying the completeness and robustness of extensions that attempt to hide the true nature of a browser from an inspecting website

we focused on extensions that advertised themselves as capable of spooﬁng a browser’s user agent

The extensions were discovered by visiting each market, searching for “user-agent” and then downloading all the relevant extensions with a sufﬁciently large user base and an above-average rating.

Our testing consisted of listing the navigator and screen objects through JavaScript and inspecting the HTTP headers sent with browser requests, while the extensions were actively spooﬁng the identity of the browser.

in all cases, the extensions were inadequately hiding the real identity of the browser, which could still be straightforwardly exposed through JavaScript

ﬁngerprinting libraries [...] can discover the discrepancies between the values reported by the extensions and the values reported by the browser, and then use these differences as extra features of their ﬁngerprints

discrepancies of each speciﬁc extension can be modeled and thus, as with Adblock Plus, used to uncover the presence of speciﬁc extensions, through their side-effects.

We characterize the extension-problem as an iatrogenic one

users who install these extensions in an effort to hide themselves in a crowd of popular browsers, install software that actually makes them more visible and more distinguishable from the rest of the users, who are using their browsers without modiﬁcations

Our ﬁndings come in direct antithesis with the advice given by Yen et al. [18], who suggest that user-agent-spooﬁng extensions can be used, as a way of making tracking harder. (Host Fingerprinting and Tracking on the Web: Privacy and Security Implications)

To this end, we also analyzed eleven popular user-agent spooﬁng extensions and showed that,even without our newly proposed ﬁngerprinting techniques, all of them fall short of properly hiding a browser’s identity.

The effectiveness of all tracker-blocking methods discussed so far depends on their underlying blocking ruleset. Rulesets can be divided into three categories: communitydriven, centralized, and algorithmic

The most popular community-driven rulesets for blocking ads and trackers origin from the development of the AdBlock Plus browser extension. At the time of writing, the main AdBlock Plus ruleset (EasyList) consists of over 17,000 URI patterns and more than 25,000 CSS tags to be blocked.

Ghostery, Disconnect and Blur rely on a centralized approach to create blocking rules. This means that the companies behind these three tracker-blocking tools maintain and curate blocking rules

The third category are algorithmic [...] These blocking tools do not rely on regularly updated blacklists,but instead use heuristics to automatically detect third-party trackers.The most popular example for the use of algorithmic rulesets is EFF’s Privacy Badger.

We evaluate the effectiveness of the most popular rulebased advertisement and tracker blocking browser extensions. Speciﬁcally, we use the following browser extensions: • AdBlock Plus 2.7.3 (default settings) • Disconnect 3.15.3 (default settings) • Ghostery 6.2.0 (blocking activated) • EFF Privacy Badger 0.2.6 (trained with Alexa Top 1,000) • uBlock Origin 1.7.0 (default settings)

In order to analyze the browser extensions, we developed a distributed modular web crawler frameworkcalled CRAWLIUM

The sample of our evaluation is seeded from the global Alexa Top Sites.

In addition to traditional, stateful third-party tracking, our large-scale evaluation accounts for tracking based on ﬁngerprinting. Our analysis is based on the ﬁndings provided by Acar et al. in FPDetective [8] and Englehardt et al. based on OpenWPM [9]. Acar et al. provide several regular expressions [39] to detect ﬁngerprinters based on their URIs, while Englehart et al. provide speciﬁc URI’s to identify ﬁngerprinters [...] to detect if a page includes a ﬁngerprinting script based on the collected results.

Our ﬁltering process ﬁnally resulted in a total set of 123,876 websites which were successfully analyzed with all browser extensions. These websites are uniformly spread in the Alexa top 200K ranks

To quantify each extension’s ability to block ﬁngerprinting, we leveraged the previously detected ﬁngerprinters found by Acar et al. [8] as well as the newly identiﬁed ﬁngerprinters by Englehardt et al. [9]. Speciﬁcally, we utilized the regular expressions provided by the authors of FPDetective on Github [39] and the URI’s provided in Englehardts’ paper.

A number of the ﬁngerprinting services we detected were not blocked by any of our evaluated web browser extensions such as MERCADOLIBRE, SiteBlackBox, orCDN.net.

Even though some of these services were identiﬁed by both studies as providers of ﬁngerprinting scripts, it is unfortunate to see that they are not completely blocked by all browser extensions. For example, CDN.net was identiﬁed by FPDetective (i.e., three years ago) and again by Englehardt et al., and yet none of the extensions includes it in its rule set

We also noticed that for some instances we observed more invocations of ﬁngerprinting scripts with activated browser extensions compared to our vanilla (plain) browser instance.

our analysis of popular Android applications showed that ThreatMetrix was included in 149 applications, i.e., 1.64% of our sample. We found that only the extensive DNS-based block list “Mother of all ADBLOCKING” [44] effectively blocked this ﬁngerprinting service.

all evaluated browser extensions failed to completely block well-known stateless ﬁngerprinting services

Our results showed that stateless tracking constitutes a serious blindspot of today’s tracker-blocking tools.

All modern browsers have extensions that disallow Flash and Silverlight to be loaded until explicitly requested by the user (e.g., through a click on the object itself).

By wrapping their ﬁngerprinting code into an object of the ﬁrstparty site and making that object desirable or necessary for the page’s functionality, the ﬁngerprinting companies can still execute their code

In the long run, the best solution against ﬁngerprinting through Flash should come directly from Flash.

Browser-based protections
We performed the tests using a specially established website https://ﬁngerprintable.org.Thiswebsitedoesnotretainany datarecoveredfromvisitingbrowsers,butsimplydisplaystheinformationthatit is able to collect from the currently employed browser.

we use a selection of known ﬁngerprinting approaches to compare the ﬁngerprintability of widely used web browsers on both desktop and mobile platforms [...] we made parallel studies for these two platform types

we chose to examine browsers running on Windows 10 and Mac OS X 10.12 (Sierra) for desktop platforms, and Android 7.0 (Nugget), iOS 10.2.1 and Windows 10 Mobile for mobile devices

The desktop browsers we examined were Chrome, Internet Explorer, Firefox, Edge and Safari. – The mobile browsers used in our tests were Chrome, Safari, Opera Mini, Firefox and Edge

in our tests we used clean installations of browsers so that they did not include any add-ons or plugins other than those installed and enabled by default [...] we chose to leave them on the basis that many users will not change the browser default settings

The mobile browsers require various permissions to be set as part of their installation [...] For testing purposes, we did not grant any permissions other than those needed for browser installation

To test the ﬁngerprintability of the selected browsers, a web page containing JavaScript was constructed, intended to be served by our experimental website (https://ﬁngerprintable.org). Whenever the website is visited by a client browser, e.g. one of those being tested, the scripts in the web page interrogate the browser to learn the values of a set of identifying attributes [...] As mentioned elsewhere, this site is publicly available, and is open for general use

Attribute Processing. Each browser was tested for the retrievability of discriminating information for each of the six ﬁngerprinting attributes

Desktop browser ﬁngerprintability : Attribute/Browser Chrome Internet Explorer Firefox Edge Safari Fonts - high - high Device ID high - medium low Canvas low low medium low low WebGL Renderer low - low low Local IP low - low medium Total attributes 4 2 4 5 1 Fingerprintability Index 6 4 6 8 1

Mobile browser ﬁngerprintability : Attribute/Browser Chrome Safari Opera Mini Firefox Edge User Agent medium - medium - medium Device ID high - medium medium low Canvas low low medium medium low WebGL Renderer low low low - low Local IP medium - medium medium Total attributes 5 2 5 3 4 Fingerprintability Index 9 2 9 6 5

Some mobile browsers seem to unnecessarily give out the speciﬁc phone model

At the time we performed our experiments, Safari would appear to be the best choice in this respect on both mobile and desktop platforms. Despite Chrome being the most widely used browser, it proved to be one of the most ﬁngerprintable

In PriVaricator we use the power of randomization to “break” linkability by exploring a space of parameterized randomization policies

PriVaricator modiﬁes the browser to make every visit appear diﬀerent to a ﬁngerprinting site, resulting in a diﬀerent ﬁngerprint that cannot be easily linked to a ﬁngerprint from another visit, thus frustrating tracking attempts

The basis of our approach is to change the way the browser represents certain important properties, such as offsetHeight (used to measure the presence of fonts) and plugins, to the JavaScript environment

In summary, a randomization policy should 1) produce unlinkable ﬁngerprints and 2) not break existing sites.

In this paper we concentrate our attention on randomizing plugins and fonts, as these dominate in the current generation of ﬁngerprinters (Table 1). We, however, consider the approach presented here to be fully extendable to other ﬁngerprinting vectors if that becomes necessary.

We have implemented PriVaricator on top of the Chromium web browser

Existing private modes help prevent stateful tracking via cookies; PriVaricator focuses on preventing stateless tracking. We believe that it is better to integrate PriVaricator into the browser itself as opposed to providing it via an extension

For the values of offsetHeight, offsetWidth, and getBoundingClientRect in PriVaricator, we propose the following randomization policies: a) Zero; b) Random(0..100); and c) ± 5% Noise.

For the randomization of plugins, we deﬁne a probability P(plug hide) as the probability of hiding each individual entry in the plugin list of a browser, whenever the navigator.plugins list is populated. Example: As an example, a conﬁguration of Rand Policy = Zero, θ = 50, P(lie) = 20%, P(plug hide) = 30% instructs PriVaricator to start lying after 50 oﬀset accesses, to only lie in 20% of the cases, to respond with the value 0 when lying, and to hide approximately 30% of the browser’s plugins.

For the reasons of better compatibility and transparency, we ultimately chose to implement our randomization policies within the browser, by changing the appropriate C++ code in the classes responsible for creating the navigator object, and the ones measuring the dimensions of elements.

For this evaluation, we measured how PriVaricator stands against BlueCava, Coinbase, PetPortal, and ﬁngerprintjs, as explained below.

In all four cases, the individual ﬁngerprinting providers gave us a way of assessing the eﬃcacy of PriVaricator, simply by visiting each provider multiple times using diﬀerent randomization settings, and recording the ﬁngerprint provided by each oracle. To explore the space of possible policies in detail, we performed an automated experiment where we visited each ﬁngerprinting provider 1,331 times, to account for 113 parameter combinations

Even though we propose multiple lying policies about oﬀsets, in this section, we only show the eﬀect of PriVaricator’s ± 5% Noise policy,

96.32% of all ﬁngerprints being unique. This shows how fragile BlueCava’s identiﬁcation is against our randomization policies

(about fingerprintingjs) In nearly all intermediate points (78.36% of the total set of collected ﬁngerprints), randomness works in our favor by returning diﬀerent sets of plugins, which, in turn, result in diﬀerent ﬁngerprints.

In contrast with the other three services, we were able to get unique ﬁngerprints in“only”37.83% of the 1,331 parameter combinations

Overall, our experiments showed that, while the speciﬁc choices of each ﬁngerprinter aﬀect the uniqueness of our ﬁngerprints, PriVaricator was able to deceive all of them for a large fraction of the tested combination settings

Overall, the results of our breakage experiments show that the negative eﬀect that PriVaricator has on a user’s browsing experience is negligible

Note that we do not claim to solve the entire problem of web-based device ﬁngerprinting with PriVaricator. The focus of our work is on explicit attempts to ﬁngerprint users via capturing the details of the browser environment. We do not attempt to provide protection against sophisticated side-channels such as browser performance [17] which may be used as part of ﬁngerprinting.

We use careful randomization as a way to make subsequent visits to the same ﬁngerprinter diﬃcult to link together

While our implementation has focused on randomizing font- and plugin-related properties, we demonstrate how our approach can be made general with pluggable randomization policies.

Proposed ideas
Finally, we sketched two possible countermeasures based onthe ideas of encapsulation and namespace pollution that aimto either hide the presence of extensions or confuse trackersabout which extensions are really installed in a user’s browser

The idea of enhancing only the appearance of web pages isclose to the concept of Shadow DOM, which gives the abilityto web developers to encapsulate presentational widgets fromother JavaScript and CSS on the page

In the long run, the best solution against ﬁngerprinting through Flash should come directly from Flash.

To unify the behavior of JavaScript under different browsers, all vendors would need to agree not only on a single set of API calls to expose to the web applications, but also to internal implementation speciﬁcs

Also, based on the fact that the vendors battle for best performance of their JavaScript engines, they might be reluctant to follow speciﬁc design choices that might affect performance.

There are three different architectures to detect drive-by downloads: low-interaction honeypots, high-interaction honeypots and honeyclients.

Given the complexity of fully hiding the true nature of a browser, we believe that this can be efﬁciently done only by the browser vendors

At the same time, it is currently unclear whether browser vendors would desire to hide the nature of their browsers, thus the discussion of web-based device ﬁngerprinting, its implications and possible countermeasures against it, must start at a policy-making level in the same way that stateful user-tracking is currently discussed.

Ideally, novel research into detecting stateless ﬁngerprinters would automatically create blocking rules (since for some of the identiﬁed ﬁngerprinters even after three years no ﬁlter rules exist).

The analysis of Web standards, APIs and their implementations can reveal unexpected Web privacy problems by studying the information exposed to Web pages. The complex and sizable nature of the new Web APIs and their deeper integration with devices make it hard to defend against such threats. Privacy researchers and engineers can help addressing the risks imposed by these APIs by analysing the standards and their implementations for their eﬀect on Web privacy and tracking. This may not only provide an actionable feedback to API designers and browser manufactureres, but can also improve the transparency around these new technologies.

The current diversity in the contents of the user-agent ﬁeld results from a very long history of the ‘browser wars’, but could be standardized today.

This means that if plugins disappear and if user-agents become generic, only one ﬁngerprint out of two would be uniquely identiﬁable using our collected attributes, which is a very signiﬁcant improvement to privacy over the current state of browser ﬁngerprinting

reduced APIs that still provide rich features

. For example, we could envision a whitelist of fonts that are authorized to be disclosed by the browser, as suggested by Fiﬁeld and Egelman [20]. Such a list would contain the default fonts provided by an operating system. This whitelist of fonts would also include a default encoding for emojis that is common to all versions of the operating system, or even common to all platforms.

Having generic HTTP headers and removing browser plugins could reduce ﬁngerprint uniqueness in desktops by a strong 36%.

Techniques
Browser fingerprinting uses browser-dependent features such as Flash or Java to retrieve information on installed fonts, plugins, the browser platform and screen resolution.

Plugins and fonts are the most identifying metrics,followed by User Agent, HTTP Accept, and screen resolution.

It is unable to distinguish between instances of identically configured devices.

The fingerprint is unstable, meaning that the fingerprint can change quite easily. The instability can be caused by upgrades to the browser or a plug-in, installing a new font, or simply the addition of an external monitor which would alter the screen resolution.

Many events can cause a browser fingerprint to change. In the case of the algorithm deployed, those events include upgrades to the browser, upgrading a plugin,disabling cookies, installing a new font or an external application which includes fonts, or connecting an external monitor which alters the screen resolution.

Also, fingerprinters methods are known to tailor their approach to the specific parameters of the targeted browser, once they recognize its type by means of a range of techniques that may also include analysis of browser-specific features.

Overall, one can see how various implementation choices, either major ones, such as the traversal algorithms for JavaScript objects and the development of new features, or minor ones, such as the presence or absence of a newline character, can reveal the true nature of a browser and its JavaScript engine.

Graphics rendering
We implemented one possible fingerprinting algorithm, and collected these fingerprints from a large sample of browsers that visited our test side, panopticlick.eff.org. We observe that the distribution of our fingerprint contains at least 18.1 bits of entropy, meaning that if we pick a browser at random, at best we expect that only one in 286,777 other browsers will share its fingerprint. Among browsers that support Flash or Java, the situation is worse, with the average browser carrying at least 18.8 bits of identifying information. 94.2% of browsers with Flash or Java were unique in our sample.

we analyzed the ﬁngerprinting libraries of three large, commercial companies: BlueCava, Iovation and ThreatMetrix

all companies use Flash, in addition to JavaScript, to ﬁngerprint a user’s environment

CSS
We identify three CSS-based methods of browser fingerprinting: CSS properties, CSS selectors and CSS filters. Differences in the layout engine allow us to identify a given browser by the CSS properties it supports. When properties are not yet on ”Recommendation“ or ”Candidate Recommendation“ status, browsers prepend a vendor-specific prefix indicating that the property is supported for this browser type only. Once a property moves to Recommendation status, prefixes are dropped by browser vendors and only the property name remains.

Selectors are a way of selecting specific elements in an HTML tree. For example, CSS3 introduced new selectors for old properties, and they too are not yet uniformly implemented and can be used for browser fingerprinting.

Instead of conducting image comparison (as used recently by Mowery et al. to fingerprint browsers based on WebGL-rendering), we use in our implementation JavaScript to test for CSS properties in style objects: in DOM, each element can have a style child object that contains properties for each possible CSS property and its value (if defined).

The method of distinguishing browsers by their behavior is based on CSS filters. CSS filters are used to modify the rendering of e.g., a basic DOM element, image, or video by exploiting bugs or quirks in CSS handling for specific browsers, which againis very suitable for browser fingerprinting.

In the proposed method, we use CSS properties that request the content on a Web server

p{background-image : url("database.php? property=background-image") ; }

“database.php” saves the query string of the request sent from the Web browser to the database and responses the file corresponding to  the  request.

If the rendering engine interprets the code shown in Fig. 2, a request for “database.php” is sent to the Web server from the Web browser

div#mask-image{ /*test1*/ -webkit-mask-image : url("database.php? property=maskimage") ; } div#border-image{ /*test2*/ border-image : url("database.php? property=borderimage”) fill 10 / 10%/ 10px space ; }

Fig. 3 shows an example of the code for Web browser estimation using only CSS properties. Seven test cases, from “test1” to “test7,” are included.

If a rendering engine interprets the code shown in Fig. 3, only properties supported by that rendering engine are interpreted and requests for “database.php” are sent to the Web server. “database.php” stores each property name applied by the Web browser to the database. From the stored information, the determination of the Web browser’s implementation status and the estimation of the Web browser family and its version are possible.

Detecting Screen Information of a Device

Media Characteristics: Detectable Information ; device-height: screen size, width: width of window size, height: height of window size orientation: orientation of screen (landscape or portrait), device-pixel-ratio: ratio of device pixel

Detecting Information by Mozilla’s Media Queries Media queries listed in this section are proprietary implementations of Mozilla

Media Characteristics: Detectable Information; -moz-touch-enabled: Responding to touch screen, -moz-windows-compositor: whether using DWM, -moz-windows-default-theme: whether using Windows defalut theme like Luna or Aero, -moz-windows-classic: whether using classic mode, -moz-mac-graphite-theme: whether using Graphite theme on Mac OS

In this section, we show the method of font determination in a user’s device using @font-face

When the Web browser interprets the code shown in Fig. 7, if the font specified in “local” is present in the device, that font is applied, and the request to the Web server is not transmitted. If the specified font in “local” does not exist in the device, “url” is interpreted, and the request for “database.php” is sent to the Web server.

Web server can determine the fonts that exist in the user’s device

@font-face{ font-family: 'font1'; src: local('Arial'), url("database.php? fontname=Arial"); } div#font1{ font-family: 'font1'; }

The Web browser family and version listed in Table III were identified in the seven test cases.

In the proposed method, if fonts installed in a user’s device are not specified in @font-face, it is not possible to confirm their existence

Therefore, although existing countermeasure tools for fingerprinting are not valid against countermeasure tools for fingerprinting by CSS, existing countermeasure tools can limit the collection of some information.

(NB : Mean to be in css section ? cause they use css but it's not the heart of the paper)

Data analyzed in this paper was gathered in the What The Internet Knows About You project

The experimental system utilized the CSS :visited history detection vector [4] to obtain bitwise answers about the existence of a particular URL in a Web browser’s history store for a set of known URLs.

We expect that our data sample comes from largely self-selected audience skewed towards more technical users, who likely browse the Web more often than casual Internet users.

In this paper we refer to 382,269 users who executed the default “popular sites” test of over 6,000 most common Internet destinations.

we analyze data about visited “primary links” only, without analyzing any detected subresources within a website.

In our dataset, the average number of visited sites per proﬁle is 15, and the median is 10. However, analyzing just the history sizes larger than 4 (223,197 of such proﬁles) results in the average number of links 18 (median 13) with 98% proﬁles being unique

in average, for all the proﬁles, 94% of users had unique browsing histories

Web history of mobile users different usage patterns are observed— speciﬁcally, the detected history sizes are smaller, which might suggest that the Web use on mobile devices is not as frequent or large as it is with non-mobile Web browsing

Thus, testing for as few as 50 well-chosen websites in a user’s browsing history can be enough to establish a ﬁngerprint which is almost as accurate as when 6,000 sites are used

We conclude that the most important sites for ﬁngerprinting are the most popular ones because a considerable number of history proﬁles are still distinct, even in a small slice of 50 bits

we converted each history proﬁle into a category proﬁle. This was performed by replacing each website of a proﬁle by the general category it belongs to by using the Trend Micro Site Safety Center categorization service [17].

We computed a unique set of interests for every Web history proﬁle by discarding repeated occurrences of the same category in proﬁles. This resulted in 164,043 distinct category proﬁles, out of which 88% are unique (i.e. only attributed to a unique user).

The conversion from Web history proﬁles to only use each website’s category decreased the overall number of unique proﬁles. However, we observe that even with the coarser-grained metric there is still a large number of distinct proﬁles.

we analyze history contents of repeat visitors to our test site

it suggests that in considerable number of cases the history remains similar with time, which is especially the case for the ﬁrst few days after the initial visit

The data analyzed in this paper was gathered by performing CSS :visited history detection, which is now generally ﬁxed in the modern browsers, although it will continue to work for older browser installations which constitute to a considerable fraction of Web users

The results indicate that Web browsing histories, which can be obtained by a variety of known techniques, may be used to divulge personal preferences and interests to Web authors; as such, browsing history data can be considered similar to a biometric ﬁngerprint

An analysis of tracking potential (on two examples of Google and Facebook, shown in in Section 5) brings us to a conclusion that Web service providers are also in a position to re-create users’ browsing interests

Flash/Java
Java is another plugin that can be used by web browsers to display interactive web content such as online games and online chat programs. While it can be used for collecting the system information, it is not a desirable method for fingerprinters.

Fingerprinting methods that can operate without the target user’s explicit consent or awareness are preferable to techniques requiring user interaction. In particular, the FlashPlayer transmits information without asking.

Flash APIs can operate without targeted user's explicit consent or awareness.

APIs are in favor of Fingerprinters who want to collect as much information as possible about the targeted user's system.

The scripting language of Flash, Action Script, does include methods for discovering the list of installed fonts.

More subtly, browsers with a Flash blocking add-on installed show Flash in the plugins list, but fail to obtain a list of system fonts via Flash, thereby creating a distinctive fingerprint, even though neither measurement (plugins, fonts) explicitly detects the Flash blocker.

all companies use Flash, in addition to JavaScript, to ﬁngerprint a user’s environment

Despite the fact that Flash has been criticized for poor performance, lack of stability, and that newer technologies, like HTML5, can potentially deliver what used to be possible only through Flash, it is still available on the vast majority of desktops.

when a user utilizes a dual-monitor setup, Flash reports as the width of a screen the sum of the two individual screens. This value, when combined with the browser’s response (which lists the resolution of the monitor were the browser-window is located), allows a ﬁngerprinting service to detect the presence of multiple-monitor setups.

none of the three studied ﬁngerprinting companies utilized Java

We consider it likely that the companies abandoned Java due to its low market penetration in browsers

ActionScript, the scripting language of Flash, provides APIs that include methods for discovering the list of fonts installed on a running system [...] it can also be used to ﬁngerprint the system

Two out of the three studied companies were utilizing Flash as a way of discovering which fonts were installed on a user’s computer.

we found evidence that the code was circumventing the user-set proxies at the level of the browser, i.e., the loaded Flash application was contacting a remote host directly, disregarding any browser-set HTTP proxies

if a JavaScript originating request contains the same token as a Flash originating request from a different source IP address, the server can be certain that the user is utilizing an HTTP proxy.

Flash’s ability to circumvent HTTP proxies is a somewhat known issue among privacy-conscious users that has lead to the disabling of Flash in anonymity-providing applications

All modern browsers have extensions that disallow Flash and Silverlight to be loaded until explicitly requested by the user (e.g., through a click on the object itself).

By wrapping their ﬁngerprinting code into an object of the ﬁrstparty site and making that object desirable or necessary for the page’s functionality, the ﬁngerprinting companies can still execute their code

we analyzed the ﬁngerprinting libraries of three large, commercial companies: BlueCava, Iovation and ThreatMetrix

none of the three studied ﬁngerprinting companies utilized Java Thig the dimensions of text rendered with different fonts.

Further in 2011, Boda et al. identified that the major drawback in Panopticlick project was its reliance on Browser instances and either Java or Adobe Flash (the attributes with highest entropy) must be enabled to get the list of fonts. To avoid this weakness Boda et al. proposed a new solution in which they omitted the browser specific details and used JavaScript, some basic system fonts to identify fonts that are browser independent and installed without the need of Java or Flash, system features (Operating system, screen resolution) and the first two octets of the IP address.

Javascript
For our ﬁngerprinting method, we compared test results from openly available Javascript conformance tests and collected results from different browsers and browser versions for ﬁngerprint generation. These tests cover the ECMAScript standard in version 5.1 and assess to what extent the browser complies with the standard, what features are supported and speciﬁcally which parts of the standard are implemented incorrectly or not at all. In essence, our initial observation was that the test cases that fail in, e.g., Firefox, are completely different from the test cases that fail in Safari.

Javascript ﬁngerprinting had the correct result for all browsers in the test set

Our novel ﬁngerprinting techniques focus on the special, browser-populated JavaScript objects; more precisely, the navigator and screen objects

we constructed a ﬁngerprinting script that performed a series of “everyday” operations on these two special objects (such as adding a new property to an object, or modifying an existing one) and reported the results to a server.

The enumeration of each object was conducted through code that made use of the prop in obj construct, to avoid forcing a speciﬁc order of enumeration of the objects, allowing the engine to list object properties in the way of its choosing.

By sharing the link to our ﬁngerprinting site with friends and colleagues, we were able, within a week, to gather data from 68 different browsers installations, of popular browsers on all modern operating systems.

our data is small in comparison to previous studies [11], [12], we are not using it to draw conclusions that have statistical relevance but rather, as explained in the following sections, to ﬁnd deviations between browsers and to establish the consistency of these deviations

the order of property-enumeration of special browser objects, like the navigator and screen objects, is consistently different between browser families, versions of each browser, and, in some cases, among deployments of the same version on different operating systems.

This feature by itself, is sufﬁcient to categorize a browser to its correct family, regardless of any property-spooﬁng that the browser may be employing.

the different orderings can be leveraged to detect a speciﬁc version of Google Chrome, and, in addition, the operating system on which the browser is running.

Overall, we discovered that the property ordering of specialobjects,suchasthe navigator object,isconsistent among runs of the same browser and runs of the same version of browsers on different operating systems.

Using the data gathered by our ﬁngerprinting script, we isolated features that were available in only one family of browsers, but not in any other.

All browser families had at least two such features that were not shared by any other browser. In many cases, the names of the new features were starting with a vendor-speciﬁc preﬁx, such as screen.mozBrightness for Mozilla Firefox and navigator.msDoNotTrack for Microsoft Internet Explorer

we investigate whether each browser treats the navigator and screen objects like regular JavaScript objects. More precisely, we investigate whether these objects are mutable, i.e., whether a script can delete a speciﬁc property from them, replace a property with a new one, or delete the whole object.

only Google Chrome allows a script to delete a property from the navigator object.

When our script attempted to modify the value of a property of navigator, Google Chrome and Opera allowed it, while Mozilla Firefox and Internet Explorer ignored the request. In the same way, these two families were the only ones allowing a script to reassign navigator and screen to new objects.

Mozilla Firefox behaved in a unique way when requested to make a certain property of the navigator object non-enumerable.

we examine if we can determine a browser’s version based on the new functionality that it introduces. We chose Google Chrome as our testing browser and created a library in JavaScript that tests if speciﬁc functionality is implementedby the browser.

we chose 187 features to test in 202 different versions of Google Chrome, spanning from version 1.0.154.59 up to 22.0.1229.8, which we downloaded from oldapps.com and which covered all 22 major versions of Chrome.

we found 71 sets of features that can be used to identify a speciﬁc version of Google Chrome

The results show that we can not only identify the major version, but in most cases, we have several different feature sets on the same major version. This makes the identiﬁcation of the exact browser version even more ﬁne-grained.

Our enumeration of object-properties indirectly uses the method toString for the examined objects. By comparing the formatted output of some speciﬁc properties and methods, we noticed that different browsers treated them in slightly different ways. For instance, when calling toString on the natively implemented navigator.javaEnabled method, browsers simply state that it is a “native function.” Although all the examined browser families print “function javaEnabled { [native code] },” Firefox uses newline characters after the opening curly-bracket and before the closing one

Canvas
It works by rendering text and WebGL scenes on to an area of the screen using the HTML5 element programmatically, and then reading the pixel data back to generate a fingerprint.

ThetoDataURL(type)method is called on the canvas object and a Base64 encoding of a PNG image containing the contents of the canvas are obtained.

A hash of the Base64 encoded pixel data is created so that the entire image is not needed to be uploaded to a website. The hash is also used as the fingerprint.

The results also showed that at least the operating system, browser version, graphics card, installed fonts, sub-pixel hinting, and anti-aliasing all pay a part in the final fingerprint.

A 2014 study conducted by Acar et al. showed that canvas fingerprinting is the most common form of fingerprinting.

AddThis scripts perform the following tests: • Drawing the text twice with different colors and the default fallback font by using a fake font name, starting with “no-real-font-”. • Using the perfect pangram8 “Cwm fjordbank glyphs vext quiz” as the text string • Checking support for drawing Unicode by printing the character U+1F603 a smiling face with an open mouth. • Checking for canvas globalCompositeOperation sup- port. • Drawing two rectangles and checking if a specific point is in the path by the isPointInPath method. By requesting a non-existent font, the first test tries to em- ploy the browser’s default fallback font. This may be used to distinguish between different browsers and operating sys- tems.

Another interesting canvas fingerprinting sample was the script served from the admicro.vcmedia.vn domain. By in- specting the source code, we found that the script checks the existence of 1126 fonts using JavaScript font probing.

Fortunately for us, web fonts can be used when writing to a as well.

We collected samples from 300 distinct members of the Mechcanical Turk marketplace, paying each a small sum to report their graphics card and graphics driver version. Meanwhile, our ﬁve ﬁngerprinting tests ran in the background

(Ariel Font Rendering Tests)Given these results, we conclude that rendering a simple pangram in Arial on a is enough to leak the user’s operating system family and (almost always) browser family.

(Ariel Font Rendering Tests)During our experiments, we observed that at least operating system, browser version, graphics card, installed fonts, subpixel hinting, and antialiasing all play a part in generating the ﬁnal user-visible bitmap

WebGL
WebGL provides a JavaScript API for rendering 3D graphics in a element

WebGL test creates a single surface, comprised of 200 polygons. It applies a single black and white texture to this surface, and uses simple ambient and directional lights. We also enable antialiasing.

270 remaining images appear identical. When examined at the level of individual pixels, however, we discovered 50 distinct renders of the scene.

This suggests that these graphics cards are performing antialiasing slightly differently, or perhaps simply linearly interpolating textures in almost imperceptably diﬀerent ways

Some browsers provide access to the identity of the vendor and the speciﬁc model of the user platform’s Graphics Processing Unit (GPU). These two pieces of information are obtained by requesting the following WebGL attributes: UNMASKED VENDOR WEBGL and UNMASKED RENDERER WEBGL. These attributes could reveal the Central Processing Unit (CPU) type if there is no GPU or if the GPU is not used by the browser. We found that the UNMASKED VENDOR WEBGL either states the browser vendor or the CPU/GPU vendor. In both cases it does not provide any useful information that cannot be readily found from the UNMASKED RENDERER WEBGL (i.e. identifying a vendor is trivial once the full CPU/GPU model details are known)

Fonts
The font list is likely to be the most accurate test, i.e., the one which provides the highest amount of information.

The presence of a specific font on the system where the browser is running can be checked with JavaScript by surreptitiously measuring and then comparing the dimensions of text rendered with different fonts.

Further in 2011, Boda et al. identified that the major drawback in Panopticlick project was its reliance on Browser instances and either Java or Adobe Flash (the attributes with highest entropy) must be enabled to get the list of fonts. To avoid this weakness Boda et al. proposed a new solution in which they omitted the browser specific details and used JavaScript, some basic system fonts to identify fonts that are browser independent and installed without the need of Java or Flash, system features (Operating system, screen resolution) and the first two octets of the IP address.

In this work, we examine another facet of font-based device fingerprinting, the measurement of individual glyphs. Figure 1 shows how the same character in the same style may be rendered with different bounding boxes in different browsers. The same effect can serve to distinguish between instances of even the same browser on the same OS, when there are differences in configuration that affect font rendering—and we find that such differences are surprisingly common. By rendering glyphs at a large size, we magnify even small differences so they become detectable.

At the most basic level, font metrics can tell when there is no installed font with a glyph for a particular code point, by comparing its dimensions to those of a placeholder “glyph not found” glyph. But even further, font metrics can distinguish different fonts, different versions of the same font, different default font sizes, and different rendering settings such as those that govern hinting and antialiasing. Even the “glyph not found” glyph differs across configurations.

Font metric–based fingerprinting is weaker than some other known fingerprinting techniques.

However, it is relevant because it is as yet effective against Tor Browser, a browser whose threat model includes tracking by fingerprinting.

We performed an experiment with more than 1,000 web users that tested the effectiveness of font fingerprinting across more than 125,000 code points of Unicode. 34 % of users were uniquely identified; the others were in various anonymity sets of size up to 61. We found that the same fingerprinting power, with this user population, can be achieved by testing only 43 code points.

Fonts were rendered very large, with CSS style font-size: 10000 %, in order to better distinguish small differences in dimensions.

Benchmarking
Intel Turbo Boost Technology is a function that improves the performance of the CPU by increasing the operating frequency

For estimating the existence of AES-NI in a target CPU, we presume to measure the difference in operation speed between device with AES-NI and device without AES-NI, with applying Web Cryptography API.

Because the processing performance will be different in each CPU regardless of the existence of AES-NI, we cannot simply compare the results.

If the target CPU does not have AES-NI or is disabled, the calculation speed of the referencing arithmetic operation is identical with that of the cryptographical operation of AES. On the contrary, if the target CPU has AES-NI and is enabled, the AES processing time should be faster than the referencing operation that provided by non- cryptographical operations.

AESrate = time of aes operation / time of montecarlo operation (referencing arithmetic operation)

Therefore, we examined the differences in the processing performance of Turbo Boost using the JavaScript software benchmark Octane 2.0

Thus, we evaluated 341 samples in our experiment.

Therefore, using the value of useragent, we divided the samples into four browser categories: Chrome, Firefox, Internet Explorer, and Safari.

In the case of Chrome shown in Figure 2, our proposed method can identify the existence of AES-NI with an accuracy of 99.28%.

In the case of Firefox shown in Figure 3, our proposed method can identify the existence of AES-NI with an accuracy of 71.17%.

In the case of Internet Explorer shown in Figure 4, our proposed method can identify the existence of AES-NI with an accuracy of 77%.

The accuracy of estimations in the Chrome, Firefox and Internet Explorer browsers were 84.78%, 82.88% and 55%, respectively. (for Turbo Boost)

The estimation accuracy of the proposed method was available even in the cross-browsers.

Both AES-NI and Turbo Boost statuses, i.e., enable or disable, were estimated with high accuracy in short processing time, in the Chrome browser. The estimates were relatively stable in Firefox, but were degraded in Internet Explorer.

One of countermeasures against the proposed method is to degrade the accuracy of the time-measurement function built in JavaScript. Time measurements in JavaScript are performed by an in-built object called Date and High Resolution Time API.

Battery Status
checks for the existence of an AudioContext and OscillatorNode to add a single bit of information to a broader ﬁngerprint. More sophisticated scripts process an audio signal generated with an OscillatorNode to ﬁngerprint the device. This is conceptually similar to canvas ﬁngerprinting: audio signals processed on diﬀerent machines or browsers may have slight diﬀerences due to hardware or software diﬀerences between the machines, while the same combination of machine and browser will produce the same output.

HTML5 Battery Status API enables websites to access the battery state of a mobile device or a laptop.

World Wide Web Consortium’s (W3C) Battery Status API allows the reading of battery status data. Among the oﬀered information are the current battery level and predicted time to charge or discharge

The API does not require user permission

In our exploratory survey of the Battery Status API implementations, we observed that the battery level reported by the Firefox browser on GNU/Linux was presented to Web scripts with double precision. An example battery level value observed in our study was 0.9301929625425652. We found that on Windows, Mac OS X and Android, the battery level reported by Firefox has just two signiﬁcant digits (e.g. 0.32).

the battery level is read from UPower, a Linux tool allowing the access to the UPower daemon

We ﬁled an appropriate bug report to Firefox implementation, pointing out the inconsistency of level reporting across diﬀerent platforms [20]. The ﬁx was implemented and deployed as of June 2015.

Our analysis shows that the high precision battery level readings provided by Firefox can lead to an unexpected ﬁngerprinting surface: the detection of battery capacity

Device ID
The use of a device ID as a ﬁngerprintable attribute was proposed by an anonymous developer on BrowserLeaks.com3. According to this website, a device ID is a hash value generated by a browser by applying a cryptographic hash function to the unique ID of a hardware component in the user platform (combined with other data values); it is retrieved by requesting the WebRTC hardware ID attribute

for a single website, the device ID appears likely to remain constant (at least for some browsers) across multiple visits, giving it high value for ﬁngerprinting purposes

To the authors’ knowledge, there is no description in the literature of any practical evaluations of this attribute as a technique for ﬁngerprinting, and so its robustness and usefulness for this purpose has yet to be determined. However, experiments conducted as part of this research show that it has great promise for use in ﬁngerprinting

Device IDs also have the potential of being highly discriminating; however, as discussed earlier, browsers that provide device IDs diﬀer in terms of the persistence of the values. This attribute is therefore assigned high if the browser shows no signs of changing this value under typical browser usage, and is assigned medium if a browser provides a new value with every browsing session. It is assigned low if a browser provides a new value with every visit or page refresh.

Chrome device IDs are consistent and do not change unless the user selects the privacy mode9 feature or clears the browser cache. The Firefox device ID remained the same during multiple visits in a single browsing session, but changed once the browser was reopened. Of the browsers revealing a device ID, Edge gave the value that changed most readily; merely refreshing a web page caused Edge to generate a new value.

Protocols
Browsers choose the way they order headers fields and their number, and so this can be used to infer the browser family. Internet Explorer choose to order the UserAgent befor the Host field, the while Chrome do the opposite order.

In HTTP header, there is the user agent string that can provide basic informations about the connected user. For example informations directly about the hardware system, and can reveal a phone model.

Browser extensions and plugins
Browser Plugins are software components which enable the browser to show content that is otherwise not supported by the browser whereas Browser extensions are the programs written in JavaScript to add new functionality to the browser.

There is trade-off between privacy enhancing tools and fingerprinting as more the user install extensions to protect privacy, the more he will become unique for fingerprinting. For example, NoScript which is a popular browser extension for blocking the JavaScript and enhancing the privacy and security of the user can be exploited for fingerprinting purpose as only lout of 93 people disable or block JavaScript.

Unlike plugins, extensions are not enumerable through JavaScript and thus can only be detected by their possible side-effects. For instance, Mowery et al. showed that it is possible to deduce custom white lists from the popular NoScript plugin,simply by requesting scripts from domains and later inspect-ing whether the scripts successfully executed, by searching for predetermined JavaScript objects in the global address space. The deduced white lists can be used as an extra fingerprint feature. Nikiforakis et al. sh showed that user-agent-spoofing extensions can also be discovered due to inconsistencies in the reported browsing environment when each extension is active.

we present XHOUND(Extension Hound), the first fully automated system for fingerprinting browser exten-sions, based on a combination of static and dynamic analysis.XHOUND fingerprints the organic activity of extensions in a page’s DOM, such as, the addition of new DOM elementsand the removal of existing ones

Moreover, our findings are likely to beapplicable to mobile platforms where most browsers havepoor or no support for plugins, yet popular browsers, suchas, Firefox Mobile and Dolphin Browser for Android, andChrome for iOS [32], support extensions

XHOUNDcurrently supports Google Chrome and MozillaFirefox extensions

XHOUNDis currently limited in that it searches for mod-ifications in a page’s DOM but not in the browser’s BOM(Browser Object Module). As such, our tool will not be ableto detect certain niche extensions

we applied XHOUNDto the 10,000 most popularextensions in the Chrome Store.

XHOUND’s results show that at least 9.2%of extensions introduce detectable DOM changes on anyarbitrary domain

more than 16.6% are fingerprintable on at leastone popular URL of the Alexa top 50 websites. If, instead oflooking at all 10K extensions, we limit ourselves to the top1K, the fraction of detectable extensions increases to 13.2%for arbitrary domains and 23% for popular URLs

more than 16.6% are fingerprintable on at leastone popular URL of the Alexa top 50 websites.

he overall trend is that the fractionof detectable extensions decreases when we consider lesspopular Chrome extension

the vast majority offingerprintable extensions perform at least one DOM change(or combination of changes) that is unique to each one of them.

pecifi-cally, whenever an extension modifies the DOM it can i) adda new DOM element, ii) delete an existing DOM element, iii)set/change a tag’s attribute, and iv) change the text on the page.As the data shows, the most popular action among fingerprint-able extensions is to introduce new elements on a page.

we took advantage of the elapsed time of ourprevious experiment (four months), to assess whether the“new” top 1,000 extensions were as fingerprintable as the“old” top 1,000 extensions. We found that the intersectionof these two sets of top 1,000 extensions was 79.8% outof which 54.6% had updated their versions. By applyingXHOUNDon the new top 1,000 extension set, we discoveredthat 12.2% of the extensions were fingerprintable on anyarbitrary URL, while 21.6% were fingerprintable on at leastone popular URL, compared to our previous 13.2% and 23%.

Among the most popular 1,000 Firefox extensions im-plemented with either WebExtensions or Add-on SDK, wefound that 16% are fingerprintable on at least one URL, and7.3% on any domain.

Similar to he analyzed Chrome extensions, the most popular types ofchanges are the addition of new DOM elements (67%), thechanging of particular attributes (37%) and the deleting ofparts of content (27%).

854 users participated in our surveys who had atotal of 941 unique browser extensions installed and enabled. [...] On average, surveyed users had4.81 active extensions in their browsers.

One can seethat, for all groups of users, with the exception of Non-USMTurk workers, approximately 70% of users had at least onefingerprintable extension. In addition, 14.1% of all users in allgroups are uniquely identifiable (belong to an anonymity setof size equal to one).

A more subtle implication of fingerprinting browser exten-sions is that extensions, unlike plugins and other existing fin-gerprintable features, capture, to a certain extent, the interestsof users.

We then surveyed 854 real users and discovered thatmost users utilize fingerprintable extensions, and a significantfraction of them use different sets of fingerprintable exten-sions, allowing trackers to uniquely or near-uniquely identifythem.

When this two-step validation is not properly implemented, it is prone to a timing side-channel attack that an adversary can use to identify the actual reasons behind a request denial: the extension is not present or its resources are kept private. To this end, we used the UserTimingAPI1,implementedineverymajorbrowser, inordertomeasuretheperformanceofwebapplications.

By comparing the two timestamps, the attacker can easily determine whether an extension is installed or not inthebrowser.

it is possible to completely enumerate all the installed extensions

built-inextensions. These extensions are pre-installed and present in nearly every major web browser and there is no possibility for theusertouninstallthem. Therefore,ifweconﬁgureour techniques to check one of these built-in extensions that does not exist in other browsers, a website can precisely identify the browser family with 100% accuracy.

Installed extensions provide information about a particular user’s interests, concerns, and browsing habits

we implemented a page that checks the users’ installed extensions among the top 1,000 most popular from the Chrome Web Store andtheAdd-onsFirefoxwebsites,usingthetimingsidechannel extension enumeration attack described in§3.1.

Overall,from the 204 users that participated in ours tudy, 116 users presented a unique set of installed extensions, whichmeansthat56.86%oftheparticipantsareuniquely identiﬁable just by using their set of extensions.

In particular, Table 4 compares the different entropy values of the top six ﬁngerprinting methods or attributes measured in the work by Laperdrix et al. [24] with our extensions-based ﬁngerprinting method. (Cf nextcloud)

we focused on extensions that advertised themselves as capable of spooﬁng a browser’s user agent

The extensions were discovered by visiting each market, searching for “user-agent” and then downloading all the relevant extensions with a sufﬁciently large user base and an above-average rating.

Our testing consisted of listing the navigator and screen objects through JavaScript and inspecting the HTTP headers sent with browser requests, while the extensions were actively spooﬁng the identity of the browser.

in all cases, the extensions were inadequately hiding the real identity of the browser, which could still be straightforwardly exposed through JavaScript

ﬁngerprinting libraries [...] can discover the discrepancies between the values reported by the extensions and the values reported by the browser, and then use these differences as extra features of their ﬁngerprints

discrepancies of each speciﬁc extension can be modeled and thus, as with Adblock Plus, used to uncover the presence of speciﬁc extensions, through their side-effects.

We characterize the extension-problem as an iatrogenic one

users who install these extensions in an effort to hide themselves in a crowd of popular browsers, install software that actually makes them more visible and more distinguishable from the rest of the users, who are using their browsers without modiﬁcations

Our ﬁndings come in direct antithesis with the advice given by Yen et al. [18], who suggest that user-agent-spooﬁng extensions can be used, as a way of making tracking harder. (Host Fingerprinting and Tracking on the Web: Privacy and Security Implications)

To this end, we also analyzed eleven popular user-agent spooﬁng extensions and showed that,even without our newly proposed ﬁngerprinting techniques, all of them fall short of properly hiding a browser’s identity.

while analyzing the plugin-detection code of the studied ﬁngerprinting providers, we noticed that two out of the three were searching a user’s browser for the presence of a special plugin, which, if detected, would be loaded and then invoked

the plugins were essentially native ﬁngerprinting libraries, which are distributed as CAB ﬁles for Internet Explorer and eventually load as DLLs inside the browser. These plugins can reach a user’s system, either by a user accepting their installation through an ActiveX dialogue, or bundled with applications that users download on their machines

The submitted ﬁngerprinting DLLs were reading a plethora of system-speciﬁc values, such as the hard disk’s identiﬁer, TCP/IP parameters,the computer’s name,Internet Explorer’s product identiﬁer, the installation date of Windows, the Windows Digital Product Id and the installed system drivers

All of these values combined provide a much stronger ﬁngerprint than what JavaScript or Flash could ever construct

HTML5
We have differing implementation states of the new HTML5 features, support for the various improvements can be tested and used for fingerprinting purposes as well. For identifying the new features and to what extent they are supported by modern browsers, we used the methodology described in. The W3C furthermore has a working draft on differences between HTML5 and HTML4 that was used as input .In total we identified a set of 242 new tags, attributes and features in HTML5 that were suitable for browser identification.

One of our findings from the fingerprint collection was that the operating system apparently has no influence on HTML5 support. We were unable to find any differences between operating systems while using the same browser version, even with different architectures.

In this paper, we propose to use the behavior of the HTML parser under speciﬁc inputs to ﬁngerprint the type and version of browsers. We call those particular responses HTML parser quirks. The Merriam-Webster dictionary deﬁnes a quirk as a “a peculiar trait”

HTML parser quirks are peculiar behaviors under speciﬁc inputs. They may have different consequences, in particular incorrect rendering or undesired execution of JavaScript code.

Based on this set of testable XSS vectors, a framework called XSS Test Driver performs the full test suite on different browsers, collecting as many XSS signatures as possible.

A technique based on the observation of HTML parser quirks is doable at the application level, and its counter-attack is hard, since HTML parser behavior is hardly spoofable.

Each signature contains attributes describing the results of all the tests. We consider an initial set of 77 browsers, and the corresponding signatures are referred as the raw dataset of browser signatures. This dataset can be directly used for ﬁngerprinting an unknown web browser, in order to determine (1) its exact version based on a Hamming distance between browser signatures. This set can also be used (2) as input for machine learning techniques in order to build an optimized decision tree.

Our experiments show that the exact version of a web browser can be determined with 71% of accuracy (within our dataset), and that only 6 tests are sufﬁcient to quickly determine the exact family a web browser belongs to (with building tree).

The JavaScript code of one of the three ﬁngerprinting companies included a fall-back method for font-detection, in the cases where the Flash plugin was unavailable.

the code ﬁrst creates a element. Inside this element, the code then creates a element with a predetermined text string and size, using a provided font family. Using the offsetWidth and offsetHeight methods of HTML elements, the script discovers the layout width and height of the element.

In order to capitalize as much as possible on small differences between fonts, the font-size is always large

a ﬁngerprinting script can rapidly discover, even for a long list of fonts, those that are present on the operating system. The downside of this approach is that less popular fonts may not be detected, and that the font-order is no longer a ﬁngerprintable feature

Studies history
Eckersley conducted the first large-scale study to analyze the uniqueness of web browser configurations, converting them to so called “device fingerprints”. Stateless web tracking does not rely on unique identifiers stored on user devices, but on the properties of user devices including:browser version, installed fonts, browser plugins, and screen resolution.

Eckersley conducted the ﬁrst large-scale study showing that various properties of a user’s browser and plugins can be combined to form a unique ﬁngerprint (P. Eckersley, “How Unique Is Your Browser?” in Proceedings of the 10th Privacy Enhancing Technologies Symposium (PETS), 2010. )

Yen et al. [18] performed a ﬁngerprinting study, similar to Eckersley’s, by analyzing month-long logs of Bing and Hotmail ( T.-F. Yen, Y. Xie, F. Yu, R. P. Yu, and M. Abadi, “Host Fingerprinting and Tracking on the Web: Privacy and Security Implications,”inProceddings of the 19th Annual Network and Distributed System Security Symposium (NDSS), 2012. )

Mowery et al. [13] proposed the use of benchmark execution time as a way of ﬁngerprinting JavaScript implementations, ( K. Mowery, D. Bogenreif, S. Yilek, and H. Shacham, “Fingerprinting information in JavaScript implementations,” in Proceedings of W2SP 2011, H. Wang, Ed. IEEE Computer Society, May 2011. )

Mowery and Shacham later proposed the use of rendering text and WebGL scenes to a element as another way of ﬁngerprinting browsers( K. Mowery and H. Shacham, “Pixel perfect: Fingerprinting canvas in HTML5,” in Proceedings of W2SP 2012, M. Fredrikson, Ed. IEEE Computer Society, May 2012.)

Olejnik et al. [40] show that web history can also be used as a way of ﬁngerprinting without the need of additional client-side state( Ł. Olejnik, C. Castelluccia, and A. Janc, “Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns,” in the 5th workshop on Hot Topics in Privacy Enhancing Technologies (HOTPETS 2012).)

Today, however, all modern browsers have corrected this issue and thus, extraction of a user’s history is not as straightforward, especially without user interaction

Motivated by the initial ﬁndings of Eckersley [19], a number of researchers further investigated stateless tracking and its implications. Yen et al. [54] performed a ﬁngerprinting study similar to Eckersley’s by analyzing logs of Bing and Hotmail(P. Eckersley, “How unique is your web browser?” in Privacy Enhancing Technologies. Springer, 2010, pp. 1–18. )

Nikiforakis et al. [6] described how ﬁngerprinting works by analyzing the code of three browser-ﬁngerprinting providers. ( N. Nikiforakis, A. Kapravelos, W. Joosen, C. Kruegel, F. Piessens, and G. Vigna, “Cookieless monster: Exploring the ecosystem of webbased device ﬁngerprinting,” in Security and privacy (SP), 2013 IEEE symposium on. IEEE, 2013, pp. 541–555.)

Acar et al. [8] developed the FPDetective framework to detect web-based ﬁngerprinters in the wild(G. Acar, M. Juarez, N. Nikiforakis, C. Diaz, S. G¨urses, F. Piessens, and B. Preneel, “Fpdetective: Dusting the web for ﬁngerprinters,” in ACM CCS’13. ACM, 2013, pp. 1129–1140. )

In a later study, the authors also investigated the usage of canvas-ﬁngerprinting [55] in the wild as one more vector for uniquely identifying users across multiple websites[7].([7] G. Acar, C. Eubank, S. Englehardt, M. Juarez, A. Narayanan, and C. Diaz, “The web never forgets: Persistent tracking mechanisms in the wild,” ACM CCS’14, 2014. [55] K. Mowery and H. Shacham, “Pixel perfect: Fingerprinting canvas in html5,” in Web 2.0 Workshop on Security and Privacy (W2SP), 2012.)

Themostextensivemeasurementonstatelesstracking has been performed by Englehardt and Narayanan [9](S. Englehardt and A. Narayanan, “Online tracking: A 1-million-site measurement and analysis Draft: July 11th, 2016,” Jul. 2016, [Technical Report]. [Online]. Available: http://randomwalker.info/ publications/OpenWPM 1 million site tracking measurement.pdf)

Our work leverages the ﬁndings of Englehardt and Narayanan as well as Acar et al. to shed light on the effectiveness of the state-of-the-art blocker tracking tools against stateless tracking on popular websites and mobile apps.Our results showed that stateless tracking constitutes a serious blindspot of today’s tracker-blocking tools.

Eckersley [21] published the first research paper discussing in detail the concept of browser-based device fingerprinting