Spy pixel

Spy pixels or tracker pixels are hyperlinks to remote image files in HTML email messages that have the effect of spying on the person reading the email if the image is downloaded. They are commonly embedded in the HTML of an email as small, imperceptible, transparent graphic files. Spy pixels are commonly used in marketing, and there are several countermeasures in place that aim to block email tracking pixels. However, there are few regulations in place that effectively guard against email tracking approaches.

History
Email was invented in 1971 by Ray Tomlinson and have made it much more convenient to send and receive messages as opposed to traditional postal mail. In 2020, there were 4 billion email users worldwide and approximately 306 billion emails sent and received daily. The email sender, however, still has to wait for a reply email from the recipient in order to confirm that their message was delivered. There are some situations where the recipient doesn't respond to the sender even when they have read the email, which is why the email tracking method emerged. Most email services do not provide indicators as to whether an email was read, so third-party applications and plug-ins have provided the convenience of email tracking. The most common method is the email tracking beacon or spy pixel.

Spy pixels were described as "endemic" in February 2021. The "Hey" email service, contacted by BBC News, estimated that it blocked spy pixels in about 600,000 out of 1,000,000 messages per day.

Mechanism
HTML email messages typically contain hyperlinks to online resources. Common software used by a recipient of email may, by default, automatically download remote image files from hyperlinks, without asking the user for confirmation. After downloading an image file, the software displays the image to the recipient. A spy pixel is an image file that is deliberately made small, often of a single pixel and of a colour that makes it "impossible to spot with the naked eye even if you know where to look." Any email user can be reached via email tracking due to the open nature of email.

The tracking process begins when a sender inserts an image tag, represented as, into an HTML-based email. The image tag is linked to a tracking object stored on the server of the sender through a reference Uniform Resource Locator (URL). Once the mail client is opened, the recipient receives the email through a process whereby the mail user agent (MUA) synchronizes updates from the recipient's message transfer agent (MTA) with the local mail repository. When the recipient opens the email, the mail client requests the file that is referenced by the image tag. As a result, the web server where the file is stored logs the request and returns the image to the recipient. In order to track individual behavior, the tracking object or reference URL has to contain a tag that is unique to each email recipient. Oftentimes, the hash of the recipient's email is used. In contrast, IP address and device information collected from non-tracking images does not reveal specific users' email addresses.

When a single email is sent to multiple recipients, the tracking report will normally show the number of emails that have been opened but not the specific recipients who have done so.

Email tracking vs. web tracking
Web tracking and email tracking employ similar mechanisms, such as the usage of tracking images or cookies. Email tracking makes it much easier to trace back to any individual without consent, as email addresses can often reveal an individual's affiliation to a particular organization, browsing history, online social media profile, and other PII. This can lead to cross-tracking across devices, where third-party services link devices that share common attributes such as IP addresses, local networks, or login information. Although this may be more challenging with web tracking, more advanced web trackers have data collection features, like the Meta Pixel's advanced matching feature, that allows people to be identified by submitting an email address or other PII on a form page.

Personal use
Individuals and business owners may want to use email tracking for a variety of reasons, such as lead generation, event invitations, promotions, newsletters, one-click polls, and teacher-parent communications. They can use services like Yet Another Mail Merge (YAMM), a Google Sheets add-on, to create and send personalized mail merge campaigns from Gmail. The sender has the option to enable the tracker and see email open rates, clicks, replies, and bounces. According to YAMM's website: "YAMM embeds a tiny, invisible tracking image (a single-pixel gif, sometimes called a web beacon) within the content of each message. When the recipient opens the message, the tracking image is scanned, referenced and recorded in our system."

Marketing
Tracking the behavior of users through mediums like email newsletters and other forms of marketing communication is a competitive advantage in online marketing. In fact, it is so valuable that there are companies that sell online user data or offer email tracking as a service, such as Bananatag, Mailtrack.io, and Yet Another Mail Merge. This is because by learning more about the user based on their clicking histories and demographics, websites and companies can tailor messages to each user. The more information on the individual-level preferences of a user, the better. Customized communications in marketing can then result in heightened customer loyalty, lock-in, and satisfaction, which translates to increased cash flows and profitability. Using data to map out the competitive landscape can also help companies derive a competitive strategy and gain a competitive advantage. However, adverse effects from behavioral marketing can include discrimination, including price discrimination.

Malicious emails
Some emails contain malicious content or attachments, and email tracking is used to detect how fast these viruses or malicious programs can spread. At the same time, generally, the deliverability of tracked emails is reduced up to 85%, as the firewalls of company servers embed algorithms to filter out emails with suspicious contents.

Research
Web tracking and tracking software are used by researchers who need to gather data for their research, especially in information seeking studies. In fact, tracking technologies can be used for good, offering valuable information for the development of websites, portals, and digital libraries. It can also be used to improve user interfaces, search engines, menu items, navigational features, online help, and intelligent software agents, information architecture, content description, metadata, and more. These finds can be useful in marketing and e-commerce and may be important to people like library and information professionals, educators, and database designers.

Spying effect
The spying effect is that, without the email recipient choosing to do so, the result of the automatic download is to report to the sender of the email: if and when an email is read, when (and how many times) it is read, the IP address and other identity details of the computer or smartphone used to read the email, and from the latter, the geographical location of the recipient. This information provides insights into users' email reading behaviors, office and travel times, as well as details about their environment. By doing a reverse lookup of an IP address, the log entry can provide information on which organizations a user is affiliated with. For example, a board member of a major technology company was caught forwarding confidential information when an email log entry, IP address, and location information were examined simultaneously. Additionally, if spammers send emails to random email addresses, they can identify active accounts in this manner.

There exist many companies that offer email tracking services to senders. According to a study done by three researchers at Princeton University, about 30% of the emails they analyzed leaked recipients' email addresses to third parties via methods like embedded pixels, the majority of them intentionally. 85% of emails in their corpus of 12,618 gathered using a web crawler contained embedded third-party content, with 70% categorized as trackers. Top third-party domains include "doubleclick.net," "mathtag.com," "dotomi.com," and "adnxs.com," and the top organizations that collect leaked email addresses include The Acxiom, Conversant Media, LiveIntent, Neustar, and Litmus Software. Reloading an email increases the chance of the recipient's information being leaked to third parties. The study also found that tracking protection was helpful: it reduces the number of email addresses leaked by 87%.

A separate study found that 24.7% of 44,449 emails analyzed were embedded with at least one tracking beacon. Emails categorized as travel, news/media, and health had the highest prevalence of tracking, with 57.8%, 51.9%, and 43.4% containing at least one tracking beacon respectively. On the other hand, emails categorized as email client, social networking, and education have the least tracking, with 0.6%, 1.6%, and 3.8% containing at least one tracking beacon respectively. Through a survey, the authors also found that 52.1% of participants who checked email quite often were unaware that they could be tracked from simply opening an email. 86% of participants consider email tracking as a serious privacy threat.

According to poll results from Zogby International, 80% of consumers are either "somewhat" or "very" concerned about online tracking. Consumers who perceive a lack of business or governmental regulation will try to regain power through a variety of responses, such as fabricating personal information, using privacy-enhancing technologies, and refusing to purchase. At the same time, some argue that people's perceptions about privacy have changed with the times. For example, Mark Zuckerberg, founder of Facebook, said, "People have really gotten comfortable not only sharing more information and different kinds, but more openly and with more people. That social norm is just something that has evolved over time." Ironically, Facebook was also at the center of the Facebook-Cambridge Analytica data scandal in 2018.

Cambridge Analytica used a third-party app called “thisisyourdigitallife” to collect information from over 50 million Facebook users. Access to users' emails can expose them to data leaks. Four researchers from the University of Iowa and the Lahore University of Management Sciences designed and deployed CanaryTrap, which identifies data misuse by third-party apps on online social networks. It does this by linking a honeytoken to a user’s social media page and then watches for unrecognized usage. Specifically, the authors shared email addresses as honeytokens and watched for any unrecognized use of those email addresses. After performing an experiment on 1,024 Facebook pages, the authors discover multiple counts of data misuse. 422 unrecognized emails were received on honeytokens shared with 20 Facebook apps. Within those 422 emails, 76 were categorized as malicious or spam. Furthermore, third-party trackers can be considered as “adversaries” to Internet users because the use of HTTP cookies, Flash cookies, and DOM storage breaks data confidentiality between the users and the websites they interact with.

Overall, researchers at Carnegie Mellon University and Qualcomm found that many users don't see tracking as black and white. Many want control over tracking and think that it has its benefits, but don't know how to control tracking or distrust current tools. Out of 35 participants in the study, fourteen saw tracking as conditionally positive, eight saw it as generally neutral, nine saw it as generally negative, and the remaining four had mixed feelings. Twelve participants felt resigned to tracking.

Countermeasures
Countermeasures include using a plain text email client, disabling automatic download of images, or, if reading email using a browser, installing an add-on or browser extension.

The process of email-tracking does not require cookies, which makes it difficult to block without affecting user experience. For example, disabling automatic download of images is easy to implement; however, the trade-off is that it often results in a loss of information, incorrect formatting, a decline in user experience, and incomprehension or confusion.

Three Princeton University researchers who analyzed 16 email clients found that none of the existing setups completely protects users from the threats of email tracking. Blocking extensions such as uBlock Origin, Privacy Badger, and Ghostery can filter tracking requests.

Four other researchers aimed to detect trackers by focusing on analyzing the behavior of invisible pixels. After crawling 84,658 web pages from 8,744 domains, they found that invisible pixels are present on more than 94.51% of domains and make up 35.66% of all third-party images. Filter lists such as EasyList, EasyPrivacy, and Disconnect are popular ways to detect tracking; they detect known tracking and advertising requests by keeping a "blacklist." However, they miss around 30% of the trackers that the researchers detected. Moreover, when all three filter lists were combined, 379,245 requests from 8,744 domains still tracked users on 68.70% of websites.

Recent research has focused on using machine learning to develop anti-tracking software for end-users.

Analyzing mail flows and aggregate statistical data can help protect user accounts by detecting abnormal email behavior such as viral propagation of malicious email attachments, spam emails, and email policy violations.

Privacy tools can have usability flaws which makes it difficult for users to make informed and meaningful decisions. For example, participants in a study thought that they had installed configured a tool successfully when they had not. Additionally, the rise of ad-blockers and similar privacy tools have led to the emergence of anti ad-blockers, which seek out ad-blockers and try to disable them with various methods, in an escalating ad-blocker arms race.

Privacy regulations and policies
There are few regulation initiatives that exist to protect users from email tracking. The help pages of many email clients, such as Gmail, Yahoo! Mail, and Thunderbird may mislead users into thinking that privacy risks associated with email tracking are limited by stating that the threat is restricted to the email sender receiving recipients' information rather than third-parties also being able to access that information.

United States
The U.S. currently does not have comprehensive privacy rights in place. The Fourth Amendment, which guarantees "the right of the people to be secure in their persons, houses, papers and effects. against unreasonable searches and seizures, shall not be violated" does not explicitly apply to private companies and individuals. California's state constitution, however, grants individuals explicit privacy rights from both government and private action. There are regulations that target specific sectors, such as the Gramm-Leach-Bliley Financial Modernization Act of 1999 directed towards the financial services sector, the Health Insurance Portability and Accountability Act of 1996 for the healthcare sector, and the U.S. Department of Commerce's Safe Harbor framework which assists US companies' compliance with the EU's Directive on Data Protection.

European Union
The European Union passed the Directive on Data Protection (Directive 95/46/EC) in 1995 which requires member states to comply with certain privacy protection laws, focused on protecting the consumer. The directive forbids the exchange of data between EU member countries and countries that are not in accordance with the directive. Personal data can only be collected in certain circumstances and must be disclosed to individuals whose information is being collected. Additionally, PII can only be kept for as long as it is used for its original purpose.

The EU first introduction a set of regulations on tracking technologies in 2002. In 2009, the EU Directive mandated that websites ask for consent before using any type of profiling technology, such as cookies. As a result, most European websites implemented a "cookie bar." However, four researchers at the Polytechnic University of Turin performed an experiment on 35,000 websites using a tool called CookieCheck and found that 49% of those websites do not follow the EU cookie directive and installed profiling cookies before the user gave consent. In conclusion, the authors argue that the EU regulatory framework has been ineffective in enforcing rules and has not done much in helping reduce users’ exposure to tracking technologies.