Collection No. 1

Collection #1 is the name of a set of email addresses and passwords that appeared on the dark web around January 2019. The database contains over 773 million unique email addresses and 21 million unique passwords, resulting in more than 2.7 billion email/password pairs. The list, reviewed by computer security experts, contains exposed addresses and passwords from over 2000 previous data breaches as well as an estimated 140 million new email addresses and 10 million new passwords from previously unknown sources, and collectively makes it the largest data breach on the Internet.

Collection #1 was discovered by security researcher Troy Hunt, founder of "Have I Been Pwned?," a website that allows users to search their email addresses and passwords to know if either has appeared in a known data breach. The database had been briefly posted to Mega in January 2019, and links to the database posted in a popular hacker forum. Hunt discovered that the offering contained 87 gigabytes of data across 12,000 files. Not only was this discovery of concern to Hunt, but he further found that the passwords were available in plaintext format rather than in their hashed version. This implied that the creators of this database had been able to successfully crack the hashes of these passwords from weak implementation of hashing algorithms. Security researchers noted that unlike other username/password lists which are usually sold on the dark web, Collection #1 was temporarily available at no cost, and could potentially be used by a larger number of malicious agents, primarily for credential stuffing.

By January 30, 2019, security researchers observed that similar sets of data, named Collections #2 through #5, have been seen for sale on the dark web. Collections #2-5 included over 845 gigabytes of data, with a total of 25 billion email/password records. Security researchers at Hasso Plattner Institute estimated that Collections #2-5, after removing duplicates, has about three times as much data as Collection #1. Many of the email/password pairs in the collection were found to be from previous breaches including the Yahoo! data breaches, and breaches from LinkedIn and Dropbox.

Arrests
According to threat intelligence firm IntSights, Collection #1 through #5 had been compiled by a hacker known as Sanix; however, the data was leaked online by a rival data broker known as Azatej. Both hackers were arrested in May 2020. Azatej was arrested in Poland, and Sanix in Ukraine.