Directory harvest attack

A directory harvest attack (DHA) is a technique used by spammers in an attempt to find valid/existent e-mail addresses at a domain by using brute force. The attack is usually carried out by way of a standard dictionary attack, where valid e-mail addresses are found by brute force guessing valid e-mail addresses at a domain using different permutations of common usernames. These attacks are more effective for finding e-mail addresses of companies since they are likely to have a standard format for official e-mail aliases (i.e. jdoe@example.domain, johnd@example.domain, or johndoe@example.domain).

There are two main techniques for generating the addresses that a DHA targets. In the first, the spammer creates a list of all possible combinations of letters and numbers up to a maximum length and then appends the domain name. This would be described as a standard brute force attack. This technique would be impractical for usernames longer than 5-7 characters. For example, one would have to try 368 (nearly 3 trillion) e-mail addresses to exhaust all 8-character sequences.

The other, more targeted technique, is to create a list that combines common first name and surnames and initials (as in the example above). This would be considered a standard dictionary attack when guessing usernames for e-mail addresses. The success of a directory harvest attack relies on the recipient e-mail server rejecting e-mail sent to invalid recipient e-mail addresses during the Simple Mail Transfer Protocol (SMTP) session. Any addresses to which email is accepted are considered valid and are added to the spammer's list (which is commonly sold between spammers). Although the attack could also rely on Delivery Status Notifications (DSNs) to be sent to the sender address to notify of delivery failures, directory harvest attacks likely don't use a valid sender e-mail address.

The actual e-mail message generated to the recipient addresses will usually be a short random phrase such as "hello", so as not to trigger a spam filter. The actual content that is to be advertised will be sent in a later campaign to just the valid email addresses.

One theory is that spammers also use DHAs to disseminate spam, and not just to collect email addresses for a later spam campaign. Using the method in this way, similar to a paper-based leaflet drop, the sender achieves the goal based on sheer volume, and not on accuracy of delivery. Using this method, the message would likely contain the content that the spammer is advertising, and not a short random phrase.