Talk:Callback verification/sandbox

Limitations
The documentation for both postfix and exim caution the use of this technique and mention many limitations to SMTP callbacks. In particular, there are many situations where it is either ineffective or causes problems to the systems that receive the callbacks.


 * Some regular mail exchangers do not give useful results to callbacks:
 * Servers that reject all bounce mails (contrary to the RFCs). To work around this problem, postfix, for example, uses either the local postmaster address or an address of "double-bounce" in the MAIL FROM part of the callout. This work-around, however, will fail if Bounce Address Tag Validation is used to reduce backscatter.  Callback verification can still work if rejecting all bounces happens at the DATA stage instead of the earlier MAIL FROM stage, while rejecting invalid e-mail addresses remains at the RCPT TO stage instead of also being moved to the DATA stage.
 * Servers that accept all e-mail address at RCPT TO stage but reject invalid ones at DATA stage. This is commonly done in order to prevent directory harvest attacks and will, by design, give no information about whether an e-mail address is valid and thus prevent callback verification from working.
 * Servers that accept all mails during the SMTP dialogue (and generate their own bounces later). This problem can be alleviated by testing a random non-existent address as well as the desired address (if the test succeeds, further verification is useless).
 * Servers that implement catch-all e-mail will, by definition, consider all e-mail addresses to be valid and accept them. Like systems that accept-then-bounce, a random non-existent address can be detect this.
 * The callback process can cause delays in delivery because the mail server where an address is verified may use slow anti-spam techniques, including "greet delays" (causing a connection delay) and greylisting (causing a verification deferral).
 * If the system being called back to uses greylisting the callback may return no useful information until the greylisting time has expired. Greylisting works by returning a "temporary failure" (a 4xx response code) when it sees an unfamiliar MAIL FROM/RCPT TO pair of email addresses. A greylisting system may not give a "permanent failure" (a 5xx response code) when given an invalid e-mail address for the RCPT TO, and may instead continue to return a 4xx response code.
 * Some e-mail may be legitimate but not have a valid "envelope from" address due to user error or just misconfiguration. The positive aspect is that the verification process will usually cause an outright rejection, so if the sender was not a spammer but a real user, they will be notified of the problem.
 * If a server receives a lot of spam may do a lot of callbacks. If those addresses are invalid or spamtrap, the server will look very similar to a spammer who is doing a dictionary attack to harvest addresses. This in turn might get the server blacklisted elsewhere.
 * Every callback places an unasked for burden on the system being called back to, with very few effective ways for that system to avoid the burden. In extreme cases, if a spammer abuses the same sender address and uses it at a sufficiently diverse set of receiving MXs, all of which use this method, they might all try the callback, overloading the MX for the forged address with requests (effectively a Distributed Denial of Service attack).
 * Callback verification has no effect if spammers spoof real email addresses or use the null bounce address.

Several of the above problems are reduced by caching of verification results. In particular, systems that give no useful information (not rejecting at the RCPT TO time, have catch-all e-email, etc.) can be remembered and no future call backs to those systems need to be made. Also, results (positive or negative) for specific e-mail addressas can be remembered. MTAs like Exim have caching built in.