Amavis

Amavis is an open-source content filter for electronic mail, implementing mail message transfer, decoding, some processing and checking, and interfacing with external content filters to provide protection against spam and viruses and other malware. It can be considered an interface between a mailer (MTA, Mail Transfer Agent) and one or more content filters.

Amavis can be used to:
 * detect viruses, spam, banned content types or syntax errors in mail messages
 * block, tag, redirect (using sub-addressing), or forward mail depending on its content, origin or size
 * quarantine (and release), or archive mail messages to files, to mailboxes, or to a relational database
 * sanitize passed messages using an external sanitizer
 * generate DKIM signatures
 * verify DKIM signatures and provide DKIM-based whitelisting

Notable features:
 * provides SNMP statistics and status monitoring using an extensive MIB with more than 300 variables
 * provides structured event log in JSON format
 * IPv6 protocol is supported in interfacing, and IPv6 address forms in mail header section
 * properly honors per-recipient settings even in multi-recipient messages, while scanning a message only once.
 * supports international email (RFC 6530, SMTPUTF8, EAI, IDN)

A common mail filtering installation with Amavis consists of a Postfix as an MTA, SpamAssassin as a spam classifier, and ClamAV as an anti-virus protection, all running under a Unix-like operating system. Many other virus scanners (about 30) and some other spam scanners (CRM114, DSPAM, Bogofilter) are supported, too, as well as some other MTAs.

Interfacing topology
Three topologies for interfacing with an MTA are supported. The amavisd process can be sandwiched between two instances of an MTA, yielding a classical after-queue mail filtering setup, or amavisd can be used as an SMTP proxy filter in a before-queue filtering setup, or the amavisd process can be consulted to provide mail classification but not to forward a mail message by itself, in which case the consulting client remains in charge of mail forwarding. This last approach is used in a Milter setup (with some limitations), or with a historical client program amavisd-submit.

Since version 2.7.0 a before-queue setup is preferred, as it allows for a mail message transfer to be rejected during an SMTP session with a sending client. In an after-queue setup filtering takes place after a mail message has already been received and enqueued by an MTA, in which case a mail filter can no longer reject a message, but can only deliver it (possibly tagged), or discard it, or generate a non-delivery notification, which can cause unwanted backscatter in case of bouncing a message with a fake sender address.

A disadvantage of a before-queue setup is that it requires resources (CPU, memory) proportional to a current (peak) mail transfer rate, unlike an after-queue setup, where some delay is acceptable and resource usage corresponds to average mail transfer rate. With introduction of an option smtpd_proxy_options=speed_adjust in Postfix 2.7.0 the resource requirements for a before-queue content filter have been much reduced.

In some countries the legislation does not permit mail filtering to discard a mail message once it has been accepted by an MTA, so this rules out an after-queue filtering setup with discarding or quarantining of messages, but leaves a possibility of delivering (possibly tagged) messages, or rejecting them in a before-queue setup (SMTP proxy or milter).

Interfacing protocols
Amavis can receive mail messages from an MTA over one or more sockets of protocol families PF_INET (IPv4), PF_INET6 (IPv6) or PF_LOCAL (Unix domain socket), via protocols SMTP, LMTP, or a simple private protocol AM.PDP can be used with a helper program like amavisd-milter to interface with milters. On the output side protocols SMTP or LMTP can be used to pass a message to a back-end MTA instance or to an LDA, or a message can be passed to a spawned process over a Unix pipe. When SMTP or LMTP are used, a session can optionally be encrypted using a TLS STARTTLS (RFC 3207) extension to the protocol. SMTP Command Pipelining (RFC 2920) is supported in client and server code.

Interfacing with SpamAssassin
When spam scanning is enabled, a daemon process amavisd is conceptually very similar to a spamd process of a SpamAssassin project. In both cases forked child processes call SpamAssassin Perl modules directly, hence their performance is similar.

The main difference is in protocols used: Amavis typically speaks a standard SMTP protocol to an MTA, while in the spamc/spamd case an MTA typically spawns a spamc program passing a message to it over a Unix pipe, then the spamc process transfers the message to a spamd daemon using a private protocol, and spamd then calls SpamAssassin Perl modules.

Design priorities
Design priorities of the amavisd-new (from here on just called Amavis) are: reliability, security, adherence to standards, performance, and functionality.

Reliability
With the intention that no mail message could be lost due to unexpected events like I/O failures, resources depletion and unexpected program terminations, the amavisd program meticulously checks a completion status of every system call and I/O operation. Unexpected events are logged if at all possible, and handled with several layers of event handling. Amavis never takes a responsibility for a mail message delivery away from an MTA: the final success status is reported to an MTA only after the message has been passed on to the back-end MTA instance and reception was confirmed. In case of any fatal failures during processing or transferring of a message, the message being processed just stays in a queue of the front-end MTA instance, to be re-tried later. This approach also covers potential unexpected host failures, crashes of the amavisd process or one of its components.

The use of program resources like memory size, file descriptors, disk usage and creation of subprocesses is controlled. Large mail messages are not kept in memory, so the available memory size does not impose a limit on the size of mail messages that can be processed, and memory resources are not wasted unnecessarily.

Security
A great deal of attention is given to security aspects, required by handling potentially malicious, nonstandard or just garbled data in mail messages coming from untrusted sources.

The process which is handling mail messages runs with reduced privileges under a dedicated user ID. Optionally it can run chroot-ed. Risks of buffer overflows and memory allocation bugs is largely avoided by implementing all protocol handling and mail processing in Perl, which handles dynamic memory management transparently. Care is taken that content of processed messages does not inadvertently propagate to the system. Perl provides an additional security safety net with its marking of tainted data originating from the wild, and Amavis is careful to put this Perl feature to good use by avoiding automatic untainting of data (use re "taint") and only untainting it explicitly at strategic points, late in a data flow.

Amavis can use several external programs to enhance its functionality. These are de-archivers, de-compressors, virus scanners and spam scanners. As these programs are often implemented in languages like C or C++, there is a potential risk that a mail message passed to one of these programs can cause its failure or even open a security hole. The risk is limited by running these programs as an unprivileged user ID, and possibly chroot-ed. Nevertheless, external programs like unmaintained de-archivers should be avoided. The use of these external programs is configurable, and they can be disabled selectively or as a group (like all decoders or all virus scanners).

Performance
Despite being implemented in an interpreted programming language Perl, Amavis itself is not slow. The good performance of the functionality implemented by Amavis itself (not speaking of external components) is achieved by dealing with data in large chunks (e.g. not line-by-line), by avoiding unnecessary data copying, by optimizing frequently traversed code paths, by using suitable data structures and algorithms, as well as by some low-level optimizations. Bottlenecks are detected during development by profiling code and by benchmarking. Detailed timing report in the log can help recognize bottlenecks in a particular installation.

Certain external modules or programs like SpamAssassin or some command-line virus scanners can be very slow, and using these would constitute a vast majority of elapsed time and processing resources, making resources used by Amavis itself proportionally quite small.

Components like external mail decoders, virus scanners and spam scanners can each be selectively disabled if they are not needed. What remains is functionality implemented by Amavis itself, like transferring mail message from and to an MTA using an SMTP or LMTP protocol, checking mail header section validity, checking for banned mail content types, verifying and generating DKIM signatures.

As a consequence, mail processing tasks like DKIM signing and verification (with other mail checking disabled) can be exceptionally fast and can rival implementations in compiled languages. Even full checks using a fast virus scanner but with spam scanning disabled can be surprisingly fast.

Adherence to standards
Implementation of protocols and message structures closely follows a set of applicable standards such as RFC 5322, RFC 5321, RFC 2033, RFC 3207, RFC 2045, RFC 2046, RFC 2047, RFC 3461, RFC 3462, RFC 3463, RFC 3464, RFC 4155, RFC 5965, RFC 6376, RFC 5451, RFC 6008, and RFC 4291. In several cases some functionality was re-implemented in the Amavis code even though a public (CPAN) Perl module exists, but lacks attention to detail in following a standard or lacks sufficient checking and handling of errors.

License
Amavis is licensed under a GPLv2 license. This applies to the current code, as well as to historical branches. An exception to this are some of the supporting programs (like monitoring and statistics reporting), which are covered by a New BSD License.

The project
The project started in 1997 as a Unix shell script to detect and block e-mail messages containing a virus. It was intended to block viruses at the MTA (mail transfer agent) or LDA (local delivery) stage, running on a Unix-like platform, complementing other virus protection mechanisms running on end-user personal computers.

Next the tool was re-implemented as a Perl program, which later evolved into a daemonized process. A dozen of developers took turns during the first five years of the project, developing several variants while keeping a common goal, the project name and some of the development infrastructure.

Since December 2008 (until 2018-10-09) the only active branch was officially amavisd-new, which was being developed and maintained by Mark Martinec since March 2002. This was agreed between the developers at the time in a private correspondence: Christian Bricart, Lars Hecking, Hilko Bengen, Rainer Link and Mark Martinec. The project name Amavis is largely interchangeable with the name of the amavisd-new branch.

Much functionality has been added through the years, like adding protection against spam and other unwanted content, besides the original virus protection. The focus is kept on reliability, security, adherence to standards and performance.

A domain amavis.org in use by the project was registered in 1998 by Christian Bricart, one of the early developers, who is still maintaining the domain name registration. The domain is now entirely dedicated to the only active branch. The project mailing list was moved from SourceForge to amavis.org in March 2011, and is hosted by Ralf Hildebrandt and Patrick Ben Koetter. The project web page and the main distribution site was located at the Jožef Stefan Institute, Ljubljana, Slovenia (until the handover in 2018), where most of the development was taking place between years 2002 and 2018.

Change of Project Leaders Announcement
On October 9 of 2018 Mark Martinec announced at the general support and discussion mailing list his retirement from the project and also that Patrick Ben Koetter will continue as new project leader.

"I know Ben personally, he is one of the two authors of The Book of Postfix, and uses Amavis in his professional life too, so I think the project will be in good hands."

- Mark Martinec

After that Patrick notified the migration of the source code to a public GitLab repository and his plan for the next steps regarding the project development.

Branches and the project name
Through the history of the project the name of the project or its branches varied somewhat. Initially the spelling of the project name was AMaViS (A Mail Virus Scanner), introduced by Christian Bricart. With a rewrite to Perl the name of the program was Amavis-perl. Daemonized versions were initially distributed under a name amavisd-snapshot and then as amavisd. A modular rewrite by Hilko Bengen was called Amavis-ng.

In March 2002 the amavisd-new branch was introduced by Mark Martinec, initially as a patch against amavisd-snapshot-20020300. This later evolved into a self-contained project, which is now the only surviving and actively maintained branch. Nowadays a project name is preferably spelled Amavis (while the name of the program itself is amavisd). The name Amavis is now mostly interchangeable with amavisd-new.