Email Filtering With Sieve

Sieve, defined in RFC 5228 , is a programming language constructed for the express purpose of filtering email messages. And, on a Sieve-enabled server, it can do a lot of work.

Sieve is a relatively simple language: It’s not Turing-complete (meaning, you cannot perform arbitrary calculations), there is no looping or function calling constructs (almost), and variables aren’t in the base spec, they’re an extension. Sieve is meant to be completely safe from a server-side point of view. Since the entire purpose is running user submitted code server-side, the language was built such that one cannot reasonably harm the server it’s running on.

Here is an example Sieve script:

require "fileinto";

if address :is "From" "[email protected]" {
    fileinto "Junk";
}

Which can be read as “If the message in question was from [email protected], then put it in my Junk folder.” (If no action was explicitly specified, an implicit keep is executed, which saved the message to your INBOX, untouched.)

You can see that there’s a way to request extensions (require), one major flow control construct (if / elsif / else), some ways of matching pieces of data, and then some ways of manipulating where the message ends up. There’s also actions for “delete the message completely” (discard), or “Respond with another message”, or, a few things. My server, for example, handles email forwarding and out-of-office automated replies using some sieve scripts of its own. But if you look, there’s not much you can actually do with a script like this, which makes it perfect.

Sieve scripts are, when implemented correctly, executed when an email is received for a user, meaning they execute after the message has been received and accepted, but before it’s delivered to your INBOX. Scripts may check various addresses, arbitrary header contents, some rough checking of the email body, and a few meta-variables like total message size. Note that a user may only have one active (meaning, executed on new message) script at a time, but may have multiple scripts on the server at once. A later extension allows one script to call another script, meaning you can separate actions into separate files, but there is no way to pass data back and forth besides the in-progress email message.

The Managesieve protocol

Managesieve, running on either TCP port 4190, or the legacy port, 2000, is a network protocol designed to handle managing Sieve scripts on a remote host, supporting some features like SASL authentication and STARTTLS capability. Managesieve allows you to perform some basic actions:

  • Upload an entire script to the server
  • Download an entire script from the server
  • Delete a script file
  • Check a script for errors
  • Mark a script as active

And that is about it, pretty rudimentary. The GETSCRIPT, PUTSCRIPT, and CHECKSCRIPT commands all have one argument which is just… a single string representing the script contents, meaning that any change requires uploading the entire thing over the old one, there are no fancy techniques like sending diffs. If a mail server doesn’t have some way of managing Sieve scripts in their web interface, or the interface is limited to less than the full power of Sieve, then Managesieve can be used with a capable client (like the Thunderbird extension-turned-standalone) to manage your scripts that way.

Extensions

As I mentioned, Sieve is extensible, as it has been over many RFCs, and scripts can use the require keyword to include a certain extension in a script. By default, the Sieve interpreter should assume no extensions.

Some extensions are special, like Dovecot allows you to send email messages to an external executable program, which sounds like a security issue… except you have to realize that it’s only pre-defined scripts in a folder that users shouldn’t have access to, again, meaning that anything “unsafe” is mitigated server-side. For example, here’s the script I use for a test automated email, named special-funcs:

require ["include", "vnd.dovecot.pipe"];

# Filter list for automated "bot" addresses.
# It's expected that a service will be pipe'd, and is responsible
# for sending it's own response back, if any.
# (Otherwise sieve would add a bunch of other unnecessary headers)

# rule:[Quota check service]
if address :localpart :is "to" "quota-check"
{
  pipe :try "check_quota";
}

# Discard all original messages (I don't need them)
discard;
stop;

And in my active script all I need to say it this:

if allof (address :contains "to" "[email protected]", address :contains "from" "@mail.tdstoragebay.com")
{
	include :personal "special-funcs";
	stop;
}

So when an emai, from, say, [email protected] send an email to [email protected], then the active script will pass the message off to special-funcs. That script will check if the user of the TO field is quota-check and it’ll execute the file “check_quota” by just piping the entire email message in, headers and all, on STDIN. The :try means that it’s not an error if this returns a failure code, it’ll just proceed to ignore it. The reason for special-funcs would be so that, in this case, I can have one script with the pipe extension that knows what addresses relate to what executables, and then the main script just knows “if it matches any of these, let this other script handle it” since that one is likely going to be rather large without that.

IMAPSieve

An extension to, not only Sieve, but also, IMAP, is… interesting. The IMAPSieve extension (RFC 6785 ) defines a new capability for IMAP, IMAPSIEVE, that actually has an argument: a sieve:// URL, which defines the Managesieve endpoint that manages Sieve scripts for a particular server. From there, it depends on the IMAP metadata extension, using metadata items /shared/imapsieve/script on a particular mailbox, which contains the name of a Script file to execute when the IMAP APPEND and COPY actions are executed, and flag changes. The Sieve script will need to require the imapsieve extension, and some data that might be relevant is in the Sieve environment, in the environment extension… so yeah, there’s a lot of dependencies here, but it comes with the benefit that you can run scripts on IMAP actions and not just mail delivery. One small issue is that… no client that I’ve seen actually supports this, so it’s something you’d have to do manually if you dared type raw IMAP commands by hand, but it’s possible nonetheless.

In Summary

If your email provider supports it, Sieve is an easy way of creating either very simple, or very complex filtering for your emails that can modify their flags, headers, destination mailbox, or just discard them completely, adding a nice way or organizing your emails or just… completely removing spam from known bad addresses if you feel like it. A cool feature that’s not often seen in many places, once you get used to the language, it’s very useful if you have any level of serious email usage.