Comparing IMAP and POP

For those of you that have ever tried to setup email on your phone that’s more than just GMail, you’ve likely been asked how you want to access the account, either POP or IMAP, and were likely told the difference is that “IMAP keeps messages on the server.” Well, let’s go over the specific differences here, to actually give a more complete understanding of what each protocol is like.

IMAP

We’ll start with the most common, the Internet Message Access Protocol, or IMAP. The current version is, as denoted by servers, is IMAP4rev1. IMAP, as a protocol, is very extensible, and is made of commands and responses. A client can send commands like LOGIN, SELECT,LIST, and FETCH to an IMAP server, who will respond in turn. Clients and servers can also talk before the other has finished: Each client command has a “tag” that it begins with, which is just an alphanumeric string of no real importance (known in many places as an ‘opaque identifier’). When the server wants to reply to a specific command, it will echo that tag back in its response, so the client knows which command it’s referring to. Because of this, clients may send commands before the server has finished processing and replying. Additionally, servers may send “untagged responses”, which are usually some number, and a keyword, indicating what they are, as updates to the client. While some commands can cause untagged responses, they may also come with no provocation from the server itself.

Here’s an example IMAP exchange, from the RFC:

S lines came from the server, C lines came from the client

S:   * OK IMAP4rev1 Service Ready
C:   a001 LOGIN mrc secret
S:   a001 OK LOGIN completed
C:   a002 SELECT INBOX
S:   * 18 EXISTS
S:   * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
S:   * 2 RECENT
S:   * OK [UNSEEN 17] Message 17 is the first unseen message
S:   * OK [UIDVALIDITY 3857529045] UIDs valid
S:   a002 OK [READ-WRITE] SELECT completed
C:   a003 FETCH 12 FULL
S:   * 12 FETCH (FLAGS (\Seen) INTERNALDATE "17-Jul-1996 02:44:25 -0700"
      RFC822.SIZE 4286 ENVELOPE ("Wed, 17 Jul 1996 02:23:25 -0700 (PDT)"
      "IMAP4rev1 WG mtg summary and minutes"
      (("Terry Gray" NIL "gray" "cac.washington.edu"))
      (("Terry Gray" NIL "gray" "cac.washington.edu"))
      (("Terry Gray" NIL "gray" "cac.washington.edu"))
      ((NIL NIL "imap" "cac.washington.edu"))
      ((NIL NIL "minutes" "CNRI.Reston.VA.US")
      ("John Klensin" NIL "KLENSIN" "MIT.EDU")) NIL NIL
      "<[email protected]>")
       BODY ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 3028
       92))
S:    a003 OK FETCH completed
C:    a004 FETCH 12 BODY[HEADER]
S:    * 12 FETCH (BODY[HEADER] {342}
S:    Date: Wed, 17 Jul 1996 02:23:25 -0700 (PDT)
S:    From: Terry Gray <[email protected]>
S:    Subject: IMAP4rev1 WG mtg summary and minutes
S:    To: [email protected]
S:    cc: [email protected], John Klensin <[email protected]>
S:    Message-Id: <[email protected]>
S:    MIME-Version: 1.0
S:    Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
S:
S:    )
S:    a004 OK FETCH completed
C:    a005 STORE 12 +FLAGS \deleted
S:    * 12 FETCH (FLAGS (\Seen \Deleted))
S:    a005 OK +FLAGS completed
C:    a006 LOGOUT
S:    * BYE IMAP4rev1 server terminating connection
S:    a006 OK LOGOUT completed

Note the * lines from the server, which are the ‘untagged responses.’ Additionally, multi-line responses are prefixed with {n}, which is their exact byte-count. (The large response from FETCH 12 FULL has line breaks for clarity when reading, in the protocol, it’s one long line, hence no byte counter.)

IMAP is a complex protocol, supporting multiple ‘folders’, messages may have arbitrary “flags” on them like seen, deleted, junk, etc. (though some are standardized), and messages carry a lot of metadata, as the FETCH FULL command would show you. IMAP is also, as stated, extensible, meaning that the protocol can have extra commands and behaviors added on later. IMAP supports standard SASL authentication methods (as supported by the server), and is a pretty stateful protocol, expecting the client to keep track of many pieces of data at once.

POP

POP, or, Post Office Protocol, current version POP3, is an older, more simpler protocol for the same purpose of accessing your mail. POP operates in lock-step, with one client command getting one server response, and the client must wait for each response before sending a new command. POP by itself wasn’t originally extensible, and didn’t have any concept of advanced login measures like SASL, though it did have its own called APOP that I’ll get to in a second. It’s also a generally simpler protocol, as demonstrated:

S: <wait for connection on TCP port 110>
C: <open connection>
S:    +OK POP3 server ready <[email protected]>
C:    APOP mrose c4c9334bac560ecc979e58001b3e22fb
S:    +OK mrose's maildrop has 2 messages (320 octets)
C:    STAT
S:    +OK 2 320
C:    LIST
S:    +OK 2 messages (320 octets)
S:    1 120
S:    2 200
S:    .
C:    RETR 1
S:    +OK 120 octets
S:    <the POP3 server sends message 1>
S:    .
C:    DELE 1
S:    +OK message 1 deleted
C:    RETR 2
S:    +OK 200 octets
S:    <the POP3 server sends message 2>
S:    .
C:    DELE 2
S:    +OK message 2 deleted
C:    QUIT
S:    +OK dewey POP3 server signing off (maildrop empty)
C:  <close connection>
S:  <wait for next connection>

Unlike IMAP, there are no concept of folders, messages barely have any metadata, if any, and the overall protocol is much simpler to read and handle. In IMAP, your main mailbox, the one that mail goes to by default, is called “INBOX”. In POP, there is no concept of this, that’s just what mail you have. There’s very little extensibility, few advanced commands because “get message” tells you everything it knows, and unlike IMAP, where you have a large number of searching possibilities, POP gives you none. Additionally, a fun fact that not much documentation gets right: POP does not delete messages off the server once you read them.

Now if you’re keen-eyed (and know the protocol), you’d notice that exchange used APOP, or Authenticated POP, which is, originally, the single ‘secure’ authentication mechanism you have, otherwise you had plain-text USER and PASS commands (I hope you’re going over TLS!) APOP works by the server sending a timestamp (like <[email protected]>) in its greeting, which the client will hash together with the user’s password, meaning that the hash is different every connection, and doesn’t transmit the password over the wire. However this didn’t do that much good, the hash algorithm chosen was HMAC-MD5, which, as we know today, MD5 is completely broken, and since the majority of the hash was in plain text right there, which is required to be at the beginning of the hashed contents, in other words, with that example, you’d hash, say, <[email protected]>tanstaaf, to get a MD5 digest of c4c9334bac560ecc979e58001b3e22fb to send in the APOP command. What this really means is that the computational required to crack a user’s password like that is… the same as a usual brute-force, plus a single MD5. It’s better than reading their password right off the wire, but if you already have a hash, you can very easily reverse it with a brute-force check.

Anyways, the interesting part about POP is that, to delete a message, you use the DELE command to mark as deleted. Messages aren’t actually deleted until the QUIT command is given. You do need to explicitly ask a server to delete a message, which is in contrast to what every email client will tell you: You cannot keep messages on the server when you use POP. This is not strictly true, but the idea with POP is that a client will connect, grab all it’s mail, download, and delete. Sort of like, well, a post office — You enter, take all your mail, and now the mail is with you, it’s not in the post office anymore. For clients with intermittent email connections, POP is really good, since all messages are downloaded locally, you can view them at any time, and only need a connection to grab new ones. But, I do find it interesting that what most things will tell you is actually factually inaccurate. It might be true for their implementation, but not for the entirety of the POP protocol.

In comparison, with IMAP, you can “delete” a message by adding the \Deleted flag to it (or more commonly, move it to a ‘Trash’ mailbox or the like), but the message isn’t usually truly removed until an EXPUNGE command is given. However, IMAP expects you won’t delete every message after you read it, because, with IMAP, your emails stay on the server and don’t take up local space. You may need an active connection to read or modify anything, but everything is on the server, which gives IMAP another benefit: Multiple clients can work just fine. With POP’s approach, when one client downloads and deletes, the rest aren’t able to view those messages. With IMAP, since messages aren’t deleted until the user asks, then all clients can view the user’s messages. And about that requirement of a connection to view messages: Not if you download them locally, keep track of changes, and then apply those changes next time you connect. This is, to the best of my knowledge, what the GMail app on your phone does.

Conclusion

IMAP, today, is really the better protocol of the two, feature and security-wise, and most clients and servers are built with the assumptions of IMAP levels of access from clients, with POP available as more of a compatibility feature. Unless you’re only using one device, or see a need to have your messages on the server for a short a time as possible, there’s no reason to use POP at this point, in my opinion.