VPN Protocols Explained

2021-06-28 9 min read Networking Protocols explained Tech explained Teknikal_Domain Unable to load comment count

So I’ve talked a little bit about VPNs before, but I didn’t go into too many details. Here’s a look into the top 5 common VPN protocols, and how they work: SSTP, L2TP(/IPsec), IPsec, WireGuard, and PPTP.

SSTP (Secure Socket Tunneling Protocol)

AKA what 90% of all “VPNs” are, the basis of OpenVPN, and many others. SSTP is primarily a client VPN, not site-to-site. SSTP is also, if you can’t guess by the name, fundamentally based on TLS (formerly known as SSL, the Secure Sockets Layer) to provide the security. SSTP also allows for client authentication through the Extensible Authentication Protocol (EAP), CHAP, client TLS certificate validation, and many others. Because the underlying security mechanism is TLS, the strength of the tunnel is entirely up to the TLS version and selected cipher.

SSTP transports raw Point-to-Point Protocol (PPP) data across a private tunnel. PPP, which will come up more than once, is a layer 2 protocol (like Ethernet) that’s a direct communication link between two hosts, with no networking in between. In this case, it’s connecting the client device to the VPN endpoint ‘directly’ from one end of the tunnel to the other, and by being a layer 2 protocol, it has a lot of flexibility in what it physically can tunnel for VPN traffic.

SSTP packets come in two flavors: control and data. Besides that, there’s very little actual overhead in the SSTP protocol itself. Most of what you experience is up to the performance of the underlying network, and the TLS libraries in use.

L2TP (Layer 2 Tunneling Protocol)

Commonly paired with IPsec, thusly called L2TP/IPsec, L2TP is a protocol that also encapsulates and transports PPP data, but L2TP by itself provides no encryption or security, hence the pairing with IPsec. An active L2TP connection is called a tunnel, and either endpoint may create multiple separate sessions within one tunnel.

Like SSTP, L2TP packets are either “control” or “data”. And unlike SSTP, which can run across TCP or UDP, L2TP is specifically UDP-only. Also, the L2TP header is a little more substantial than the SSTP header, having a few more fields like control message sequence numbers, but it’s not that large otherwise.

The main advantage to using L2TP/IPsec over plain IPsec is that L2TP can carry more than just IP traffic, and, if you’re using IKEv1, plain IPsec can only carry the same version of the Internet Protocol that’s used for the IPsec tunnel to begin with. And, when using L2TP with IPsec, IPsec is capable of handling the authentication and encryption, meaning it supports the rather large array of authentication options that IPsec has to choose from.

IPsec (Internet Protocol Security)

The complicated one. IPsec can run in two modes, AH, or ESP. AH, or Authentication Header, provides no encryption, but authenticates packets going in both directions, and checksums the header values so they can’t be modified. ESP, or Encapsulating Security Payload, provides encryption. IPsec is fully capable of being a site-to-site style VPN, and the protocol itself isn’t really specific to VPNs like the previous two were.

An IPsec tunnel is called an “association.”¹ IPsec associations can run in one of two modes: transport mode and tunnel mode. In transport mode, the original IP packet is modified with the extra AH or ESP data, checksummed, and sent.. However, this means that transport mode is incompatible with NAT since it would change the header data, and invalidate that checksum. In tunnel mode, the entire original IP packet is encapsulated within another IP packet, containing the IPsec information. Therefore, ESP in tunnel mode is the choice for VPNs.

Because of the specifics of IPsec, we might also need another layer of encapsulation, for NAT traversal, or NAT-T. IPsec doesn’t use a transport protocol like TCP or UDP, it’s two custom protocol numbers for raw IP packets. Besides NAT messing with things, that oddity can cause issues. NAT-T wraps the entire payload (again!) in a UDP packet destined for port 4500. This UDP packet isn’t checked, meaning that it can be modified by NAT, or sent across switching equipment that’s not expecting the different IP protocol identifiers.

IPsec tunnels are set up using two protocols, the main one being Internet Key Exchange (IKE), which is used within the larger framework of the Internet Security Association and Key Management Protocol (ISAKMP), which communicates on UDP port 500. IKE negotiations happen in two phases, named, well, phase 1 and phase 2.

Phase 1

Phase 1 authenticates the endpoints, and then performs a Diffie-Hellman key exchange, to generate a shared secret for phase 2. At this point, the two peers are capable of negotiating securely. This takes place, usually, with three exchanges:

The first selects a mutual encryption and hash algorithm
The second performs the key exchange
The third (now able to be encrypted) verifies the endpoint’s identity, before phase 2 begins.

Phase 2

Phase 2 is used to negotiate the security associations for higher level operation, such as negotiating keys and routing rules for IPsec. Technically, as long as your IPsec association exists, you have an active IKE phase 2 association, since IPsec usually has periodic re-negotiation, and the same phase 2 is used to facilitate that,

WireGuard

WireGuard (WG) is… special. Not necessarily in a bad way, more of that its theory of operation is a bit different. WG is meant to be simple. Simple to write an implementation for, and simple to setup. WG is a UDP-only protocol, that uses public-key authentication to form peer-to-peer connections. What this means is that WireGuard is actually pretty flexible in its allowed configurations, everything from simple point-to-point links to a full mesh network.

WG is built on the Noise protocol framework, and has a lot of impressive names to brag about when it comes to standards in use for encryption. Both peers (endpoints) designate the other’s public key as allowed, and specify the address that peer is allowed to access. This also means that WG can be used as a split-tunnel type VPN, where only some traffic goes through the VPN, or a complete VPN where all traffic passes through the VPN. WG connections start with a handshake and key exchange, and this handshake will be periodically refreshed with new keys, providing forward secrecy. There’s no standard port for WireGuard (that I’m aware of), but it can be freely changed around to any UDP port, which is helpful since correctly firewalling it takes a little bit of know-how. Because peers are identified by public key (and with an optional pre-shared key, A.K.A. a password, if desired), there’s no way to configure per-user authentication, it’s really more of a per-host authentication scheme, unless you’re taking each user’s key and establishing a new peer association with each user’s generated key.

Now, as much as I like WireGuard, and use it as my primary remote access VPN for my network, it does have some downsides:

The theory of operation is different enough from other protocols that you might need to work a little harder to understand it in relation to those.
The nature of the peer key authentication means that you more or less require a configuration change for every new client, unless clients are re-using configuration files themselves. (Meaning, every new device I want to connect means a new line in the WireGuard Peers tab of my firewall, I can’t just configure a user/password and use that anywhere I like). While this can be automated (I assume) using an out-of-band authentication scheme, or just pushing approved config to the client, again, OOB, it’s just not as simple as the others.
Networking issues are difficult to troubleshoot. Because of the specification, any invalid, malformed, or unauthorized packets are silently dropped. On the receiving end, all you see is that your “received bytes” count stays at 0, when you should see 2-3 KiB on both sides for a successful connection, near instantly. You need to look at the other peer to see what its logs say (if any) and figure out the issue. As an example, I accidentally added a space to the PSK field when adding the config block to my phone. My phone would reject the handshake response packets from the firewall since the PSK didn’t match, meaning even though I saw traffic on both ends, there was no connection. I had to look at the debug logs for it to mention it’s explicitly dropping packets because a bad PSK. But, because that’s its failure mode, that means there’s almost no distinction between, say, a bad firewall or NAT rule, and a mismatched config, without carefully looking through logs on both ends.

PPTP (Point to Point Tunneling Protocol)

I’m putting this last because you should not use it. PPTP is a complete security mess and is about as broken as… I don’t know, SSLv2? WEP? If me saying “PPTP is as secure as WEP”² doesn’t make you realize how broken it is, let me say this: using PPTP, seriously, is basically just plain L2TP with extra steps — You’ve got no solid encryption, and any password you used to authenticate can likely get cracked easily, I hope you don’t reuse passwords… then again, now someone can just login to your VPN without any problems.

PPTP used TCP port 1723 for a control channel, and would send PPP frames within a Generic Routing Encapsulation (GRE) tunnel. PPTP, like L2TP, doesn’t natively define encryption or authentication. It’s up to the PPP stream and any tunneled data to implement that, and the default implementation does, using protocols like PAP (doesn’t encrypt your password), CHAP, MS-CHAP v1 (broken), and MS-CHAP v2 (not as broken, but… broken).

Microsoft was the one to start this, and as such, Windows has had native support for PPTP. But, please. do not use this. PPTP just does not make sense at this point in time. Pick something else.

Actually, it’s two: one for A -> B communication, and one for B -> A communication. ↩︎
Yeah, it’s closer than I thought. Both PPTP and WEP are based on the RC4 encryption cipher, which is fast but weak. Actually since PPTP uses the same cipher key for both data channels, that means you can pretty quickly reverse out the key and break the encryption, just like you can reverse out a WEP key (aka, the network password) with enough passively collected traffic. ↩︎

Tek's Domain