WebDAV Explained: Filesystems Over HTTP

2021-06-25 6 min read Protocols explained Tech Tech explained Teknikal_Domain Unable to load comment count

So I take it some people reading this are familiar with what I’ll call a ‘remote filesystem protocol’ like NFS, SMB, or AFP. Well, did you know there’s one that’s found use in a few places and you’ve maybe heard of once or twice, and really… well, doesn’t sound like it should make any sense? Welcome to WebDAV. The remote filesystem that runs over HTTP.

Web Distributed Authoring and Versioning (RFC 4918 ) is an extension to HTTP of all things that, in theory, allows people to create, manage, and author resources on a web server. And while this seems to have started as a more advanced way of interacting with / creating webpages, well, apparently someone realized that you can just expose an entire filesystem hierarchy over it, and now you can do almost all filesystem operations over a WebDAV connection. Note that ‘versioning’ isn’t actually part of the protocol, since it was found that it’d be too much work. Thus, WebDAV is more WebDA, with the Delta-V extension adding versioning, later.

WebDAV uses XML for all it’s main communication, while leveraging custom HTTP methods (like COPY, MOVE, PROPFIND, and so on) with custom HTTP headers (If, Depth, Destination…) to allow, what seems to be arbitrary file operations (and locking) on a remote host, where properly authenticated (naturally, this means more HTTP status codes as well). As an example, Nextcloud provides the ability to mount your user share as WebDAV, and most OSes today do have client-side support for it. Windows will allow you to “map a network drive” and gladly take an http(s):// URL, and interpret it as a WebDAV endpoint. This also means that WebDAV can be secured, by using HTTPS. but basically, WebDAV extends the definition of resources to not only include their content, like the bytes in the file, but also to allow ‘properties’, key/value data about a resource. Things like it’s access time, author, size, what-have-you, are ‘properties’ that can be read with a PROPFIND or set with a PROPPATCH request, containing a chunk of XML detailing the properties the client wants, or the properties and values that it wants to write.

Also fun, WebDAV just loves 207 Multi-Status. Any time you attempt to, say, get more than one property, I guarantee the first line of the server reply is HTTP/1.1 207 Multi-Status. This was, of course, introduced with WebDAV, and means that it’s possible (but not guaranteed) that different individual pieces of your request might have different statuses. Parsing through the returned XML will tell you. For example, I might not be allow to access property C, and I can access propertied A and B on a resource. Attempting a PROPFIND for all three would give a 207 reply, with two responses in the XML: a 200 for A and B, and a 401 for C. That is… well, that is something.

Most HTTP servers have support for WebDAV, including Apache, NGINX, Caddy, Microsoft IIS, even lighttpd. Because of this, if you really wanted to give other people access to a file share, and didn’t want to set up NFS or SMB, or just wanted it accessible across the internet with decent security, and user-based authentication basically built-in? Slap WebDAV on it. Problem solved. Though, there is one issue — WebDAV is slow. Not necessarily in file transfer speed, more in ops/sec speed. Because everything is at least one HTTP transaction, and usually you’re going to need a good handful to even just get a directory listing, well, you’re looking at some latency. Every operation requires a new HTTP request, meaning potentially a new connection handshake (and TLS handshake, if used), encoding the data into the DAV XML format, sending over the header block, then the XML, waiting for a response, reading the response headers, then the XML, then parsing the XML, then getting the data out… And because they’re HTTP requests, there’s a lot of data that’s getting duplicated with each request, meaning in other protocols, probably ones that use long-lived connections, you might see some data savings, but in this? No.

So really… it works, for a bit more than its original intended purpose. Sure, compared to something like NFS or SMB it’s slower because HTTP isn’t that fast, and constantly negotiating TLS sessions isn’t that fast, the milliseconds pile up, but, given that we managed to take an established concept, HTTP, and find a way to turn that into a channel to perform arbitrary, authenticated filesystem operations over a secured tunnel, it’s not half bad. And if you ever wondered why Windows, Mac, or Linux will actually try to map a web URL as a remote share when asked, this is why.

CalDAV and CardDAV

Before I go though, let’s talk about two extensions to WebDAV: the vCard Extensions to WebDAV (CardDAV) and the Calendaring Extensions to WebDAV (CalDAV). Both are pretty similar, and just an extension that allows a client to use some fake resource endpoints to sync up either iCalendar data, or vCards for an address book. (And fun fact, Nextcloud supports these too!)

CalDAV

CalDAV (RFC 4791 ) uses the iCalendar format for data exchange, since the server does have to parse some calendar data. If you’re curious, iCal files are text files, not unlike a vCard, and by that I mean they’re identical in structure. Even according to the RFC, here’s an example iCalendar file:

BEGIN:VCALENDAR
PRODID:-//Example Corp.//CalDAV Client//EN
VERSION:2.0
BEGIN:VEVENT
UID:[email protected]
SUMMARY:One-off Meeting
DTSTAMP:20041210T183904Z
DTSTART:20041207T120000Z
DTEND:20041207T130000Z
END:VEVENT
BEGIN:VEVENT
UID:[email protected]
SUMMARY:Weekly Meeting
DTSTAMP:20041210T183838Z
DTSTART:20041206T120000Z
DTEND:20041206T130000Z
RRULE:FREQ=WEEKLY
END:VEVENT
BEGIN:VEVENT
UID:[email protected]
SUMMARY:Weekly Meeting
RECURRENCE-ID:20041213T120000Z
DTSTAMP:20041210T183838Z
DTSTART:20041213T130000Z
DTEND:20041213T140000Z
END:VEVENT
END:VCALENDAR

If you look at the vCard format that I know is going to come a bit further down, you’ll see just how similar they are.

Anyways, CalDAV uses a number of different resource locations to manipulate the calendar, meaning you can do things like look at or create specific events, and do a fair amount of processing server-side without having to download the entire calendar file. But besides that, it’s… just a calendar sync.

CardDAV

This one, RFC 6352 , defines something similar as CalDAV, except it exchanges contact information between two address books through the use of vCards. Compared to CalDAV, there’s not much server-side processing on this one. (Though some things like filtering are implemented server-side) An address book is a ‘collection’ (folder) that just contains vCard files… with some additional properties thrown in, not like CalDAV which defined it’s own method as well. Now, if you don’t know, here’s a vCard, again, from the RFC:

BEGIN:VCARD
VERSION:3.0
FN:Cyrus Daboo
N:Daboo;Cyrus
ADR;TYPE=POSTAL:;2822 Email HQ;Suite 2821;RFCVille;PA;15213;USA
EMAIL;TYPE=INTERNET,PREF:[email protected]
NICKNAME:me
NOTE:Example VCard.
ORG:Self Employed
TEL;TYPE=WORK,VOICE:412 605 0499
TEL;TYPE=FAX:412 605 0705
URL:http://www.example.com
UID:1234-5678-9000-1
END:VCARD

Yeah. iCalendar files and vCard files are basically identical formats that just contain a different category of data. You don’t really think about these things much, do you?

But, that’s WebDAV. I know there’s a few extensions I’ve not gone over, but this isn’t meant to be exhaustive. It’s just meant to be a little bit of an explanation into what WebDAV is since I know if, and I keep coming back to it, you set up Nextcloud, which a number of my friends have, you’ll see enough references that you’ll be a bit curious as to what it actually is.