How Time Machine Works

2020-08-02 7 min read Software Tech Tech explained Teknikal_Domain Unable to load comment count

For those unaware, Time Machine is the built-in backup application within macOS that will take incremental backups of your system to an Apple Time Capsule, or another local disk. Simple premise, almost as simple in execution. Let’s take it apart, shall we?

Usage

Time Machine is meant to be set up to run automatically, taking hourly backups, for the last 24 hours, then it condenses them to daily backups for the last month, and weekly backups for everything past that. Once the disk that Time Machine is backing up to is full, it’ll start purging the oldest backups to make space.

The point is that you can pick through any one of these snapshots of your system and pull files back out if you find you need them again… That’s just the point of backups.

The purpose of allowing network shares is, besides backing up to your other Macs, like backing up your laptop to your main iMac, is for the Time Capsule, which was a (now discontinued) router that had NAS features in it, part of the AirPort product line. if you couldn’t guess by the name Time Capsule, it was meant to allow for storing all your Time Machine backups on a central point in your home network.

Requirements

For reasons that we’ll talk about later, only Journaled HFS+ (HFS+J) volumes can be Time Machine destinations, APFS will not work. Additionally, if going over the network, it needs to be either a shared folder from another Mac, a correctly-advertised AFP server, or a correctly-advertised SMB3 server with certain extensions.

Technical Details, Client

Time Machine creates incremental backups via hard linking, essentially two ‘files’ that are configured to point to the same physical piece of data on disk. Like sticking two labels on the same drawer, either name, same thing inside.

Time Machine creates folders for each backup iteration, copying over files that have changed since the last run, and hard linking other files that haven’t changed to the last backup, meaning they take up no additional space on the backup destination. It will create multiple hard links to directories, which is something that most filesystems do not support… such as APFS. Hence the requirements for HFS+J. This also means that copying over backup destinations isn’t exactly easy, you’ll likely destroy most of the data, because of that filesystem limitation, unless you’re copying from HFS+J to HFS+(J).

Because of the services running on a standard Mac, the system is already recording file modifications, meaning that Time Machine is able to consult the local system for modified files, a much faster solution than the common one: Scanning every file and folder’s modification time attribute to see if it’s been changed.

Changes for Network Backups

When backing up to a network device, the process changes just a little. instead of copying files over directly, Time Machine creates what’s called a sparse bundle that holds everything, which acts like a layer of isolation so the underlying filesystem of the network share won’t provide any issues.

Sparse Images

A sparse image is, for tech people, essentially a thin-provisioned disk image. For non-tech people, this means it’s a disk image file that grows in size as data is added. And a side note on disk images, which I did cover before a little bit, a disk image is a single file that is essentially treated like an entire hard drive. Every .dmg (disk image) file you download is effectively, once mounted and ready… just a folder. But in the case of Time Machine, it’s your backup folders, and the disk image file is managed with HFS+J, so, again, the underlying filesystem that said image file is stored on is irrelevant. Sparse images are denoted with the .sparseimage extension.

Sparse Bundles

Starting with OS X Leopard, sparse bundles became a thing. Bundles are a macOS feature that I’ve yet to cover, but every app in your Applications directory is a bundle (for Hackintosh people, kexts are also bundles). Bundles are, for all you need to know right now, a folder that macOS just pretends is a file, and lets you, the user, treat it like it’s just a single file.

Sparse bundles are like sparse images, except instead of being one large single file as a disk image, contain multiple 8 MiB (8,388,608 bytes) files called bands. This split allows things like, yes, Time Machine, to operate a little faster: instead of requiring updating and writing an entire multi-gigabyte file for any change, then only the bands that get affected will be written, meaning writes go from thousands of megabytes for an edit to multiples of… 8.

It also makes them easier to copy around, since it’s a lot of smaller file transfers instead, which, in theory, can happen in parallel, speeding up the process too.

Technical Details, Server

For Time Machine to recognize a network share as a valid Time machine backup destination, it needs to meet a few criteria. Ignoring the “shared folder from another Mac” option, let’s take a look at the requirements otherwise.

Correct Advertising

Time Machine takes advantage of Bonjour, their name for zero-configuration networking, commonly called zeroconf, to find valid destinations. This takes the form of DNS Service Discovery (DNS-SD) over multicast DNS (mDNS), where each device on the local network can make DNS-like queries to other devices to see what services the offer and some options for them. These queries do operate on somewhat standard DNS records. The only other piece needed (right now) is that the most common local domain name is local. That can change, but it’s a well-used default.

Time Machine will send out a query for a _adisk._tcp.local TXT record, and any device that is advertising shared Apple volumes should respond with something like this: dk0=adVN=Test Time Machine Volume,adVF=0x82

Put simply, the adVN (Apple Disk Volume Name?) is what name to show to you in the list, and adVF (Apple Disk Volume Flags?) represents certain flags set on this particular disk. in this case, 0x80 is the flag for SMB, and 0x02 is the flag for Time Machine capable. (I don’t know the flag for AFP).

Assuming it gets that, when you tell it to start, it’ll query for a _smb._tcp.local SRV record from that device, which will tell it what port to connect to, usually 445.

After this… it’s just up to the transfer protocol to work its magic. And as I’ve said multiple times, there’s two at work: SMB (version 3) and AFP. AFP is obsolete, but it’s still supported.

SMB

Server Message Block (SMB) is Microsoft’s (Actually, originally, IBM’s) remote filesystem protocol, and as such, is commonly found in Windows networks. Besides providing access to file shares, SMB also has support within windows systems for printer sharing and even IPC communications.

SMB does support protocol encryption (not TLS), and to work with Time Machine, needs a few extensions to the protocol, like the AAPL¹ extension set, support for extended file attributes, and F_FULLSYNC, which… I’m not going to get into here.

AFP

The Apple Filing Protocol (AFP) is a, well, at this point, outdated since macOS 9 (but still rather good) protocol that supports all the features that macOS and Time Machine want to have, natively. The open-source implementation of AFP is Netatalk, though I don’t know much about AFP or it’s implementations, I don’t run a Mac network, and given that it’s an outdated protocol now (Macs have moved to Samba and SMB for most things now), chances are I won’t have much to say for a while.

This post here isn’t covering how to set up a Time Machine server yourself, since I’ll have another one out, hopefully soon, on how to configure a Linux box as a makeshift Time Capsule, but… Samba isn’t happy with me right now, and I’m not going to be putting that one out until I have something that works well enough that I’m comfortable sharing.

Apple uses the abbreviation “AAPL” in a lot of their things, which, funnily enough, is also their stock ticker. ↩︎

Tek's Domain