Tek's Domain

#<NTA:NnT:SSrgS:H6.6-198:W200-90.72:CBWg>

We Saved Space, but at What Cost?

2021-04-30 6 min read Behind the scenes Fails Teknikal_Domain Unable to load comment count

Anyone notice that the featured images on post headers seemed to be a bit… well, bad? So did I, and I only just realized why that actually happens. And ironically, it was in the name of improvements. Luckily, with a little help from an upscaling AI, that’s not as bad of a problem, for now.

So for some context, Hugo (the site generator I use) is capable of processing static images before outputting them. In this theme, that’s used to re-scale featured images to the correct box size. With the theme as it stands right now (with my modifications), that means images get resized to 1000x500 pixels. This is because the content area, where you can, well read content, is 1000 pixels. However, by default it’s… 700. Which would be fine, except I wanted to save a little space

Saving Bytes

Everything on here is in a Git repository. Posts, config, theme, everything except images that you see within some post content, since those are generally moved to my CDN. Well that means that, yes, featured images (and gallery images) have to be in here too. Now if you’re not aware, you usually don’t want to put large(-ish) files into Git, since that makes every subsequent clone after that have to download every large file you’ve put into the repository, ever. Even if it got removed in some commit, the file is still there in previous commits. Even though we do have a way of dealing with this (and if you use Git a lot, you know what I’m going to say), I still made the decision to keep things small. As such, when I have featured images, I’ll crop them down myself, since they’re smaller for the same end result.1 Can you see where this is going?

Before I changed the content area width to 1000 px, I was still cropping things down myself. Which meant that after I increased it, now it has to rescale 700x350 px images to 1000x500. If you don’t see the issue yet, that’s not going to scale without some noticeable artifacts. Like JPEGs gone really, really wrong. Yeah, you’re going to be seeing something from that scale. Even worse, I found one image that was 640x360, being scaled by… well, too much. Everything looked just fine in the past, but changing one Sass variable was all I needed to bring that crashing back down to the land of blocky, blotchy, unappealing-looking images.

The Smart Way

The easiest way of solving my original problem, that I hinted at earlier, is Git LFS . Git LFS is a third-party tool that integrates with Git to sidestep the large checkout problem I described earlier. With Git LFS, instead of committing the actual entire file to the repository, you commit a pointer of sorts, containing a content hash. The actual file can be uploaded to some other bulk storage service, like Amazon S3. Because of some of the various hook scripts Git lets you execute at certain points in its operation, Git LFS can transparently rewrite large files to pointers before committing, and then replace those pointers with real files after checkout. The actual checkouts and clones are as fast as anything else, and at the end, Git LFS downloads all the large files at once.

If you want an example, here’s a pointer file that gets committed in place of a large file:

version https://git-lfs.github.com/spec/v1
oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
size 12345

That’s it. And that’s what gets transparently swapped every git push or git pull. With this, I could have easily just… not cared, throw all the full size images into the repository. I already have Git LFS tracking PNG files since I have to have those for galleries and featured images, so it’s not too bad to just lump the rest in as well. Well, here’s the problem: that data still needs to be downloaded. What that means is that if I’m on a slower connection, not at home, on a different device, it still has to download each and every file to get me to a working state. Sure, the Git repository itself is small, but that step has to be done. Plus, that means that all image requests are coming from my origin, meaning that I have to have the bandwidth for all the image requests, I have to have the storage space, and I’m footing the bill for all the data transfer.

However, the CDN(-like) solution I went with for this instead means that Amazon, who has a lot larger storage capacity and network pipe, has the burden of dealing with this, and only as far as Cloudflare needs stuff, assuming it’s not cached in Cloudflare. The additional reason for a CDN-based solution is that… I don’t host just images there, other files, like zips of STLs for 3D printing, can be put there too, same as anything else, and given slightly more special handling, while again, taking the burden of storage and transfer off me.

The Solution

I don’t actually have the originals for half the images at this point, because they were taken so long ago that my Syncthing cluster deleted them from old age. What this means is that I have to find some way of making the images bigger without making them look worse. I have to create detail where there is no detail to use. Naturally, we have AI for this.

There’s plenty of AI-upscaling services, free and paid, where you upload your images to the site, it processes them, and can then spit back out an image that’s something like 2x to 8x the original resolution, with little drop in quality. There will be some since AI is not perfect, and things can happen, but it’s going to be a bit more intelligent about it than any of the upscaling modes you’ll find in, say, Photoshop. I did have to make a small, one-time purchase, just to have enough image credits on one service to get them all processed, but at that point all I had to do was drag and drop them all, optimize for photos (not drawings), set the upscale factor to 2x, and hit go. Of course, once done I immediately downscaled everything down to 1000x500, only because, again, space savings, and I can’t really increase my content area much beyond 1000px without incuring other issues. No images were large enough to warrant a 4x upscale, and some were at just odd enough resolutions that I had to manually crop them correctly, then save.

But at the end of all this… featured images look good again, and I (mostly) learned my lesson, don’t assume you’ll never change things (or, at least, keep the originals around in case you do).


  1. Well, and Hugo can have a few issues processing itself, cropping instead of rescaling, usually lopsided, and not where you wanted it to be in the first place. If you want something done right, do it yourself! ↩︎

comments powered by Disqus