Consider switching SquashFS compression to zstd

Rationale

Compared to our current crazy-size-optimized XZ settings, in theory zstd should give us:

  • faster compression ⇒ improves dev & RM experience
  • vastly faster decompression ⇒ improves performance for users
  • only slightly larger output ⇒ makes install/upgrade downloads take a little longer

Regarding tooling, zstd is supported by:

  • squashfs-tools-ng
  • Linux kernel
  • squashfs-tools 4.4+

Benchmarks

Image size

These tests are with defaultcomp

  • xz (feature/trixie at b1bd804b, i.e. without pre-compiled AppArmor policy):
    • 1768435712 bytes
  • zstd (feature/trixie at b1bd804b, i.e. without pre-compiled AppArmor policy + 855e32bd):
    • 1932941312 bytes i.e. +164MB i.e. +9%

Boot time

Rationale for the process

  • As baseline we use 7.0~rc1 + bringing back the cached AppArmor policy, because this is a more realistic baseline than 7.0~rc1.
  • All tests are done with defaultcomp (release), which is what this issue is now about, since we've already switched the fastcomp builds to zstd.
  • We use the same way to measure boot time as what we do during manual QA for releases: we're familiar with it, it's fair, and it avoids bias due to first-boot repartitioning.

Process

Download

  1. Download the baseline USB image (built from !2361 (merged)) from https://nightly.tails.net/build_Tails_ISO_21105-bring-back-precompiled-apparmor-policy/lastSuccessful/archive/build-artifacts/
  2. Download the zstd22 USB image (built from !2360 (merged)) from https://nightly.tails.net/build_Tails_ISO_18655-zstd-squashfs-simple/lastSuccessful/archive/build-artifacts/

Test baseline

Prepare the baseline USB stick:

  1. Install the baseline USB image to a USB stick.
  2. Boot the baseline USB stick on a bare-metal computer a first time to trigger re-partitioning.
  3. Wait until you see the Welcome Screen.
  4. Shutdown Tails

Then, on every spare computer that you can have access to:

  1. Boot this USB stick a second time, add the login option to GRUB, and measure the boot time (from the GRUB menu until the GNOME desktop is ready).
  2. Take note of the boot time you measured in the table below, in the baseline column.

Test zstd22

Prepare the zstd22 USB stick:

  1. Install the zstd22 USB image to the exact same USB stick that you used to test the baseline image.
  2. Boot the zstd22 USB stick on a bare-metal computer a first time to trigger re-partitioning.
  3. Wait until you see the Welcome Screen.
  4. Shutdown Tails

Then, on every spare computer that you can have access to:

  1. Boot this USB stick a second time, add the login option to GRUB, and measure the boot time (from the GRUB menu until the GNOME desktop is ready).
  2. Take note of the boot time you measured in the table below, in the zstd22 column.

Repeat with a different USB stick

Repeat the Test baseline and Test zstd22 sections with another USB stick. And another one. And maybe that'll be enough. Please stay hydrated.

Results

Computer

USB Stick

Baseline

zstd22

Seconds saved

Percent saved

ThinkPad X1C6 Kingston 72 57 15 21%
ThinkPad X1C6 Toshiba 78 63 15 19%
ThinkPad X1C6 Generic 92 81 11 12%
ThinkPad X250 Kingston 90 76 14 16%
ThinkPad X250 Toshiba 91 77 14 15%
ThinkPad X250 Generic 103 93 10 10%
ThinkPad X201 Kingston 102 75 27 26%
ThinkPad X201 Toshiba 103 76 27 26%
ThinkPad X201 Generic 111 87 24 22%
ThinkPad X200 Kingston DataTraveler 3.0 (PMAP) 110 82 28 25%
ThinkPad X200 ADATA USB Flash Drive (1.00) 111 82 29 26%
ThinkPad X200 TOSHIBA TransMemory (1.00) 113 83 30 27%
HP EliteBook 840G1 NVMe in USB enclosure 82 60 22 27%
HP EliteBook 840G1 Kingston DataTraveler 3.0 (PMAP) 78 59 19 24%
HP EliteBook 840G1 ADATA USB Flash Drive (1.00) 76 57 19 25%
HP EliteBook 840G1 TOSHIBA TransMemory (1.00) 78 60 18 23%

Compression speed

New tests

Builds take a few minutes less on Jenkins (dragon, iguana). Not a game changer.

Old tests

Note: these are results from the 4.x era.

system compression time to compress SquashFS build time saved
sib xz (release) 537s n/a
sib zstd (release) 299s 238s
ant01 xz (release) 404s n/a
ant01 zstd (release) 220s 184s
lizard xz (release) 910s n/a
lizard zstd (release) 519s 391s
iguana xz (release) 393s n/a
iguana zstd (release) 263s 130s
intrigeri's laptop xz (fast) 265s n/a
intrigeri's laptop zstd (fast) 170s 95s
boyska's desktop xz (fast) 62s n/a
boyska's desktop zstd (fast) 33s 29s

Conclusions

release (defaultcomp)

We cannot switch to zstd as-is: the image grows too much. (UPDATE: actually, we can, see discussion on #21100 (closed). So what's below is moot.)

We have 2 options:

With squashfs-tools

Append, to our SquashFS sort file, the files that are not added to it by boot-profile, grouped by filename extension. This can save about 5% on the image size.

I tried this and saw no impact at all on the image size:

With squashfs-tools-ng

Branches:

If the former is not sufficient, then we could switch to squashfs-tools-ng, if upstream can add advanced compression options (bcj filters, dictionary) to the zstd compressor, that would give us ISO/USB image not much greater than our current ones, with better performance on users' systems (and hopefully not slower to build on CI and RM's machines). For this to happen:

  • Base this work on feature/bullseye or backport squashfs-tools-ng to Buster
  • Build a version of our live-build fork that uses gensquashfs
  • Advanced zstd compression option: request this from upstream
  • SquashFS file ordering
  • Replace mksquashfs-excludes
    • First, narrow down the problem a little bit (!649 (merged))
      • Update mksquashfs-excludes: some of these excludes are obsolete.
      • Delete as many of these files as possible via config/chroot_local-hooks/
    • Remove as much as possible via config/binary_rootfs/excludes: this covered all the remaining excluded files

dev (fastcomp)

It seems switching has only benefits → !643 (merged).

Edited by intrigeri