Consider disabling CPU vulnerabilities mitigation features in our CI builder/tester VMs

Originally created by @intrigeri on #17387 (Redmine)

Recent kernels (including 4.9 LTS that we run on our CI builders/testers) offer a mitigations=off command line option, that disables all mitigation features for known CPU vulnerabilities. On my local Jenkins instance, I’ve measured that setting this option both on the l0 virtualization host and on the l1 builder/tester VMs gives me a 15% test suite performance boost; this cuts down the feedback loop by about ½ hour, which makes a significant difference to my developer experience. On lizard, I don’t think we can reasonably disable these mitigations on the l0 virtualization host, so this proposal is only about the l1 builder/tester VMs. We won’t know what kind of performance improvement we would get in this context without trying.

These l1 VMs are mostly used to run l2 VMs:

isobuilder:
- The l2 VM is a Vagrant box, that’s been built by the isobuilder itself; that l2 VM is trusted (otherwise we can’t trust the Tails images we publish)
- Given the kind of input data the l2 VM handles (mostly Debian packages) and how it handles it (mostly APT), it seems very unlikely that vulns such as Spectre and Meltdown can be exploited in there, towards escaping to l1 or gaining information about other processes running in the l1 VM.
- The l1 VM runs essentially nothing else than this l2 VM, so a cross-process info leak would have no harmful consequences.
isotester:
- Tails is running in the l2 VM.
- The l2 VM mostly handles trusted input data (e.g. it loads our website in a web browser). Here as well, the way it handles untrusted input data (e.g. gpg --recv-key) does not leave much room for exploiting vulns such as Spectre or Meltdown.
- The l1 VM runs essentially nothing else than this l2 VM, so a cross-process info leak would have no harmful consequences.

If we’re concerned that mitigations=off increases the risk of l2 escape to l1, and thus increases the risk of lateral progression to more sensitive VMs hosted on the same l0 virt host, then we could sandbox the QEMU process that runs l2 in l1 more strictly, by enabling AppArmor for libvirtd in l1.
Given the above, I think this risk increase is too small to bother, but I could change my mind :)

Related issues

Related to tails#17386 (closed)
Blocked by #16960 (closed)

Edited Oct 13, 2021 by intrigeri

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information