Consider disabling CPU vulnerabilities mitigation features in our CI builder/tester VMs
Originally created by @intrigeri on #17387 (Redmine)
Recent kernels (including 4.9 LTS that we run on our CI
builders/testers) offer a mitigations=off command line option, that
disables all mitigation features for known CPU vulnerabilities. On my
local Jenkins instance, I’ve measured that setting this option both on
the l0 virtualization host and on the l1 builder/tester VMs gives me a
15% test suite performance boost; this cuts down the feedback loop by
about ½ hour, which makes a significant difference to my developer
experience. On lizard, I don’t think we can reasonably disable these
mitigations on the l0 virtualization host, so this proposal is only
about the l1 builder/tester VMs. We won’t know what kind of performance
improvement we would get in this context without trying.
These l1 VMs are mostly used to run l2 VMs:
- isobuilder:
- The l2 VM is a Vagrant box, that’s been built by the isobuilder itself; that l2 VM is trusted (otherwise we can’t trust the Tails images we publish)
- Given the kind of input data the l2 VM handles (mostly Debian packages) and how it handles it (mostly APT), it seems very unlikely that vulns such as Spectre and Meltdown can be exploited in there, towards escaping to l1 or gaining information about other processes running in the l1 VM.
- The l1 VM runs essentially nothing else than this l2 VM, so a cross-process info leak would have no harmful consequences.
 
- isotester:
- Tails is running in the l2 VM.
- The l2 VM mostly handles trusted input data (e.g. it loads our
website in a web browser). Here as well, the way it handles
untrusted input data (e.g. gpg --recv-key) does not leave much room for exploiting vulns such as Spectre or Meltdown.
- The l1 VM runs essentially nothing else than this l2 VM, so a cross-process info leak would have no harmful consequences.
 
If we’re concerned that mitigations=off increases the risk of l2
escape to l1, and thus increases the risk of lateral progression to more
sensitive VMs hosted on the same l0 virt host, then we could sandbox the
QEMU process that runs l2 in l1 more strictly, by enabling AppArmor for
libvirtd in l1.
Given the above, I think this risk increase is too small to bother, but
I could change my mind :)
Related issues
- Related to tails#17386 (closed)
- 
Blocked by #16960 (closed)