Harden Tails kernel with security-related kernel parameters
Originally created by @cypherpunks on #11143 (Redmine)
There are a few kernel parameters which can be safely added to the Tails boot command line which increase security at little to no cost, and some of which improve security pretty noticably. Here I present a few kernel parameters which can improve the security of Tails against kernel exploits, their rational, and rough cost in terms of performance, compatibility, or memory footprint. I have been adding these to Tails each time I boot manually for around a year on various machines and have never had any problem with any of them. I hope you’ll consider utilizing them to harden Tails from kernel exploits. If any additional information is needed on any of the options, I will be happy to do more research into them and provide relevant kernel code snippets if necessary.
slab_nomerge
Disables the merging of slabs of similar sizes. Many times some obscure
slab will be used in a vulnerable way, allowing an attacker to mess with
it more or less arbitrarily. Most slabs are not usable even when
exploited, so this isn’t too big of a deal. Unfortunately the kernel
will merge similar slabs to save a tiny bit of space, and if a
vulnerable and useless slab is merged with a safe but useful slab, an
attacker can leverage that aliasing to do far more harm than they could
have otherwise. In effect, this reduces kernel attack surface area by
isolating slabs from each other. The trade-off is a very slight increase
in kernel memory utilization. “slabinfo -a” can be used to tell what the
memory footprint increase would be on a given system.
slub_debug=FZ
Enables sanity checks (F) and redzoning (Z). Sanity checks are
self-evident and come with a modest performance impact, but this is
unlikely to be significant on an average Tails system. The checks are
basic but are still useful both for security and as a debugging measure.
Redzoning adds extra areas around slabs that detect when a slab is
overwritten past its real size, which can help detect overflows. Its
performance impact is negligible. I did consider adding the P value
which enables poisoning. Poisoning writes an arbitrary value to freed
objects, so any modification or reference to that object after being
freed or before being initialized will be detected and prevented. This
prevents many types of use-after-free vulns at little perf cost.
Unfortunately, the default poison value points into userland and might
make exploitation easier on systems without SMAP (aka most systems), so
I excluded the P. I’ll look into it more to see if the trade-offs
(increased vulnerability to dereferencing into userland memory in
exchange for increased resistence to UAFs) are worth it, but until then
I left it out to be safe. An additional note: any time slub_debug= is
put in the kernel command line, slab_nomerge is implied. But having
slab_nomerge explicitely declared can help prevent regressions where
disabling of debugging features is desired but re-enabling of merging is
not.
vsyscall=none
Virtual syscalls are the obsolete predecessor of vDSO calls.
Unfortunately, both vsyscall=native and vsyscall=emulate (the default)
have a negative security impact, with the latter a little less so.
Namely, they provide a target for any attacker who has control of the
return instruction pointer, which is increasingly common these days now
that attackers need to resort to ROP and similar attacks which target a
process’ control flow. The impact of this is with reduced compatibility,
however only legacy statically compiled binaries and old versions of
glibc used vsyscalls. All software on modern Tails uses vDSO instead. If
for some reason a program does try to use a vsyscall, the process will
crash with a memory access violation, and won’t bring the whole system
down.
mce=0
Mostly useful for systems with ECC memory, setting mce to 0 will cause
the kernel to panic on any uncorrectable errors detected by the machine
check exception system. Corrected errors will just be logged. The
default is mce=1, which will SIGBUS on many uncorrected errors.
Unfortunately this means malicious processes which try to exploit
hardware bugginess (such as rowhammer) will be able to try over and
over, suffering only a SIGBUS at failure. Setting mce=0 should have no
impact. Any hardware which regularly triggers a memory-based MCE is
unlikely to even boot, and the default is 1 only for long-lived servers.
oops=panic
Sets the kernel to fail-fast, which is highly desirable from a
security-perspective (see https://en.wikipedia.org/wiki/Fail-fast for
an extremely useful and succinct explaination which provides very useful
reasoning). Many kernel exploits hit the kernel hard and fail many times
before finally hitting the sweet spot and gaining full control over
kernel space. A large percentage of these times, the failures result in
a kernel oops, rather than a kernel panic. Setting oops=panic will
trigger a true stop error instead. This may be problematic for machines
using very buggy drivers which cause harmless oopses. These systems will
simply crash. I think this is very unlikely on a Tails system though.
oops=panic can also be set as a sysctl, which may be preferable because
it could also allow a few other panic_on_* features to be enabled
which for some reason do not have their own kernel parameters, such as
panic_on_warn, panic_on_unknown_nmi, and panic_on_io_nmi.
There’s also panic_on_oom which might be useful to prevent the
system from locking up when memory pressure is high and not responding
to a yanked out USB stick, but that’s another discussion…
Summary: slab_nomerge slightly increases memory footprint, but this shouldn’t matter for Tails because it’s not an embedded system. slub_debug=FZ increases memory footprint slightly, and has a moderate performance impact in benchmarks, but is unlikely to have any impact in the real world. Remove the “F” to remove the majority of that perf impact. vsyscall=none breaks very old apps but Tails uses none of these anyway. mce=0 prevents malicious programs from trying to exploit hardware bugs by giving them only one shot at it. oops=panic causes the system to fail-fast, which is desirable from a security perspective. Systems with very buggy drivers may crash with this option set.
Additional options I am looking into are reboot=cold (may make certain types of cold-boot attacks harder if memory is not removed from the system), acpi=copy_dsdt (may harden the system slightly from buggy BIOSes), and elevator=deadline (might reduce kernel surface area, with a nice side effect of improving USB and SSD performance). I may post rational for them as well if they turn out to be useful security-wise.
Feature Branch: feature/11143-harden-kernel