Test suite breakage when restoring an old Bullseye snapshot: "Guest disabled display", "Display output is not active"
Sometimes, when restoring an old snapshot, immediately after our test suite sets the clock to the host's time, the display turns black with this text: "Guest disabled display" (with Buster's libvirt + QEMU) or "Display output is not active" (with buster-backports' libvirt + QEMU.
This breaks tests at least when they next look for GnomeApplicationsMenu.png
on the screen.
See 04_13_41_Use_GNOME_Disks_to_unlock_a_basic_VeraCrypt_file_container_with_a_PIM.mkv
Data points
-
This looks exactly like a bug in
virtio_gpu
(https://bugs.launchpad.net/qemu/+bug/1882851) that is supposed to be fixed in Linux 5.9-rc5, while we see the problem with 5.10. -
The Journal contains nothing useful.
-
The bug occurs immediately after we sync the time from host to guest, that is the clock jumps a couple hours in the future.
-
But we did not manage to trigger the bug manually in a VM by bumping the clock a few hours in the future (Bullseye host, virtio-gpu), running
sleep 20; date -s '2022-07-09 12:34'
-
Wayland is affected too.
Next steps
Gather data
-
Check DPMS settings: xset q
- virtio-gpu:
- "DPMS is Enabled"
- "Screen Saver: prefer blanking: yes"
- QXL:
- "Server does not have the DPMS Extension"
- "Screen Saver: prefer blanking: yes"
- virtio-gpu:
-
Test if that's DPMS - If
xset dpms force off
triggers the bug, then it's related to DPMS - virtio-gpu: screen turns black, virt-viewer says "Display output is not active" (but I cannot reproduce it anymore! same for
xset dpms force suspend
andxset dpms force standby
, they don't do anything on my system anymore) - QXL:
xset
fails with "server does not have extension for dpms option", nothing happens (good)
- If
-
Check logind power saving config - The only related setting,
IdleAction=
, defaults toignore
.
- The only related setting,
Try potential workarounds
-
Test if this also happens with QXL graphics (instead of virtio): !854 (closed) - It creates worse problems. We would need to spend more time on it to fix them.
-
Test if moving the mouse brings the desktop back: !871 (closed) - I still see "Guest disabled display" after sync'ing the time from host to guest. Moving the mouse does not wake up the screen. No error in the Journal.
-
Disable DPMS in the VM: xset -dpms
: !873 (closed)- I've seen 1 instance of "Guest disabled display", immediately after sync'ing the time from host to guest.
-
Try a bunch of workarounds on !885 (closed) - Does not work: there are still instances of "Guest disabled display", immediately after sync'ing the time from host to guest.
- Host to guest time sync: do it with higher level tool e.g. timedatectl set-time
gsettings set org.gnome.settings-daemon.plugins.power idle-dim false
- Type a key to wake up the screen (also see !871 (closed))
- Wake up the display with
xset dpms on
- To detect the problem: after we do
$vm.host_to_guest_time_sync
, wait a few seconds, and check if we see "Guest disabled display" on the screen. - Or don't try to detect the problem, and "simply" run
xset dpms on
every second for N seconds? It's a cheaper way to check if this would help. - Either way, if that works for
amnesia
, applying this toDebian-gdm
requires finishing the fix for Remote shell cannot run X commands in the Welco... (#19049 - closed).
- To detect the problem: after we do
-
Upgrade isotesters QEMU and libvirt to the versions from bullseye-backports - No more "Guest disabled display" so far, but instead "Display output is not active", which interestingly is what I got earlier when trying
xset dpms force off
, so it could be worth retrying the various workarounds with this new stack (xset dpms on
, move the mouse, type a key, etc.)
- No more "Guest disabled display" so far, but instead "Display output is not active", which interestingly is what I got earlier when trying
-
Test if this also happens with QXL graphics (instead of virtio) with the new QEMU and libvirt - base this on !854 (closed)
- It (still) creates worse problems. We would need to spend more time on it to fix them.
-
Upgraded QEMU and libvirt + a pile of workarounds (!885 (closed)) -
Find out what's the power saving feature that makes the display turn off, if it's not DPMS - On !885 (closed), I've disabled every power saving, screensaver, etc. feature I could think of ⇒ no luck.
- Then, I've applied the same settings in a Tails 5.2 VM with virtio GPU, running on a sid host, waited 1h, and the screen remained on. I did not try with a Bullseye host so it's possible that the newer libvirt & QEMU fixes things; if it does, then we'll get the fix for free from bullseye-backports at some point.
-
Try to maintain virt-viewer connection - It could be that virt-viewer disconnecting triggers the bug
- Revert 25c45e35: !893 (closed)
-
Try again/harder to move the mouse: !871 (closed) didn't work, but maybe we could try again - Rationale: on intrigeri's sid system, it's enough to wake up the screen that's gone off while debugging interactively a test suite run.
-
Make stuff verbose to try to understand which software turns the display off -
Consider reverting the QEMU and libvirt upgrades -
Detect this in the test suite as a "special" kind of error. It doesn't fix the problem, but it helps when triaging failures. !1014 (merged)