keep_node_busy_during_cleanup sometimes loses the race against next test suite job ⇒ test failures
I've seen a pathological case that goes like this:
- a test suite job completes and triggers
keep_node_busy_during_cleanup
andreboot_node
, as expected - another test suite job gets scheduled on the same node, while
keep_node_busy_during_cleanup
should run first -
reboot_node
restarts the node while the 2nd test suite job has started running ⇒ that 2nd job fails
Hypotheses that could be worth exploring:
- We're not sleeping long enough at https://gitlab.tails.boum.org/tails/jenkins-jobs/-/blob/master/macros/test_Tails_ISO.yaml#L71
- The https://plugins.jenkins.io/PrioritySorter/ plugin is buggy.
- The priority values we pass to PrioritySorter are buggy (e.g. higher value might now mean higher priority).
- The way we configure PrioritySorter is buggy.
- Some leftover job in the queue messed things up (this happened after the upgrade of isotesters to Bullseye).
cc @zen
Edited by intrigeri