• intrigeri's avatar
    jenkins-slave.service: only enable a node if we marked it as temporarily offline (#10601) · 46ad3c62
    intrigeri authored
    The enable_node() method toggles offline status iff. the master thinks the node
    is currently offline, which can be the case:
     - either because the agent was not started yet (this is always true at this
       stage) *and* the master already understood that the agent that was running
       before the reboot got disconnected: that's racy so let's not rely on this,
       contrary to what the previous implementation did before this commit;
     - or because we marked the node as temporarily offline ourselves (before
       rebooting, in test_Tails_*).
    Let's eliminate at least one part of the race condition, by:
     - only fiddling with temporarilyOffline state when we really need to,
       i.e. when we explicitly put the node temporarily offline ourselves;
       this makes us stop rely on the exact timing of when the Jenkins
       master registers the fact an older agent got disconnected;
     - waiting for the Jenkins master to have registered this state change, before
       we start the agent; I'm not sure this is really needed, but I'm under the
       impression that the master processes requests to put a node back online in
       a non-blocking manner, which would race against the startup of the agent, and
       I'm not willing to bet on the opposite.
    I'm not sure this will be sufficient to fully fix this race condition.