-
intrigeri authored
The enable_node() method toggles offline status iff. the master thinks the node is currently offline, which can be the case: - either because the agent was not started yet (this is always true at this stage) *and* the master already understood that the agent that was running before the reboot got disconnected: that's racy so let's not rely on this, contrary to what the previous implementation did before this commit; - or because we marked the node as temporarily offline ourselves (before rebooting, in test_Tails_*). Let's eliminate at least one part of the race condition, by: - only fiddling with temporarilyOffline state when we really need to, i.e. when we explicitly put the node temporarily offline ourselves; this makes us stop rely on the exact timing of when the Jenkins master registers the fact an older agent got disconnected; - waiting for the Jenkins master to have registered this state change, before we start the agent; I'm not sure this is really needed, but I'm under the impression that the master processes requests to put a node back online in a non-blocking manner, which would race against the startup of the agent, and I'm not willing to bet on the opposite. I'm not sure this will be sufficient to fully fix this race condition.
46ad3c62