Jenkins jobs frequently fail due to timeout when fetching from puppet-git.lizard:tails Git repository
Impact: makes our CI unreliable enough to break the "git push or click build and look at the results tomorrow" dev workflow, as one has to baby-sit Jenkins jobs long enough to ensure git fetch
succeeded.
See e.g. https://jenkins.tails.boum.org/job/build_website_master/5335/console
17:46:11 > git fetch --tags --force --progress -- gitolite3@puppet-git.lizard:tails +refs/heads/*:refs/remotes/origin/* # timeout=10
17:56:11 ERROR: Timeout after 10 minutes
17:56:11 ERROR: Error cloning remote repo 'origin'
17:56:11 hudson.plugins.git.GitException: Command "git fetch --tags --force --progress -- gitolite3@puppet-git.lizard:tails +refs/heads/*:refs/remotes/origin/*" returned status code 128:
17:56:11 stdout:
17:56:11 stderr: remote: Enumerating objects: 65647
remote: Enumerating objects: 112533
remote: Enumerating objects: 132445
remote: Enumerating objects: 135232
remote: Enumerating objects: 140582
remote: Enumerating objects: 145943
remote: Enumerating objects: 167049
remote: Enumerating objects: 184307
remote: Enumerating objects: 199246
[… hundreds of such lines …]
remote: Enumerating objects: 790054
remote: Enumerating objects: 791571
fetch-pack: unexpected disconnect while reading sideband packet
17:56:11 fatal: protocol error: bad pack header
When this happens I see an error like this one in the Journal of puppet-git.lizard:
gitolite[336896][336896]: system() failed,git,shell,-c,git-upload-pack '/var/lib/gitolite3/repositories/tails.git',-> 36096
On puppet-git.lizard, sudo journalctl | grep -E 'gitolite.*system\(\) failed'
returns 50 results between March 21 and April 4. We have logs since March 19.