Skip to content

Build our production website in GitLab CI

Scope

In scope

  • build our live, production website via a GitLab CI job, and then deploy the output to webserver(s) upon success
  • by default, use caching + ikiwiki --refresh to avoid a huge time-to-publication increase
  • developers and tech writers can force a full rebuild of the website, that bypasses/invalidates the cache, via GitLab CI

Out of scope

This issue is not about serving our website via GitLab pages.

Expected benefits

More robust

  • the build happens in a controlled, mostly reproducible environment, so problems caused by transition between states are less of a problem
  • the output of the build is published only if it succeeded ⇒ no partly refreshed, half broken website in production
  • Avoid problems caused by incorrect state transitions like Deleted page still served and indexed by search... (#18065 - closed)

Non-sysadmins have more agency about their work

  • everyone can look at the build output: not only the person who pushed, but also the person who should investigate and debug what happened
  • developers can fix stuff themselves via the GitLab CI config file, if needed
  • developers and tech writers can maintain the configuration themselves (ikiwiki.setup, ikiwiki plugins, build dependencies such as po4a (Upgrade to po4a 0.62 (tails#18667 - closed) and the upcoming tails#20239 (closed)))
    • no need to maintain changes in 2 different versions (tails.git, puppet-tails)
    • no need to coordinate merging branches with deploying updated configuration on the production infra

Recover from broken website refresh/build without sysadmin intervention

In a variety of situations, an ikiwiki refresh triggered by a Git push fails, leaving it in an unclean state, and then the only way to recover is to ssh into the machine and manually start a full rebuild. This is painful because:

  • When this happens during a release process, the release can be left half-published, until someone fixes this. That’s not fun for the RM.
  • It puts timing/availability/expectations pressure on sysadmins.
  • I suspect our technical writers have grown wary of pushing some kinds of changes that typically trigger this sort of problems. Not being able to do one’s job with a reasonable amount of confidence in oneself and in our infra is surely not fun.

Paves the way towards web server redundancy

Context: #16956 (closed)

E.g. here's how Tor is doing it: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/static-shim#deploying-a-static-site-from-gitlab-ci

Examples: https://gitlab.torproject.org/tpo/web/tpo/-/blob/main/.gitlab-ci.yml?ref_type=heads and https://gitlab.torproject.org/tpo/web/blog/-/blob/main/.gitlab-ci.yml?ref_type=heads

And an example deployment: https://gitlab.torproject.org/tpo/web/tpo/-/jobs/496878

Originally created by @intrigeri on #17364 (Redmine)

To-do

  • Build the website in GitLab CI and push it to www2tails!1519 (merged)
  • Fix Website builds in GitLab CI sometimes timeout a... (#18086 - closed)
  • Prevent jobs corresponding to older commits from overwriting newer versions of the website (see thread below)
  • Use our own container image to build the website
  • Pin our GitLab's container registry IP in /etc/hosts of gitlab-runner VMs
  • Figure out how to feed Ikiwiki's PO files update back to tails.git
  • Push the website to www (somehow) and retire tails::website
  • Test changing source string so IkiWiki pushes updated .po files back to the repo
  • Check if there's a better access-token setup than the current one re. needed permissions and expiration time
  • Document accordingly
  • Push to www2 via the private network and remove the public access to that VM's SSH service
Edited by Zen Fu
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information