Automate mirror pool management with a server-side redirector

Why
- Main reason
- Additional benefits
Candidates
Implementation
Deployment
Follow-ups

Why

Main reason

When we designed our current mirror pool, we dismissed the server-side options mainly because of reasons that are all now moot:

We did not host our website ourselves → we now do.
We had no place of our own to host such a redirector → we now have this.
We lacked sysadmin resources to implement something server-side → we could now decide to prioritize this if we wanted.
We were worried about our web server becoming a SPOF → our current client-side mirror picking code relies on our web server and has the same problem.

Then we decided to pick the mirror on the client-side. It felt pretty cool at the time but experience shows that as a consequence of this solution, we have a whole set of other problems to solve:

https://gitlab.tails.boum.org/tails/blueprints/-/wikis/HTTP_mirror_pool#improve-ux-and-lower-maintenance-cost-2021
We have to maintain our mirror pool manually, which is tedious, repetitive, and stressful
We have to write, maintain, and improve, custom code to:
- monitor our mirrors (check-mirrors + integration into our infra)
- pick a mirror on the client side, using programming languages very few of us are comfortable with (JavaScript + integration into our Perl Upgrader)
Even seemingly simple changes require careful planning and coordination between developers, technical writers, mirror pool maintainers, and release managers. Quite often the update must be released in several stages. This sometimes leads technical writers to instead implement JS/CSS hacks to bypass the design, instead of integrating their change into the design, which suggests the design is not great.
Only 1 person has the big picture in mind.

All of these need more work in order to make our current implementation sustainable. So I think it's time to consider cutting down our costs, stopping investing into this design, and ditching the whole thing in favor of a server-side redirector.

Additional benefits

As we see all downloads, we can do stats about downloads and upgrades ⇒ UX has more useful data to do their job
We can ditch the DNS fallback pool eventually, so:
- One less piece of infra (Git hook & code that updates the dl.a.b.o DNS zone)
- No more hard-coded IP addresses that may change without upstream notifying us
We can simplify our website: no need to differentiate clients with/without JavaScript anymore, one less piece of JS to integrate

Candidates

The most popular option these days seems to be Mirrorbits (in Debian Bookworm).

Implementation

Deployment

Stage 0: Infrastructure

Deploy Mirrorbits in rsync.lizard → puppet-tails!99 (merged)
Decide on what hostname to use for download URLs → download.tails.net
Setup reverse proxy for download.tails.net to rsync.lizard (with TLS and http → https redirection)
Make rsync_url mandatory in mirrors.json (and remove non-compliant mirrors)

Stage 1: Use the mirror redirector for downloads from our website (simple, low-risk, reverting is cheap)

In a MR targeting master (!945 (merged)), switch from http://dl.amnesia.boum.org to https://download.tails.net in:

Once satisfied that it works well:

Drop the ikiwiki underlay for mirror-pool-dispatcher

Stage 2: Make the Upgrader use the mirror redirector (not entirely simple, reverting needs a release)

MR: !983 (merged)

This is a tricky topic, as the Upgrader we've shipped in already released Tails 5.x must allow upgrading to any future 5.y.

Thankfully, we're in luck:

The transformURL JS function only uses the length of fallback_download_url_prefix (hardcoded to http://dl.amnesia.boum.org/tails in already released 5.x), not the actual contents of the string.
https://download.tails.net/tails is the same length as http://dl.amnesia.boum.org/tails

So, once we generate UDFs that use the new URL prefix, the old Upgrader will keep working exactly the same way as it currently does: it'll replace that new URL prefix with a random one from mirrors.json. And it'll keep falling back to the DNS pool on failure.

To make newer Upgrader versions actually use the mirror redirector:

Ensure the Upgrader follows redirections
- We don't override LWP::UserAgent's default settings wrt. redirections, which are: max redirect = 7.
- This works: PERL5LIB=config/chroot_local-includes/usr/src/iuk/lib DISABLE_PROXY=1 ./config/chroot_local-includes/usr/src/iuk/bin/tails-iuk-get-target-file --uri https://download.tails.net/tails/project/trace --fallback-uri=https://download.tails.net/tails/project/trace --hash-type=sha256 --hash-value=77f44a44a342a12e6894f144bdd92075549bb33b7788ae7557c2bd344c766b27 --output-file=/tmp/trace --size=11
Generate UDFs that point to the mirror redirector
- Code: config/chroot_local-includes/usr/src/iuk/lib/Tails/IUK/UpgradeDescriptionFile/Generate.pm
Update example UDFs in wiki/src/contribute/design/upgrades.mdwn
Drop the "replace URL with a random one from the mirror pool JSON" and "fallback to DNS pool" mechanisms: instead, use the mirror redirector
- Drop config/chroot_local-includes/usr/src/perl5lib/lib/Tails/MirrorPool.pm, that'll become unused
Update mirrors design docs accordingly (wiki/src/contribute/design/)
Drop submodules/mirror-pool-dispatcher
Remove obsolete dependencies (Node.js?)

Wrap up

Consider using LocalJSPath and the bundled fetchfiles.sh to serve JS/CSS/Fonts for Mirrorbits UI from our server. → Tracked separately at: sysadmin#17971 (closed)
Document and plan periodic procedures for:
- Updating the GeoLite2 databases
- Updating the mirror pool
Unsubscribe Zen-Fu from tails-mirrors

What we won't update and why

Pre-existing upgrade YAML files: wiki/src/upgrade (most URLs are broken, and we can start using the new one for next releases)
One very old report: wiki/src/news/report_2013_11.mdwn (historic data)
News entries: wiki/src/news/ (most URLs are broken, and we can start using the new one for next
Sandbox: wiki/src/sandbox.*.po (maybe add the new URL just for reference?)

Follow-ups

These should be tracked elsewhere as they should not block us from closing this issue:

Optimize Upgrader download

Status: done on 653cbdee. If it's any more complicated than this, let's create an issue to track it.

Context: #18263 (comment 193021)

Currently Tails::IUK::TargetFile::Download downloads IUKs using:

    SocksPort 127.0.0.1:9062 IsolateDestAddr IsolateDestPort

We should probably use another SocksPort, that has IsolateDestAddr IsolateDestPort, disabled, so there's at least a chance that the circuit used to connect to Mirrorbits is reused for the actual download, thus benefiting from the fact Mirrorbits uses GeoIP to select the mirror.

Deprecate the fallback DNS pool

#19333

Rethink how we monitor our mirror pool

#19334 (closed)

Edited Dec 23, 2022 by Zen Fu

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information