Monitor external broken links on our website
It would be great if someone prepared whatever is needed (scripts, cronjob line, email output) to monitor outgoing broken links on the Tails website (e.g. links on our website pointing to a third-party resource that does not exist anymore), and send useful reports of it regularly to some email address.
Ideally, it would be good to cache old results in order to report new(ly) broken links, and links that were broken last time already, separately (a bit like apticron is doing).
Once the basics are ready, we will want to turn the whole thing into a Puppet module, and deploy it on our infrastructure, but as a first step, preparing things without Puppet, as long as there is some setup documentation, would be enough.
It’s important to avoid the Not Invented Here syndrome, as we don’t want to maintain a big new chunk of software forever. Most likely, existing tools can be reused extensively. It might even be that Puppet modules to do the whole thing can be found.
Current command line used:
I’m ignoring /contribute for now. Let’s start with more exposed sections of the website and then move on to more internal ones.
linkchecker --file-output=csv/tails.csv --no-warnings --check-extern \ --no-follow-url="https://tails.boum.org/blueprint/.*" \ --no-follow-url="https://tails.boum.org/news/.*" \ --no-follow-url="https://tails.boum.org/security/.*" \ --no-follow-url="https://tails.boum.org/contribute/.*" \ https://tails.boum.org/