Have Tails systems report their version to us in a privacy-preserving way
Before #15281 (closed), as a side-effect of our upgrade design, once connected to Tor, a Tails system would send a HTTPS GET request over Tor to our website, on a URL that included the version of the running Tails. This request ended up in our website logs (without IP address).
Since #15281 (closed), we lost this source of information.
If we recovered the ability to do such metrics, we could use it to do metric about:
- upgrade rate
- argument for fundraising, if this data shows users don't upgrade enough (or not fast enough), (and if user research confirms that this is because of problems in our current upgrade system)
- which old versions were still alive in the wild, which impacts for example:
- which minimal version of the Upgrader we can assume most/all users are running (credits: @BitingBird)
- more accurate metrics, by allowing to filter out dev builds booting (e.g. test suite runs)
- having more data about how much our betas are being used
Let’s make
it a conscious decision rather than an unintended side-effect.
Was the data collection we previously had sufficiently innocuous to be
acceptable? Or do we want Tails to hide more information from us? (Which
one?)
Would be good but not worth more than maybe 6h of implementation, and maybe not creating tensions in our community nor spending hours on complicated discussions.
Implementation idea: add a ?current_version=$xyz query parameter to the HTTP request sent by the Upgrader to fetch the UDF.
What we're telling our web server (in RAM):
- at $time, someone started Tails initially installed at $version, currently running $current_version
- IP of Tor exit node being used (leaked already == baseline)
What we're storing:
- at $time, someone started Tails (IP not stored) initially installed at $version, currently running $current_version
The info that "this person uses Tails version X" for someone with access to our logs could be used for incriminating someone else. Such info could be correlated with:
- Request/access time
- Wiretap info
- Increases in Tor traffic
Also:
-
Such info could help expose users running old Tails versions with vulnerabilities that could be leveraged.
-
If our webserver (or future mirrors) are compromised, the config could be changed to turn on IP logging and turn that webserver/mirror into a harvesting point.
-
In such case, the version check page could be replaced by some script that would serve malware suited to the specific Tails version.
This means that a compromise of our webserver/mirror could open up an escalation path to all outdated Tails instances, which is creepy.
One way to mitigate that would be to encrypt the version information to a public key whose secret part is not stored in the webserver. That information would only be decrypted when processing server logs somewhere outside of the webserver. It'd be good to double check how solid the encryption is when encrypting the same small amounts of data so many times. Our understanding is that using a salt (as OpenPGP does) solves that, but it'd be good to assert this more solidly if deciding to go with such mitigation.
OpenPGP should not permit an attacker to discriminate between A'* and B'* any better than random.
Note that compression might have an effect though:
$ echo aaaa | gpg -r sajolida@pimienta.org --compression-algo none --encrypt | wc -c
583
$ echo abcd | gpg -r sajolida@pimienta.org --compression-algo none --encrypt | wc -c
583
$ echo aaaa | gpg -r sajolida@pimienta.org --compression-algo ZLIB --encrypt | wc -c
592
$ echo aaaa | gpg -r sajolida@pimienta.org --compression-algo BZIP2 --encrypt | wc -c
628
$ echo aaaa | gpg -r sajolida@pimienta.org --compression-algo ZIP --encrypt | wc -c
586
So we shouldn't use compression.
See also https://tails.boum.org/blueprint/user_survey/ for the additional user research that we could do about it.