tagged_snapshots.mdwn 6.26 KB
Newer Older
intrigeri's avatar
intrigeri committed
1 2 3 4
[[!meta title="Tagged snapshots of upstream APT repositories"]]

[[!toc levels=2]]

5
# Overview
intrigeri's avatar
intrigeri committed
6

intrigeri's avatar
intrigeri committed
7 8 9
Our tagged snapshots of upstream APT repositories are published on
<http://tagged.snapshots.deb.tails.boum.org/>.

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
These are _partial_, tagged snapshots of upstream APT repositories we
need, so that one can rebuild a released ISO in the future, and we
keep the corresponding source code around.

The main goal here is having reproducible builds some day, and to
comply with various licenses such as the GPL.

These snapshots are partial: in a given snapshot, we import only the
packages needed by a given build of Tails.

The corresponding data shall be backup'ed, and expired very
cautiously, if ever.

# Source code

* `tails::reprepro::snapshots::tagged` class in
  [[!tails_gitweb_repo puppet-tails]]
* bits scattered in the main Tails Git repository (details below)

29 30 31
# Design notes

## Listing needed packages
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

To generate partial APT repositories, we need to know what to include
in them. Therefore, we create a _build manifest_ at the end of an ISO
build. It is generated by
[[!tails_gitweb auto/scripts/generate-build-manifest]], thanks to
[[!tails_gitweb data/wrappers/apt-get]] and
[[!tails_gitweb data/debootstrap/scripts/jessie.patch]].

Output:

- for each APT repository we use time-based snapshots for: name, serial
- for each binary package: name, version, architecture
- for each source package: name, version

In passing, here are some nice side-effects of having this build
manifest:

- It allows to inspect the diff between the subset of two different
  snapshots that was used at build time; the benefit is quite small as
  long as we're based on Debian stable (we also fetch packages from
  testing, sid, backports, etc. though), but if/when we switch to
  being based on Debian testing, then we will definitely want that.
- Say a branch (topic one, or devel, etc.) introduces a regression,
  and has changes in the set of packages used at build time, we may
  want to check how exactly that set was changed. Think "check the
  diff between `.packages`" as we do at release time, but done in
  a more correct way.

60
## Importing packages into partial snapshots
61

62
### How it's done in practice
63 64 65 66 67

* [[!tails_gitweb auto/scripts/tag-apt-snapshots]]
* [tails-prepare-tagged-apt-snapshot-import](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/tagged/tails-prepare-tagged-apt-snapshot-import)
* [tails-publish-tagged-apt-snapshot](https://git-tails.immerda.ch/puppet-tails/tree/files/reprepro/snapshots/time_based/tails-publish-tagged-apt-snapshot)

68
### A corner case: APT pinning magics
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133

If a (package, version) is seen at build time in 2 or more APT
sources, `tails-prepare-tagged-apt-snapshot-import` injects it
into each of the tagged snapshots corresponding to these sources.

The goal is to avoid this scenario, that could happen if we injected
each package _only_ into the distribution it was downloaded from:

 - version X of package P is available both in suite S1 on origin O1,
   and in suite S2 on origin O2
 - version Y of package P is available in suite S3 of origin O3
 - our pinning makes us prefer version X of package P *because it's
   available in O1/S1*; otherwise, if it wasn't in there, then our
   pinning would make APT prefer version Y to version X
 - at ISO build time, APT fetches package P version X from O2/S2
 - given this build manifest, we import package P version X into our
   tagged snapshot of O2/S2, but not into our tagged snapshot of O1/S1
 - if we rebuild from the same source tree using that set of tagged
   snapshots, then version Y of package P will be installed

This scenario can happen in practice:

	# cat /etc/apt/sources.list
	deb http://security.debian.org wheezy/updates main
	deb http://ftp.us.debian.org/debian/ wheezy main
	deb http://ftp.us.debian.org/debian/ jessie main

	# cat /etc/apt/preferences
	Package: *
	Pin: origin security.debian.org
	Pin-Priority: -10

	Package: *
	Pin: release o=Debian,n=wheezy
	Pin-Priority: 990

	Package: *
	Pin: release o=Debian,n=jessie
	Pin-Priority: 700

	# apt-cache madison a2ps
	a2ps | 1:4.14-1.3 | http://ftp.us.debian.org/debian/ jessie/main amd64 Packages
	a2ps | 1:4.14-1.1+deb7u1 | http://security.debian.org/ wheezy/updates/main amd64 Packages
	a2ps | 1:4.14-1.1+deb7u1 | http://ftp.us.debian.org/debian/ wheezy/main amd64 Packages

	# apt-cache policy a2ps
	a2ps:
	  Installed: (none)
	  Candidate: 1:4.14-1.1+deb7u1
	  Version table:
	     1:4.14-1.3 0
	        700 http://ftp.us.debian.org/debian/ jessie/main amd64 Packages
	     1:4.14-1.1+deb7u1 0
	        -10 http://security.debian.org/ wheezy/updates/main amd64 Packages
	        990 http://ftp.us.debian.org/debian/ wheezy/main amd64 Packages

And then, APT will download `a2ps` from security.d.o:

	# apt-get download a2ps --print-uris
	'http://security.debian.org/pool/updates/main/a/a2ps/a2ps_4.14-1.1+deb7u1_amd64.deb' a2ps_4.14-1.1+deb7u1_amd64.deb 956298 sha256:e47d7fe9adb7aa62421108debf425830f4e2385e98151c5cb359d3eb8688eea8

... but if `a2ps` was not available in the regular Wheezy archive,
e.g. because we were using a tagged snapshot that imported `a2ps` into
the security archive, then APT would prefer `a2ps` from Jessie, which
demonstrates the problem.
134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154

## Valid-Until

A tagged APT repository snapshot that was used to build a given Tails
release is immutable by design, so it does not need the protections
provided by `Valid-Until`. Besides, not using `Valid-Until` for those
makes it much easier to reproduce a given ISO build in the future.

So, the `Release` files for tagged snapshots have no
`Valid-Until` field.

## Garbage collection

We want to keep "forever" the tagged snapshots used by Tails releases.

In practice, "forever" == min(3 years for GPL, how long we want to be
able to reproduce the build of a released ISO) = 3 years.

Depending on the growth rate of our tagged snapshots in practice, we
may or may not need to implement expiration of these snapshots any
time soon. Time will tell.
155 156 157 158 159 160 161 162 163 164 165 166

# Known issues

## Unusable tagged APT snapshots generated for unused APT sources

When an APT source from which we pull no package at ISO build time is
configured in the Tails Git repository, the tagged APT snapshot
generated for that APT source will be unusable, which breaks the
ISO build.

To avoid this problem, ensure we do not enable any useless APT source
at ISO build time.