hey,
last week I've attended the Reproducible Builds World Summit #3.
It was even better — for me — than last year, because this time I knew
quite a bit more what I was talking about: last year we had not
started our own work on this topic, so all the experience I had was
about making Debian packages build reproducibly.
Here are some very subjective highlights.
Reporting back & sharing knowledge
==================================
I've facilitated a session about making system images reproducible.
It was very popular: about 8-10 people attended, mostly Qubes OS and
Debian folks. I've mostly pointed out the main known issues and
solutions based on our experience. Personally I've not learned much,
except that there's a tool to create reproducible ext4 FS images,
which can be interesting for our VM images (on the Vagrant and
cloud/infrastructure side) and for the upcoming research process about
revamping our install/upgrade, on-disk format and distributed images
format. Attendees were pretty excited about the fact it now seems
doable to build reproducible system images (disclaimer: we're *not*
the first ones, another Debian Live system did it before).
I've also run a 1-to-1 skill sharing session about our APT snapshots
system, attended by a Qubes OS developer. We came up with an idea that
could allow us to keep the signatures from the Debian archive, even in
partial "tagged" snapshots, instead of shipping our own. It's not
trivial to implement but it would bring quite some value to our
reproducibility feature. I'll keep this in mind for [stage 2].
[stage 2]
https://labs.riseup.net/code/issues/14455
Documentation
=============
With Ulrike and lynxis we've de-Tails-ified the [report] I've sent
a few weeks ago about how we made our ISO image build reproducibly.
It's now [live] on the RB website. The goal is to gather and share
knowledge with anyone else who want to make their own system images
reproducible.
As you know, it is super important to me that we work in a way that's
well connected with the broader ecosystem and communities around us,
and that we share whatever knowledge we learn, whatever tools we
create, in the process. I hope this sets a good example of how this
should be done (no, humility is not my strongest skill, that's a known
issue :)
[report]
https://lists.reproducible-builds.org/pipermail/rb-general/2017-October/000656.html
[live]
https://reproducible-builds.org/docs/system-images/
What can we call "reproducible"?
================================
This was the most important and challenging task I brought with me to
this event. What we can legitimately call "reproducible" is a somewhat
touchy topic in a community whose name includes "reproducible"; makes
sense, doesn't it, uh? The timing was perfect, as our plan is to start
boasting publicly about "Tails is reproducible" in 6 days.
So I initiated a discussion on this topic during the session about
system images, and followed-up on it privately with a number of
attendees in order to convince myself we had buy-in from
this community.
At first it was not obvious to me, and to a number of attendees, that
assembling an image from a bunch of binary blobs satisfied the
criteria we have for calling it reproducible. Then, while discussing
this topic, we realized that:
- The compilation process for some packages includes stuff that comes
straight from binary artifacts shipped by other packages
(build-dependencies); I don't recall what exact examples were
provided about C programs, Ximin knows better and it's definitely
outside of my comfort zone.
- The official [definition] of "reproducible builds" that we created at
the same event last year allows one to have build-dependencies.
Arguably the packages we fetch from (our own snapshots of) the
Debian archive are build-dependencies.
So we're good… in principle. But how we shape our communication does
matter: the limit between "we're good" and "we're diluting/corrupting
the RB message" is thin and we're playing exactly around this limit.
Let me clarify: even though in principle everyone agreed with this,
there was a consensus that when calling such things reproducible, one
must make it clear what's their input, source code and
build-dependencies. In our case, we should make it clear that we are
*not* building everything from source, and we're rather *assembling*
an ISO image whose content comes at 99% from *binary*
build-dependencies.
[definition]
https://reproducible-builds.org/docs/definition/
End-user policy
===============
We had great discussions about how to convey the value of reproducible
builds to end-users in a meaningful and actionable manner, e.g. "I
want to install only reproducible packages". I'm too lazy to explain
what we came of with now (notes were taken and there will be a global
report from the event), but process-wise I was extremely happy and
somewhat surprised: we quite easily managed to take a user-centric
PoV, to avoid diving into implementation details when it didn't
matter, to think in terms of value proposition, and to avoid
pretending we're good at designing GUIs. We also managed to take
exceptions into account (e.g. packages that are not reproducible yet,
proprietary blobs actual users need to get their Wi-Fi working) which
is often the hard part when designing such systems.
My secret mission about this before going to this event was to ensure
dkg (and not myself) would lead this effort. I'm glad I can say it
worked :)
Now, one person volunteered to implement one of the needed backend
bits, but I don't think anyone formally committed to facilitate/lead
the next steps of this process. It's no big deal IMO as we were aware
these discussions were kinda premature. I'm confident what we produced
will be a solid basis for whoever comes next, possibly at the next RB
summit, and — who knows? — earlier.
Interestingly, if/once this gets implemented, this will allow us
(non-trivially again but still) to ensure the packages we install in
our ISO are the same as the ones that N reproducers have built, which
again would be good for our stage 2: even if our indices and
signatures are different, at least we'll have a guarantee that the
*contents* of these packages was not altered on our infrastructure,
which is a pretty good start.
Questions are welcome.
cheers,
--
intrigeri