[Tails-dev] Tails persistence use case

Delete this message

Reply to this message
Autor: anonym
Data:  
Dla: The T(A)ILS public development discussion list
Temat: [Tails-dev] Tails persistence use case
Hi,

I've started working on todo/persistence [1] in live-boot upstream, so I
think it's time for us to settle on exactly what we want. My intention
is to update the RFC for persistence related changes in live-boot [2]
with any conclusions reached in this thread, and then discuss them on
the Debian live mailing list, and then (if accepted) implement them.

[1] https://tails.boum.org/todo/persistence/
[2] http://live.debian.net/devel/rfc/persistence/

Requirements
============

From the roadmap and various other places in our todo item [1] our
high-level requirements seem to be these:

* Persistent application-specific configurations.
* Persistent user data store.
* Ability to mount persistent media in read-only mode.
* Protection against too much write-wearing for flash media.

Current state of live-boot
==========================

live-boot enables persistence in four ways:

* live-rw: an overlay that is mounted on /.
* live-sn: a snapshot that is copied over / on boot, and is updated
with changes on / on shutdown.
* home-{rw,sn}: like the above two, respectively, but only for /home.

These options are too crude for us -- we need something that is more
flexible, which allows to make only specific directories persistent.
Only a certain type of snapshot has support for that currently, namely
the cpio.gz type (i.e. a cpio archive that is gzipped). Since I've
implemented read-only mode for snapshots [3], it seems we're almost there!

[3]
http://git.immerda.ch/?p=tails_live-boot.git;a=commit;h=c69497ad905cb09df4ced7e418fbef45e579bd3e

The problems with snapshots
===========================

But cpio.gz snapshots has some issues:

1. It is very unfriendly to flash based storage if we only do minor
changes to our persistent data, since *all* persistent data are
written back to the physical storage at *every* shutdown. I'm afraid
minor changes is a more typical usage where it matters. Imagine
having 100 MB of emails, fetching maybe 50 KB worth of new mails from
your inbox, and then syncing *all* the 100 MB worth of old,
unmodified emails back as well. That causes pretty significant write-
wearing in comparison to how much data that was added to the
snapshot.
2. On boot all snapshots' files are synced into the tmpfs, so they're
stored in preicious RAM. Hence snapshots cannot be very large
(specifically, the maximum is ~ ${RAM_SIZE}/2, and that leaves no
space for other file system modifications).
3. Syncing on shutdown may take a long time since all persistent data
is synced at the same time, not continuously over the whole session.
4. Crash or emergency shutdown => cartain data loss.
5. Doesn't currently support file deletion.

2 and 4 are inherent issues with snapshots, and cannot be avoided. The
rest are only implementation issues: 1 may be fixed if we ditch the
compression, and use "tar -u" style appending of changes, which also may
improve 3 somewhat (don't know). 5 can be fixed in various ways.

The case for overlays
=====================

To me it seems like overlays are inherently nicer than snapshots since
none of the above affects them. Sure, write wearing is also a problem
for them, but I think we get the best of all worlds if we use overlays
with flash friendly filesystems like yaffs/logfs/ubifs or whatever.
Support for that needs to be added then, of course.

At the very least the "persistent user data store" cannot use snapshots
due to point 2, so an overlay is the only option and thus must support
our requirements (e.g. "read-only" and "arbitrary directories", both
which currently are note implemented for overlays).

The overlay's only limitation is that it has static size and cannot
automatically grow like a snapshot file can (snapshot partitions can't
for the same reason). Trying to anticipate how much space is required
for each "persistent application-specific configuration" is not a path
I'd like to thread, either with us trying to do the estimate or our
users. They need to share the space of an underlying overlay.

Proposed solution: locally specified inclusions
===============================================

We make home-{rw,sn} obsolete, only live-{rw,sn} are considered by
live-boot (or more correctly, the scripts it adds to the initramfs).
When a persistent media (with label/filename "live-rw") is found by
live-boot, it looks for a file called .live-persistence.includes (but
I'll continue calling it just ".includes") in its root. If it's not
there, then it mounts the media (using aufs) on / just like it does for
live-rw currently. But if .includes is present, then it doesn't mount
anything on /, it instead bind-mounts the directories listed in
.includes to their specified destinations.

In-depth example
----------------

live-boot scans $dev and mounts it on some $mnt (current behaviour). It
finds $mnt/.includes, which consists of:

/home/amnesia/.gnupg
/var/lib/tor

This translate into live-boot doing:

  mount -o bind $mnt/home/amnesia/.gnupg  $root/home/amnesia/.gnupg
  mount -o bind $mnt/var/lib/tor          $root/var/lib/tor


where $root is what will become the filesystem root after initramfs.

Snapshots
---------

.includes could also be used for all types of snapshots, makeing the
rather awkward (and only available for cpio.gz type snapshots)
/etc/live-snapshot.list obsolete. I say it's awkward because it's
*inside* the live system, so if you want to change it you need to build
a new image.

Note: for snapshots the .includes file wouldn't be limited to
directories but could also handle individual files. Overlays cannot
handle individual files as long as we rely on bind-mounting.

Backwards-compatibility
-----------------------

*If* we care for backwards-compatibility, we'd have to allow for an
extended syntax which allows specifying source-desination pairs. To get
the old home-{rw,sn} type persistence, .live-persistence.includes should
then look simply like this:

  .    /home


which translates to:

mount -o bind $mnt/. $root/home

If live-boot finds a home-{sn,rw} labeled/named media, it could create
the above file and everything would work exactly like before.

Beyond backwards-compatibility, some people may find this syntax more
convenient. Not sure this is worth the effort for implementing it,
though. There are other UI changes in live-boot's persistence handling
that will break stuff, so people will have to be prepared for fixing
their old setups any way.

Conclusion
==========

I think this will be pretty easy to implement, consistent between
snapshots and overlays, and quite powerful in all its simplicity. Thoughts?

Cheers!