Hi!
28/05/13 20:47, winterfairy@??? wrote:
> The following patches introduce support for persisting /var/lib/tor. The
> primary benefit of this is the improved security/anonymity by keeping ones
> Tor entry guards. But there is bootstrap and circuit speed benefits too.
Nice work! It's awesome that you're tackling this long-overdue task, but
your patches overlook some issues. I hope that this very long email
won't scare you off. :) Your contribution is greatly appreciated!
First, you did not update the documentation in
doc/first_steps/persistence/configure.mdwn. In general we won't merge
anything before the corresponding user/design docs (if applicable) are
updated, but while you contributing docs isn't a requirement, it
definitely will speed up the process. For this particular preset, the
docs should clearly warn (see that page for examples) about the
"physical location tracking via guard node set fingerprinting" issue
described in the ticket [1], and make it clear that any Vidalia/torrc
configuration changes (e.g. bridges) are *not* made persistent. Linking
to the Tor FAQ's entry about entry guards [0] may also be worthwhile.
[0]
https://www.torproject.org/docs/faq.html.en#EntryGuards
Second, your patches does not deal with the issues that's listed on the
ticket [1].
[1]
https://tails.boum.org/todo/persistence_preset_-_tor/
The first issue, about tracking the physical movement of a particular
Tails user, isn't something we can do much about (except documenting)
without patching Tor. I suppose one could consider adding a Tails
Greeter option for temporarily disabling Tor data dir persistence for
the current session (to be used when the user has moved to a location he
or she doesn't want to be associated with) but I don't deem that
necessary for merging this.
The second issue, breakage of some of Tails' scripts, *must* be fixed
before this feature can be merged.
The easy part is to fix tor_is_working() (used by the Unsafe Browser and
the "tordate" time syncing script [2]) in the Tails Tor shell library
[3]. It simply checks if cached-microdescs{,.new} exist, so with a
persistent Tor data dir, it will always return true (after the firs
bootstrap).
[2] in the Tails sources:
config/chroot_local-includes/etc/NetworkManager/dispatcher.d/20-time.sh
[3] in the Tails sources:
config/chroot_local-includes/usr/local/lib/tails-shell-library/tor.sh
A more reliable approach (in the Tails Tor shell library) is:
tor_is_working() {
[ "$(tor_control_getinfo status/circuit-established)" = 1 ]
}
If the testing results are ok (no regressions in tordate or the Unsafe
Browser) a patch for this would be individually accepted.
What remains is tordate's usage of the *-microdesc-consensus files, and
this will require some serious thinking and investigation about *how* to
fix it, and probably a much more complex fix in the end (this is what's
been blocking us from implementing this persistence preset ourselves).
In hope that we may get help with this issue (perhaps by you?) I'll try
to outline the issue in general terms that doesn't require a detailed
study of how the tordate hack works. I will also serve as a basis for
updating the ticket with some more detailed info if this isn't resolved
soon. It may help to have a look at the Time syncing design
documentation [4] though.
[4]
https://tails.boum.org/contribute/design/Time_syncing
tordate will, in general, run wait_for_tor_consensus(), which waits
until one of {cached,unverified}-microdesc-consensus exist. If the
system time skew is large enough too make the consensus look invalid,
the system time is set to the consensus' valid-after + 30 minutes via
maybe_set_time_from_tor_consensus(). This guarantees that Tor will be
able to build circuits, so htpdate can run later on and set the system
time to something accurate, which is beside the point but explains the
motive for tordate.
If Tor's data dir is persistent, an old consensus may exist when tordate
starts, which contradicts the implicit assumption that only a *fresh*
consensus may exist at that point (which is the case without
persistence). Hence a persistent Tor data dir can result in a
non-working Tor. Steps to trigger bug (not tested, would be great if you
could try it):
1. Boot Tails with Tor data dir persistence enabled.
2. Let Tor bootstrap.
3. Shutdown Tails.
4. Wait for 3+X hours (where X >= 0), so the persistent, cached
consensus definitely is invalid.
5. Boot Tails with Tor data dir persistence enabled.
6. Now tordate will immediately find the persistent, cached consensus
fetched in step 2 and set the system time based on it. The system
clock should now be around 2+X hours late.
7. Tor detects a time skew and that our consensus is old, so a new one
is fetched (over-writing* our old one) but since our new system
time will be before the valid-after of the new consensus, Tor will
refuse to use it.
Steps 6-7 are potentially (probably not realistically, though) prone to
race conditions, e.g. if you have a fast Internet connection the
consensus fetched in 7 may actually be fetched in between 5 and 6, which
would not trigger this bug. A robust fix would make it so that 6 can
only happen when either one of the following conditions are met, no
matter the system clock skew:
1. We didn't have a consensus before starting Tor, but now we've
fetched one and it's written to {cached,unverified}-microdesc-
consensus.
2. We had an old enough consensus for Tor to fetch a new one, and we
waited until that finished and {cached,unverified}-microdesc-
consensus is updated with it.
3. We had an old consensus, but it's still fresh, so no new consensus
is fetched.
Our current code only deals with 1. This issue is to extend
wait_for_tor_consensus() so that if we have a consensus before Tor is
started, we wait until exactly when 2 or 3 happens.
* The "if we have a consensus before Tor is started" part cannot be
tested in 20-time.sh *only* since that would be prone to race
conditions. Instead we could write a state file, e.g.
/tmp/tor_had_no_persistent_consensus, in 10-tor.sh, if that's the
case right before we start Tor, and then test for its existence in
20-time.sh and act accordingly.
* The "we wait until exactly when 2 [...] happens" part is easy using
`inotifywait` to monitor the consensus file for write changes.
* The "we wait until exactly when [...] 3 happens" part is hard. What's
difficult with this compared to 2 is to detect "no write changes to
the consensus file" without using something like an `inotifywait`
timeout (which is an unacceptable solution). We have to look
elsewhere, maybe grep:ing Tor's log (ugly, and likely complex since
Tor may log different things depending on the clock skew), or using
Tor's control port. The latter may involve monitoring for some event
(also ugly) since no one-off GETINFO command (which would be ideal)
seems to contain the information we need, but I'm not sure even any
of the events or the Tor log can help us here either.
So, hopefully some investigation in the hints given above we lead to a
solution satisfying the above criterias. If not we may have to wait with
this persistence preset until we replace the tordate hack with something
better [5].
[5]
https://tails.boum.org/todo/robust_time_syncing/
A complete alternative to making all of `/var/lib/tor` persistent, which
wouldn't be affected by any of the above mentioned issues, would be to
only make `/var/lib/tor/state` persistent (for an example of how to make
a single file persistent, see how the Iceweasel bookmarks persistence
preset works). I'm not sure how much I like this idea, though, since a
persistent consensus has some worthwhile performance benefits, and since
this hack may have its own load of unintentional side-effects (so asking
on the tor-talk mailing list and investigating the Tor sources is a
requirement).
> I also tested bridge mode, and it seems not to break with this enabled,
> but of course it is useless in bridge mode and just leaves unnecessary
> traces.
See Tor's NetworkManager hook [6]. When Tails' bridge mode is enabled,
Tor's data dir is cleared, which is a bad idea when it's persistent.
[6] in the Tails sources:
config/chroot_local-includes/etc/NetworkManager/dispatcher.d/10-tor.sh
The comment suggests it's a workaround for Tor bug #2355, but I'm not
sure this is the case any more. If I recall correctly this was a hackish
workaround used in our early experiments with bridge mode, that also
used `ReachableAddresses reject *:*` (which was unnecessary, but which
had some bug that required the data dir clearing) so I think the `rm -f
/var/lib/tor/*` stuff can be removed from 10-tor.sh. Obviously, bridge
mode with this change should be tested carefully, both with and without
a persistent /var/lib/tor. If the testing results are ok a patch for
this would be individually accepted.
> 0001-Add-preset-for-persisting-Tor-entry-guards-and-Tor-c.patch
>
> + name => $self->encoding->decode(gettext(q{Tor Entry Guards})),
> + description => $self->encoding->decode(gettext(
> + q{Keep entry guards for better anonymity}
> + )),
Like you have pointed out yourself, persistent entry guards isn't the
only consequence (and reason) for a persistent `/var/lib/tor`. For
instance, people using a slow dialup connection or an expensive mobile
data plan would benefit from skipping Tor's bootstrap phase, and I'd
like to explicitly support this use case. Also, "better anonymity" isn't
as clear-cut as we may wish it to be due to the physical location
tracking issue. It all depends on the user's threat model, and we have
to guide them so they can make an intelligent decision.
What about:
Tor data directory
Tor performance and anonymity benefits at the cost of
geographical tracking
As always, reading the docs will give the full information, and
hopefully the "geographical tracking" part of the description will make
users that haven't read the docs think twice.
> + options => [ 'source=tor-state' ],
For consistency with how we (and torrc and Tor's man page) refer to this
directory, and to prevent confusion with `/var/lib/tor/state`, I'd
prefer `source=tor-data`.
> 0001-Fix-ownership-of-var-lib-tor-after-login-before-Tor-.patch
I think I'd prefer not to bloat Tails Greeter further whenever it's
possible and makes sense. In this case that code can be put in Tor's
NetworkManager hook [6], before it starts Tor. I can't see any reason
why not to, but I may have overlooked something (?).
Cheers!