Re: [Tails-dev] Please review and test feature/tordate

Delete this message

Reply to this message
Autor: anonym
Data:  
Dla: The Tails public development discussion list
Nowe tematy: Re: [Tails-dev] Please review and test feature/tordate
Temat: Re: [Tails-dev] Please review and test feature/tordate
01/29/2012 10:46 PM, anonym:
> Now, in Tails this error only occurs if htpdate fails (and this should
> be unlikely nowadays) but I think this potential problem still warrants
> for us not setting time to the middle of [valid-after, fresh-until].
> Setting it to fresh-until (time error 0 to 60 minutes in the future) or
> up to one hour later would be safe though. I guess it's best to have a
> margin in both ways, so our old middle of [valid-after,valid-until]
> seems like the safest choice.


It turns out this isn't correct, if my analysis below holds. I was
updating our Time syncing documentation with the hopes of formally
deriving safe numbers for the parameters in tordate (which interval to
just accept as good enough, and what to set the time to in case it's
not). It turned out that if we want a long, stable Tor session with a
time only handled by tordate (like when htpdate fails), then the only
really safe thing to do is to *always*, no matter what, set the time to
fresh-until.

This problem is partially based on Tor's extreme sensitivity to clocks
that are behind, for which a potential fix is discussed in the end of
the analysis. If you agree with my analysis I'm gonna send a bug report
with the relevant parts.

Here's the full analysis (which is sort of like a proof by induction),
in markdown:

## Reliability of tordate

What we need to decide is:

* When run, which time range would `tordate` consider to be *good
enough* so that the time doesn't have to be changed to get a
properly working Tor? Let's call this interval `[V, W]`, and note
that we must have `valid-after <= V < W <= valid-until`.

* When the time is incorrect, what should `tordate` set the clock to?
Let's call this `N`, and `V <= N <= W` seems like a pretty
reasonable constraint for it.

Observe that the first consensus we download (during the Tor network
bootstrap) will be valid no matter how we choose `V`, `W` and `N` as
long as we follow the above constraints, so we will always get a
working Tor initially for `valid-until - W` time. Because of this we
probably want to constrain `W` such that `W <= valid-until - 30 min`
so we always get a Tor that won't need a new consensus for 30
minutes.

What we need to concern ourselves about is that consensus updates will
not cause trouble.

Without loss of generality, let's assume the real time is `T` and that
our client has an incorrect time `T+E` (which was previously set to
`N`, or already was within `[V, W]` when we started `tordate`) where
`E` is the time skew, and let's fix `valid-after`, `fresh-until` and
`valid-until` to the values of our client's current consensus.
Furthermore, since we're only dealing with consensus updates, and
these are made against directory mirrors, we need to take into account
the delay `D` for the consensuses to reach the any of the mirrors we
try to fetch an update from.

* Assume `valid-after <= T+E <= T < valid-until`, i.e. `E <= 0` so we
have a clock that's back in time, but still correct enough for our
current consensus to be valid.

  - Assume that `T < fresh-until + D`. A consensus update will fetch
    the same still valid consensus as we already have so all is good.


  - Assume that `T >= fresh-until + D`. A consensus update will fetch
    a new consensus that's valid starting at `fresh-until` or later. If
    we fetch while `T+E < fresh-until` the consensus is not valid for us
    yet, and Tor is unable to build new circuits for `fresh-until -
    (T+E)` time. We need `V,N >= fresh-until` to prevent this. (It
    seems to be a bug/"feature" in Tor that the current consensus can
    be overwritten by a new consensus that is not yet valid. See the
    end of this analysis for more information.)


* Assume `valid-after < T <= T+E <= valid-until`, i.e. `E >= 0` so we
have a clock in the future, but still correct enough for our current
consensus to be valid.

  - Assume `T >= fresh-until + D`. A consensus update will fetch a new
    consensus that's valid for at least another hour than the current
    one so all is good.


  - Assume `T < fresh-until + D`. A consensus update will fetch the
    same consensus as our current one. If `fresh-until + D - T >
    valid-until - (T+E)`, which is the same as `E > 2 hours - D`, we
    have an issue: we won't be able to get a new consensus until after
    ours expires, so we're stuck with an invalid consensus for
    `E + D - 2 hours` time. To prevent this we need the constraint
    `V,M <= valid-until - (2 hours - D) = fresh-until + D`.


So we need `fresh-until <= V <= N <= W <= fresh-until + D` to ensure
smooth updates of consensuses and avoid Tor outages. Depending on
`D`, this may not give us much headroom. Directory mirrors randomly
updates their consensuses every hour so we may get `D ~= 0` (first
mirror we pick fetched the new consensus more or less immediately
after it was published by the authorities), `D ~= 1 hour` (after all
mirror fetches either failed or resulted in a consensus we already
have, we pick a mirror that update in the end of the hour) and
everything in between. The point is that `D` is unpredictable, and we
cannot say much about it and hence shouldn't depend on it.

**Conclusion:** The safest seems to be to set `V = N = W =
fresh-until`, which effectively removes "*good enough* time" check --
no matter what, `tordate` should make sure a consensus is fetched and
we should always set the time to its `fresh-until`, no more, no less.

It is really unfortunate that we cannot safely set `V = valid-after`
and `N = middle of [valid-after, fresh-until]`. That would:

1. allow us to keep the "*good enough* time" check. This specific
interval would be really good since it would accept all initial
clocks that are more or less correct, so they can just ignore
`tordate` and quickly get a working Tor. It wouldn't handle clocks
that are even just a bit behind right after a new consensus has
been published, but Tor has severe problems with this case as it
is.

2. ensure that when `tordate` has to set the clock to `N`, the clock
only be at most 30 minutes incorrect in either direction, which in
turn would ensure that hidden services running on Tors prior to
2.3.x-alpha will not refuse our connection due to too incorrect
clocks (with `N = fresh-until` we only get this right 50% of the
time).

We would be able to do this if Tor could handle clocks that are behind
in time better. Currently, if Tor fetches a consensus update that is
not yet valid (but signatures and everything else are good) Tor may
overwrite a current, still valid consensus with the new not yet valid
one (at least according to my logs), which prevents Tor from building
new circuits until the new consensus becomes valid.

Tor could instead discard such a consensus, but then we risk being
without any valid consensus when our current expires, so Tor would be
malfunctioning until we find a mirror with the most recent consensus,
which could take some time. An alternative would be that Tor saved any
new consensus with `current_valid-until <= new_valid-after` and
switched to it when `now >= new_valid-after`, but that doesn't work
for bootstrapping, so it wouldn't make much sense for normal Tor
users, only `tordate` users.

Instead I suggest that Tor keeps an internal *clock skew correction*
value `C` that starts as 0. The idea is that whenever we fetch a new
consensus with `T+E < new_valid-after`, we know that our clock is
behind, so we should correct it when dealing with Tor related time
checks, at least those necessary for the consensus (maybe other time
checks benefit from this correction too, but I don't know and won't
consider it here). In practice this means that we set `C =
new_valid-after - now`, and whenever we compare `valid-after`,
`fresh-until` or `valid-until` with `now` (or something derived from
it), we take `C` into account.

With this fix, Tor will work no matter how badly the clock is
behind. But that will allow replaying any consensus with
`valid-until`in the interval `[T+E, T + 3 hours]`, which is terrible
(with a big enough time skew *any* consensus ever published could be
replayed), instead of `[T+E, T+E + 3 hours]` like it is with the
current behaviour. Therefore we must limit `C` to never be allowed to
be larger than some `L`, and then the interval becomes
`[T+E, T+E + L + 3 hours]` which should be fine for a reasonably low
`L`. I think `L = 2 hours` is a good choice; with a correct clock
(which still of course is what all Tor users should aim at), the
consensuses that can be replayed are the two previous ones that are
valid any way.

Alternative solution to clock skew correction: make all consensuses
valid for four hours (which luckily also divides 24) and set:

    `fresh-until = valid-after + 2 hours`
    `valid-until = valid-after + 4 hours = fresh-until + 2 hours`


and when a consensus is generated, `new_valid-after = old_valid-after
+ 1 hour`. If this is easy to deploy (e.g. clients and directory
mirrors doesn't sanity check that `valid-until - valid-after == 3
hours` or similar) this would allow all deployed Tor clients to work
with a clock that is up to 1 hour behind.

Bonus: here are two simple things Tor could do to make it possible to
detect *some* consensus replay attacks, for whatever that's worth:

* If we have a current, *valid* consensus we could also check if
`current_valid-until < new_valid-after`; in that situation we know
that at least the current consensus was replayed unless our clock
jumped (so we'd need to keep track of that).

* We could save the time when Tor was compiled as an internally
accessible constant and use it as a cheap detection mechanism for
*really* old consensuses; if `valid-after < build_time` we know
the consensus has been replayed.