Re: [Tails-dev] Serious issue: fail-safe and hotplugging [Wa…

Delete this message

Reply to this message
Author: anonym
Date:  
To: The Tails public development discussion list
Subject: Re: [Tails-dev] Serious issue: fail-safe and hotplugging [Was: MAC spoofing: current status?]
30/12/13 13:48, intrigeri wrote:
> anonym wrote (29 Dec 2013 21:21:35 GMT) :
>> 27/12/13 18:05, intrigeri wrote:
>> Approach 1
>> ----------
>
>> A seemingly obvious fix would be to move the fail-safe from its current
>> location, tails-unblock-network, into tails-spoof-mac, which is run by
>> the MAC spoofing udev hook when network devices are added. The fail-safe
>> would then act on a per-device basis, and it would be closer to the
>> spoofing, which both are nice (bonus: the problem you raised about
>> "macchanger can't retrieve the permanent MAC address" would be really
>> easy to fix).
>
> I like this approach, and I hope we can make it work fine. Let's see.
>
>> However, a big issue with this approach is that if NetworkManager is
>> running when tails-spoof-mac is run by the udev hook (which will be the
>> case every time a device is hotplugged after TG login) then there's a
>> race: will NM spawn network activity before the fail-safe is triggered
>> in case of a MAC spoofing error? This doesn't feel robust at all.
>
> Are you sure NM picks up newly added devices before udev has finished
> adding it to the system (which, I hope, is indicated by the fact that
> all udev rules have completed their job, which, I hope, includes
> running all hooks)?


I wasn't sure, but I've now verified that there indeed is *no* race with
NM. I verified by adding a sleep(1) hooked via an udev rule, and NM is
delayed for the duration of the sleep. It should be noted, though, that
the device is fully operational during the sleep; I could get the
network up via e.g. `dhclient eth0`. Hence, if "something else" can
cause network activity at that time, then there's a race between that
"something else" and the spoofing instead. (In case you wonder why I
didn't investigate the NM race possibility earlier it was because I
first wanted to explore Approach 2 (which took quite some time) since
that would eliminate this class of issues.)

It remains to investigate why NM waits for all udev hooks as we don't
want to depend on voodoo. Any ideas?

> If NM waits for this to be done (and I would hope so), then this
> approach is actually not racy at all, is it?


Correct. Well, at least as long as we consider NetworkManager as the
only possible (and automatic) cause of network activity.

> If my guess is wrong, then this approach is definitely racy. We could
> mitigate this a little bit by (first thing) disabling the just-added
> interface in the script run by the udev hook, with something like:
>
> nmcli dev disconnect iface $IFACE
>
> Potential problems:
>
>   * I see this is available in current sid's NM, no idea if that's the
>     case on Squeeze.
>   * I see no way to re-enable the device with nmcli, so it would
>     probably require reloading or SIGHUP'ing or (worst case)
>     restarting NM once we have spoofed the new NIC's MAC address.

>
> Another similar hack could be to add the newly plugged device to the
> list of unmanaged ones, see "Unmanaged devices" on
> https://wiki.gnome.org/Projects/NetworkManager/SystemSettings.
>
> Of course, this would still be a bit racy, but perhaps acceptable.
>
> Another hack could be to add to /etc/network/interfaces, at Greeter
> time, a broken static configuration snippet for all possible names
> that could be used for NICs plugged in the future. IIRC, NM fully
> ignores devices that are listed in that file unless they're configured
> to use DHCP (not sure what's the current state of the art in Wheezy
> and later, but I expect README.Debian to document this). Then, when
> a new NIC is plugged in and we have successfully spoofed its MAC
> address, then we can remove the corresponding configuration snippet
> and gently ask NM to manage it, with a NM reload or SIGHUP or — worst
> case — restart. Ugly and not that easy to get right (listing the names
> would be a pain, to start with), yeah, but probably more robust in the
> end than the other ideas raised in approach #1. Still, I hope we can
> avoid this.


Right. I had also considered at least the "unmanaged devices in NM"
solution, but I *really* think we should avoid adding such complex and
possibly fragile hacks if possible.

>> Approach 2
>> ----------
>
>> An alternative fix that would keep the robustness of the current
>> implementation (in fact, very little in the current implementation would
>> change, at least compared to Approach 1) would be to disallow
>> hotplugging of network devices after TG login. While this will add some
>> user inconvenience, I think it's acceptable
>
> Fully agreed.
>
> More specifically though, this would bring one (minor) UX issue:
> careful people have got used to 1. boot Tails; 2. login; 3. stop NM;
> 4. plug their NIC; 5. spoof the MAC for their NIC; 6. start NM. So,
> they would need to change their existing habits entirely, and plug the
> NIC before booting. I think that can very well be addressed by a tiny
> bit of documentation, which we'll need anyway if we go with
> this approach.


A notification ("Hotplugged network device was ignored...") would also
be helpful, but if we're able to do this depends entirely on how the
hotplug blocking is implemented.

>> (also, it's pretty close to
>> some of the proposed solutions for bug #5451: protect against external
>> bus memory forensics).
>
> First, I expect that most such hotplugged NICs are either USB, or
> plugged into a bus we want to disable post-login as part of #5451.
> The latter doesn't matter much IMO yet (since it'll be disabled at
> a lower level at some point, and until we complete #5451, there's
> little point in protecting against this at the NIC level only).
> For USB NICs, that's a different matter.
>
> On the one hand, we have not considered disabling USB devices hotplug
> as part of #5451, mostly because 1. USB doesn't do DMA (right?) and 2.
> many usecases we want to support, e.g. Tails Installer, depend on the
> ability to do so. So, I doubt we'll ever want to disable USB hotplug
> entirely post-login.
>
> On the other hand, assuming we disable USB NICs hotplugging via udev,
> but keep supporting hotplug for other kinds of USB devices, then we
> don't add much protection: the recent flow of issues discovered by
> Kees Cook in Linux regarding USB drivers were mostly found in the HID
> layer, and I suspect the underlying USB controller drivers may have
> similar issues.
>
> So, I must say I don't buy this side argument (closeness with #5451)
> at all. Of course, I understand you didn't mean this to be a decisive
> one :)


Since I was talking about "user inconvenience" I was taking the user's
perspective, not a technical one. In #5451 we're already considering to
forbid plugging of some devices after TG login, so from the user's
perspective it's "close" (if not even the same thing). Or am I
misunderstanding what you mean?

Any way, slightly off-topic, but regarding "USB doesn't do DMA" I think
that's actually false. A quick search yielded:

    Because FireWire and USB were designed with the intention of
    connecting high-speed disk drives, both specifications have
    provisions for DMA. This means that, under many circumstances, a
    device that's plugged into a FireWire or USB interface has the
    ability to read and write to individual physical memory locations
    inside a the host computer. Such access necessarily bypasses the
    host operating system and any security checks that it might wish to
    implement.


Source: <http://www.csoonline.com/article/print/220868>

(Also reported by Schneier:
<https://www.schneier.com/blog/archives/2006/06/hacking_compute.html>)

>> The problem with this approach is how to disallow hotplugging. Simply
>> restoring the blacklist isn't very robust; since the blacklist works on
>> the module loading (modprobe) level, devices that happen to use the same
>> module as a device that was added before TG login can then be
>> successfully hotplugged even after TG login.
>
> I don't think we should disregard this idea entirely. I think the
> problematic situations are very much corner cases in practice:
>
>   1. It's quite unlikely to happen in practice: first, it requires
>      this special combination of NICs, that's itself pretty unlikely;
>      second, if MAC spoofing worked for the NIC plugged in early,
>      there are great chances it works to for the NIC plugged
>      post-login, since both use the same driver. So, the lack of
>      a fail-safe in this edge case doesn't seem critical to me.


Sure, but we'd still have the NM (or "something else") vs. MAC spoof
race from the discussion of Approach 1.

>   2. The failure mode I would expect to be the most frequent (MAC
>      spoofing fails for both the NIC plugged pre-login and the one
>      plugged post-login, since they use the same driver) is not
>      a problem in practice: if MAC spoofing fails for a NIC plugged
>      pre-login, then we already unload the corresponding module with
>      our fail-safe code. This, combined with restoring the blacklist,
>      would prevent the post-login NIC to be enabled.

>
>   3. For the worst failure mode (MAC spoofing works for the NIC
>      plugged pre-login, but fails for a NIC hotplugged later despite
>      they use the same driver), that I expect to be very rare, then
>      I expect the kind of (power-)users who play with multiple NICs to
>      be able to use rfkill as a minimal safe-guard, or refrain from
>      plugging a network cable too early, and then verify themselves
>      that MAC spoofing was successful, especially once this potential
>      problem is documented.

>
> So, I think it would be acceptable to document as a known issue that
> NIC hotplug works after login (while it ideally shouldn't) *and* lacks
> the "fail-safe" verification *if* it uses the same kernel driver as
> another NIC that was already plugged in at Greeter time. I'm playing
> my "let's be pragmatic" magic card!


I'm agreeing with all you say here, but before settling on this I want
to explore the possibility for a "perfect" solution, with no odd corner
cases like this. I would sleep much safer at night if we could
completely disregard this class of issues, like we would with a
lower-than-module-based blocking or similar.

>> Does any one know of an alternative to "ignore_device"?
>
> I couldn't find any :(


After reading up on recent (and future) developments of udev, it seems
its purpose is moving towards something that doesn't support these kinds
of things. Therefore I now give up on a udev-based solution.

>> What to do?
>> ===========
>
>> If we cannot solve the problems in Approach 1 or 2 (and cannot come up
>> with a superior Approach 3) then I guess we will have to pick whatever
>> we consider the least bad of those two, and perhaps document that
>> hotplugging after TG login isn't safe w.r.t. MAC spoofing.
>
> Another approach (that only works for USB devices) worth looking at is
> the Linux USB authorization support:
>
> * Documentation/usb/authorization.txt
> * http://www.irongeek.com/i.php?page=security/plug-and-prey-malicious-usb-devices#3.2_Locking_down_Linux_using_UDEV
>
> So, my current position is: if approach #1 is doable in a non-racy way
> (i.e. if NM does not picks up a new device before udev is done with
> it), then let's KISS and just do it. Else, either go with approach #2,
> or with approach #1 (in a bit racy way), that both require documenting
> the limitations in rare, edge cases.


So for KISSing and picking Approach #1 it all boils down to whether we
consider NM as the only possible/reasonable participant in the
aforementioned race. I guess it is, but who knows? Given how complex a
modern Linux system is, it's kind of hard to exhaustively rule out
everything about anything. :)

Hmm. I just think I came up with a fix that makes Approach #1 robust (it
can be used for Approach #2 too, but it doesn't make as much sense): we
use ferm/iptables to drop all outgoing traffic from interfaces that have
not been explicitly said to be "ok" by the fail-safe code.
Implementation-wise I'm thinking something like this:

--- config/chroot_local-includes/etc/ferm/ferm.conf
+++ config/chroot_local-includes/etc/ferm/ferm.conf
@@ -6,6 +6,13 @@
 # IPv4
 domain ip {
     table filter {
+        chain drop_bad_ifs {
+            outerface `cat /tmp/good` {
+                RETURN;
+            }
+           DROP;
+        }
+
         chain INPUT {
             policy DROP;


@@ -22,6 +29,8 @@ domain ip {
             # Established outgoing connections are accepted.
             mod state state (RELATED ESTABLISHED) ACCEPT;


+            jump drop_bad_ifs;
+
             # White-list access to local resources
             outerface lo {
                 # White-list access to Tor's SOCKSPort's


So, all outgoing traffic is ignored, except that from interfaces listed
by name in /tmp/good (which obviously need a better location and name).
The only code that remains to be added is:

* The fail-safe code would `echo $INTERFACE >> /tmp/good` after
if has verified that MAC spoofing worked (or always do it if it's
disabled).

* An udev hook that on ACTION=="remove" runs
`sed -i '/$INTERFACE/d' /tmp/good`.

In particular, no code is necessary for reloading the ferm rules as we
already have a NM hook that does that (00-firewall.sh) when any device
is up:ed.

Random notes:

* I had to define the new chain drop_bad_ifs instead of just putting
something like

      outerface ! `cat /tmp/good` { DROP; }


directly in our existing "chain OUTPUT" block (at the same place I
put the "jump") since ferm unfortunately doesn't support negating
arrays.

* Ferm will fail if /tmp/good doesn't exist, but we will need to put
"lo" in there always any way (as a config/chroot_local-includes I
guess) so we should be fine. We probably want to handle the failure
mode better any way (for a fail-close behaviour) by adding a DROP
policy to all chains in case ferm fails. So yeah, this some more
coding not mentioned above (although very little). IMHO that would
be a good addition even without all this.

* What would a good name for /tmp/good be? /var/lib/tails/allowed_nics?
We may want to make all of /var/lib/tails completely inaccessible by
non-root users to not leak that MAC spoofing is enabled in the error
case.

So, am I just chasing ghosts (i.e. a problems that we're sure don't
exist now or in the future) or is this something worthwhile adding? It
may not be super-KISS, but unless I've missed anything it will be quite
simple to fit it into Approach #1.

Cheers!