Re: [Tails-dev] TrueCrypt/VeraCrypt volumes undetectable

Delete this message

Reply to this message
Autore: segfault
Data:  
To: tails-dev
Oggetto: Re: [Tails-dev] TrueCrypt/VeraCrypt volumes undetectable
anonym:
> segfault:
>> Hi,
>>
>> so I started working on the project a few days ago and now I realize
>> that we might have a problem, because TC/VC volumes are undetectable.
>> Unlike LUKS encrypted volumes, TC/VC encrypted volumes don't have a
>> cleartext header. Instead, the TC/VC header is encrypted with the
>> passphrase / keyfile provided when unlocking it. This means that TC/VC
>> volumes are undistinguishable from random data.
>
> There is a probabilistic algorithm for detecting TrueCrypt container *files* since they exhibit certain characteristics [0]. It's not foolproof, and can produce both false positives and false negatives, so it might not be suitable at all, and I don't know if it identifies VeraCrypt volumes. Furthermore, I believe one of the steps (chi-square distribution test [1]) is computationally too heavy to run on each file encountered by GNOME Files.
> [0] implemented by TCHunt: https://github.com/stephenjudge/TCHunt
> [1] and it's a bit ironic that the "pass chi-square distribution test"

is one of the things identifying a TC volume -- I suspect TC ensures its
volumes pass that test in order to "look random" (which the test more or
less is about) instead of "encrypted"... but few other header-less data
passes that test. :)

Even if it was not too computationally heavy, GNOME Files does currently
not automatically open and (partly) read every file it encounters (I
verified that using strace), and I don't think we should change that.

> However, it might be feasible for media scanning; this requires that the algorithm can be re-adapted to identify TC partitions instead of container files [2], and we'd only run the computationally heavy operation on a subset of the disk data (e.g. the first megabyte of the device/partition), if that doesn't break the algorithm. IMHO this is worth looking into, possibly also asking the author for some tips.
> [2] for encrypted storage we have two cases: (1) the whole device is

TC encrypted, so there's no even a real partition table, and (2) there
is a partition table, and the TC volume is one of its partitions. For
neither of these cases I think the two criteria about file size can make
sense, but it deserves some thinking/experimenting.

I reimplemented the chi-square calculation (see attachment) in order to
test how long the calculation takes and which results it gives for TC/VC
devices and other kinds of devices. I only inspect the first 512 Bytes
of the device, because I expect this to usually either contain a
partition table or something like a filesystem superblock, which are
both not random. This is also works for LUKS encrypted devices, which
start with a LUKS header.

The results seem good. Some examples:

$ ./chi_square truecrypt_device
Chi square: 264.000000
Execution time: 0.000036

$ ./chi_square luks_device
chi square: 50663.000000
execution time: 0.000127

$ ./chi_square ext4_device
chi square: 130560.000000
execution time: 0.000036

$ ./chi_square ntfs_device
chi square: 6864.000000
execution time: 0.000033

$ ./chi_square fat32_device
chi square: 10100.000000
execution time: 0.000035

$ ./chi_square lvm2_device
chi square: 130560.000000
execution time: 0.000017

So the execution time is low enough to perform the calculation for every
device.
I also implemented an entropy calculation on the same data, which
produces what looks like equally good results to me, with about the same
execution time. So I think we could also use entropy as an indicator.

The only problematic result I got is from my encrypted swap partition:
chi square: 244.000000
execution time: 0.000035

This is because the swap partition is encrypted in dm-crypt in plain
mode, i.e. non-LUKS mode, so there is no LUKS header. As a result, all
volumes encrypted in dm-crypt plain mode are indistinguishable from
random data, just like TC/VC volumes. The same is true for cryptoloop
(loop-AES) volumes.

Since both the dm-crypt plain mode and loop-AES encryption are also
supported by cryptsetup, we could solve this by letting the user choose
the encryption mode when they try to unlock a volume that looks like
it's encrypted. They would have to select from all of TrueCrypt,
VeraCrypt, dm-crypt plain, and loop-AES.

> Returning back to container files vs GNOME Files, I think it's enough if we simply consider those that have the .tc or .hc extensions. One could consider running the TCHunt algorithm on files with those extensions, but given the risk for false negatives/false positives that seems like it doesn't add anything (contrary to the partition case, where that is all we have).


I'm not sure what you have in mind for how we should handle container
files in GNOME Files. Is this about double-clicking a file in order to
decrypt it? I don't know exactly how to handle this case, but I don't
think it should be GNOME Files specific. Instead, we should register a
mimetype and a default handler for .tc and .hc files. Note that this is
currently not supported for LUKS files either.

>> I see some problems resulting from this:
>>
>> 1. We won't be able to detect TC/VC partitions on removable media
>> plugged in and ask the user to unlock it (like it is done for LUKS
>> partitions).
>
> So, we actually might be able to do this, although not with 100% certainty. So...


You're right, it might be possible to do this, but I think it will be a
UX challenge.

>> 2. We have to treat all unidentified partitions as possibly TC/VC
>> encrypted and allow users to try to unlock them, which might be
>> confusing if the partition is not actually a TC/VC volume.
>
> ... yes, no matter what, we'll need to support this. However, there's a problem, but let's first recognize (unless I'm completely wrong here) that we have two different cases for TC/VC on storage media (as opposed to container files):
>
> 1. the whole storage device (e.g. /dev/sda) consists of a TC/VC volume, so there is probably no identifiable partition table.
>
> 2. only some partition (e.g. /dev/sda1) is a TC/VC volume, so there is a partition table, and the partition probably has no identifiable filesystem header.
>
> Note the word "probably" in both of these cases; given the random + headerless nature of TC/VC volumes, it could be that those first random bytes happen to coincide with what would look like a valid partition table in 1, and a valid filesystem header in 2. In those cases these would no longer be unidentifiable (GNOME Disks uses the term "unknown", FWIW), so in fact we have to support opening *any* device/partition as a TC/VC volume, no matter what udisks thinks it is.
>
> However, I have no idea of how probable any of these are. Perhaps they

are just too improbable (e.g. in the same order of probability as a
gamma ray flipping a bit in the controller firmware that bricks the
device) so we don't have to care? It needs to be investigated!

This is a very good point. I confirmed that by changing only two bytes,
I could make udisks think my VC volume had a filesystem. That would mean
that at least every few thousand times, a TC/VC volume would be wrongly
detected as a valid filesystem. But we could use the chi square or
entropy indicator to detect that the volume does not actually contain a
valid filesystem. I doubt that there is a filesystem which allows the
first block to consist of random data, but we would have to confirm this.

>> 3. We probably won't be able to display TC/VC partitions in the quickly
>> accessible sidebar of GNOME Files, because this seems to be reserved for
>> special volumes (I couldn't figure out yet which criteria a volume must
>> currently meet to be displayed there, but it does include LUKS
>> partitions from removable media).
>
> Ok. I'm a bit lost on this point, so unless you can clarify it, I'll just skip it.


I think we might have different understandings of what "extending GNOME
Files for TC/VC" means. My understanding is that we want to show TC/VC
devices in the sidebar, like it is currently the case for LUKS devices
(see attached screenshot). If you click on such a LUKS device, it
prompts for the password and unlocks it.

>> I'm wondering if we could integrate TC/VC support into GNOME Files in
>> another way. We could display all possible TC/VC partitions in "Other
>> Locations" and allow unlocking there. But I'm not sure if this adds any
>> value if you could also use GNOME Disks instead.
>
> That location normally doesn't list unknown media, so how exactly do you envision this working? That each unknown medium is shown there, and clicking it asks it you want to try opening it as a TC/VC volume? Or that there is an "Open TrueCrypt/VeraCrypt volume" action that then will list all possible devices/partitions? Neither of these sound particularly attractive to me.


Well, it does list many different partitions on my system. So I thought
this would be a better place to list a possible false positive than the
sidebar. And yes, I want a click to unlock the TC/VC volume. But with a
good indicator, I think we could also use the sidebar.

> A closing thought on all this is that, after all, the difficulty of identifying TC/VC volumes is actually a feature, one that some users appreciate. As such, I think it'd be fine to give the users the control of when they want the identification of the volume to be easy vs hard.


I think only few TC/VC users actually know about this feature, and they
actually use TC/VC for different reasons. So I don't think this is a
good argument to offload work to the users - I'd prefer it if there was
another way.

> For the file container case, that is whether they use the .tc/.hc extention; for TC/VC media it's less clear, since I guess there's no standard of how to make that obvious, unlike the file extension case. Perhaps we can invent one (and preferably get VeraCrypt on-board)? E.g. that the volume must have a GPT label with some specific string (so we explicitly disallow case 1, which is fine since the user doesn't care about hiding the volume)?


Like you said, it is a feature that TC/VC volumes are indistinguishable
from random data. I expect that the VeraCrypt developers don't want to
change this.

> Any way, it's worth looking up if there actually is a standard for

this (what does VeraCrypt do when encrypting a device/partition? I never
tried it or TC on anything besides file containers, so I have no clue).

What do you mean? A standard for what? VeraCrypt doesn't know which
devices contain VeraCrypt volumes, the user must choose the device (but
they can add devices as "favorites").

Cheers!

#include <stdlib.h>
#include <math.h>
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <time.h>

float get_chi_square(char* path){
int fd = -1;
unsigned char buf[512];
uint symbols[256] = {0};
float chi_square = 0.0;
float e = sizeof(buf) / 256;
unsigned int i;

  fd = open(path, O_RDONLY);
  if (fd == -1) {
    printf("Error opening file\n");
    exit(2);
  }


  if (read(fd, buf, sizeof(buf)) != sizeof(buf)) {
    printf("Error reading from file\n");
    close(fd);
    exit(3);
  }


close(fd);

  /* Calculate Chi Square */
  for ( i = 0; i < sizeof(buf); i++) symbols[buf[i]]++;
  for ( i = 0; i < 256; i++)
  {
    chi_square += (symbols[i] - e) * (symbols[i] - e) / e;
    //printf("symbols[i]: %u, %f, chi_quare: %f\n", symbols[i], chi_square);
  }


return chi_square;
}

int main(int argc, char *argv[]) {
float chi_square;

  if (argc < 2) {
    printf("Usage: chi_square FILE\n");
    exit(1);
  }


clock_t begin = clock();
chi_square = get_chi_square(argv[1]);
clock_t end = clock();
double time_spent = (double)(end - begin) / CLOCKS_PER_SEC;

printf("chi square: %f\n", chi_square);
printf("execution time: %f\n", time_spent);

return 0;
}