A random tidbit on non random data

I recently was talking with somebody who felt that TrueCrypt hidden volumes were the bee knees. The scenario they used, and which I myself have read ‘musings’ about, involved a laptop carrying sensitive corporate data being seized by customs. Laptop drive gets “reviewed”, secret container is not seen, and laptop passes as normal and uninteresting. Big deal. Bigger deal is if you have 007 style data and that guy in the uniform is pretty certain you have it as well. My colleagues version of the story ends with an almost hollywood style style exhalation of breath and cinematic zoom out to the hero walking out the door. That’s not how it would probably pan out…

Truecrypt volumes, which are essentially files, have certain characteristics that allow programs such as TCHunt to detect them with a high *probability*. The most significant, in mathematical terms, is that their modulo division by 512 is 0. Now it is certainly true that TrueCrypt volumes do not contain known file headers and that their content is indistinguishable from random, so it is difficult to definitively prove that certain files are TrueCrypt volumes. However their very presence can demonstrate and provide reasonable suspicion they contain encrypted data.

The actual math behind this is interesting. TrueCrypt volume files have file sizes that are evenly divisible by 512 and their content passes chi-square randomness tests. A chi-square test is any statistical hypothesis test in which the sampling distribution of the test statistic is a chi-square distribution* when the null hypothesis is true, or any in which this is asymptotically true. Specifically meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-square distribution as closely as desired by making the sample size large enough.

So what does this all mean? Really nothing for us normal people. For those whom I have built custom STSADM containers for securing your backups and exports, your data is still secure and will stay that way indefinitely. For those running across the border. A forensic analysis will reveal the presence of encrypted data, TrueCrypt volumes or otherwise, but not much more. Sometimes that’s enough to start asking questions or poking further. With the forensic tools, not the dentistry kit.

* A skewed distribution whose shape depends on the number of degrees of freedom. As the number of degrees of freedom increases, the distribution becomes more symmetrical.

http://www.truecrypt.org/
http://16systems.com/TCHunt/

SharePoint Disaster Recovery: A moment

Disk space is cheap. We all hear and see it but plenty of you out there seem to ignore this fact. Yes, there can be a cost associated with maintaining the extra volumes in your data plan, but does there rally have to be?

Let’s face it, the average hard disk has a stated MTBF that is just ridiculous. Oft misinterpreted, and more generally misunderstood the numbers range upward of 50+ years. They are sourced roughly with the following logic. If a drive has a MTBF rating for 300,000 hours and the service life is 5 years a group of these drives should provide 300,000 hours of service before one fails. Needless to say, the unknown unknowns can interfere… The key point here is that they as a standalone device are supposed to be, and typically are, rock solid and reliable. Paired with a drive of equal properties from a different manufacturer, or if the same, from a different production batch, your odds of failure are even more reduced. Right now an external 1TB drive with USB or Firewire will run you less than $150. Buy two and you’re still under $300. Total costs for electricity ~$50 a year? That’s cheap.

Now why don’t people just hook one of these to a server, networked would be a bonuus, and add it in as an additional backup location? Some do, but they are the exception, not the norm. More than once, though sometimes it took some “cajoling”, clients of mine have seen the merits of extra, cheap, storage that STSADM can dump data securely onto and be retrieved quickly and easily. I’m a firm believer in the more baskets you have, the fewer broken eggs you have.

Needless to say you can secure these drives with something like this…

dd: clean your drive securely

Now like anybody I’m a BIG fan of wiping old drives using dd but sometimes there’s a tool out there that will do most if not all of the work for you. Cue DBAN. OR as the site says:

Darik’s Boot and Nuke (“DBAN”) is a self-contained boot disk that securely wipes the hard disks of most computers. DBAN will automatically and completely delete the contents of any hard disk that it can detect, which makes it an appropriate utility for bulk or emergency data destruction.

Complemented with TrueCrypt you will have a mighty secure setup. Possible / definite paranoia issues too… But your data will be secure. For the more command line orientated the old reliable dd if=/dev/urandom of=/dev/disk bs=1k is good enough imho. (It puts random bits in place as opposed to a regular pattern. Not that it will stand up to NSA level scrutiny but it’s more than enough for most data recovery…)

For more go to:

DBAN: http://www.dban.org/
TrueCrypt: http://www.truecrypt.org/