Monday, October 20, 2008

The Limits of Rainbow Tables

In a standard offline password cracking attack you have a hash that you are trying to crack, (for example 7c6a180b36896a0a8c02787eeafb0e4c), and you need to guess the password that generated it, (in the above case it would be 'password1'). To do this you make a lot of guesses, hash them and then compare the guess's hash to the hash you are trying to crack. If they match, you have "cracked" the password.

What this means is that you often spend a majority of your time generating hashes. Since the MD5 hash of 'password1' will always hash to 7c6a180b36896a0a8c02787eeafb0e4c though, some smart people thought it would be a great idea to make all our guesses, hash them, and then save the results. That way when we want to crack a password, we just do a lookup on our table of precomputed hashes.

A defense against this is to use a password "salt" which is a random value added to the password before it is hashed. For example if you salted the password 'password1' with the salt '122342' the resulting password hash would be cff2402abb06e3015d70619a9310930b. If everyone uses a different salt, precomputed hashes do not work. It's a simple solution, but unfortunately from what I've seen with the disclosed password lists we deal with, very few people actually use salts.

Now the above method of saving passwords is called a "Hash Lookup Table". They work really well, but for the most part are very unpopular because they can grow to an enormous size. Even a simple hash lookup table can be several terabytes large. To solve this problem though an algorithm was developed to dramatically shrink the size of these Hash Lookup Tables by a factor of several thousand, (or more). Lookup tables generated this way are called "Rainbow Tables" and they are extremely popular in password cracking. That being said, you rarely get anything for free, and they have some severe shortcomings compared to traditional Hash Lokup Tables.

So yah, all the above was a long introduction to the meat of this post. I just wanted to talk about some of the tradeoffs between choosing Rainbow Tables vs. Hash Lookup Tables. As disk space gets cheaper and if people continue to avoid using salts, I think it will be important as a community to be able to weigh the merits of the two so we can focus on the best approach to spend our free CPU cycles generating.

Rainbow Tables Pluses:

  1. Takes up much, much less disk space than Hash Lookup Tables
  2. Nothing else, but I want to stress, point #1 is really important

Rainbow Tables Minuses:

  1. Rainbow tables usually take much longer to crack an individual password than a Hash Lookup Table. The more you compress a rainbow table, the longer it takes.
  2. It takes a Rainbow table twice as much time to crack two passwords as it takes to crack one password. While this generally applies to Hash Lookup Tables as well, (depends on your search algorithm), it is much worse with a Rainbow Table attack due to point #1. In fact, if you have several thousand passwords to audit, it can actually be faster to do traditional password cracking, (where you generate all the hashes), instead of using a Rainbow Table.
  3. When you create Rainbow Tables, you do it by randomly generating guesses in a repeatable fashion. What this means is that you are not guaranteed to have a password guess in your Rainbow Table. For example, if you generate a Rainbow Table that contains all words of length 8, it might not include the word 'password'. With a Hash Lookup Table you don't have that problem
  4. As a side note to point #3, Rainbow Tables take much longer to create than Hash Lookup Tables because you end up generating and hashing many duplicate guesses to increase the probability of covering the entire keyspace. If it takes you a week to brute-force a keyspace, it would probably take you several months to generate a Rainbow Table that does the same thing.
  5. It's trivial to add new guesses to a hash lookup table, while with Rainbow Tables it is much more complicated.

In short, Rainbow Tables are not perfect, but they are convenient since they can be used by people who don't have several hundred terabytes of free diskspace to spare. For smaller keyspaces though, (for example, a list of the most popular passwords), Hash Lookup Tables can be very useful.

Tuesday, October 14, 2008

Quantum Snake Oil

Seeing things like this happen makes me sad about the security industry:

Massive Quantum Network Unveiled

I could devote an entire blog just to debunking Quantum Cryptography. Back in 2005 I worked with a team to evaluate if Quantum Cryptography was a technology that was worth investing in. My recommendation was a resounding no. Since then I have to say that my answer hasn't changed.

First some background. To get a general understanding of Quantum crypto, you need to know that it works on the idea of probability. If Bob sends a message to Alice she will only be able to receive 50% of that message. If Mallory is sitting in the middle and intercepts the message, he also only gets 50% of the message, but due to the fact that Bob is sending photons instead of 1's and 0's Mallory can not resend the entire message to Alice. So the best Mallory can do is send 50% of the message on, and then fill in the other 50% with random gibberish. This means that Alice will only be able to get around 25% of the actual message from Bob, (50% of 50%). If Alice sees her error rate go up, she knows that someone is tapping the line. How does this work? Magic. Honestly, it really doesn't matter due to all the problems with the fundamentals of Quantum Crypto.

First, let's assume that this link is totally unbreakable. That's not true....

Laser cracks 'unbreakable' quantum communications

But still, I'm not ready to argue quantum mechanics with a physics major so I'll give them that. The problem is that quantum crypto also relies on an out of band channel that at best uses traditional crypto to communicate which bits were received on the quantum channel. That channel is still subject to normal attacks. In fact, since the companies making quantum crypto devices are so focused on the "Gee Whiz we're using photons", they have been badly neglecting this side channel, making their implementations much weaker than traditional VPNs

Quantum Cryptography: Researchers Break 'Unbreakable' Crypto

That being said, even if Quantum Crypto was "unbreakable" the costs of it are huge. The boxes themselves are not the problem, but running dedicated fiber lines between the different sites is horribly expensive. A much cheaper solution would be to hire a trusted person to take a stack of Cd's, (or heck spluge on BlueRay), filled with AES keys to the different sites, and just cycle through a new key a minute. I guess what I'm trying to say is traditional crypto has been banged away at for a long time by some really smart people and been found to be secure if used correctly, (admittedly a big if). Quantum crypto has not. Just because it sounds like magic does not mean you should trust it.

As a side note, it annoys me when people say that Quantum Crypto was developed as a response to Quantum Computers. A marketing response yes, but it is not a technology response. Quantum Computers have the potential to break certain algorithms like RSA faster than traditional computers. People realized that, and as Quantum Computers become more mature, you might not want to rely on RSA. Algorithms like AES on the other hand are no more vulnerable to Quantum computing than traditional computing. Quantum Computers, while nice for certain tasks, will not be a win button for crypto breaking.

Thursday, October 9, 2008

Password Cracking Geekiness

Since I'm stuck in the terminal on my way to Boston I figure I might as well be contrarian and post about something besides the stupidity of Airport security.  On that note though, why we as a society haven't risen up and revolted against having to take our shoes off I will never know...

There are really two approaches to dictionary attacks in password cracking.  It's kind of appropriate that John the Ripper (JtR) and Cain and Able (C&A) take different sides in that divide considering their user-bases get along about as well as Mac and Windows users.   As you probably know, in a standard dictionary attack you take dictionary words and mangle them in a predefined way.  For example you take the dictionary word "password" and turn it into "P@ssWord99".  Where the two approaches differ though is in what order they apply the mangling rules to dictionary words.

The first approach, which JtR takes, runs through the rules in order.  It applies each individual mangling rule to all the dictionary words before moving on to the next rule.  For instance, if you have two dictionary words "foo" and "bar" it would create guesses in the following order:
foo1
bar1
foo2
bar2
FOO
BAR
....

C&A on the other hand applies all the mangling rules to each individual dictionary word before trying the next word.  Once again using "foo" and "bar" C&A would make the following guesses:
foo1
foo2
FOO
bar1
bar2
BAR

Now ideally if you have time to run through all your mangling rules/dictionary words then both options are equally valid.  The thing is, you generally want to crack passwords as quick as possible so you can move on to your next case.  So which approach is better?  I tend to favor the first approach as that assumes that the words in your dictionary have about the same probability, but the rules vary widely in how likely they are to generate a "hit".  In C&A, the view is you want to exhaustivly test the highest probability words first.  Now I admit, some words are much more likely to generate hits than others, "monkey, password, swear words, football, etc".  It's pretty easy to deal with that in JtR though as all you need to do is do two runs, the first one with a "high priority" dictionary and the second one with a larger less specific dictionary.

The main problem with C&A's approach is that you tend to get stuck on certain word/rule combinations.  For example, C&A gives you the option to try every single case combination, (aka PaSSwoRd).  Unless you prune your dictionary though, you can find yourself spending all day/week on words like "supercalifragilisticexpialidocious".  Also, if your dictionary is a million words long, you usually are better off trying the highly likely rules  first on all the dictionary words before you start doing things like adding four digits to the end of passwords.

That being said, the biggest advantage of C&A's approach from a generating/coding standpoint is that it is very hard to do comprehensive case mangling or character insertion in the approach JtR takes.  You would need a rule for every possible option, which can quickly balloon into 1,000's of rules.  Also since your dictionary words are of different lengths, you waste some time trying to mangle words only to find out they aren't long enough for that rule to apply to them.  In short, that's one huge advantage C&A has over JtR.  When cracking passwords, I'll often run them through C&A on one computer just to take advantage of it's case mangler even though I do a majority of my cracking with JtR.  That's annoying to say the least, and C&A doesn't give you much flexibility in the rules you apply to it, (aka you can't do case mangling AND add a number to the end) so I'm working on a script to allow the use of a hybrid approach in JtR.  I'm tempted to call it Mowndo the word mutilator (It's got electrolytes... See the movie idiocracy if you don't get it).  It takes a JtR config file and applies the standard JtR rules in order, (like normal), but it also performs case mangling like C&A.  Once again using "foo" and "bar" it would produce:
foo1
Foo1
fOo1
foO1
bar1
Bar1
bAr1
baR1
foo2
Foo2
fOo2
.....

Now for the most part this doesn't matter since only about 8% of normal password contain an uppercase letter.  Of those 8%, a vast majority of them either capitalize the first letter, capitalize everything, or cap vowels.  What we are really interested in though is trying to crack those "strong" passwords.  When targeting them, this approach might help us a bit.  In short, it's not something you would want to do at the beginning of your cracking session, but it would be useful to use before you give up and resort to bruteforcing the password.

Monday, October 6, 2008

Secure Programming

I write most of my code in C and occasionally C++. I know Perl, (or Java), would be better in many cases but all my programming classes, (the school kind), require C so that's what I'm banging out most of the time. Add to that the fact that most password cracking programs are written in C/C++, (JtR written in C), (L0phCrack written in C), (rcrack written in C++), (Access Data's PRTK's cracking engine is written in C), so C tends to be my language of choice.

Well, today I came smack up against the fact that the strnstr() function isn't widely supported across platforms. Yes, I know the "n" functions aren't much better than the other string functions. Heck, I've had more than enough segfaults even when I thought I was using them correctly. At the same time though, they are "easy" to use and supported everywhere, (with the exception of strnstr). I know about the "l" functions, but once again they are mainly used on OpenBSD and aren't available on many flavors of *nix. The other option everone seems to mention is that I could just write my own string handling classes, but that's just asking for trouble. Finally, I could ditch straight C and use the C++ string class but a lot of my code still needs to be ANSI C combatable so that doesn't work. What we really need is a library that meets the following requirements:
  1. Portable - (Ideally it should work across all ANSI C compatible compilers)
  2. Secure - (otherwise we would just use the string.h library)
  3. Flexible - (It needs to be able to handle the tasks I need it to do)
  4. Well Known - (If people don't know about it, they aren't going to use it)

That gets to the core of my problem. Why is it that Microsoft is the only group that has released a secure string handling library, "strsafe.h", that is actually good? With the exception of option #1 above, it meets all the requirements. It's a rhetorical question, but why aren't "secure" versions added to string.h, (or at least the GLibc library)? It's no wonder programmers are still making the same mistakes they were 20 years ago if they haven't been given the tools to write better code.

For my part as a side project I'm going to see if there is a port of "strsafe.h" to the Linux environment. BTW, if you are programming in a Windows environment, you would be crazy not to use "strsafe.h". It's easy to make the transition, and it's even fairly straightforward to update most legacy code since there's a one to one correspondence between most of the string.h and strsafe.h functions.

If I can't find that port though, my options boil down to finding a library on sourceforge or writing my own. Unfortunatly, that fails option #4, and probably option #2 since writing secure code is HARD. Even the best programmers often miss weird cases, so I generally don't trust something until a lot of people have had the chance to bang away at it.

The ironic thing is that for the programs I'm writing, security isn't a requirement. That being said, I've found that trying to write secure code generally makes my programs more robust, and it's good experience. Also, it's hard for me to give people heck when I'm still using strcpy...

If anyone knows of a better solution I would love to hear it. On that note though, if you are programming in C/C++ I have to recommend the following link..

http://msdn.microsoft.com/en-us/library/bb288454.aspx

Don't say I didn't warn you...

Statement of Goals
For a long time I've kept an e-mail list of friends where we would discuss security issues, both computer related and not. With my password cracking webpage starting to see some traffic , I figure I might as well move some of my ramblings from bar napkins to the internet. Not that's an improvement mind you, but I'm always worried about finding myself in an echo chamber and I would really like some feedback. That, and the remote-exploit forums aren't always the best place to post random ideas.

What You Can Expect
Infrequent posts first of all... But most of my posts will also probably center around my current research in password cracking. I'm interested in network security and debunking bad crypto as well, though I know enough to realize I personally can't do good crypto. On that same note, I'm willing to admit that there's a lot about computer security I don't know so please take everything I write with a grain of salt.