Thursday, October 29, 2009

Analysis of 10k Hotmail Passwords - Even More Brute Force

A reader asked me through e-mail how much better John the Ripper's Markov models were compared to pure brute force or letter frequency analysis. I knew there was a reason why I put my e-mail address on the side of this blog. That's a great question, since while I'd always had more success with Markov models vs letter frequency analysis, (and certainly brute force), I had never measured the difference before. What type of researcher am I? I better fix that, so let's check it out.

Test 4: Markov Models vs. Letter Frequency Analysis vs. Pure Brute Force

So in this test I reused the data collected previously in Test 1 using JtR's -incremental mode targeting lowercase letters and numbers, (a-z0-9). I then used the popular tool crunch to run both the brute force and letter frequency analysis, (which I'm going to call LFA), attacks since JtR doesn't support pure brute force, (well there is a bit of a hack, but crunch is easier). For the pure brute force attack I ran:

./crunch 6 8 abcdefghijklmnopqrstuvwxyz0123456789

In short, I'm creating brute force guesses between 6 and 8 characters long containing a-z0-9, starting with aaaaaa and theoretically ending with 99999999, (in this run it never finished all of the six character long words). This is also the default character set that the popular password cracker Cain&Able uses.

For the LFA attack, I then ran it using the numbers based on my analysis of the phpbb.com password list. As we'll see, you can do better if you base your LFA on analysis of Spanish/Portuguese passwords, but I don't have another dataset of Spanish/Portuguese passwords so this is the best I can do without cheating. The command I used was:

./crunch 6 8 smp1abctdkjlhrfgnw2ei0ov3q457z968yux

A couple of notes. I used the first character LFA charset since crunch increments it's values from right to left; aka it will try 01, 02, 03, 04 etc. This means using the above character set, it will initially crack all the words starting with 's', then move on to words starting with a 'm' and so on. When there is no password creation policy, (aka users aren't forced to include numbers in their passwords), this is slightly better than password crackers such as Cain&Able who increment their values from left to right, aka 10, 20, 30, 40 ... The reason for this is because people are slightly more predicable in how they start words/passwords, vs how the end them. For example, from the phpbb.com dataset, over 51% of the users choose one of top ten characters to start their passwords. Likewise, 46% of users choose one of top ten characters to end their passwords. It's not a big difference, but every little bit helps. If there is a password creation policy though, this quickly falls apart, and it's vastly more preferable to use a left to right approach since people almost always put numbers/special characters at the end of their passwords. For example, just the number '1' appears 21% of the time at the end of stronger passwords.

Also, the more astute, (and/or skeptical), readers might notice the order I'm using is slightly different from the one I published before. That's because I've cracked more passwords from the phpbb.com dataset, (I'm nearly up to 98% of the total set cracked), so I've updated some of my statistics. That's another thing I need to get around to posting up here...

Results: Note, you can click on the image for a bigger version of it.


Analysis: The first thing that stands out is that my letter frequency analysis percentages are way off. I actually do worse initially then if I had chosen a pure brute force approach. I know it's cheating, but going back to my original analysis of the hotmail list, and comparing it to the phpbb list, you get the following:

Hotmail list: a1mbc2sp0lterdjfgn3hi6k759vo48ywzuqx
Phpbb list : smp1abctdkjlhrfgnw2ei0ov3q457z968yux

Looking at this, the best choice you could have done was to start with the letter 'a' which is coincidentally what the pure brute force approach did. The reason for the huge bump in the LFA attack around the 220 million guess mark is because it hit the '1' as the first character, followed by an 'a' which was about the equivalent of a perfect storm since people started their passwords with a number, followed by the actual word they were using. Even with my probabilities being off though, LFA still beat out pure brute force, so if you aren't at least using that, well I don't know what more I can say to convince you. That being said, this really shows how superior using Markov models is to both attacks, with the alphanumeric Markov attack cracking more than twice the amount of passwords that the LFA did. I'd also like to point out that if you look at the first several million guesses, a Markov based attack performs so much better than LFA it's ridiculous.

Test 5: LFA and Brute Force Attacks Against Different Length Passwords

The next question is how do LFA and Brute Force attacks perform as the password gets longer. As I said before, the above test only shows them attacking 6 character long passwords. What happens when we bump up the length to 7 or 8 characters minimum. To simulate this, I once again used crunch with the above settings except for increasing the minimum size guesses allowed. I also used the phpbb LFA probabilities, since I hate training and testing on the same dataset, (also I had already run the tests before I graphed out the results and realized how off those probabilities were...)

Results:



Analysis: I didn't even bother to put the eight character one up there since it was pretty boring. If I'm generous the pure brute force attack would crack one eight character long password, (it took slightly more than one billion guesses to do so), and the LFA attack would crack two. This shows though that while increasing the character set quickly makes brute force and LFA attacks infeasible, which is what you'll see in all the literature, even longer passwords are vulnerable against Markov model based attacks. One way to increase the effectiveness of LFA style attacks is to not try the full alpha numeric character set, (aka drop qxz etc). That's another test I'll try to run later. Even so, once again this shows that Markov models are the way to go. I'd like to point out too, that JtR's Markov model attack almost certainly would perform better if trained against similar passwords. That's another test I'll need to run ... after Halloween weekend.

A Quick Note on Time Required: I think it's important to state that these test runs are extremely short since I didn't want to run each for an entire day just to get a few more data points. My update schedule is already too slow ;) To give you a better idea how long it would take to bruteforce the entire key space here is some numbers I ran using Cain&Able. Note, Cain&Able is much slower than JtR, but it has a really nice time estimator, so please consider these numbers a worst case scenario.

Computer Specs:
2.4 Ghz Core Duo
3 Gigs of Ram
NVIDEA GeForce 8800 GTS
Windows 7 - 64 bit

-All tests were run against cracking plain MD5 hashes

Alpha-Numeric passwords: a-z0-9
  • 6 character: 14 minutes
  • 7 characters: 8 hours
  • 8 characters: 12 days
The Entire US Keyboard
  • 6 characters: 3 days
  • 7 characters: 272 days
  • 8 characters: 71 years
Oh, and keep sending me ideas, either via e-mail or in the comments of these posts.

Sunday, October 18, 2009

Analysis of 10k Hotmail Passwords Part 3 - Brute Force

As promised, let's see how these Hotmail passwords would fare in a real password cracking attack. This post will cover brute force attacks, and I'll make a later post going over the effectiveness of dictionary based attacks. As always, if there is any further attacks/analysis you would like to see run against these passwords, please let me know in the comments.

Test 1: John the Ripper Incremental Modes

The first test I wanted to run was to use John the Ripper's brute force attack, (aka -incremental). As I mentioned in a previous post, JtR's incremental attack is very powerful and probably as close to an "industry" standard as you will find. For this test, I ran JtR's four different built-in incremental modes, (All, Alpha, Digits, Alnum), against the password set, and let each one run for one billion guesses. While in a real password cracking attack, the attacker would probably run their attacks much longer, (aka try trillions of guesses), I figured a run of one billion guesses was enough to give people a general idea of the effectiveness of these attacks. Since the minimum password requirement for Hotmail passwords required legitimate passwords to be at least six characters long, I modified my JtR config file so it would only try guesses of six characters or more. Also, since the built in brute force attack in JtR won't try passwords longer than eight characters long, for the Digits only attack I ran it through Middle Child to add two digits to the end of each guess. This is because JtR would only make one-hundred million guesses otherwise, (aka all digits between 0 and 99,999,999.) I know I can patch JtR to try longer guesses, but I'm lazy.

Character Set of Each Attack Type:
  • Alpha: a-z, all lowercase
  • Alnum:a-z 0-9, all lowercase
  • Digits:0-9
  • All:a-z A-Z 0-9 and keyboard characters such as !@#$%
Results:



Analysis: As you can see, bruteforcing lowercase characters and numbers was the most effective strategy, cracking a hair over 30% of the passwords. It shouldn't be surprising that it did better than the "All" attack since less than 10% of the passwords contained an uppercase character or a special character. By including those in your attack you drastically increase the search space you are attacking. It's only the fact that JtR's Markov models are so good that the "All" attack remains competitive.

It may appear that the digits attack stopped cracking passwords after around 300 Million guesses, but the truth is that it was running out of passwords to crack. After 1 billion guesses, there were 36 passwords that contained only numbers that hadn't been cracked. This also shows how weak numeric passwords can be, (since there are only ten digits vs. 26 letters).

Test 2: Running a longer cracking session

Next I wanted to try the "All", and the "Alpha Numeric" attacks again, but let them run much longer. In this case I let them run for 25 billion guesses. This took a while even without having to do any hashing just because JtR had to spend time generating the guesses, I had to check/record those guesses, (and I was running it on my laptop).

Results:
















Analysis: I think this shows fairly dramatically how front-loaded a password cracking session can be. An attacker will initially see very good results, but after the easiest passwords are cracked it requires many more guesses to crack successive passwords. It almost looks like a logarithmic function. Near the end it was taking over fifty million guesses to crack each additional password. Even if that rate remains constant, (in real life it almost certainly will get worse), you're looking at having to make over 56 billion guesses total to crack 50% of the Hotmail passwords using the Alpha Numeric character set.

Test 3: Effectiveness of password length against brute force attacks

One person wondered in the comments about how password length affects dictionary based password cracking attacks. I'll get to that question in a later post, but I figured I should run some tests to demonstrate how minimum password length requirements can drastically increase the cost of brute force attacks. For this test, I ran John the Ripper's -incremental mode several times, using the Alphanumeric character set. Each session I modified the config file so it would only attempt to crack passwords of a specified length, from 6 to 8 characters long. The results of this, (measuring percentage of passwords of that length cracked), can be seen below.

Results:
















Analysis: Quite honestly, if your password is six characters or less, chances are,it's going to be cracked in under a billion guesses. Please note, none of the attacks will ever reach 100% of the passwords cracked since I'm only bruteforcing lowercase letters and numbers, (and there are a few passwords that used uppercase letters and symbols). Once again, we see that cracking 30% of the passwords is fairly easy even when the password length is longer, but as the attacker tries to start cracking 40-50+ of the passwords the effort required really starts to ramp up. This also shows that the minimum password requirement should be at least seven characters long to defend against any sort of offline password cracking attack.

One interesting thing that is hard to see on the graph is that I only needed 31 thousand guesses to crack 5% of the six character long passwords. I needed 60 thousand guesses to crack 5% of the seven character long passwords. Finally I needed 536 thousand guesses to crack 5% of the eight character long passwords. The reason why I'm pointing this out is to look at the password's resistance to an online attack, (aka where the attacker does not have the password hash and is trying to log into the system). Now admittedly in most online attacks the attacker is almost certainly going to use a dictionary attack vs. brute force. That being said, it was surprising that seven character long passwords only took twice as many guesses to hit that 5% mark, and the difficulty level didn't really ramp up until eight character long passwords were required.

Up Next: Dictionary Based Attacks! If there are any dictionaries you would like to see me use, please let me know, either via e-mail or in the comments. (Yes, I'll run the Wiki dictionary). As a quick preview, most of my input dictionaries have been performing horribly since they aren't targeting Spanish/Portuguese/French speakers, and the foreign language dictionaries I've found online have been ... lacking in quality. What I'm going to do is start building a custom dictionary based on these passwords to use in future research, (I've had great success with that when attacking Finnish and Swedish passwords), but that's not really an option here for obvious reasons. "Hey look, I cracked 100% of the passwords using an input dictionary created from those passwords!"

Saturday, October 17, 2009

Analysis of 10k Hotmail Passwords Part 2

There's been a lot of discussion and analysis of this list on various other sites over the last week. That's actually why I'm so interested in it. It isn't the size. Ten thousand passwords aren't that hard to come across on the net, (as scary as that is). The nice thing though is this password list is becoming sort of a common data-set anyone can work on. This keeps us researchers honest, (If I mess up my analysis someone can easily call me on it), and it gives us a way test competing password cracking techniques in a public environment.

First off, I'd like to give Google, E-Bay, and Facebook credit for how they handled this. All three sites suspended user accounts which appeared on the list, (and in the additional 20k list which I'll get to in a second), pending user verification. I don't know the amount of hoops that a user will have to go through to reactivate their accounts, but this step was necessary to protect them. Unfortunately, according to this writeup, Microsoft didn't lock several of the Swedish Hotmail accounts that appeared in the 20k list, (perhaps international .live accounts are handled by a different internal group in Microsoft), and Yahoo did an even worse job. Oh, and apparently the Yahoo accounts were filled with password reset messages, as thieves used the yahoo accounts to compromise other online accounts belonging to those users.

As to the additional 20k list, I finally managed to grab a hold of it. Looking through it, it's fairly obvious that the list was collected via a different attack, and probably was posted online by a different hacker. It just happened to be another list that was posted around the same time on pastebin. I'm going to delay doing analysis of it for now, and focus on the Hotmail list to keep things simple. Later on, once I figure out what tests/analysis proves to be the most useful to people, I'll run them on the 20k list and perhaps show some head-to-head comparisons with the Hotmail, phpbb.com, webhosting talk, myspace, etc lists.

So on to the analysis. As promised, here is a graphical breakdown of the password lengths, with the longest password being 30 characters long, (though that looks like it may have been a typo by the user, since they typed their password in three times with a varying degree of 'o's after it. The actual password was probably 17 characters long). On a side note, I really should put all of this in a pdf whitepaper, since the graphs are fairly hard to read here.


As you can sort of see, (once again, I'm sorry about the blurriness and size), 80% of the passwords fell between 6 and 10 characters long. Manually looking at the invalid passwords, (the ones which were blank and/or less than six characters long), I saw that most of the e-mail addresses associated with them also had an entry containing a valid password, (at least six characters long). I didn't check the whole list, (I do sometimes try to have a life), but I saw 50 invalid passwords that matched a valid username/password combo, and 31 examples where I couldn't match an invalid login attempt to a valid username/password. The 31 unmatched invalid username/password combos almost certainly overstates the case, since for most of those they might have also typed their username in wrong as well. Aka the username listed could be 'bob@hotmail.com', but while there were no other login attempts with 'bob@hotmail.com' there would be several accounts for 'bob.lastname@hotmail.com', 'bob.lastname2@hotmail.com', etc. I also saw similar results for people who typed in their e-mail address wrong, (such a bob@homail.com). All this is just a longer way of saying that there probably was some form of authentication in the collection of this password list. Either they were collected via a keystroke logger, or the phishing site attempted to log into Microsoft's servers using the entered credentials and presented the results to the user.

So the next question is how would these passwords fare in a real password cracking attack? Well, it's late and I need to get some sleep so that will have to wait till part 3 of this writeup.

Wednesday, October 7, 2009

Analysis of Hotmail Passwords by Other People

As the title implies, there have been a few other people who have commented on this dataset.

Link:
Comments:
At first I was a bit worried because their numbers didn't match up with mine. It wasn't until later that I realized all of their analysis was only on unique passwords. Aka if three people used the password "monkey123," they would only use that password once when generating their statistics. I disagree with some of their conclusions, but it's certainly worth a read since I can be wrong.

Link:
Comments:
The above post was written as a response to Acunetix's analysis, and I think it brings up some good points on how we need to stop blaming the user for not having some 14 character complex password for every single website they go to. There are some details that once again I don't fully agree with, (I'm much more accepting of people writing their passwords down), but it's a good read. I completely agree that the most important thing is to have three classes of passwords, one that you use everywhere, one for medium sensitive sites, (such as Facebook), and (at least) one strong password for your e-mail and online banking. The simple fact is that phishing and password disclosure, (such as one of the sites being hacked), is a much bigger danger to users then password cracking attacks.

Link:
Comments:
This one is in Russian, (You can use Babblefish if you want), and it's quite good. It has some excellent graphs. In fact, I'm probably going to steal his idea and make some graphs of my own in a followup post since they convey the information a lot better than raw numbers. Another nice thing he did was exclude all the invalid passwords, (such as passwords that didn't meet hotmail's password creation policy), in his analysis. The best graph was him comparing the hotmail passwords to two other Russian password datasets. In short, this was a really good writeup.

Tuesday, October 6, 2009

10k Hotmail Passwords

Ed note: Ok, I probably should update the title to "30k E-mail Passwords". That teaches me to take my time writing these posts ;) An updated article talking about the additional passwords floating around can be found here. All of the below only deals with the initial 10K passwords that were posted originally, since I haven't been able to find the second 20k list yet. According to Google, there's a third list as well. It's still unknown at this point if all the lists are related or not.

So as some of you may have heard, a little over 10,000 hotmail e-mail account/password combinations were publicly posted online around October 1st, with the first news report about it surfacing around October 5th. First off, I'd like to give special thanks to Steve Gadd and Ilya Sokolov for alerting me about this dataset. I'm always open to any help I can get.

Luckily I managed to snag a copy of the list before it was deleted from Google cache, though I've seen some other copies floating around. The site where it was posted on, pastebin.com, has since been taken down by the owner due to the large amount of attention, (and traffic), it has received over the incident. It's also been mentioned that pastebin is putting filters in place to prevent this from happening again. If they look through their old archives though, I think they will be surprised to find that pastebin has been one of the primary sites for distributing passwords for a while. Just about every password cracking forum I've gone to I've seen people posting their password lists, (both the raw hashes, and the cracked passwords), on pastebin. That's because most forum software won't allow you to add several thousand lines of hashes into a single post. Also pastebin's original goal, of providing a way for developers to easily distribute and modify code, makes it real easy for several people to update a huge list of password hashes with the ones they have cracked. I'm not blaming pastebin for this. It's just interesting to me how many technologies can have a dual use.

The passwords posted only cover a small range of users. Specifically the list only covers users who e-mail addresses fell into the range ara*** to bla***. It should go without saying then that this is only a small sample of the entire list that the attacker collected. While I don't know how the list was first discovered, my guess is that this list was posted online by the attacker who was trying to sell the entire list, and used this snippet to prove they actually had the goods. Looking at some of the other e-mail lists I've collected through my research, that range, (ara-bla), constitutes around 4% of all the total e-mail addresses. This means that the 10k sample probably represents around a 250k password set. Note, I based this on primarily English e-mail addresses, and the list mostly contains Spanish/Portuguese/French users so this number may be wildly off. That number sounds about right though, since 250k would make a nice round number to sell off in one chunk, (the attacker probably has collected many more passwords and is saving the extra for a different sale).

It's important to realize how valuable your webmail account is. Once someone's e-mail account is compromised, you can take over every other web account they own. Banks, Paypal, Facebook, Twitter, amazon.com, the list goes on and on. That's because every other site relies on your e-mail account to send a reset password to if you forget your current one. An attacker doesn't even need to know which online bank someone uses. All they need to do is just go to all of the online banks, enter in the compromised e-mail address and click "I forgot my password". A couple of minutes later, the bank that the user has an account with will e-mail a new password to the compromised account. I wish I could say I'm exaggerating the problem, but your e-mail address is the key to your whole online existence.

So the next question is: how did the attacker get the e-mail addresses/passwords in the first place? It wasn't from Microsoft, that's for sure. I'm very confident about this due to numerous reasons, (invalid e-mail accounts, several gmail accounts being in the list, blank passwords, and passwords that didn't meet the minimum password requirements, etc). That means that this list was either collected through a phishing attack, or through malicious software, (keystroke loggers), installed on users computers. Luckily, it's looking more and more like it was from a phishing attack, which means that Microsoft's quick efforts to disable the accounts and alert the users will pay off. If it was a keystroke logger, (And by keystroke logger I mean software set to harvest passwords), most of the users would reset their passwords only to have them stolen again. I originally suspected that this was due to infected computers, simply because there weren't any vigilante posts that you normally see due to phishing attacks. Aka passwords such as "F**K YOU, YOU F**KING HACKERS!!!" As Mr. Gadd pointed out to me though, one of the usernames was
still can not believe this, tell me whether you think is real
The lack of vigilante posts though seems to point to the fact that if this was a phishing scheme, it was a very convincing one. Upon doing some further research, it looks like this list may have been collected from a MSN Instant Messenger scam. I'm still a little weak on the details, but the short answer is that the scam site would send out e-mails saying that they could discover who had blocked you on your instant messenger client. The user would then go to the website and log in using their Microsoft account ,(supposedly so the site could run it's tests to see who was ignoring you). The scammers would then send out the same advertisement to everyone in the user's address book, saying that the e-mail was from them. There are some other details as well that I'm fairly foggy on, (they would try to sell the user some free software, and have the user send them a SMS message so they could split the profit of the SMS message with the phone company).
As a clarification, the only reasons I suspect that the above phishing attack and this password list might be related, is that the first public comment about this list that I can find appeared in a discussion about that scam on October 2nd, and the list mainly contains Microsoft e-mail addresses. There's a very high chance these various scams might not be related.

So on to the analysis:
  • Total Passwords: 9,845 - This number excludes all the e-mail addresses that had blank passwords
  • Average Password Length: 8.7 characters long
  • Percentage that contained an UPPERCASE letter: 7.2%
  • Percentage that contained a special, (aka !@#$), character: 5.2%
  • Percentage that contained a digit: 51.7%
  • Percentage that only contained lowercase letters: 43.3%
  • Percentage that only contained digits: 17.6%
  • Percentage the started with a digit, (aka '1password'): 25.0%
  • Percentage that ended with a digit, (aka 'password1'): 44.1%
  • Percentage that started with a special character: 0.5%
  • Percentage that ended with a special character: 2.2%
  • Percentage that started with an uppercase letter: 6.1%
Letter Frequency Analysis:
Note, for some reason I can't get a couple of the characters to display correctly on my Mac so I'm cutting off several of the non-ascii ones that are only used once or twice:

Overall letter frequency analysis:
aeoi1r0ln2st9mc83765u4dbpghyvfkjAzEIOxRLwSNq.MTC_DB-UP*G@H/ZYF+VJK,\$&X!Q=W?'#")(%^][}< {`>
First character, letter frequency analysis:
a1mbc2sp0lterdjfgn3hi6k759vo48yAwMzBSCuqPLExJRTFDGNV*HOZYKI\W@/-+(.$U&?Q^[,#
Last character, letter frequency analysis:
aos01326e57849nrilydzmtuAhbO.gck*SxpfE@+LvjNRw_-I?/$q!ZX)YKH"UPMDCB#GF'&%}T,]\VJ(
Short Analysis:
Overall, the passwords in this list were fairly strong considering Microsoft only had a weak password creation policy in place, (the password had to be at least six characters long). What was also surprising was the number of passwords that only contained numbers. -See, I told you it would be a short analysis. I'll try to post a more detailed analysis, (such as nationality/language breakdown, effectiveness of input dictionaries, etc), later.

Cracking Passwords with Middle Child

Yup, it's been a month since my last post. Believe it or not, I've actually been fairly busy, both in working on my probabilistic password cracker, and writing up several research papers. That doesn't even begin to get into the stacks of disclosed passwords I've managed to accumulate but I still need to do some analysis of. Of course it's fairly hard to complain about having too many passwords to crack/analyze. It's sort of like having too many girls ask you out. It's a good problem to have.

Wow, after rereading the last couple of sentences, I really need to get my priorities in order ;)

On that off-topic note though, I figured I should actually make some more of my tools available to the public rather than just boring everyone by talking about crypto. Middle Child is a tool designed to aid in targeted brute force password cracking attacks.

The short summary is that I love using John the Ripper's -incremental mode which incorporates the most sophisticated Markov modeling I've yet to see in a password cracker. JtR uses 2nd order Markov models, (where it models the probability of three letter triplets appearing), and it actually keeps separate probability information for different length words, (aka the probability matrix it uses for a 5 letter word is different from the one it uses for an 8 letter word). To better illustrate this, below are the first 10 guesses JtR will create when running a brute force attack, (using only a lower alpha key-set):
  1. bara
  2. sandy
  3. shanda
  4. sandall
  5. starless
  6. dog
  7. bony
  8. bool
  9. boon
  10. stark
As you can see, it looks a lot like a dictionary attack, but it's actually using brute force. The advantage is that if you let it run long enough it will completely cover the keyspace. For example, it will eventually try unlikely guesses such as 'zqzqqqzz' which a normal dictionary attack will never try. The problem is, while Markov models are very efficient at generating guesses that look like human generated words, they still can have trouble cracking any halfway decent password, (due to time constraints). Middle Child is an attempt to apply additional logic to JtR's -incremental option to more efficiently attack stronger passwords.

-Ed note: Yes that is the "short" summary...

So what is the additional logic that Middle Child uses? Mostly just basic modeling of how people create passwords. For example, people often put numbers at the end of a password, and if they bother to hit the shift key, capitalize the first letter. Another optimization is if users incorporate special characters and numbers into their passwords, they generally use a special character followed by a number and not the other way around. Really all Middle Child does is apply word mangling rules to brute force guesses. By applying these rules to the password guess as a whole, it allows an attacker to target what are traditionally considered stronger passwords using a brute force attack. This is important since most input dictionaries will generally only crack around 30-50% of an average password set even in a best case scenario. That number isn't made up, (like 74% of all statistics...). If you are interested, check out my Defcon16 slides for a further explanation of how I came up with that 30-50% range. Note, I haven't tried to run the numbers again with Sebastien Raveau's humongous wikipedia wordlist yet since it was so large it broke my parsing tool.

Ok, back on track: Now admittedly when you start to use multiple input dictionaries you can achieve better results, (especially input dictionaries created from previously cracked passwords), but after a certain point that starts to resemble brute force as well. I guess what I'm trying to say is that brute force attacks still play a very important role in password cracking. It just means you have to be smart about how you go about using brute force style attacks.

Now, I've actually gotten into arguments over whether this can still be called a brute force attack, (since I'm using Markov models and word mangling rules). Can you say, "Nerd fight"? My response is anything that doesn't use an input dictionary technically is a brute force attack. Personally I like the term "Targeted brute force attacks", though I've seen people call it "Indexed attacks" as well. In the end, it is what it is, and I find it useful so I don't care if you call it a brute force attack or not. I've posted before about how troublesome it is that the password cracking field doesn't have standard definitions...

You can get the tool, (along with some more info and a short user's guide), here.