Saturday, January 30, 2010

Out of Context Graph Challenge #1

I'm struggling with the best way to graph some new data I just analyzed based on the RockYou list. Since I'm also too lazy to write up a full post on it right now, I thought I might as well throw it out as a challenge to the five or so people who read this blog. I'll buy a beer for the first person who can correctly state what the following graph shows.


The beer is redeemable at any conference I happen to meet you at, (For example: Shmoocon). Here are a few hints:
  1. It is based on a subset of one million passwords from the RockYou set
  2. It has to deal with a project I am working on
  3. There is one word you MUST include in your submission for it to be valid
Answers will only be accepted in the comments. This contest will run until someone gets it right or I actually get around to writing a post on this. Imaginary bonus points are applied if you have any suggestions on a better way to graph the data.

Tuesday, January 26, 2010

More Analysis of the Rockyou Password List - Strong Passwords

So it's been an interesting last couple of days. First of all, it's a bit amazing how popular the Rockyou list has become after it was mentioned in the New York Times article. While I'm not going to provide a link, let's be honest, if you can't find it, you are not looking. The thing that keeps going through my head is that we may have just narrowly missed having a black swan event, (Ok, Mr. O'Conner just posted about those in his blog so the term is stuck in my head, even though I'm using it wrong.) Can you imagine what would have happened if the public RockYou list had contained e-mail addresses? While lists of this size have been distributed before, (and black hats have been able to obtain the whole list + email addresses), I really don't know what a public disclosure of a list this size containing e-mail addresses and passwords would be like. It probably won't lead to an internet apocalypse, but we would have people looking up friends/co-workers. The 4chan crowd, and associated griefers would of course get involved. How would sites deal with locking accounts/resetting passwords? When the Hotmail + various other passwords were leaked back in October, e-bay/webmail providers/etc had a hard time dealing with even 30k leaked passwords. Will secret questions be enough to re-verify people? What if the secret question answers from the hacked site are disclosed as well?

As I was joking with someone else, "Does the security of the Internet really depend on Facebook not getting completely 0wned?"

Ok, the above is overly pessimistic - but an event like this is going to happen, and I feel we as a security community need to start planning how we're going to respond instead of just making stuff up as we go along. I certainly hope Microsoft/Google/Banking websites/etc, have a coherent plan on what to do anyway.

So enough doom and gloom, on with the analysis. I was shooting e-mails back and forth today with Per Thorsheim about the Rockyou dataset and we both pretty much agreed that as great as 32 million passwords sounds, it's very hard to draw conclusions from the set since we know so little about it. In fact, he has a good writeup of the problem plus analysis of Imperva's white-paper which he posted here.

Basically the list gives us 32 million values. Are these values real passwords? Well there's numerous URL's included in the list, (several that are over 100 characters long and contain search strings for Rockyou applications). In fact, I just did a search using awk and came up with 684 passwords that were longer than 100 characters. Should we count those as legitimate passwords? How about 484 passwords that were only 1 character long? How many of the passwords are from the same person? What password policies were in place at the different sites? Why am I asking so many questions in this post?

So yah, I'm struggling to make sense of it all. Now on to some answers instead of more questions. Per wrote:

Regarding your own analysis of the RockYou password list as well as the analysis done by others, what strikes me is the "negativity" of the results. I had a long chat with a colleague/friend of mine who is also assisting me in my various analysis, and we agreed that we wanted to know a little more about the positive parts of the RockYou list..

He then went on to ask some specific questions. Once again I have to agree with him. The interesting part of this list isn't that thousands of people used '123456' as their password. We already knew that. The tougher passwords, now that's interesting.

Q) What's the longest password found? (# characters)

A) As I mentioned, that's hard to say since there's a lot of really long values in the list that probably aren't passwords as we consider them in the traditional sense. Excluding passwords with non-ascii characters, (they gave awk a bit of a problem since it counted them as two or more characters), I found 27,337 passwords that were longer than 21 characters long. Glancing through the results, most of them appeared real. I'll get more into their composition in the next question.

Q) What's the most complex password found? (all character groups, randomness, length etc)

A vast majority of what would be considered complex passwords turn out to be e-mail addresses. Some of them are even mangled, such as alice@example.com123. In other news, people still apparently hate using spaces in passphrases as well, (or more likely they don't realize they can use spaces). That all being said, very few of the passwords would meet a corporate password requirement, aka >8 characters, containing an uppercase/lowercase/special/digit. That's to be expected since I doubt any of the sites that rockyou collected the passwords for enforced such a requirement.

Q) Percentage of passwords longer than 8?

A hair over 30% of the passwords were longer than eight characters long. This is actually worse than the hotmail dataset where close to 40% of the passwords were longer than eight characters. That can probably be explained by the high number of rockyou only accounts in the list, (heck, 'rockyou' was the #8th ranked password). I don't know about you, but I certainly wouldn't use my A-game password there.

There still are a couple other questions I haven't had a chance to answer, but they will have to wait for another blog post.


Sunday, January 24, 2010

New York Times Article

When I was interviewed a week and a half ago by a reporter from the New York Times about the Rockyou hack, I honestly never expected for it to end up on the front page of the newspaper, but there you go. As a friend mentioned though, it doesn't really count since it's below the fold ;)

While I'm ecstatic about being quoted, there are a few things I wish could have been changed. Just saying that makes me feel like the guy who won the million dollar lottery, but is annoyed that he didn't get the 10 million jackpot. That being said, I have a blog, and this is the internet so I might as well complain away ;) First of all, I feel the need to explain my quote. Here is an excerpt from my conversation with the reporter:

Matt's Brain: "Don't say anything stupid. Don't say anything stupid. Don't say anything stupid..."
Reporter: "I take it this is the largest password list ever stolen right?"
Me: "Well, it's the largest one ever publicly disclosed. There's been a lot of larger or similar sized databreaches in the past."
Reporter: "But this is like the mother load for you guys?"
Me: "Yes this was the mother load.'
Matt's Brain: "Doh!!! I hope he doesn't use that..."

I'm sure the reason I wasn't quoted further is because A) I'm just a college student, B) I tend to be a bit overly verbose as you may have noticed... and C) What I was saying didn't fit into the narrative he was trying to tell.

I completely agree with the reporter on part A, and B. If you have a choice of quoting the CTO of a security company or a college student, you go with the CTO. My suspicions about Part C requires further explanation.

Throughout the entire interview the reporter tried multiple times to get me to criticize users for picking weak passwords. I refused to do that because quite honestly I don't blame the users. This may be a little controversial, but for a majority of users and a majority of accounts, password strength does not matter. Let's be honest: while everyone loves to talk about online attacks, (where the attacker is trying to guess the user's password to gain access to their account), online attacks generally are too expensive, (from a time/resources perspective), to perform on all but the most valuable accounts. Online attacks are something you have to worry about if you are well known, it is a corporate account, or you hosting a publicly available server, (for the last example, one of my old co-workers got 0wned by it), but the average user doesn't have to worry about that. Phishing attacks and malware/keystroke loggers are much more prevalent threats, and the reason they are is because they can be highly automated by the attacker, (aka cheap), and they don't depend on the user picking a weak password. Aka, I could have a 50 character passphrase, but if my computer is infected by the zeusbot and is recording all my keystrokes, it doesn't matter.

Likewise, this Rockyou list shows another reason why we shouldn't blame the users. The list was all in plaintext. I don't have to crack anything. This gets to the heart of my argument. I'm not saying everyone should start using '123456' as their password. What I am saying is that the security of the system is much more dependent on the user recognizing other threats, (such as that fake security e-mail asking them to re-verify all their info), and for sites to practice better security, (hashing the lists with a salt, having a password blacklist, limiting online guesses, etc). We tend as a security community to get caught up on the fact that users pick bad passwords. What we need to do as a community is to move on and say what are we going to do about that.

At the end of the interview the reporter asked if I had anything else to say, which I responded, "If there is anything people should know, it is the importance to have at least two passwords. You can't expect people to remember a different password for every site, but this Rockyou hack demonstrates, you should use a different password for your bank/webmail vs. the password you use on facebook and other social networking sites." I'm happy the reporter phrased this much better and included it in the article. That's also the reason why only the passwords, (and not the e-mail addresses), were publicly disclosed. The passwords by themselves are not that valuable to an attacker. The passwords + e-mail address are since they allow an attacker to quickly compromise many more accounts. I'm not just making this up. In the original forum postings the attacker essentially said, "These passwords are free, but if you want the e-mail address you'll have to pay me a lot of money."

Considering my writeup is longer than the entire NYT's article, once again I'm not surprised I wasn't quoted more ;) If you disagree with me please let me let me know since as my last post pointed out, I certainly can be wrong.

One last thing. I've been getting a whole lot of e-mails from people asking me to send them a copy of the Rockyou list. Please don't take this personally, but I can't determine from an e-mail if you are a legitimate white-hat security researcher or a script-kiddie, (ok some are fairly easy to pick out as script-kiddies, but identifying the good guys is much harder to do). The answer is almost certainly going to be no.

Analysis of 10k Hotmail Passwords Part 6: Markov Model Showdown 2 - The Rematch

I know, soon my titles will get so long I won't be able to fit them into a Twitter post. It was all I could do to leave off a tagline such as "Revenge of the Incremental Mode."

I just received an e-mail from SolarDesigner, the creator of John the Ripper, who promptly set me straight on a few points about how the incremental attack works. I'm going to break down the e-mail into two parts. The first part is as follows:

Matt,
In your very interesting blog post at:
http://reusablesec.blogspot.com/2009/11/analysis-of-10k-hotmail-passwords-part.html

you made some incorrect statements/guesses about the incremental mode:

"Unfortunatly it doesn't take into account the previous trigraphs that appeared before it,"

Not true.

"(except when calculating the overall probability)."

No idea what you mean here.

"For example, if the first trigraph is "auq", the next trigraph's probability isn't increased if it starts with a 'u',"

Actually, if the first trigraph is "auq", then as far as inc.c is concerned the next trigraph starts with "uq", and there's one letter to be determined.

"(even though 'u' almost always follows 'q')."

If "u" commonly follows "uq" at that character position in passwords of that length in the training set, then it will be tried by JtR early on.

"Therefore, the incremental attack doesn't measure the conditional probability between the third and fourth letter, and the sixth and seventh letter of a password guess."

That's not true. It treats trigraphs at all positions in the same way. It also has special handling for the first and the second character position, before the first trigraph can be formed, but then it just proceeds with overlapping trigraphs till the full length is reached.

"The biggest advantage of using the incremental attack is that it creates highly optimized guesses while still retaining the ability to eventually cover the entire key-space if you give it enough time."

Right, and that's probably the cause of the difference you're seeing as well. This ability is achieved by having the same number of character indices possible for each character position. This works great early on, but it might be suboptimal after a while - where it might make sense to try, say, 50 different character indices for a certain position, but only 10 for another one. This is cured to a limited extent with the "cracking order" table (which includes not only the length and the character count, but also the "fixed character index" position number), but perhaps further improvement in this area is desired.

I think you could hack inc.c such that you'd achieve results similar to those of the Markov mode. Specifically, drop expand() and adjust inc_key_loop() to remove the assumption that characters exist for all indices. It'd be tricky to skip those incomplete-info indices quickly, though - but in terms of the first 1G guesses produced (if you don't care how much time it'd take to produce those), I think you'd achieve results very similar to Markov's.

I'm very happy I was corrected about this, and I have modified my previous post to be less wrong. The original copy of my post can still be found in the comments since I do occasionally try to practice some journalistic ethics. Also, if I ever say anything else that doesn't match up with reality, please let me know. I'd rather be shown to be wrong than to remain ignorant. That's why I have a blog with comments and an e-mail address on the side ;)

Now onto the second portion of SolarDesigner's e-mail:
Another related thought I had:
When you compare Incremental vs. Markov (or the like), you could want to run through --single and at least a small wordlist such as password.lst with rules first. That would reflect the real-world case those modes need to be optimized for. It's certainly what I was optimizing for. I had slightly different revisions of the incremental mode early on that would happen to crack more passwords sooner when run with empty john.pot, but would work worse after single and wordlist w/rules; I rejected those revisions (never released).

I don't expect this change in your testing to affect your results in a major way (I pointed out another likely cause for the difference you've observed above), yet this is something for you to try next time (for both modes, indeed).

Similarly, when you compare different wordlists and different wordlist rulesets, it makes sense to do so after having run --single first (keep and reuse the same john.pot with after-single results). Again, this is what I had in mind when experimenting with the default wordlist ruleset (over 10 years ago).
That's a good point, which of course leads us to more tests:

Test 9: Testing JtR's Markov and Incremental Modes After a Simple Dictionary Attack

Unfortunately I'm not going to run --single mode first for this test. As a quick explanation, single mode uses the username/e-mail addresses/related information as a very small targeted input dictionary and then runs a large number of advanced word mangling rules against the password set. By default, this is the first attack that John the Ripper runs during a password cracking session. The reason I'm not going to run it is that I don't store the user information with the password sets I analyze. This isn't because of some secret l33t password cracking technique, but instead due to the fact that I'm trying to remain ethical and respect the privacy of the people in these lists. While I will look at username/password info initially when I get a new list so I can get a better handle on how the list is structured, I separate the info as soon as possible. That's because I don't want to know that bob@yahoo.com's password is 'Ihatemyboss'. In short, I'd rather penalize my cracking techniques than deal with those issues. I might grab the original list and run a -single attack in the future since as SolarDesigner points out, that is very useful information, but right now it's just too much of a hassle, which is the way I meant it to be.

So that disclaimer aside, the next question is how I should run my dictionary attack. To keep things simple, for this first test I used the default John the Ripper ruleset and the default passwords.lst wordlist. With that dictionary/rule combo, I ended up cracking 512 passwords from the Hotmail dataset using just 139k guesses. What can I say? It's a small input dictionary and JtR's default rules are highly targeted, (aka they are designed to finish quickly). Also, most of the Hotmail passwords are Spanish/Portuguese, and passwords.lst is and English dictionary. I'll get more into that once I finally write that post I've been promising you about dictionary attacks on this list. Just for comparison, below is a graph representing the passwords cracked using a dictionary attack, vs an incremental attack and a Markov attack as detailed in Test 8 from my previous post. Aka, both of the brute force attacks were trained on the Phpbb password set, and the Markov attack used a limit of 215 with a maximum of 8 characters.

Results Part 1:


Analysis Part 1: It shouldn't come as a surprise that the dictionary attack performed better. As I stated previously there were a lot of factors that kept it from performing as well as it would against an English based password list, (the main one being it helps to use a dictionary targeted against the language of the set. It's a complicated concept I know). That being said, notice the sharp drop off in passwords cracked. That shows the dictionary attack was able to crack a lot of passwords using no mangling rules, or just adding '1' to the end of the guess. That's partially an artifact of the dictionary, as it has many common password pre-mangled in it, such as "abc123", and the fact that many people didn't use any mangling rules to create their passwords. That's actually one of the nice things about the dictionary passwords.lst since having pre-mangled passwords in the dictionary helps the attacker crack the easiest passwords quicker. Oh, and for the Markov mode, no I didn't forget to add it to the graph. It only cracked 4 passwords in the first 140k guesses since I didn't re-target it for the smaller cracking session. As I said before, depending on how you set the limit size it takes a while to catch up to other cracking methods.

Results Part 2: Ok, so after running the dictionary attack, I then re-ran the Markov and Incremental attacks as described previously on the uncracked passwords. Here are the results - Note, I attributed the 512 passwords previously cracked to both of them:


Analysis Part 2: Well, the results are pretty much the same. This is to be expected since the dictionary attack only cracked 512 passwords, (aka the largest possible improvement would occur if all of those passwords were guesses that Markov mode tried, but Incremental mode didn't. Even that would only help Incremental mode catch up by 512 passwords). In fact, most of the passwords that were cracked in the dictionary attack would also have been cracked in the Markov/Incremental attacks given 1 billion guesses. By this I mean the Dictionary attack only cracked 28 passwords the Markov attack didn't, and 4 passwords the Incremental mode didn't. Please note, this isn't a knock against dictionary based attacks. What a dictionary attack lets you do is crack those passwords much, much sooner, which is the reason why I included the previous graph in part 1. That leads us to the next test...

Test 10: Testing JtR's Markov and Incremental Modes After a Longer Dictionary Attack

Everything pretty much stays the same as the previous test, except that I ran the hotmail list through multiple dictionaries first using John the Ripper's default rule sets. The dictionaries include the previously mentioned passwords.lst, along with dic-0294, the Spanish wordlist off of John the Ripper's ftp site, and a Spanish and Portuguese wordlist gathered off of wiktionary, (this isn't Sebastian Raveou's one though I'll get to a specialized dictionary he provided me in a later post).

Results: I'm not going to include a graph of the dictionary attack since I was lazy and didn't bother to merge the dictionaries and remove duplicate words, (I just ran each one through independently). All together they cracked 2,104 passwords from the Hotmail set. Once that was done I re-ran the Markov and Incremental mode attacks against the list. Here are the results:

Analysis: Once again, the results look pretty much the same, though this time the dictionary attack cracked many more passwords that were not cracked during the brute force attacks. In fact, this run raised the success rate of the dictionary attack + Markov mode to slightly more than 50% of the total set cracked.

Coming Up: I know interest is mostly focused on the Rockyou list right now, but I do want to continue with the Hotmail list simply because combined with all the previous analysis, I think we're getting a better view of the effectiveness of different password cracking techniques. Aka, when I finally get around to doing a write-up on dictionary based attacks you will be able to compare it to all the previous brute force methods I've looked at. Don't worry, I'll still cover the Rockyou list. It's just that I might be switching between the two lists for the next couple of posts.

Friday, January 22, 2010

From the "That's Just Not Cool" Department

So it looks like a spammer managed to modify the Hackers for Charity webpage so they could put all their fake drug medication links into it. Hint, view source-code and then scroll down to the very bottom.

This does lead to some interesting observations though:
  1. The person who did this has no idea about the webpage they were hacking. If it was a targeted hit, (think ZF0), they probably would have done some visible defacing. If it is someone just looking to make money, there's no way they would knowingly tangle with all the heat that is probably going to be coming their way soon.
  2. Web page security is really hard. Over the last 6 months we've seen a large number of people in the security field have their webpages get hacked. Heck, even the NSA's main webpage was defaced.
  3. What does this say about the white-hat security community? As a member of that community this drives home the point that humility is important in this line of work.
I expect that the Hackers for Charity webpage will be fixed soon so if anyone is interested in doing some additional analysis, here are two of the spammer links, (they all pretty much are the same). I also have the entire webpage source-code available on request. Note, I changed the http to hxxp, and the www to aaa to avoid further helping the links advance their Google ranking.
  • hxxp://aaa.oaregion3.org/events/old_files/_vti_cnf/general/buy-acomplia-online-no-prescription.html
  • -- Buy Acomplia Online no Prescription
  • hxxp://aaa.oaregion3.org/events/old_files/_vti_cnf/general/take-acomplia-cheap.html
  • -- Take Acomplia Cheap
Now back to writing the post I was planning to put up here...

Sunday, January 17, 2010

State of The Blog: 2010

I know most people normally do this at the beginning of January, but like so much else I'm running behind ;) What I really wanted to do was give you readers, (all 10 or so of you now), an update on where I'm at, my goals for the following 6 months, and why my update schedule might be kind of wacky.

Goal 1: Graduate
Yes, I have been researching password cracking since August 2007, and as much as I love college, it's time to move on to the real world. Or I hope it's time, as I still have that pesky dissertation to write. Getting that done is my #1 priority right now, and to be honest I don't know how that will impact this blog. The dissertation itself will mostly cover my work developing a probabilistic password cracker, though I'll also be covering some of my other tools such as my dictionary based rainbow tables. I'm a little ashamed I haven't posted more about my probabilistic password cracker here since I've become a true believer in it. It's just that I've kind of enjoyed using this blog as an excuse to research other things, (there's only so much time I can spend talking about parsing grammars). With every addition I make to it, it gets meaner and nastier, and quite honestly I need to take the time to clean up the latest version and post it online. Here is a quick rundown of using a probabilistic approach:
  • This is a way to create password guesses. Aka, it works against salted hashes and full hard-drive encryption. It's not a rainbow table, and instead is designed to work with traditional password cracking tools.
  • Instead of using standard word mangling rules, it assigns probabilities to the various ways people create passwords. These range from dictionary words, ('password' is more common than 'zebra'), word mangling rules, (people tend to uppercase the first letter), specific replacements/additions, (aka dates are very common), etc. It even takes into account the password length, (aka people generally make passwords between 6 and 8 characters long).
  • It then uses these probabilities to create very fine grained word mangling rules on the fly based on probability order. By fine grained, I mean it will literally create millions of rules if you give it the chance.
  • The algorithm it uses is very fast, and extremely parallelizable, which is necessary for a password cracker.
  • Because of this, it becomes conceptually easy to create a customized attack profile for an individual target, (or group). You can create a custom dictionary for them based on kids names, important dates, files found on their computer, and then just give those values a higher probability. If there's a password creation policy for the site? No problem, just exclude rules that don't meet it, and/or only train the password cracker on similar passwords that meet those rules. The main problem going forward with all of this has been developing the GUI believe it or not.
  • I'm currently adding support for brute force as well, (hence my focus on brute force techniques over the last couple of posts). This way it will automatically switch between limited brute force and dictionary based attacks depending on the current probability of the guess it is working on.
I've been running it in head to head comparisons against existing password crackers, and it has been beating the socks off of them. In short, it generally will crack more passwords with less guesses than existing methods.

Goal 2: Get Some Work Done with Pass-Phrases
This actually was the original goal of my research. The main problem up till now is I haven't had many examples of pass-phrases, (beyond the 'AliceLovesJoe' variety). With the RockYou list though, that has changed, and I look forward to adding support for pass-phrases to my probabilistic cracker. There's a couple of attack methods I plan to test, ranging from a mad-libs approach to using pass-phrase input dictionaries.

Goal 3: Finish Up Some of These Dangling Posts
Going through my archives, there's a horrible amount of posts where I started talking about something, such as the ZFO dataset gathered from people in the security community, and never got around to posting any follow-ups. I realize that's something I need to change.

Goal 4: Work on a GPU Assisted Password Cracker
Most of the work is being done by another graduate student, which means this actually might get done and be relatively bug free. I'm very excited about this, as it will help move a lot of my research from the theoretical to actual implementation where it will help law enforcement officials. We're using OpenCL right now, (I know CUDA currently is faster, but OpenCL is much more portable), and we're initially targeting TrueCrypt, (though hopefully once we get the basic framework finished we can add additional encryption types).

Goal 5: No More Public Speaking for the Next Six Months
I really need to focus on actually doing stuff vs. talking about it. I'm also cutting down on the conferences I'm attending, though I'm making an exception for Shmoocon since I love that one. BTW, anyone interested in doing the Ghost in the Shellcode competition? My hacking skills are really lame, but I still love hacking competitions.

Goal 6: Finish up my Paper on NIST's use of Password Entropy
An academic publication on the investigation of the effectiveness of using entropy to justify password creation policies. While I have a lot of preliminary data already, I don't want to spoil my objectivity by posting my initial conclusions quite yet. The hardest part is the fact that password entropy is used as an average value across the entire password set, so finding the best way to fairly map the effectiveness of a real attack against it is tricky. Aka if following NIST's calculations it would take an attacker two years to crack a password via an online attack, but I find that an attacker could crack 5% of the passwords in an hour, what does that mean exactly?

Conclusion:
It should be an interesting year ;)

Sunday, January 10, 2010

A Quick Preview

There's been a bunch of requests about this, so I thought I'd post a couple of screenshots of something I've been working on over the holiday break.















JtR Config Generator