- It is based on a subset of one million passwords from the RockYou set
- It has to deal with a project I am working on
- There is one word you MUST include in your submission for it to be valid
Saturday, January 30, 2010
Tuesday, January 26, 2010
Regarding your own analysis of the RockYou password list as well as the analysis done by others, what strikes me is the "negativity" of the results. I had a long chat with a colleague/friend of mine who is also assisting me in my various analysis, and we agreed that we wanted to know a little more about the positive parts of the RockYou list..
He then went on to ask some specific questions. Once again I have to agree with him. The interesting part of this list isn't that thousands of people used '123456' as their password. We already knew that. The tougher passwords, now that's interesting.
Q) What's the longest password found? (# characters)
A) As I mentioned, that's hard to say since there's a lot of really long values in the list that probably aren't passwords as we consider them in the traditional sense. Excluding passwords with non-ascii characters, (they gave awk a bit of a problem since it counted them as two or more characters), I found 27,337 passwords that were longer than 21 characters long. Glancing through the results, most of them appeared real. I'll get more into their composition in the next question.Q) What's the most complex password found? (all character groups, randomness, length etc)
A vast majority of what would be considered complex passwords turn out to be e-mail addresses. Some of them are even mangled, such as email@example.com. In other news, people still apparently hate using spaces in passphrases as well, (or more likely they don't realize they can use spaces). That all being said, very few of the passwords would meet a corporate password requirement, aka >8 characters, containing an uppercase/lowercase/special/digit. That's to be expected since I doubt any of the sites that rockyou collected the passwords for enforced such a requirement.
Q) Percentage of passwords longer than 8?
A hair over 30% of the passwords were longer than eight characters long. This is actually worse than the hotmail dataset where close to 40% of the passwords were longer than eight characters. That can probably be explained by the high number of rockyou only accounts in the list, (heck, 'rockyou' was the #8th ranked password). I don't know about you, but I certainly wouldn't use my A-game password there.
There still are a couple other questions I haven't had a chance to answer, but they will have to wait for another blog post.
Sunday, January 24, 2010
Matt,In your very interesting blog post at:http://reusablesec.blogspot.com/2009/11/analysis-of-10k-hotmail-passwords-part.htmlyou made some incorrect statements/guesses about the incremental mode:"Unfortunatly it doesn't take into account the previous trigraphs that appeared before it,"Not true."(except when calculating the overall probability)."No idea what you mean here."For example, if the first trigraph is "auq", the next trigraph's probability isn't increased if it starts with a 'u',"Actually, if the first trigraph is "auq", then as far as inc.c is concerned the next trigraph starts with "uq", and there's one letter to be determined."(even though 'u' almost always follows 'q')."If "u" commonly follows "uq" at that character position in passwords of that length in the training set, then it will be tried by JtR early on."Therefore, the incremental attack doesn't measure the conditional probability between the third and fourth letter, and the sixth and seventh letter of a password guess."That's not true. It treats trigraphs at all positions in the same way. It also has special handling for the first and the second character position, before the first trigraph can be formed, but then it just proceeds with overlapping trigraphs till the full length is reached."The biggest advantage of using the incremental attack is that it creates highly optimized guesses while still retaining the ability to eventually cover the entire key-space if you give it enough time."Right, and that's probably the cause of the difference you're seeing as well. This ability is achieved by having the same number of character indices possible for each character position. This works great early on, but it might be suboptimal after a while - where it might make sense to try, say, 50 different character indices for a certain position, but only 10 for another one. This is cured to a limited extent with the "cracking order" table (which includes not only the length and the character count, but also the "fixed character index" position number), but perhaps further improvement in this area is desired.I think you could hack inc.c such that you'd achieve results similar to those of the Markov mode. Specifically, drop expand() and adjust inc_key_loop() to remove the assumption that characters exist for all indices. It'd be tricky to skip those incomplete-info indices quickly, though - but in terms of the first 1G guesses produced (if you don't care how much time it'd take to produce those), I think you'd achieve results very similar to Markov's.
Another related thought I had:When you compare Incremental vs. Markov (or the like), you could want to run through --single and at least a small wordlist such as password.lst with rules first. That would reflect the real-world case those modes need to be optimized for. It's certainly what I was optimizing for. I had slightly different revisions of the incremental mode early on that would happen to crack more passwords sooner when run with empty john.pot, but would work worse after single and wordlist w/rules; I rejected those revisions (never released).I don't expect this change in your testing to affect your results in a major way (I pointed out another likely cause for the difference you've observed above), yet this is something for you to try next time (for both modes, indeed).Similarly, when you compare different wordlists and different wordlist rulesets, it makes sense to do so after having run --single first (keep and reuse the same john.pot with after-single results). Again, this is what I had in mind when experimenting with the default wordlist ruleset (over 10 years ago).
Friday, January 22, 2010
- The person who did this has no idea about the webpage they were hacking. If it was a targeted hit, (think ZF0), they probably would have done some visible defacing. If it is someone just looking to make money, there's no way they would knowingly tangle with all the heat that is probably going to be coming their way soon.
- Web page security is really hard. Over the last 6 months we've seen a large number of people in the security field have their webpages get hacked. Heck, even the NSA's main webpage was defaced.
- What does this say about the white-hat security community? As a member of that community this drives home the point that humility is important in this line of work.
- -- Buy Acomplia Online no Prescription
- -- Take Acomplia Cheap
Sunday, January 17, 2010
- This is a way to create password guesses. Aka, it works against salted hashes and full hard-drive encryption. It's not a rainbow table, and instead is designed to work with traditional password cracking tools.
- Instead of using standard word mangling rules, it assigns probabilities to the various ways people create passwords. These range from dictionary words, ('password' is more common than 'zebra'), word mangling rules, (people tend to uppercase the first letter), specific replacements/additions, (aka dates are very common), etc. It even takes into account the password length, (aka people generally make passwords between 6 and 8 characters long).
- It then uses these probabilities to create very fine grained word mangling rules on the fly based on probability order. By fine grained, I mean it will literally create millions of rules if you give it the chance.
- The algorithm it uses is very fast, and extremely parallelizable, which is necessary for a password cracker.
- Because of this, it becomes conceptually easy to create a customized attack profile for an individual target, (or group). You can create a custom dictionary for them based on kids names, important dates, files found on their computer, and then just give those values a higher probability. If there's a password creation policy for the site? No problem, just exclude rules that don't meet it, and/or only train the password cracker on similar passwords that meet those rules. The main problem going forward with all of this has been developing the GUI believe it or not.
- I'm currently adding support for brute force as well, (hence my focus on brute force techniques over the last couple of posts). This way it will automatically switch between limited brute force and dictionary based attacks depending on the current probability of the guess it is working on.