Tuesday, December 29, 2009

The RockYou 32 Million Password List Top 100

But first, a quick responses to one of the previous comments, (since it really did merit a front-page post).

Tfcx posted:
The initial vulnerability was posted 29th November on a hacking forum called darkc0de here: http://forum.darkc0de.com/index.php?action=vthread&forum=11&topic=13082
Thanks, as that really helps narrow down the timeframe, (and reading that post and related posts was interesting if a bit depressing). The hack itself appears pretty straightforward once you see it, (like most things once the solution is presented to you it's easy, but finding it in the first place is hard). I'm still interested in the hacker Igigi, and have been tossing about all sorts of theories; but I'll refrain from posting them here since they are all pure WAGs right now.

Now on to the main topic: Per Thorsheim wrote:
I would like to see a comparison of Twitters 370 banned passwords against the top 370 or so passwords stolen from rockyou (http://www.techcrunch.com/2009/12/27/twitter-banned-passwords/)
Happy to oblige! To start things off, here are the top 100 most frequently used passwords from the RockYou list. I then bolded any of the passwords that did not appear in Twitter's blacklist. As a side note: Yes I do realize I need to modify my website code to allow support for expandable post summaries. Wow do I ever miss Livejournal...

Rank | Num of Occurrences | Password
--------------------------------------------------------------
1 290729 123456
2 79076 12345
3 76789 123456789
4 59462 password
5 49952 iloveyou
6 33291 princess
7 21725 1234567
8 20901 rockyou
9 20553 12345678
10 16648 abc123
11 16227 nicole
12 15308 daniel
13 15163 babygirl
14 14726 monkey
15 14331 lovely
16 14103 jessica
17 13984 654321
18 13981 michael
19 13488 ashley
20 13456 qwerty
21 13272 111111
22 13134 iloveu
23 13028 000000
24 12714 michelle
25 11761 tigger
26 11489 sunshine
27 11289 chocolate
28 11112 password1
29 10836 soccer
30 10755 anthony
31 10731 friends
32 10560 butterfly
33 10547 purple
34 10508 angel
35 10167 jordan
36 9764 liverpool
37 9708 justin
38 9704 loveme
39 9610 fuckyou
40 9516 123123
41 9462 football
42 9310 secret
43 9153 andrea
44 9053 carlos
45 8976 jennifer
46 8960 joshua
47 8756 bubbles
48 8676 1234567890
49 8667 superman
50 8631 hannah
51 8537 amanda
52 8499 loveyou
53 8462 pretty
54 8404 basketball
55 8360 andrew
56 8310 angels
57 8285 tweety
58 8269 flower
59 8025 playboy
60 7901 hello
61 7866 elizabeth
62 7792 hottie
63 7766 tinkerbell
64 7735 charlie
65 7717 samantha
66 7654 barbie
67 7645 chelsea
68 7564 lovers
69 7536 teamo
70 7518 jasmine
71 7500 brandon
72 7419 666666
73 7333 shadow
74 7301 melissa
75 7241 eminem
76 7222 matthew
77 7206 robert
78 7148 danielle
79 7116 forever
80 6979 family
81 6775 jonathan
82 6658 987654321
83 6653 computer
84 6647 whatever
85 6598 dragon
86 6570 vanessa
87 6554 cookie
88 6547 naruto
89 6501 summer
90 6420 sweety
91 6390 spongebob
92 6320 joseph
93 6272 junior
94 6215 softball
95 6131 taylor
96 6111 yellow
97 6080 daniela
98 6079 lauren
99 6068 mickey
100 6027 princesa


Analysis: I'm hesitant to post the top 370 passwords due to privacy concerns, (also 100 is such a nice round number), but I figure this should give a good overview of the coverage of the 370 passwords that are blacklisted by Twitter. A grand total of 38 of the top 100 passwords did not appear in the Twitter blacklist. That actually really surprised me, as I expected the Twitter blacklist to perform better. So I guess what I'm trying to say is good question ;)

Just to save everyone from the math, if an attacker tried the top 100 passwords as guesses, they would have been able to crack 1,483,668 passwords from the dataset, or 4.5% of the total passwords. If the Twitter blacklist had been in place, and the attacker still tried the same 100 guesses, they would have only cracked 475,046 passwords, or 1.4% of the total passwords. So a blacklisting approach certainly can help against online password attacks, (where the attacker is severally limited in the number of guesses they can make). That being said, the Twitter list probably shouldn't be considered the gold standard as there are a lot of improvements that can be made to it.

Well, that's one question down. Keep them coming!

3 comments:

Per Thorsheim said...

Thanks Matt, highly appreciated!

More questions:
1. Number of occurrences of standard english words written backwards? (using standard wordlists)

2. Number of occurrences of surnames based on national statistics for popularity? (This should get you started: http://www.ssb.no/navn_en/)

3. Any double passwords? (passwordpassword)

Regards,
Per Thorsheim
http://securitynirvana.blogspot.com/

Matt Weir said...

No problem. I'll get to work on those stats. Just a FYI, there will probably be some false positives/negatives with the results. Aka for the reversed words dog=god. Also I might not get some names that have mangling rules applied or double passwords with mangling rules applied in the middle, aka password1password.

NaturallySelectable said...

I realize I'm late to the game here, but it wouldn't be hard, Matt, to eliminate false negatives on backwards words. The function would be like this in pseudo-code:

if (pass not word) {
return (backpass isword)
} else {
return false
}

Although I do see your point; you have no way of identifying the user's intent with a password like "dog".