Frequency Analysis for Stronger Passwords
As a commenter pointed out in my last post, the previous frequency analysis was based on a set of passwords where there was no strong password creation policy in place. What happens when you look at only "strong" passwords? Well, I went through the MySpace list, the Phpbb.com list and the Finnish list and extracted all the passwords that would meet stronger password creation rules, (at least 8 characters long, containing at least 1 lowercase letter, 1 uppercase letter, 1 digit, and 1 special character). This gave me a grand total of 214 passwords, (an impressive number I know...). I belatedly realized that I forgot to copy a couple of other lists, (such as from Millw0rm, singles.org, etc), from my school computer back in Tallahassee, so I'll try to get someone to send them to me so I can update this post with a larger data set.
As you can see below, uppercase characters dominate the first character set, and numbers/special characters dominate the last character set. Admittedly this is a small sample size. If anyone has a better data set or can point me in the right direction I'd love to take a look at it. Oh, and keep those good comments up ;)
Here is the data:
Overall Character Frequency Charset:
1ear!i0t2soln3#dbA4mcu5$h89S7y*kgPCD@_w-TG6EB.pRHxvQFMLJqYONfKW%VI/&zZXj^U}]\[)(:,+
First Character Frequency Charset:
SPDFCAGT1MBRLNKJVE*!mdcQIH$tqbO432#}zvusk^YXW960-(
Last Character Frequency Charset:e
1!5*327$r94#0e.%kdba]8-wuomlgc^\ZVTSQNKHCA6/
Overall Character Frequecy Analysis:
1 5.81745
e 5.2658
a 4.81444
r 3.91174
! 3.81143
i 3.36008
0 3.15948
t 2.70812
2 2.65797
s 2.60782
o 2.60782
l 2.25677
n 2.10632
3 2.05617
# 2.00602
d 1.80542
b 1.80542
A 1.65496
4 1.65496
m 1.60481
c 1.60481
u 1.55466
5 1.50451
$ 1.45436
h 1.40421
8 1.40421
9 1.35406
S 1.25376
7 1.20361
y 1.15346
* 1.15346
k 1.10331
g 1.10331
P 1.10331
C 1.10331
D 1.05316
@ 1.05316
_ 1.00301
w 0.952859
- 0.952859
T 0.852558
G 0.802407
6 0.802407
E 0.752257
B 0.752257
. 0.752257
p 0.702106
R 0.651956
H 0.651956
x 0.601805
v 0.601805
Q 0.601805
F 0.601805
M 0.551655
L 0.551655
J 0.551655
q 0.501505
Y 0.501505
O 0.501505
N 0.501505
f 0.451354
K 0.451354
W 0.401204
% 0.401204
V 0.351053
I 0.351053
/ 0.351053
& 0.351053
z 0.300903
Z 0.250752
X 0.200602
j 0.150451
^ 0.150451
U 0.150451
} 0.100301
] 0.100301
\ 0.100301
[ 0.100301
) 0.100301
( 0.100301
: 0.0501505
, 0.0501505
+ 0.0501505
----------------------------------------
First Character Frequecy Analysis:
S 7.94393
P 7.00935
D 6.07477
F 5.14019
C 5.14019
A 5.14019
G 4.6729
T 4.20561
1 3.73832
M 3.27103
B 3.27103
R 2.80374
L 2.80374
N 2.33645
K 2.33645
J 2.33645
V 1.86916
E 1.86916
* 1.86916
! 1.86916
m 1.40187
d 1.40187
c 1.40187
Q 1.40187
I 1.40187
H 1.40187
$ 1.40187
t 0.934579
q 0.934579
b 0.934579
O 0.934579
4 0.934579
3 0.934579
2 0.934579
# 0.934579
} 0.46729
z 0.46729
v 0.46729
u 0.46729
s 0.46729
k 0.46729
^ 0.46729
Y 0.46729
X 0.46729
W 0.46729
9 0.46729
6 0.46729
0 0.46729
- 0.46729
( 0.46729
----------------------------------------
Last Character Frequecy Analysis:
1 21.4953
! 18.2243
5 6.07477
* 4.6729
3 4.20561
2 3.73832
7 3.27103
$ 3.27103
r 2.80374
9 2.80374
4 2.80374
# 2.80374
0 2.33645
e 1.86916
. 1.86916
% 1.40187
k 0.934579
d 0.934579
b 0.934579
a 0.934579
] 0.934579
8 0.934579
- 0.934579
w 0.46729
u 0.46729
o 0.46729
m 0.46729
l 0.46729
g 0.46729
c 0.46729
^ 0.46729
\ 0.46729
Z 0.46729
V 0.46729
T 0.46729
S 0.46729
Q 0.46729
N 0.46729
K 0.46729
H 0.46729
C 0.46729
A 0.46729
6 0.46729
/ 0.46729
Comments
8=====>~O~o
Which ironically meets many "strong" password requirements.
~ ~<====3
8====> )0(
8===D
(^_-) ~ ~ <===3
but for the most part they were variations of the same theme. You could include different shaft lengths and "external graphics" to account for most of the different ways people would type them. You can also set some bounds. For example it is unlikely someone would use
8=>
or
8============================>
so you can set some reasonable limits to generate your dictionary with. This doesn't just have to apply to genitalia. There are a lot of ASCII art combos people can use. I've seen emoticons used as part of a regular password before. They can also be used standalone. For example:
/><{{{{"> fish
///\oo/\\\ spider
_/\__/\__0> worm
----{,_,"><",_,}---- two mice
...---... S-O-S
»-(¯`·.·´¯)-> heart with an arrow through it
d[ o_0 ]b robot
well you get the idea. And THANK YOU so much for giving me the chance to draw genitalia on my blog ;)