Reusable Security

OMEN Improvements

2025-08-09T17:52:00.000-07:00

“If I had an hour to solve a problem, I would spend 55 minutes thinking about the problem and five minutes finding the solution.”

- Proverb falsely attributed to Albert Einstein

Introduction:

I'm a big fan of graphing password cracking sessions. It's a good way to figure out what's working and what isn't by highlighting trends that get lost in the final "cracking success" number. The very first thing I look for in these graphs is saw-tooth steps. This is an easy way to spot potential improvements. If you suddenly see a quick run of cracks in your password cracking success rate, which is what these saw-tooth steps represent, that implies you can optimize your cracking session by moving that attack earlier in your workflow. Now you need to temper that with the realization that no two password sets are exactly the same, you don't want to overtrain your cracking sessions on one particular dataset, and often these improvements come about because you learn some target specific information part-way through your cracking session. But all that being said, these saw-tooth steps are a great place to start your investigations.

These saw-tooth steps are very evident in the current OMEN cracking sessions as you can see in the graph below. This post will cover my investigation into making OMEN better based on these observations. But if you take anything away from this post, it's really that you should graph your cracking sessions, (ideally using a linear and not logarithmic scale), as chances are it will help you optimize your cracking techniques as well.

OMEN Background:

At a high level OMEN is simply another Markov based password guess generator. What makes it stand out from other Markov approaches though is in how it calculates probability thresholds for generating guesses. Rather than ordering likely "next characters" in an array or queue and selecting the next most likely option (such as Hashcat's Markov or "roughly" like JtR's Incremental), or multiplying probabilities together and using a probability threshold (like JtR's --Markov option), OMEN instead assigns each transition a "cost" between 0-10, and then allocates a total "guessing budget" for generating guesses based on the current "level" it is at. So for an OMEN level 4 guess, it would have a "budget" of 4 it needs to spend on all the transitions when creating a guess. The nice thing about this is that when calculating an OMEN guess, you only need to do integer addition. This is a huge bonus from a speed perspective since multiplying floats (like JtR's Markov) is expensive. This approach also gives you much more granular control about the different probability costs associated with a transition compared to using an array based ranking which is what Hashcat does.

The key thing to keep in mind are there are 4 items that OMEN can spend its budget on, of which 3 are currently in use:

Initial Probability (IP): This is the first N-1 characters to start a password guess with.

If you have a NGram length of 4, this is the first 3 characters of your password guess.
The IP is trained from the start of all passwords in the training set
Because the IP isn't trained on the full password, there tends to be a lot of gaps. For example, when trained on a 1 million subset of RockYou with the default alphabet of 72 characters, the total keyspace would be around 373k lines, but the OMEN IP list only contains roughly 90k lines.

Length (LN): The total length of the password guess

OMEN prioritizes guesses based on their length so if a password guess has a "non-standard" length it is penalized with a higher cost.
The reason I want to highlight this is because **Spoiler** this is where some of the improvements can be made in the standard OMEN algorithm.

Conditional Probability (CP): This is the likelihood of the "next" character appearing.

For example if the previous three characters are "que", the CP cost of the next letter being '[e,s]' might be 0, and the next letters being '[a,n]' might have a CP cost of 1.

End Probability (EP) (NOT USED): The probability of the last couple of characters in a password

Note: Neither the Rub-SysSec or Py-OMEN currently use EP even though both the tools still calculate it. This is because testing revealed using EP makes the guesses significantly worse. I know that sounds counter-intuitive, but that's why we run tests!

OMEN Investigation:

To better investigate OMEN attacks, there are three repos I'll be using:

Py-OMEN: A python implementation of the Rub-SysSec OMEN attack

Link: https://github.com/lakiw/py_omen
Comments: I've been fixing this up so it actually works. This is the same algorithm as the OMEN implementation in the PCFG toolset but it is standalone without any of the other PCFG code. So if you are running real cracking sessions I'd still recommend using the PCFG toolset instead, but this standalone version is easier to work with when investigating improvements to OMEN.

Password Research Tools: Allows graphing and investigating the effectiveness of password attacks.

Link: https://github.com/lakiw/Password_Research_Tools
Comments: After many years, I'm also updating this toolset to add a few more features. Specifically I can send it debug statements from my guess generator and it'll include when in the guessing process the statement was outputted. This is helpful for investigating when OMEN levels change. All graphs here are generated using that tool + Excel.

Pretty Cool Fuzzy Guesser (PCFG): My main password guess generating toolset.

Link: https://github.com/lakiw/pcfg_cracker
Comments: Hey, if I'm going to make an improvement in how guesses are generated, of course I'm going to use that to upgrade my main toolset!

The first thing I did when investigating the saw-tooths in OMEN attacks was to run a cracking session and save all the CRACKED passwords to file as well as collect data for my graphs. That way when I wanted to dig into a particular run of cracked passwords from looking at the graph I can see what those passwords are. An example cracking session can be seen as follows:

python3 enumNG.py -r mod_rockyou1 | python3 ../Password_Research_Tools/checkpass.py -m 1000000000 -t ../../research/password_lists/rockyou/rockyou_32.txt --cracked_file rockyou1_vs_rockyou32.cracked

Note: You can also do something similar with Hashcat giving it the flag "--outfile-format=2,4" which will output the plaintext password followed by the guess number. Another option to make parsing easier is to save the hex_plain vs. the raw plaintext using the flag "--outfile-format=3,4". This can be annoying as it requires the extra step of decoding the plaintext password to do any manual analysis, but that's often a lot less annoying than dealing with parsing issues.

Hashcat Feature Request: (NOTE: This may be outdated since the new version of Hashcat has a ton of improvements) It would be nice if Hashcat honored the ordering of the --output-format listing so I could put in "--outfile-format=4,2" and it would print the guess number first. Right now, it ignores the ordering and will put the plaintext password first if you use that command.

Moving on, one nice thing about using py-omen is it has a "test" mode which allows you to paste in strings and see how they would be parsed by OMEN. For example, this is me analyzing the cracked password "talishka"

Here you can see that a length 8 guess had a cost of 1. The initial IP "tal" had a cost of 2. The rest of the costs are from the transitions with the ISH->K being the big cost with a price of 3. The total level that this guess would be generated at would be 8 (aka 1 + 2 + 1+ 1+ 3).

I then put "break points" in my graph by having my OMEN implementation output a debug statement whenever it made a major transition (first when the level increased, and then when things like the IP or length changed). This allowed checkpass to note in a cracking session when those transitions occurred. This was super helpful since I could put dots on my graph and start to really understand what was happening when those sawtooth improvements kicked in.

Looking through the sawtooth portions of the graph for an attack using the RockYou1 training set and the RockYou32 test set, I noticed that the transitions and IP tended to be all over the place, but that the length costs were relatively low. Basically the OMEN guesser did really well when it spent its level price on anything BUT length. What this seemed to imply was that the cost for longer passwords was not being weighted in an optimal fashion. Or to put it another way, making longer guesses probably needed to be more costly.

So let's test this out. As a totally non-scientific test to basically "muck around with the data to quickly test a hypothesis", I manually modified the OMEN ruleset trained on the RockYou1 training data and increased the cost of any length that wasn't already at a 0 cost by 1. So if the cost was 0 it stayed 0, but if the cost was 1, it was now 2. If it was 2, it was now 3. Etc. I then reran a test against the RockYou32 test set and then compared the results.

It's not pretty, but it certainly represents an improvement. Now this is only one test against one dataset. And it was a short test at that. But as a first quick test the results were promising enough that it convinced me it was worth spending a bit more time refining the improvement before running longer and more diverse tests.

Increasing the Cost of OMEN Lengths:

At this point, my theory was that the cost of longer passwords was too low in the current OMEN implementation. In the default OMEN algorithm, the "Length Cost" is based on how likely passwords of that length are to be found in the training set. For example: If length 7 passwords are the most common in the training set, then the OMEN "Length Cost" for 7 character passwords would be 0. Let's assume though that length 6 and length 8 passwords showed up at relatively the same frequency in the training set. In this example, they might be frequent enough that they would be assigned an OMEN "Length Cost" of 1. This means the OMEN cost of generating a length 6 and 8 password would be the same, even though it seems like you'd have better success making a length 6 guess vs. a longer length 8 guess when using brute force techniques.

One area where the longer OMEN generated passwords might get more cost is that the "Conditional Probability" costs of adding extra letters may not be 0. But many of the high probability CP costs ARE zero, which means you can add as many as you want and they won't increase the overall cost. This leaves us two options to make it so that in the above example length 8 guesses have a higher cost than length 6 guesses:

We can directly add additional cost to the "Length Cost" to make longer passwords more costly
We can remove 0 cost "Conditional Probability" transitions, so that way any added character is guaranteed to have a cost associated with it.

There's plusses and minuses to both approaches. I wish I could say I weighed those out, ran multiple tests, and settled on a specific implementation, but I picked option #1 of adding an additional cost for each extra character to the "Length Cost" since it was the easiest to do. The new approach is described below:

Updated Length Cost Training Algorithm:

Calculate the length cost as in the base OMEN algorithm.

    ##--Calculate the probi values
    probi = base_count / total_count
    probi *= level_adjust_factor
    probi += 0.00000000001
    
    ##--Now calculate the level
    level = math.floor(-1 * math.log(probi))

Next add an additional cost of the length * (cost_factor) to each Length Cost. So if 6 character passwords had a starting cost of 2, and the cost_factor was 1, then the total cost would now be [2 + 6 x 1] which is 8.
Finally find the lowest length cost, and set that cost to 0, and reduce the cost of all the other length costs by the original amount of the lowest length cost. This was just because I like the OMEN algorithm to start at cost 0 so at least one length cost needed to be 0 to allow that.

OMEN Improvement Results:

For my next test, I used the updated Length Cost calculation, and assigned it a cost-factor of 1. Therefore each additional character added to an OMEN guess increased the cost of the guess by 1. This is in addition to the other weighting/cost factors so there is still a "base" extra cost for non-frequent lengths as well. Running the same test as above, with training on the RockYou1 dataset and testing against the RockYou32 set yielded the following results:

I'll be up front, I was not expecting this much of an improvement. Now on a longer cracking session, I expect the two lines to converge more. This modification doesn't impact the types of guesses OMEN generates. It just makes OMEN generate shorter guesses sooner. Still, cracking more passwords quickly is always a welcome change (from the perspective of the red team member that is).

But if we're getting nitpicky, (and this whole post is evidence that I am), even in this new graph there still are sawtooth steps. They are much smaller, but they are still there. Let's see then if modifying the length cost factor improves things even more. I wanted to check if a cost factor of 1 was either too high or too low, so I ran two more tests, one with a cost factor of 2 (so each additional character added 2 to the cost), and the other with a cost factor of 0.5 (where it would be rounded down to the nearest integer). Here are the results:

Giving OMEN a length cost factor of 1.0 worked the best, though 0.5 was close. This implies there's still a lot of value of trying length 6/7/8 passwords vs. overly focusing on passwords length 5 and lower. Now this doesn't answer the question about how to smooth out the remaining sawtooth steps, but this seems good enough for now. The next thing to do is to run this test on a different password dataset. For this test, I ran the base OMEN algorithm against an OMEN with a length cost factor of 1, both trained on Rockyou1 against the full list of cracked LinkedIn passwords.

Once again, the results point to a noticeable improvement. The LinkedIn set is a much tougher one to crack, as evidenced by the significantly lower success rate, so seeing the modified OMEN attack do better against it is a good indicator that the modifications represent a real improvement vs. being a fluke.

Next, I was excited to see if I could fold these improvements into my PCFG toolset. The PCFG toolset also uses OMEN for its brute-force guess generation so it can create "words" not seen in the training set. Therefore I was able to copy paste the changes from py-OMEN into the PCFG code and train the OMEN portion using a length cost of 1. When I then ran a cracking session (trained on RockYou1) against the LinkedIn list using the "base" PCFG ruleset and the modified PCFG ruleset the following results were produced:

Breaking down these results, the base PCFG does better than the previously modified OMEN attack. That's not surprising, since the PCFG guess generator uses a lot of mangling rules that make it hard for any pure-brute force attack to keep up with it, (at least for shorter cracking sessions). But by adding a length cost factor into the OMEN algorithm that the PCFG toolset uses, I was really impressed by how much more effective it made the PCFG attack.

This seems like a clear win, so I pushed these changes to the core PCFG toolset and they will be available starting with version 4.8 of the pcfg-trainer. I also updated the Default PCFG ruleset to include these changes. That way if you run a standard attack the changes will already be applied. If you are using a custom ruleset that you trained yourself, you'll need to retrain that custom ruleset for the changes to take effect though.

The TLDR of this entire blog post is that the PCFG password cracker has gotten better. But as I said at the start of the blog post, I hope if you take anything away from this entry, it is the value of graphing out cracking sessions to understand what is going on. There is still a lot of room for improvement. Finding those improvements though really depends on someone going "huh, that looks weird" and digging into it.

Analyzing Tokenizer Part 2: Omen + Tokenizer

2024-12-04T16:15:00.000-08:00

“I have not failed. I've just found 10,000 ways that won't work”

- Thomas Edison

Introduction:

This is a continuation of a deep dive into John the Ripper's new Tokenizer attack. Instruction on how to configure and run the original version of Tokenizer can be found [Here]. As a warning, those instructions need to be updated as a new version of Tokenizer has been released that makes it easier to configure. The first part of my analysis can be found [Here].

This is going to be a bit of a weird blog entry as this is a post about failure. Spoiler alert: If you are reading this post to learn how to crack passwords, just go ahead and skip it. My tests failed, my tools failed, and my understanding of my tools failed. A disappointing number of passwords were cracked in the creation of this write-up. I'll admit, I was very tempted to shelve this blog post. But I strongly believe that documenting failures is important. Often when reading blog posts you don't really see the messy process that is research. Stuff just doesn't work, error messages that are obvious in retrospect are missed, and tests don't always turn out the way you expect. So as you read this, understand that it's more a journal of troubleshooting research tests when they go wrong, vs. a documentation of what to do.

To put it another way, the main audience for this blog post is:

My future self. I'm really annoyed at myself for not better documenting some of my past work. I'll go into that in more detail later.
Password cracking tool developers. This post won't help you crack more passwords or develop better password security strategies. Hopefully it will help those who are developing the tools to help you crack those passwords though.
People who play dwarf fortress and like [Fun].

Question 1: Why are my Tokenizer's "first 25 guesses" different from Solar Designer's

Background:

In response to my previous blog entry, Solar Designer wrote: "One thing that surprised me is that your top 25 for training on RockYou Full (including dupes, right?) is different from what I had posted in here at all (even if similar)." [Link].

That's a good question, and one that I had been wondering as well. There's a couple of things that could be causing this, from the way our Linux shells handle character encoding, the order of our training lists, to differences in our training lists. Or it could be something totally different that I'm not imaginative enough to come up with yet. At a high level, it's probably not that big of a deal since our experiences running Tokenizer attacks seem roughly the same (Solar Designer has posted tests comparing it to Incremental mode, and they roughly match what I've been seeing). But this can be a useful rabbit hole to dive down since it can expose some optimizations or environmental issue that could cause problems as more people start to use this tool. There's a big gulf between "it works on my machine" and "its easy for anyone else to run".

Conclusion Up Front:

Tokenizer (and base-Incremental mode) seem resilient to the order of the passwords they are trained on, and setting 'export LC_CTYPE=C' did not seem to impact guess generation.

Bonus Finding:

When manually analyzing password guesses DO NOT pipe the output of a password cracking session into "less". At least in my WSL Ubuntu shell, this seemed to add artifacts into the guesses I was creating which gave me bad data. Note, This doesn't impact running actual password cracking sessions.

Instead, when using John the Ripper, make use of the "--max-candidates" option. Aka:

./john --incremental=tokenize1 --external=untokenize1 --stdout --max-candidates=25

Discussion

This was an area where my analysis setup really let me down so I chased a lot of unproductive leads before I was able to find the ground truth. For my first test I sorted Rockyou1 and compared a Tokenizer attack trained on it to a Tokenizer attack trained on an unsorted Rockyou1 training set. Initially they appeared to generate different guesses. For example:

This led down an unproductive rabbit hole where I ended up generating a lot of different character sets for Incremental mode to try and track down what was causing the differences in guess generation. It wasn't until I got really frustrated and ran a "diff" on the different .chr files that Incremental mode uses and found they were EXACTLY THE SAME that I realized the problem might be in how I was displaying the guesses.

Still, I learned a few new things, and improved my testing process. So it wasn't a complete waste.

Question 2: How does the PCFG OMEN mode attack differ from the original Rub-SysSec version?

Background:

This question was inspired by a comment by @justpretending on the Hashcat Discord channel.

OMEN stands for Ordered Markov ENumerator and the original paper/implementation can be found [Here]. I became interested in it after it was presented at PasswordsCon where it was shown to be more effective than my PCFG attack and could generate guesses at speeds making it practical. That's certainly one way to get my attention! To better understand the OMEN attack I took the Rub-SysSec OMEN code and re-implemented it in Python. The standalone version of the python code (py-omen) is still available [Here]. Liking what I saw, I then replaced the existing Markov attack (based on JtR's --Markov mode) in the PCFG toolset with OMEN for the PCFG version 4 rewrite.

That's a lot of words to say that while the different implementations weren't 1 to 1, I expected my version of OMEN be "mostly" similar to the original Rub-SysSec version. But it appears there are differences, so let's look into them!

Challenges with Unsupported Tools:

The first challenge I ran into was getting the stand-alone versions of py-omen to run. For example, I get the following error when try to generate guesses with py-omen:

I vaguely remember having to update my ConfigParser calls in the PCFG toolset, so that error tracks. My guess is if you ran py-omen with Python3.6 it would work, but it looks like it isn't compatible with Python3.12. While it is tempting to fix this bug now as having py-omen working would be nice for doing more experimentation with OMEN, it's really outside the scope of this investigation and the error message brings me joy. Long story short, I'm going to defer that work until a later point.

The important tool to run though is the C OMEN version developed by Rub-SysSec. The "make" build process worked without a hitch, but when I tried to run it the following error was displayed:

Looking into the open issues for the code I found one [Link] that highlighted that this problem occurs when you run it from an Ubuntu system. I verified this happens both with a WSL install of Ubuntu as well as a copy of Ubuntu running on bare metal. When I installed a Debian WSL environment though, I was able to get the original OMEN code to work.

Another challenge I ran into with the Rub-SysSec OMEN code was that by default it only generates 1 billion guesses and then stops. I "believe" you can override this using the "-e" endless flag, but I didn't figure this out until I had run my tests, so the following tests only display a 1 billion guess session vs. the 5 billion guess sessions I used in my previous blog posts.

Test 2) How does the original OMEN code perform compared to the PCFG OMEN code?

Test 2 Design:

Training the Rub-SysSec OMEN code on the RockYou1 training set (1 million random RockYou passwords), an attack will be run using a variation of the following command. Disclaimer, I didn't actually use the "-e" flag but I'm including it here to make it easier the next time I need to copy/paste a command from this blog into a terminal.

./enumNG -p -e | python3 ../Password_Research_Tools/checkpass.py -m 1000000000 -t ../../research/password_lists/rockyou/rockyou_32.txt -o ../../research/blog/tokenize/OG_omen_rock1_rock32_stats.txt

You'll notice I don't specify a specific training ruleset for enumNG since the Rub-SysSec code only supports one training set at a time (aka you need to retrain it every time you want to use a different ruleset).

As to the target sets, I'm going to run enumNG against both the RockYou32 test set (a different set of 1 million passwords from RockYou), and the LinkedIn password dump that I used in the previous blog posts.

Test 2 Results:

Test 2 Discussion:

While I expected to see difference between the PCFG OMEN and the Rub-SysSec OMEN, I was still surprised by how much they differed. I obviously made some improvements in the PCFG version of OMEN while totally forgetting what they were. As you can see from these tests, the original Rub-SysSec OMEN performs comparably to the new JtR Tokenize attack (the original OMEN did better on Rockyou, but roughly the same or worse against LinkedIn).

The PCFG OMEN did much better though. These two attacks should be "mostly" the same! This difference in performance is like a grain of sand in my boot and I'd really like to better understand what makes them different. You'll notice both attacks have the "sawtooth" pattern though so there's certainly room for optimizing the underlying OMEN attack regardless of the implementation.

My first thought was these two tools used different Markov orders (or NGrams). Both of the tools can have the length of NGrams be specified via command line options during training, so having different default settings was a likely source of differences. Unfortunately when looking at their settings, both the Rub-SysSec and the PCFG OMEN use a default of NGram=4 (same as a 3rd order Markov chain). So that's ruled out.

Another source of differences could be the alphabets each OMEN attacks uses. The alphabet is the set of characters OMEN selects from when generating guesses. One change I made to the PCFG OMEN code was to allow for a variable number of characters in the alphabet based on the training data, (as well as support for different character encodings such at UTF-8). You can see the differences between the two different tools which were both trained on the RockYou1 training data below:

While the different alphabets are probably the cause of some of the differences, given that the "extra" characters in the PCFG OMEN are unlikely to be in many passwords, this doesn't explain the entire difference. My current theory is that the probability smoothing and how the PCFG toolset uses the "Initial Probability" (IP) may be the source of many of the other differences. Side note: Neither Rub-SysSec or PCFG OMEN used "Ending Probability" (EP) by default.

What is very annoying though is I don't have any notes on what I did differently for smoothing, so to better understand this I need to dig back into the original Rub-SysSec version of OMEN as well as my own code. So this is something I need to research, but I'm going to defer most of that investigation to a later blog post.

TLDR: The Rub-SysSec and PCFG toolsets both use the OMEN attack, but there are implementation differences which cause them to behave very differently.

Question 3) Can the Tokenize approach be applied to the PCFG OMEN attack?

This test was brought up by SolarDesigner on the John-Users mailing and we discussed what this attack might look like [Here]. The proposed approach to test this can be summed up as follows:

Use the tokenize.pl script to "modify" the training data with variable order Markov chains and create the "untokenize" JtR external mode, just like when running a normal tokenize attack with JtR's Incremental mode.
Train a PCFG OMEN attack on this "modified" training data
Use the PCFG OMEN attack to generate guesses, but before hashing/checking the password guesses, pipe the guesses into JtR --stdin mode to apply the --external=untokenize logic to convert the tokenize placeholder characters back to the multi-character strings.

This should be good enough "smoke test" to see if there might be value to add tokenize support to the main PCFG toolset.

Test 3 Training:

For this test, I'm finally updating my Tokenize code to the latest version in the John the Ripper Bleeding Jumbo github repository. To make things easier to compare against previous runs, I'll be training the new version Tok-OMEN on the RockYou1 1 million training subset. Here are the commands I used:

Run tokenize.pl to generate the sed script and the external mode code:

./tokenize.pl rockyou_1.txt

Convert the training file to make use of tokens by piping it through the sed command. Note: The new tokenizer sed script also converts the output to JtR pot format.

IMPORTANT: Remove the s/^/:/ from the end of the sed command so the results will be saved as a password only file. This is because the pcfg trainer no longer has the ability to train from a potfile (long story: It was a buggy feature. I need to revisit this feature as it is useful).
cat rockyou_1.txt | sed 'REALLY_LONG_SED_COMMAND' > tok_omen_rockyou1.training

Also copy the External mode script generated by tokenize.pl into your john-local.conf file as:

[List.External:Untokenize_omen]

Run the PCFG trainer on the tokenized training file.

-t = Training file. -r = Rule name. -c 0.0 = Only generate OMEN guesses
python3 trainer.py -t tok_omen_rockyou1.training -r tok_omen -c 0.0

This appeared to work correctly as can be seen when I view the CP.level (basically 4 letter substrings OMEN uses for NGRAM=4) in Visual Studio Code:

Test 3 Design:

For a target set of passwords, I'm going to use the same RockYou32 1 million subset of passwords (different from the training passwords), and the LinkedIn password dump. This will allow me to directly compare this attack to the previous attacks I ran.

To test the attack and generate the first 25 guesses I used the following command. Please note: It is very important to pipe the result of the pcfg_guesser into JtR to make use of the associated untokenize_omen rule.

python3 pcfg_guesser.py -r tok_omen | JohnTheRipper/run/john --stdin --external=untokenize_omen --stdout --max-candidates=25

The top 25 generated guesses were:

((Blank))
k05icty
butt
13babyg
john
6543
5555
b05ic1800ol1
bigb
18chsadfg
fu13cky
p18chsasw
blue
h0708ok
13mommy
s13coob
budd
augu
7777
6666
sn13oop
07or765
13compu
lu13cky
mysg

Note: Since OMEN works using "levels" it doesn't generate the most probable guesses first. Instead it generates all guesses at a specific level first. So for level 1 there are 104 guesses that can be created with this training set. You can see the keyspace per level in the PCFG rules file under Omen/omen_keyspace.txt. Still, this looks weird, and [[Spoiler Alert]] indicated a deeper problem with this attack run.

To actually run (and record) the attack I used the following command for the RockYou32 test set:

python3 pcfg_guesser.py -r tok_omen | ./john --stdin --external=untokenize_omen --stdout | python3 checkpass.py -t rockyou_32.txt --max_guesses 5000000000 -o tokomen_rock1_rock32_stats.txt

Test 3 Results:

Test 3a Discussion:

I ended up not running the test against the LinkedIn passwords, because ... Yikes. The TokOMEN attack only cracked 20,418 passwords. I was actually expecting it to struggle based on the first 25 passwords it created, but this was way worse than I expected.

As a quick check, I ran two short attacks (10 million guesses). One was a "normal" TokOMEN attack as above, and the other one was an intentionally "broken" TokOMEN attack without using the JtR External mode. I expected the second "broken" attack to totally fail as the tokenized guesses will be "junk", but that would at least tell me if the JtR External mode was working. The results of this were the "normal" attack cracking 15,920 passwords and the "broken" attack cracking 14,934 passwords. So the JtR External mode appears to be working to a degree.

Still, not great. Thinking it might be an issue with the current NGRAM setting, I reran the PCFG trainer using -n 2 (NGRAM=2). That's when I looked at the trainer output and noticed something going horribly wrong...

TLDR: The file encoding autodetect was going south due to the new tokens in the training data. This caused the pcfg-trainer to horribly misread many of the training passwords. The real results are even worse than the errors suggested since there are a million total training passwords, so many of the different training passwords were also incorrectly "merged". No wonder things went so wrong! Teaches me to ignore error messages past-me put into the code.

Setting the encoding to UTF-8 or ASCII made ... negative progress. The PCFG trainer was rejecting most of the Tokenized training data. Looking at my code, I quickly realized the cause of most of these errors was my "dirty training dataset reject function". Basically the tokenized training data looks like all the "junk" that normally shows up in real password dumps. The PCFG trainer includes logic to reject these "junk" passwords to generate more effective rulesets for cracking real passwords. Below is an example of some of the logic that the PCFG trainer uses to clean up training datasets.

Removing the sanity checks from the training data helped a bit, but ended up causing a problem when the PCFG trainer was trying to save the ruleset and write the OMEN data to disk:

Messing around with various options, I was able to get "different" errors, but no successful training runs. So there currently isn't an easy way to apply the Tokenizer attack to the current PCFG OMEN code.

Based on this, I started looking at the Rub-SysSec OMEN code, but that had similar issues. These issues were also compounded by the fact that the Rub-SysSec OMEN alphabet was hardcoded. So there isn't an easy option with that toolset either.

Test 3 Conclusion:

There currently doesn't exist an easy way to apply the Tokenizer attack to current OMEN implementations. I think there is a lot of possibility to incorporate "variable Markov order" aka "variable NGRAM" functionality into OMEN. I'm not convinced that a Tokenizer trainer such as tokenize.pl is the best way to go about that though considering password length is a component in the OMEN level calculations. But I think the lessons learned by looking at tokenizer.pl and seeing it applied to JtR's Incremental mode can be applied to however this approach is incorporated into OMEN.

Question 4) When performing research (or running real cracking sessions), what is a "good" ruleset and dictionary to use?

This question was inspired by the following comment/question I received on the Hashcat Discord channel.

The follow-up discussion led to a really good conversation where @Br0ken shared their cracking techniques, rules, and wordlists they used. This is a good example of where I personally really benefit from writing these blog posts since I learn a lot from the comments/discussions they generate.

Because of that I wanted to share/document some of the advice and links that came out of that conversation.

Good Wordlists:

Ignis-10m combolist

Link: https://github.com/ignis-sec/Pwdb-Public/tree/master/wordlists
Description: This is a combo-list made up of 19 million passwords from different data breaches. It can be easier to use than larger uncultivated combo lists such as the full Hashmob cracked list

Discussion of Rulesets and Wordlists:

Hashcat Discord Wordlist Daily Quiz

Link: https://docs.google.com/spreadsheets/d/1qQNwggWIWtL-m0EYrRg_vdwHOrZCY-SnWcYTwQN0fMk/edit?gid=524870023#gid=524870023
Description: This is a Google document that describes various tests that some members of the Hashcat Discord channel have run. It can provide a good starting point to better understand what wordlists and rulesets real password crackers are actually using.

Future Research Ideas:

The list of topics keeps growing and growing. Here are new items to throw on my backlog:

Figure out how to integrate variable length NGrams into OMEN attacks

This is a big take-away from this blog post. I don't know if these attacks will be better than current OMEN implementations, but it'll certainly be interesting to see their results.

Fix up py-omen so it actually runs

I'm kind of tempted to spend time on this so I can experiment with OMEN without having to deal with the full PCFG toolset

Perform tests to identify good rulesets and dictionaries for password cracking attacks

It may be important to separate this into good rulesets for research vs. real cracking attacks. The reason for this is that many of the newer rules and dictionaries are based on well documented password lists like RockYou and LinkedIn. That's perfectly ok for real cracking sessions, but can cause problems for research when you want to use those datasets to test against.

Analyzing JtR's Tokenizer Attack (Round 1)

2024-11-17T14:11:00.002-08:00

Introduction / Goals / Scope:

This is a follow-up to my previous blog post looking at how to install/run the new John the Ripper Tokenizer attack [Link]. The focus of this post will be on performing a first pass analysis about how the Tokenizer attack actually performs.

Before I dive into the tests, I want to take a moment to describe the goals of this testing. My independent research schedule is largely driven by what brings me joy. Because of that I'm trying to get better at scoping efforts to something I can finish in a couple of days. It's easy to be interested in something for a couple of days! Therefore, my current plan is to run a couple of tests to get a high level view of how the Tokenizer attack performs and then see where things go.

To that end, this particular blog post will focus on three main "tests" to answer a couple of targeted questions.

Test 1: Analyze how sensitive Tokenizer is to the size of the training data

Question: How sensitive is the Tokenizer attack to being trained on 1mil, or 30+ mil passwords?
Impact: Knowing this is important since it determines if the Tokenizer attack can be effective when trained on smaller datasets. This could be a community or language specific target, or a dataset targeting a specific password creation policy.
Secondary Reason: Identifying early on how sensitive Tokenizer is to the training size it will help inform other testing options I have available to me. For example can I train it on a subset of RockYou passwords, and then test it against a different subset from that same breach? Also, full disclosure, I made a mistake somewhere along the line of training the Tokenizer in my previous blog post that led me to think it was more sensitive to the training data size then it actually was.

Test 2: Compare a short (5 billion guess)Tokenizer attack against Incremental and OMEN.

Question: How does the Tokenizer attack compare to other Markov based attacks?
Impact: This will provide a quick gut check on if there is value in the tokenizer as-is or if this is more an academic tool to learn from. Aka should I start to incorporate it into my password cracking attacks now, or is it more like the neural network GAN attacks [Link] which were interesting research and a basis to build upon, but are worse than current methods in every way?
Limits on Scope:

I'm sticking to OMEN and Incremental since they are very similar attacks to tokenizer.
There absolutely are other attack types I could run, such as Hashcat's Markov, mask attacks, JtR's --Markov mode, PRINCE, etc. To address this, I'm going to use standard training/test datasets so that way you can compare these other attacks to the Incremental/OMEN results to extrapolate how they would perform compared to Tokenizer.
There are also a ton of variations of these attacks! For example I could use reduced character sets such as "lowernum" vs. just training on the full set of passwords in the training lists. I'm going to defer that type of experimentation for now and hopefully revisit it when digging into how to optimize cracking sessions.

Test 3: Compare Tokenizer and CutB as Part of a Larger Password Cracking Session

Question: How does Tokenizer fit in with a larger password cracking session where various wordlist attacks have already been run?
Impact: "Brute-Force" attacks like Incremental are usually run after wordlist attacks have been exhausted. Therefore it's important to understand how Tokenizer performs after all the "easy" passwords have already been cracked.
Note 1: I'm going to be comparing Tokenizer against CutB since that is often used in a "throw the kitchen sink" sessions such as those in EvilMog's random ad-hoc methodology described [here].

Tests:

Note on Testing Tools:

The primary testing tool suite I'm using to analyze password cracking success is checkpass.py [Link]
Checkpass works using plaintext passwords and generates statistics about how effective a password cracking session is. I can then paste those statistics into Excel to generate graphs.
When performing analysis on hashed passwords, this means I need to crack them first. This can be done in a couple of different ways:

If I've performed a lot of password cracking on the list before and have it at around 96% success rate I can generally use those plains without having to worry too much about the 4% of uncracked passwords
I can also download wordlists from hashmob [Link] that will often achieve a high success rate since most of the lists I deal with are already on hashmob's public cracking targets.
Finally, I can simply run the attacks I'm analyzing twice, once as a real password cracking attack, and the second time against the plains using checkpass to make some nice graphs.

Below is an example of how I run checkpass.py and use that to generate these graphs. Note: Checkpass can also create a list of uncracked passwords. This is helpful since it lets me chain together different attacks to simulate more complex cracking sessions.

Test 1: Analyze how sensitive Tokenizer is to the size of the training data

Training: RockYou

Note on RockYou Dataset: The RockYou dataset contains duplicate passwords as well as all the encoding weirdness found in the original dump. I randomized the order of the passwords in it to avoid any correlations between passwords present in the original dump, and split it into 32 1-million subsets to allow training/testing against different passwords.

Tokenize1: Trained on a 1 million subset of RockYou
Tokenize2: Trained on a different 1 million subset of RockYou
Tokenize_Full: Trained on the full set of 32 million+ RockYou passwords

Testing: LinkedIn 2012 Data Breach

Notes on LinkedIn 2012 Dataset:

Origin: There are several different LinkedIn datasets from the 2012 Linkedin data breach [Link]. For this test, I'm going to use the original dump that only included around 6.4 million hashes. This dump also had malformed hashes where the first 5 bytes of the hashes were replaced by 0's. I'm using this dataset vs. some of the later (and larger) datasets since it's been analyzed the many different academic papers.
Obtaining the List: You can download the list from skullsecurity [Link]. I probably should compare my copy of the list to that one, so there might be some differences, but I figure it's important to point out where other researchers can get a copy.
Cracking the List: You can crack the list using the default Hashcat raw-sha1 format since by default Hashcat ignores the first five byes of the hash. I wrote about that more [here]. If you are cracking these hashes in John the Ripper, you need to use the format "raw-sha1-linkedin"
Obtaining plains: For this attack I was curious how effective the Hashmob plains list would be. Hashmob is a collaborative password cracking site that has some very skilled members (they won this year's CMIYC competition). So I decided to try it out and promptly fell down a rabbit hole. Before I detour into that research, let me finish up the dataset description.
Size of Dataset vs. Cracks: 6,458,020 passwords / 5,980,436 cracked. 92% success rate.

Total Side Tangent on LinkedIn List + Hashmob Wordlists:

I'll be up front: Given the age of this dataset and the speed of the underlying hashing algorithm (raw-sha1), I was expecting the hashmob wordlist to crack over 96% of the hashes. So after seeing so many uncracked passwords, I decided to run a standard PCFG attack against the remaining hashes just to perform a sanity check. To my surprise I got a few quick hits almost immediately:

Noticing all the new cracks had non-ASCII characters, I then started up a new attack using the included Russian ruleset:

These aren't complicated passwords. For example, I believe снейка means "snake" in Russian. (EDIT: I've since been informed that Google Translate was wrong and that is just a made-up word). Wanting to dig into this more, I then ran my cracked list from 2014 when I was investigating this list against the left list.

The actual cracked list was much longer, but what's interesting was that almost all of the new (or really old depending on how you look at it) cracks were of e-mail addresses. I talked with a couple other researchers, one of which graciously provided me his cracked list, and I saw similar results. More e-mail addresses and more non-ASCII cracked passwords.

Current Theory: I suspect the Hasmob team strips e-mail addresses from their plain/cracked wordlists they provide to the public. I also suspect they run into issue creating a wordlist with all the weird encoding issues found with passwords in the wild, so their wordlist has gaps in non-ASCII cracks. I want to stress, all of these gaps are 100% totally reasonable, and when it comes to stripping e-mail addresses, commendable! But it's something to keep in mind when using these lists to conduct academic research.

Impact to these tests: While I'd like to have a higher crack percentage, given the fact that so many of the uncracked passwords likely contain non-ASCII characters or are e-mail addresses, this shouldn't have a big impact when analyzing how tokenizer performs. This is because as configured, my tokenizer attacks are unlikely to crack very many of these uncracked passwords. In the future I might run another "real" test of tokenizer against these hashes, but I'm going to put that off until I spend more time validating/improving my testing tools.

Test 1 Results:

Test 1 Analysis:

The two tokenizer attacks trained on 1 million passwords performed very similarly (you almost can't see the second line on the graph). This is a good result since it points to being somewhat resilient to minor differences in the training data. You will notice though that the tokenizer attack trained on the full 32 million RockYou passwords does perform noticeably better.

There's a lot of additional questions that come to mind about this, but I'm going to let these results stand alone for your interpretation and move on to the next set of planned tests.

Bonus Analysis and Correction:

In my previous post I posted the first 25 guesses my training of tokenizer produced, and it looked "weird". SolarDesigner replied with what they were seeing when running their own copy which was very different (and looked more like what I originally expected) [Link]. I reran all my training, and then started getting similar results to Solar. Long story short, somewhere along the way with my troubleshooting and figuring out this attack I made a mistake. Here are the updated results of the first 25 guesses generated by tokenizer with the Rockyou training data above, along with the results Solar provided:

The guesses highlighted in green are guesses that were shared with one of the other training runs.

Test 2: Compare a Tokenizer attack against Incremental and OMEN

Training:

All three attack modes were trained on the same 1 million subset of RockYou passwords

Tokenizer1: Trained on a 1 million subset of RockYou as described previously
OMEN1: Trained on the same 1 million subset of RockYou passwords. Using the OMEN attack mode build into the PCFG toolset [Link]. While you can specify during training to only generate guesses using OMEN, I took a shortcut and just modified the grammar.txt file of the ruleset to only include "M" (Markov) replacements. This way the PCFG cracker will only generate guesses using OMEN. As a disclaimer, while the PCFG OMEN implementation uses the same underlying principals as the original Rub-SysSec version, they differ in a number of ways and will generate different guesses.
Incremental=Rockyou1: Trained Incremental mode on the same 1 million subset of RockYou passwords. This is roughly equivalent of Incremental=ASCII since I didn't apply a filter which means guesses included upper/lower alpha as well a digits and special characters.

Testing:

Test 2a: Testing against a different 1 million password subset of the RockYou list. Aka this is a different subset than what the attacks were trained upon
Test 2b: Testing against the 2012 LinkedIn list (I wasn't planning on running this test, but after looking at the results of Test #1, I was really curious).

Test 2a Results:

This was interesting, but you really can't see what's going on at the start of the password cracking session. So the next graph is the same test/data, but just zoomed in to the first 20 million guesses.

Test 2b Results:

Test 2(AB) Analysis:

Not a lot of surprises here, which is good. OMEN is a very effective attack mode so that was always a tough one to beat. The challenge with OMEN is the lack of an indexing function (aka being able to tell it "generate password at position 2941932", which leads to complications with pausing/restarting cracking sessions. So I generally use Incremental mode in my real password cracking sessions. It's just easier. Which means that having the Tokenize attack improve upon standard Incremental mode is a big deal.

Side note: I try to point this out whenever talking about OMEN, but you'll notice the sawtooth success rate as it tends to crack more passwords at the start of OMEN "level". This highlights significant room for improvement if any researchers want to look into this. Ideally you'd like to have a smoother graph to frontload all your effective guesses near the beginning of your cracking session.

Test 3: Compare Tokenizer and CutB as Part of a Larger Password Cracking Session

For this last test I wanted to simulate a larger cracking session. For this I'm loosely going to base my attacks on EvilMog's "Random AD Methodology" describe [Here]. By loosely I mean I'm just going to simulate the first three steps:

run rockyou with -g 100000 or all the rulesets combined
(Comparison point) run expander (modified to max at 8 or 10), and then run -a1
(Comparison point) run cutb with -a1

For the first step, I'm going to use the full RockYou wordlist (only unique words) and the "Hashcat" ruleset in John the Ripper. I figure that gets close the the intention of step #1 without having to resort to making 100k random rules up on the spot.

The John the Ripper "Hashcat" ruleset is actually a collection of rules from the Hashcat repo modified to work with JtR:

[List.Rules:hashcat]
.include [List.Rules:best64]
.include [List.Rules:d3ad0ne]
.include [List.Rules:dive]
.include [List.Rules:InsidePro]
.include [List.Rules:T0XlC]
.include [List.Rules:rockyou-30000]

.include [List.Rules:specific]

The challenge from an analysis perspective these attacks generate an absolute ton of guesses! The main reason for the large number of guesses is there are a lot of rules in all of these rulefiles and the RockYou input wordlist at 14 million+ words is pretty hefty. There is room for improvement though since this combined mangling rule list isn't optimized. For example, all of these rules files are designed to be run individually. So there is a significant overlap in mangling rules between them which generates a large number of duplicate guesses. A smaller nitpicky point is that none of these attacks have "reject" functions built into them so every mangling rule is applied to every input word regardless if the mangling rule would actually change that word. The reason I'm highlighting this isn't to criticize the rules. I simply want to point out there are areas to improve if anyone wants to dive into that (spoiler: I do not).

Ignoring that digression, I guess what I'm trying to say is if I ran this attack with the Rockyou wordlist on my research laptop and piped it into checkpass.py (which itself can be a bit slow), the attack would take me around two weeks to complete. To that end, I ran a "quick" attack of just 5 billion guesses which gets through the best64 ruleset and into d3ad0ne ruleset using checkpass.py simply because I wanted to compare that to my previous graphs. I then launched all these attacks for real on a different computer to create a potfile of all the passwords cracked using these attacks.

(Future Improvement): Hashcat supports the ability to record "guess position" in the outfiles (potfiles) it generates. I've never really used that, but I plan on looking into that feature in a future "improve my testing process" research sprint. For now though, it's just easier to launch JtR and let it run while I do other things.

While I could be more scientific about it, given the 14 million+ word wordlist (Rockyou-Unique) and the Best64 ruleset (which has slightly more than 64 rules), the Best64 ruleset finishes up somewhere around 1 billion guesses, which is pretty evident from the graph above. The other Hashcat rulesets are not nearly as optimized. This does highlight though that starting a password cracking session off with a "smart" dictionary attack is still one of the best ways to crack passwords quickly.

As I mentioned, I then ran the full cracking session to completion using John the Ripper against the hashed LinkedIn passwords. I'll be using the found/non-found lists from that full run in the following tests. The results of running the full Hashcat rules attack vs. LinkedIn can be seen below.

Success Ratio for Full Hashcat Rules vs. LinkedIn:

3,140,344 of 6,458,020 password cracked. (48.64% success rate)
As comparison: with all my attacks and the wordlists downloaded from Hashmob, I have 5,980,436 passwords cracked. So this attack is respectable, but there's certainly room to crack more passwords.

Introduction to Hashcat Utils:

For this test, steps #2 and #3 involve using expander and cutb. If you are not familiar with these tools, they are part of Hashcat Utilities [Link].

While you can build the tools in Hashcat Utilities from source [Link], the latest release binaries are available [Here].

As to what Hashcat Utilities are, you can get more detailed information from the first link above, but at a high level they are a set of tools that each perform one specific task. Many of them can be chained together (or used stand-alone) to create targeted wordlists which is how we'll be using them in this experiment.

Expander: This tool mangles and creates new combinations of words from individual characters found in each word in the input dictionary. The actual operation is a bit weird, but imagine you wrote the input word on a piece of paper and then folded the paper into a circle so the word is like a bracelet. Expander then creates new words by taking cuts out of that bracelet. So "password123" can generate the guess "3pas" as it wraps around. By default it will generate all 1-4 letter combinations from the input wordlist that is piped to it. Here is an example of me running expander with one input "word".

echo password123 | ./expander.bin

Expander will then return the following output (only showing a sample as the full output is 40 unique words):

p
a
<Cut>
3
pa
ss
wo
<Cut>
ssw
ord
123
ass
<Cut>
pass
word
assw
<Cut>
ord1

Side note: I was really surprised by guesses Expander didn't make. For example "23pa" was not generated. So it's not an exhaustive list and there are some exceptions in the substrings it generates.

While Expander will by default only generate 1-4 letter guesses, you can increase this by changing a macro variable in the source and recompiling it. Some people will have multiple versions of expander built with the length of guesses they generate appended to the filename. For example "expander8.bin". Another approach to make longer guesses without having to recompile the code is to combine multiple runs of "length 4" expander using Hashcat's combinator mode (attack mode "-a 1") to generate longer password guesses.

Expander is the basis of what's been called a "Fingerprint" attack. This was first described by pure_hate in the following blogpost where they used it as part of the 2010 CMIYC competition [Link]. A more modern take and example of using a Fingerprint attack can be found [Here].

Now, you generally need to be selective in the input wordlists you feed to Expander since this attack can very quickly get to the point where it's almost equivalent to a full dumb brute-force attack. You also need to make sure you "sort -u" the outputs of Expander since it often generates a ton of duplicate guesses. Because of this, I generally wouldn't recommend using Expander on normal password cracking wordlists. Instead, people will often use Expander on previously cracked passwords to get new cracks. For example:

Remove the hashes from a standard hashcat potfile and save the results in plains.txt. Note: Unlike John the Ripper's "--show" command, this will output everything in the potfile vs. generating individual lines for each target hash.

cat hashcat.potfile | cut -d: -f2- | sort -u | plains.txt

Pipe the plains into expander to create the "base" wordlist.

cat plains.txt | expander | sort -u > plains_expanded.txt

Run a basic hashcat combinator attack (-a 1) using the plains_expanded.txt wordlists

hashcat -m HASH_MODE -a 1 TARGET_HASHES.hash plains_expanded.txt plains_expanded.txt

To continue to build this out and target passwords greater than 8 characters long you can re-run variations of the above commands like as follows:

Generate a wordlist of all 8 character long Expander generated words:

hashcat --stdout -a1 plains_expanded.txt plains_expanded.txt | sed -n '/.\{8\}/p' | sort -u > plains_expanded_8.txt

Generate guesses 9-12 characters long in Hashcat

hashcat -m HASH_MODE -a 1 TARGET_HASHES.hash plains_expanded_8.txt plains_expanded.txt

You can keep building this process out for longer guesses. Now you know how to run a fingerprint attack!

CutB: This tool allows you to "cut" substrings from an input wordlist for use in hashcat combinator and hybrid (rule-based) attacks. It's a lot easier than piping your wordlists into sed, awk, or other Linux tools to retrieve substrings. I'd recommend checking out the Hashcat wiki for info on how to use it, but at a high level you can give it two numbers on the command line to specify which substrings you want to extract. Aka:

echo password123 | ./cutb.bin 0 4

Result: pass

echo password123 | ./cutb.bin 4

Result: word123

echo password123 | ./cutb.bin -4

Result: d123

Often CutB will be run in a script to generate many, many, different subsections of a password guess. You may notice that CutB is pretty similar in operation to Expander, but it allows you much more flexibility to be somewhat targeted about how you apply your cuts.

Side note: CutB's code is weird, and it won't always perform like you'd expect. For example:

echo password123 |./cutb.bin -5

Result: rd123

echo password123 |./cutb.bin -6

Result: ord12

echo password123 |./cutb.bin -7

Result: word

I really don't know what's going on with those two last guesses.....

Description of Test 3 Attacks:

Tokenizer_RockyouFull:

I'm going to use the version of Tokenizer trained on the full list of 32 million+ Rockyou Passwords

Tokenizer_LinkedinPot:

This version of Tokenizer is going to be trained on the LinkedIn passwords cracked during the Hashcat rules wordlist attack using the Rockyou_Unique wordlist. Aka I'm training it on the potfile from a previous attack.
I'm including duplicated guesses in the training set by generating a list using "./john --show --format=raw-sha1-linkedin --pot=TESING_POTFILE"
The goal of this attack is to try and make a direct comparison of Tokenizer to CutB and Expander

Expander:

This attack will use Hashcat Utils: Expander to create a wordlist based on uniquely cracked passwords from the Hashcat rules wordlist attack against Linkedin.

The resulting wordlist (after sort -u is run on it) has 1,854,331 lines.
Pow(96 character, 4) = 85 million(ish), and this wordlist included non-ASCII characters as well. This means while it is large, the wordlist generated by Expander still represents a significant reduction from a true brute force attack.

This attack will be run using Hashcat's combinator attack "-a 1" as described above.

I'm only doing this first run of expander that will create guesses 2-8 characters long since even this basic attack won't complete in the first 5 billion guesses.

CutB:

This is going to use CutB trained on uniquely cracked passwords from the Hashcat rules wordlist attack against Linkedin.
Following the cutb.sh script [Link] in Evilmog's Hashcat scripts, cutb will create two lists. that take cuts from both the front and back of the input words. Pseudocode below:

for x in range(1,8): cutb 0 x
for x in range(1,8): cutb -x

The lists will then be combined and run through "sort -u" to remove duplicates.

The resulting wordlist contains 7,476,636 lines. These lines range from 1 to 8 characters long. So this is a bigger wordlist than Expander, but it also can generate longer guesses.

The actual attack will be run using the default PRINCE settings in John the Ripper. For more information about PRINCE, see my blogpost [Here].

Description of Test 3 Target:

All attacks will be run against the remaining uncracked passwords from the 2012 LinkedIn password list after the JtR Hashcat rules with Rockyou-Unique wordlist have been run against it. Each attack will be run for 5 billion password guesses. This is a very short runtime for these attacks. Normally these attacks will generate trillions of password guesses. Future testing might include Hashcat's outfile debugging formats or running the attacks for a set time (days/weeks), but I figure 5 billion guesses can start to indicate how these attacks will compare to each other.

Test 3 Results:

Quick summary of results:

Tokenize RockyouFull: 8,423 cracked
Tokenize LinkedInPot: 14,984 cracked
Expander: 141 cracked!!!
CutB: 7,344 cracked

I didn't expect Expander to do very well given the short number of guesses, but this low number really shocked me. I'm pretty sure just creating random wordlist rules using "hashcat -g 100000" would be more effective.

As for the graph of the results, see below. As a disclaimer, due to the small number of cracks vs. the total size of the list, don't read too much into it:

Analysis of Tet 3 Results:

While it's never fun to say that the biggest finding is that your test setup is flawed, that's my main takeaway from these tests. When looking at the results, 5 billion guesses is way too low a number to analyze these attacks after trillions of guesses have been made running wordlist attacks. Going back to Test 2, (and quick disclaimer this is not a direct comparison due to different training sets for Tokenizer), but Tokenizer cracked over 1 million passwords when it was run as the first attack. So when it cracks just 14k unique passwords more than the Hashcat Rules based attacks, that shows a strong overlap in the guesses that these two attacks are making.

This is a long way of saying, after an initial very long run using the Hashcat Rules attack against LinkedIn, I don't expect any non-wordlist based attack to do very well for just 5 billion guesses. So while it's easy for me to make fun of Expander, I really can't make any definitive statement about how these attacks perform in real life unless I run a cracking session that represents several days with a GPU.

Looking at the bright side, I'm glad I ran this test. It forced me to better understand some of the tools in Hashcat Utilities, as well as start to identify what future tests should look like as well as gaps in my testing strategies.

Future Research Ideas:

I'll be up front: The holidays are coming up, and I have a lot of other research items I'm working on that I would like to finish up [Spoiler/Link]. This basically means that while there are a ton of unanswered questions from this blog post, I'm probably not going to get around to investigating them anytime soon. As a note to my future self though, here are a couple of topics that jump out to me:

Develop a process to track/analyze longer password cracking session.

My gut feeling is this will require using Hashcat's output options to print guess positions for new cracks. I spent a lot of time looking at JtR's log format but I don't think I saw an equivalent guess position option.
This is a general problem for academic research. Either the sessions modeled are very short (several billion guesses), or some alternative method such as Monte Carlo estimations are used to predict how effective a longer password cracking session would be. Disclaimer: I'm very skeptical about the accuracy of the Monte Carlo estimations. But I'm willing to be convinced otherwise if someone can run a real session and the results roughly match the estimates.

Investigate how to optimize the "guessing budget" in OMEN levels to smooth out it's cracking graph and move more effective rule to earlier in the cracking session.

As I mentioned earlier, that sawtooth graph highly implies that there is a lot of room for optimization within the OMEN attacks.

Incorporate the Tokenizer approach into OMEN.

Besides the general OMEN improvements above, I think the Tokenizer approach shows a lot of promise for improving Markov based attacks by adding variable length Markov orders into them.
There's probably an academic paper that can be written on this. If you are a research student thinking about this and want an advisor or consult, drop me a line as I have a lot of thoughts about this.

Further test/improve Tokenizer attacks

That was the original goal of this blogpost before I totally go sidetracked!
I think this attack is cool and I might start to incorporate it into my normal password cracking workflow. So any improvements to make it more effective are always welcome!

Make Tokenizer Attacks Easier to Run.

I think the Tokenizer attack is a really cool improvement to John the Ripper's Incremental mode attacks. Using this attack will improve your password cracking success rate.
The challenge is due to the complications of getting this to run, I'm very doubtful about how many people will take advantage of this improvement.
Ideally tokenizer attacks should be run exactly like Incremental attacks, and the "external mode" requirement should be hidden from the user.
I'd also like to make training a new tokenizer attack to be easier

It would be nice to train Incremental mode attacks from a list of plaintext passwords as well as from a potfile.
There's a couple of different manual steps required to train a Tokenizer attack. It would be helpful to combine them together so only one command needs to be run, (besides updating your john-local.conf to include the attack).

Create a John the Ripper "Optimized" version of the Hashcat Ruleset

Delete duplicate rules between the different modes
Re-organize a lot of the rules to make them easier to see, and make use of JtR's rule preprocessor
Add reject functions to the rules so they won't be run if they wouldn't modify the input word.

Figure out what's behind the "weirdness" in the guesses CutB and Expander generate

Both of these tools don't generate guesses the way I'd expect them to based on their readmes. Examples of that can be found in my write-up above.
I don't know if this "weirdness" is intentional, but it might be useful to look into them to see if there might be improvements that can be made.

Running JtR's Tokenizer Attack

2024-10-29T19:08:00.000-07:00

Disclaimer 1: This blog post is on a new and still under development toolset in John the Ripper. Results depict the state of the toolset as-is and may not reflect changes made as the toolset evolves.

Disclaimer 2: I really need to run some actual tests and password cracking sessions using this attack, but I'm splitting that analysis up into a separate blog post. Basically I have enough forgotten drafts sitting in my blogger account that I didn't want to add another one by trying to "finish" this post before hitting publish. So stay tuned for new posts if you want to see how effective this attack really is.

Introduction:

It's been about 15 years since I last wrote about John the Ripper's Markov based Incremental mode attacks [Link] [Link 2]. 15 years is a long time! A lot of work has been done applying Markov based attacks to password cracking sessions, ranging from the OMEN approach to Neural Network based password crackers. That's why I was so excited to see a new proof of concept (PoC) enhancement of JtR's Incremental mode that was just published by JtR's original creator SolarDesigner.

The name of this enhancement is likely going to change over time. Originally it was described in an e-mail thread as "Markov phrases". That's not a bad descriptor, but doesn't really get to the heart of the current PoC. Therefore in this blog post I'm going to call this attack after the new script SolarDesigner released (tokenize.pl) and just refer to it as a "Tokenizer Attack". I think this gets closer to conveying how the underlying enhancement differs from the original Incremental attack which the tokenizer attack is built on.

The next question of course is "What does this new enhancement do?" To take a step back, what make Markov attacks "Markovian" is they represent the conditional probability of tokens appearing together. For example, if you see the letter 'q' in an English word, the next letter is very likely to be a 'u'. How far you look ahead to apply this probability is measured by the "order" of the Markov chain. So the above example would describe a first-order Markov process. If we extended this and said that given a 'q' and then a 'u', the next most likely character would be a 'e', now we are talking about a second-order Markov process. Where this starts to play out compared to a first-order process is that 'qu' -> 'e' with a high probability but the highest probability next character for the substring 'tu' might be 'r'. Therefore the highest probable letter following a 'u' might change based on the letter before the 'u' Basically the order represents the "memory" of the Markov chain. It can remember X previous tokens of the string it is generating.

The reason I bring this up is that JtR's Incremental mode works with trigraphs which "roughly" can be represented by second-order Markov processes. I use "roughly" since like most Markov implementations, Incremental mode contains nuances and differences from the abstract/academic definition of Markov based processes. Those nuances aren't super important for this current investigation/blog-post so I'm going to gloss over them for now.

One interesting area of research is implementing "variable order" Markov processes. Aka some particularly likely substrings might be created by a third/fourth/fifth/sixth order Markov process, but other less likely transitions might be implemented in a lower order (first/second) Markov process. As an example of that, if you were using a second order Markov chain the initial letters 'or' might generate a next letter of 'k' with a high probability, but if you take a larger step back and see the earlier letters are 'passwor' then the next letter will almost certainly be a 'd' instead. Based on this, it's tempting to "apply the largest memory possible" to your Markov processes. The limiting factor though is if you extend things out too much you run into overtraining issues, not being able to generate substrings not seen in the training data, and general implementation issues (the size of your Markov grammar explodes). So being able to dynamically switch how much memory/state you are keeping track of can be very helpful when generating password guesses. While more research is needed, this is my current theory why the CMU Neural Network Markov based attacks [Link] outperform other Markov implementations. I strongly suspect the Neural Network makes use of a variable order Markov process when generating guesses.

That's a lot of words/background to say that JtR's Tokenizer attack can be thought of as a way to incorporate variable length Markov orders into JtR's current Incremental mode attack. It does this by identifying certain likely substrings (aka "tokens") and then replacing them in the training set with a "placeholder" character. So for example the likely substring "love" might be replaced with the hex value of x15. This would normally be an unprintable character in ASCII (NAK), but it allows JtR's normal Incremental mode charset to be trained using these replacements. The resulting Incremental charset will have probability information for generating the character "x15" (NAK) as part of a password guess. Now (NAK) isn't actually part of a real password (unless the user did something really cool). This means when running a password cracking session, you'll need to then apply an External mode to translate these guesses back into the full ASCII text outputs. For example the guess "I(x15)MyWife" would be translated by the External mode into "IloveMyWife" which can then be fed into a real password cracking session.

While I can certainly dive more into the details of the tokenizer attack, this intro is already too long. So lets instead look at how to run it!

References:

Post by SolarDesigner announcing the release of the tokenize.pl script: [Link]
John the Ripper Bleeding Edge: [Link]

Configuring The Attack:

1) Update JtR Bleeding Edge:

The new code, (including tokenize.pl) is available in the newest version of JtR Bleeding Jumbo.
To make use of it, clone the gihub repository and make sure you are on the "bleeding-jumbo" branch which is the default. [Link]
(Optional) Rebuild JtR from source.

This is optional since the new code "should" work with older versions of JtR. But you might want to rebuild JtR to be on the safe side.

2) Create a Custom Incremental .CHR File:

Run the new tokenize.pl script on your training passwords.

Notes:

Save the full output of this tool, as you'll need to run the Sed command to parse/tokenize your trianing set, and you'll need to paste the External "Untokenize" script to your john.conf file. Both of these steps will be described later

Example:

./tokenize.pl TRAINING_PASSWORDS.txt

Example output when training on one million passwords from RockYou. You may note that the results are slightly different than when SolarDesigner trained on the full RockYou list:

Re-run the sed script on the training set to generate a custom training file.

Example:

cat TRAINING_PASSWORDS.txt | sed {{REALLY LONG SED COMMAND}} > new_training.txt

Closer to Real Example:

cat TRAINING_PASSWORDS.txt | sed 's/1234/\x10/; s/love/\x15/; s/2345/\x1b/; s/3456/\x93/; s/ilov/\xe5/; s/123/\x6/; s/234/\x89/; s/ove/\x8c/; s/lov/\x90/; s/345/\x96/; s/456/\xa4/; s/and/\xb2/; s/mar/\xc1/; s/ell/\xd9/; s/199/\xdf/; s/ang/\xe0/; s/200/\xe7/; s/ter/\xe9/; s/198/\xee/; s/man/\xf4/; s/ari/\xfb/; s/an/\x1/; s/er/\x2/; s/12/\x3/; s/ar/\x4/; s/in/\x5/; s/23/\x7/; s/ma/\x8/; s/on/\x9/; s/el/\xb/; s/lo/\xc/; s/ri/\xe/; s/le/\xf/; s/al/\x11/; s/la/\x12/; s/li/\x13/; s/en/\x14/; s/ra/\x16/; s/es/\x17/; s/re/\x18/; s/19/\x19/; s/il/\x1a/; s/na/\x1c/; s/ha/\x1d/; s/am/\x1e/; s/ie/\x1f/; s/11/\x7f/; s/ch/\x80/; s/10/\x81/; s/00/\x82/; s/te/\x83/; s/ve/\x84/; s/as/\x85/; s/ne/\x86/; s/ll/\x87/; s/or/\x88/; s/ta/\x8a/; s/st/\x8b/; s/is/\x8d/; s/01/\x8e/; s/ro/\x8f/; s/20/\x91/; s/ni/\x92/; s/at/\x94/; s/34/\x95/; s/45/\x97/; s/it/\x98/; s/08/\x99/; s/mi/\x9a/; s/ca/\x9b/; s/ic/\x9c/; s/da/\x9d/; s/he/\x9e/; s/21/\x9f/; s/nd/\xa0/; s/me/\xa1/; s/ng/\xa2/; s/mo/\xa3/; s/ba/\xa5/; s/sa/\xa6/; s/ti/\xa7/; s/56/\xa8/; s/sh/\xa9/; s/ea/\xaa/; s/ia/\xab/; s/ol/\xac/; s/se/\xad/; s/ov/\xae/; s/be/\xaf/; s/de/\xb0/; s/co/\xb1/; s/ss/\xb3/; s/99/\xb4/; s/to/\xb5/; s/22/\xb6/; s/oo/\xb7/; s/02/\xb8/; s/ke/\xb9/; s/ee/\xba/; s/ho/\xbb/; s/ey/\xbc/; s/ck/\xbd/; s/ab/\xbe/; s/et/\xbf/; s/ad/\xc0/; s/13/\xc2/; s/07/\xc3/; s/pa/\xc4/; s/09/\xc5/; s/06/\xc6/; s/ki/\xc7/; s/98/\xc8/; s/hi/\xc9/; s/th/\xca/; s/05/\xcb/; s/14/\xcc/; s/25/\xcd/; s/ay/\xce/; s/ce/\xcf/; s/89/\xd0/; s/ac/\xd1/; s/os/\xd2/; s/ge/\xd3/; s/03/\xd4/; s/ka/\xd5/; s/ja/\xd6/; s/bo/\xd7/; s/do/\xd8/; s/04/\xda/; s/e1/\xdb/; s/nn/\xdc/; s/em/\xdd/; s/31/\xde/; s/15/\xe1/; s/18/\xe2/; s/ir/\xe3/; s/91/\xe4/; s/om/\xe6/; s/90/\xe8/; s/30/\xea/; s/nt/\xeb/; s/di/\xec/; s/si/\xed/; s/ou/\xef/; s/un/\xf0/; s/24/\xf1/; s/us/\xf2/; s/88/\xf3/; s/ai/\xf5/; s/78/\xf6/; s/y1/\xf7/; s/so/\xf8/; s/pe/\xf9/; s/ot/\xfa/; s/ga/\xfc/; s/ly/\xfd/; s/16/\xfe/; s/ed/\xff/' > new_training.txt

Convert the new_training.txt file to a John the Ripper potfile format

Reason: Currently JtR's Incremental mode character sets can only be trained from cracked passwords in a potfile. You can't just give it a set of raw plaintext passwords.

Enhancement: You can totally combine this step with the previous one by piping the output of the original Sed command into this second one. I'm keeping these steps separate for now since my recommendation is going to be to update the tokenizer.pl script to incorporate this step into the Sed command it generates. Basically I'm hopeful I can just delete this step in the future

Example:

cat new_training.txt | sed 's/^/:/' > training.pot

Run the JtR Incremental mode training program on the new training file.

Flags:

--make-charset= : the filename here is the name of the Incremental character file you want to create. Warning: It will overwrite existing files!
--pot- : In this example, this is the potfile to read in training passwords from. In "normal' usage this is the file where your cracked passwords will be stored.

Example:

john --make-charset=tokenize.chr --pot=training.pot

Example output of running the training session on one million training passwords:

3) Update John the Ripper's Config File

Add the following to John the Ripper's "john.conf" config file

New Values:

[Incremental:Tokenize]
File = $JOHN/tokenize.chr

Notes:

If you save your Incremental character file as "custom.chr" you can use the default custom Incremental mode that is already including in John.conf instead
Feel free to update the name of this attack (in this example it's "Tokenize") to whatever identifier you want to use.

Add the "Untokenize External Mode" config generated when you ran tokenize.pl to your john.conf

Notes:

I don't think there are many restrictions where you paste this in, as long as it is not in the middle of another configuration item or at the very end.

Below is a screenshot of me running the tokenize.pl script again, with the External mode section highlighted.

Next is a screenshot of me pasting the resulting External mode into my John the Ripper config file.

Running The Attack:

To run the attack, you can use the following commands in addition to your normal attack commands:

Arguments:

--incremental=Tokenize : You can set this to be whatever name you specified in your john.conf file to use the .chr file you generated with tokenize.pl

--external=untokenize : This calls the external mode to convert the tokenized output of the Incremental mode attack into a normal password guess

Example:

./john --incremental=tokenize --external=untokenize --stdout

Here is a screenshot comparing the first 25(ish) guesses generated by both the new Tokenize attack and the default JtR Incremental attack. It's interesting to note that the default Incremental attack seems to start off stronger with more likely passwords. What's important though is longer password cracking sessions, which I'll investigate in future tests.

Conclusion:

The next step will be to run some actual tests. The experiments SolarDesigner ran seemed to imply a significant improvement when running the tokenize attack over the first two billion guesses compared to a standard Incremental attack. This assumes you run these attacks after cracking the "common" passwords first using a more traditional wordlist attack. I'm having to restrain myself from speculating about the results of future tests that I'm planning on running (better to just run them), but hopefully this blog post is helpful for anyone else who also wants to experiment with this new attack. It's a cool attack and a neat approach so I'm looking forward to seeing how it evolves going forward.

Extracting Secrets from Packet Captures (A CMIYC2024 Story)

2024-08-26T19:52:00.000-07:00

"Interest is the most important thing in life; happiness is temporary, but interest is continuous."

- Georgia O'Keeffe

Introduction:

The focus of this blog entry will be on tools and scripts to analyze packet captures. This is the result of falling down a rabbit hole when writing the previous tutorial on the CMIYC 2024 WIFI cracking challenge: [Link]. In that writeup I realized I hadn't been keeping up on the state of automated tooling to help extract secrets and interesting data from packet captures. So I asked for tips and suggestions on what I could use. And you all responded! This is another reason why these blog posts are really beneficial to me. I learn so much writing them, so thank you!

As a disclaimer, while I will be using the CMIYC2024 dataset to explore using some of these tools, these tools are not really suited for password cracking competitions. For short competitions, you are better off performing manual analysis of the data. As a spoiler, none of the tools I looked at identified many of the CMIYC2024 secrets out of the box. I needed to look through the packet captures myself to figure out those clues. Instead as you read through this blog entry, I'd like you to consider analyzing real world data which can take up Gigabytes of disk space. That's where these tools can be helpful.

Important Links, Tools, and References for this Post:

Previous Blog Entry: CMIYC2024 WIFI Cracking Challenge
Link: https://reusablesec.blogspot.com/2024/08/cmiyc2024-wifi-cracking-challenge.html
Reason: I'm not going to cover cracking WIFI passwords or decrypting encrypted traffic in Wireshark in this post since I already covered those topic in this previous entry. So consider that entry a prerequisite (or at least an earlier chapter) to this blog post.
PCredz
Code Link: https://github.com/lgandx/PCredz
Documentation LInk: https://shellcode33.github.io/CredSLayer/index.html
Reason: A popular tool to sniff network traffic and capture plaintext credentials as well as other interesting secrets such as credit card numbers
CredSLayer
LInk: https://github.com/ShellCode33/CredSLayer
Reason: An enhancement on PCredz with a focus on making it easier to add support for extracting secrets from new protocols.
DSniff
Link: https://github.com/tecknicaltom/dsniff
Reason: A very popular tool to sniff network traffic and extract plaintext credentials.

Manual Analysis:

To start things off, I wanted to create a control set of myself manually analyzing a packet capture file. That way I knew what to expect when I start trying to process the same data with automated tools. For this blog post I'm going to use CMIYC2024 contest's packet capture challenge since it includes different types of secrets, and quite honestly this provides a good incentive to spend more time trying to solve the other CMIYC2024 challenges as well.

To reiterate from my previous post on how I manually analyzed the data, I opened up the packet capture in Wireshark, decrypted the WPA1 encrypted traffic, and then manually walked through the different TCP streams using the Wireshark filters.

Note: There was UDP traffic as well, (which you can walk through in a similar fashion). Looking through the UDP traffic there wasn't anything that I saw that was interesting to this contest. It was mostly "plumbing" traffic like DHCP address leases. When doing this yourself though, make sure you don't forget to look at UDP traffic since there can often be some really interesting findings in it.

As far as content went, there were four main types of sessions that Korelogic included in the contest:

FTP Sessions (Covered in the previous blog post)

Interesting Secrets:

Username/password to log into the FTP server
Each session downloaded a passwords file containing the same three passwords.

Number of Sessions:

10 FTP protocol (Packet capture missed one of the FTP protocol sessions)
11 FTP Data

HTTP Sessions (Covered in the previous blog post)

Interesting Secrets:

HTTP Basic Auth: Username/password
Each session displayed a "password vault" which was just a plain HTML page containing the same four passwords

Number of Sessions:

Telnet

Interesting Secrets:

Username/password to log into the telnet server

Number of Sessions:

Email

Interesting Secrets:

This was a password reset e-mail for the *WORST* password reset service. So it included the original username/password combo and the new username/password combo as text inside the e-mail.

Number of Sessions:

I talked about most of these session types in the previous blog post, but I wanted to highlight the password reset e-mails that Korelogic created for this contest:

Getting a user's previous password as well as what the password reset service reset it to will probably be helpful for cracking other passwords during this competition! You may also notice that the new password matches the HTTP Basic Auth password highlighted in the previous blog post. So Korelogic is working to try and tell a consistent story here.

To actually save the data, I copy/pasted interesting fields from Wireshark into an Excel Workbook.

In data analysis there is a general idea/term called "Time to Excel" that asks how long it takes to import data into Excel and display it in a way you can understand it. The idea behind it is that Excel is the best general purpose data analysis tool out there. The competition isn't even close, (Kind of like how ThatOnePasswordWas40Passwords dominated the CMIYC2024 Street contest). I would not have graduated with my doctorate without Excel. So if you are looking to use automation or a different data analysis platform/tool you need to compare them to the "Time to Excel" and see if they save you any time/work. If they don't, then you are better off sticking with large Excel worksheets. For this effort here was my manual "Time to Excel" metrics:

Time spent on task:

Roughly an hour total to extract the data from 40 session and copy/paste them into Excel

This is a little more than a minute a session

Notes on time spent on task:

The first session type (FTP/Email/Telnet/HTTP) I encountered took more time than all the follow up sessions

Ignoring taking screenshots and blog related notes, it probably took me a couple of minutes for each new session type encounter to look through them.

There was a lot of duplicate data that I could have skipped if I was doing this during the contest. I probably could have completed the entire task in under 30 minutes if I was rushing.
The initial time analyzing a session is probably more representative of real life data as it isn't as scripted as the CMIYC2024 contest material. So my gut says 5 minutes per "interesting" session is a reasonable general estimate of time for manual analysis.

Putting it all together, here are all the "secrets" and interesting information I found manually going through the packet capture:

Stats on Total Secrets Found:

85 total secrets

4 HTTP Basic Auth
16 passwords in e-mail HTML code
10 FTP Logins
23 FTP Data (Text File)
8 Telnet Logins
14 Passwords found in E-mail Text (original and reset password)

Saving Decrypted Packet Data:

The first thing we need to do when performing automated analysis of the traffic is to save the decrypted WPA1 packets. Up till now, I've been using Wireshark to display the decrypted traffic but the tools I'll be using may not have the ability to decrypt the traffic themselves. Therefore I need to export the plaintext packets and create a new packet capture file containing them.

Disclaimer: I'm going to skip a lot of the troubleshooting and debugging that I had to do for this next step and instead focus on the current "solution" I found that mostly works. One thing that this research has highlighted for me is how *weird* this use-case is. Normally you want to feed decrypted traffic directly to your analysis tools by forwarding it to another interface or loopback after it has decrypted by your wireless card. Basically you want your tools to work off a tap on the LAN. Likewise capturing WPA1 traffic and cracking the password often happens as its own step, after which you can connect to the network and start sniffing for secrets using the "LAN Tap" strategy. It's very rare to need to feed a packet capture of encrypted traffic through these tools. It just doesn't fall into many real life workflows outside of a contest, and the dodgy tool support for this use-case reflects that. Lots of words to say that if you find yourself doing the below as part of a real world assessment, you are probably doing something wrong and should rethink your collection strategy.

Task 1) Strip off all the datalink layer content from the packet capture using the "Strip Headers" option in Wireshark.

Without going too deep into the weeds about networking 101 or side-tangents about how horrible the OSI network model is, what you really need to know is that WPA1 encryption happens at the data link layer. Higher level data like the TCP/IP information is encapsulated (and in the case of WPA1, encrypted) in a data link packet. This way your FTP server doesn't need to know anything about the fact that a client is connecting over a wireless network. What this means is that if we rip out all the datalink layer content and rewrite the TCP/IP data to a new "fake" plaintext datalink packet we can then save the fully decrypted packet capture to disk for analysis by other tools. Wireshark supports this using the "Strip Headers" feature.

The procedure to "Strip Headers" from a packet capture are as follows:

Load up the encrypted packet capture in Wireshark and decrypt it using the steps described in the previous blog post.

The traffic MUST be decrypted for the next step to work.
If step 3 returns NO packets, one possible cause is you didn't first decrypt the traffic in Wireshark

In Wireshark, click File->Strip Headers
In the pop-up dialog box, make sure you select "IP" in the drop down box and then click ok

I first tried selecting "Ethernet" and it didn't return any traffic. Also the fact that they reference "Ethernet" vs. the more generic "Data Link" is also weird.
Long story short, there is a significant degree of "magical incantations" vs. understanding what Wireshark is doing under the hood. But this seems to work, so I'd recommend treating it as a magical spell and following these steps.

The traffic should now be decrypted

Task 2) Save the "stripped down" packet capture to disk.

I would *STRONGLY* recommend saving it as the older/more-generic "wireshark/tcpdump" format with a .pcap file extension vs. the newer .pcap-ng file extension. While the newer file extension is nice and supports some nifty features, keeping it more generic is helpful when running it through non-Wireshark tools.

Side tangent on what NOT to do: A lot of the online documentation recommends that you use the Wireshark "Export PDUs to File" option. In fact, the links I found didn't even mention the "Strip Headers" option. While Export PDUs does save the traffic without any layer2 encryption, it saves that decrypted data in a format that only Wireshark can process. I spent hours troubleshooting different options to make the resulting PDUs play nice with other tools. This includes passing the PDU packet captures through tcpreplay tcprewrite, text2pcap, and various other tools trying to rebuild the data link layer encapsulation in a format non-Wireshark tools could process. I totally failed. Then I found out that strip headers does exactly what I needed it to do. So this is a note to you as well as my future self saying don't use "Export PDUs to File".

Google SEO: Unable to process unsupported DLT type: (null) (0xfc). If you reached this page looking for a solution to this error, please read above.

Viewing the Decrypted Packet Data:

While stripping the data link layer is helpful for parsing the data using external tools, I want to take a moment about what we are losing. The first thing you'll notice when looking at the new packet capture is that all the 802.11 WIFI traffic is gone. This actually makes it a bit easier to manually find interesting traffic for contest related activities (you don't need to apply an additional filters), but we also no longer can snoop in on what Korelogic set their WIFI router's SSID up as. The other major change is that you'll notice the Data Frame now uses encapsulation type 7, "Raw IP". This is a fancy way of saying there is literally no Layer 2 data link layer encapsulation/frames in the packet. This goes beyond not having any MAC addresses. As the name implies, it truly is "Raw IP". Here is a screenshot of a packet displayed in Wireshark:

The bits on the wire (or in this case the packet capture) start at the IP layer. While this isn't a problem for some tools to display, this can cause problems with other tools (such as PCredz) that expect that data link layer to be present. I'll talk more in the tool section for how to handle that.

Evaluating Network Credential Dumping Tools:

Finally we are at the point where we can start looking at automated tools to parse the packet capture file for secrets! For each of the tools I'm going to break it down into three different sub-sections: 1) How to install the tool, 2) Results of parsing the packet capture with the tools, and 3) Any closing thoughts about using the tool for other real-life activities.

DSNIFF

I'm sure there were tools to extract credentials from network taps before DSniff, but it was one of the cornerstone hacking tools back in the day. Usually DSniff would be paired with Ettercap to redirect traffic to your local system. Then while running a person in the middle attack against all switched traffic you would turn on DSniff to save any plaintext credentials people were using to a file. It was very simple to run, and quite effective against a lot of the earlier Internet protocols.

Installation:

DSniff hasn't been updated in over 14 years so you don't have to worry about missing out on features by installing the pre-built binary from your local package manager. So for example, installing it on Ubuntu I just did:

sudo apt install dsniff

Operation and Results:

To run DSniff against a saved packet capture, simply use the '-p' flag with the path to the packet capture to read in. For example:

dsniff -p decrypted_pcap.pcap

As far as the effectiveness, DSniff did an excellent job of picking out the FTP, Telnet, and HTTP authorization usernames + passwords, but missed picking out any secrets from the contents of the traffic (such as the e-mail, the passwords in the HTTP body, and the password file downloaded by FTP). To put it another way, DSniff works great to capture credentials that are part of a protocol, but it does not run any regular expressions to pull out secrets or interesting data from protocol's traffic after the initial authentication is complete.

Overall Review:

It warms my heart to see DSniff continue to work after all these years. Dsniff was a breeze to get up and running which certainly makes it attractive to someone looking to get a basic capability operational as quickly as possible. If you just want to catch low hanging fruit and basic misconfigurations, DSniff is still a respectable option.

Where the gaps start to show is DSniff's lack of support for extracting secrets from the content of anything besides Telnet sessions. Another major challenge is that Dsniff doesn't have a standardized format to save the content of different secret types. This makes it harder to pull these secrets out and create a input dictionary for password cracking, or an Excel spreadsheet for analysis. While you can save the results to disk vs. stdout, DSniff saves log files using a Berkley database format limiting your ability to manually open the save file up with a text editor (or script) for further processing.

Conclusion: DSniff works, but it feels like a 20 year old tool. 3.5 our of 5 stars.

PCredz

This tool was recommended to me by various people and seems to be the main credential dumper being used by red team members and penetration testers. It has a specific focus on credit card numbers which I'm sure makes it popular with the PCI enforcement crowd.

Installation:

The following was taken from the PCredz Linux installation instructions and validated on a WSL Ubuntu system.

Clone the PCredz repository to where you want to save/install it

git clone https://github.com/lgandx/PCredz.git

sudo apt install python3-pip
sudo apt-get install libpcap-dev
pip3 install Cython
pip3 install python-libpcap
If you are parsing a raw IP packet dump, you will then need to modify the PCredz code as described in the next section.

If I have time I may submit a merge request to help fix this problem for others

Operation and Results:

To run PCredz against a packet capture file you simply need to give it the -f option. For example:

python3 Pcredz -f decrypted_pcap.pcap

Unfortunately when I tried this with the decrypted packet capture it failed.

Luckily, PCredz is a python3 program, and that's something I can troubleshoot pretty easily! Looking through the code, I quickly found that it was expecting a Layer2 dataframe, and without that encapsulation it was failing. I quickly put together a very hacked together fix as shown below to force it to parse the data as an IPv4 packet.

This is why I'm a menace on development teams by the way. The good news though is this fix "worked" and I could now extract secrets from the contest packet capture file.

This is much better! Like DSniff, PCredz extracts the login information for FTP and HTTP sessions. It missed the Telnet sessions though along with e-mail, HTTP body, and FTP Data secrets. I was really surprised by the omission of Telnet credentials. Looking at the supported protocols, (you can see that list when PCredz starts up) Telnet is not included.

Where PCredz does shine compared to DSniff is that it can use regular expressions to look for specific data inside the packets, and PCredz is written in Python so you can add your own regular expressions to it with only a few lines of code.

Overall Review:

PCredz is more of a developer tool than DSniff. It provides a lot of extensibility and the ability to tailor it to a particular problem that you are trying to solve. Since it's just Python3 code, you can modify its detection logic as well as how it outputs the data it collects, making it a great starting point to develop a pipeline for more advanced analysis. It's not a one-stop shop though, and I'd probably direct newer users to DSniff instead.

Conclusion: For a new user I'd give PCredz 3 stars out of 5. For a developer it earns 4 stars out of 5 as it is a lot nicer than playing around with direct calls to SCAPY (another nice Python network tool).

CredSLayer

CredSlayer was inspired by PCredz with a focus on extensibility. It is able to use Wireshark dissectors and can support many more different protocols out of the box.

Installation:

While CredSLayer is written in Python, it relies upon tshark (the Wireshark command line interface) to do all the heavy lifting for parsing the packets. The following instructions are for getting it to work on a WSL Ubuntu install

Make sure you have Python >= 3.4 installed
Install tshark

sudo apt install tshark

pip3 install credslayer

Operation and Results:

To run CredSlayer against a packet capture file just run it with the packet capture file specified

credslayer decrypted_pcap.pcap

When I ran it, it worked perfectly the first time I tried, and extracted the same data that DSniff did (Telnet, HTTP Auth, and FTP Auth), while missing the same "content" secrets that DSniff missed. The biggest difference though was that Credslayer printed its results in a well defined, easily parsed format.

Now you might be wondering, "How can I make this fit even better into my workflows", or "Can I create a pipeline to make use of the found credentials". Heck, I can even make a callback to the beginning of this post: "Can I reduce my Time-To-Excel to as close to zero as possible?" The good news is that not only can you import CredSLayer into your other Python programs, but it has documentation on how to do that as well!

What about adding in your own regular expressions to look for data in the session contents? CredSLayer has you covered there as well!

Overall Review:

I'm a huge fan of CredSLayer. It's not magic. For example, manual analysis is still the way to go during these competitions since none of the automated tools found the password reset e-mails, the HTTP "Password Vault" or the download of the FTP password list. But given the constraints that "secrets" are highly subjective and context sensitive, CredSLayer offers an impressive mix of built-in capabilities and extensibility. I like the code, I like the architecture, and I probably owe the developer a beer. This is code that brings me joy.

Conclusion: Five out of five stars. No notes. Will recommend. Will use in my day job.

CMIYC2024: Wifi Cracking Challenge

2024-08-18T20:34:00.000-07:00

"It is never too late to be who you might have been."

- George Elliot

Introduction:

This is a continuation of my write-up about this year's Crack Me If You Can challenges. You can view my previous two write-ups using the following links. Each one covered a specific challenge of the CMIYC contest: [Striphash] and [Radmin3 hashes].

I'll admit, in my previous posts I was focusing on the plumbing of the challenges. Aka how to extract the hashes and get them in a format that you can run password cracking attacks against. But I danced around how to run successful cracking sessions against those hashes. There's a lot of reasons for that, but the biggest one is that I wasn't very successful during the contest itself. I needed time to step back, and start investigating all the challenges and hints that Korelogic gave out during the contest but I didn't have time to really dig into. Then with sleep and no pressing deadlines I could start to solve, understand, and then incorporate these challenges into my cracking session. That's a lot of words to say that I didn't solve the wifi cracking challenge during the contest, but I felt it would be worthwhile to look into it afterwards and document how I went about working through it. This will hopefully be one of the more day-to-day practical write-ups as well since cracking wifi passwords is something that can be pretty common during pen-test engagements if you can line up the appropriate permissions.

Important Links, Tools, and References for this Post:

Rapid7 Writeup: Poorly Purged Medical Devices Present Security Concerns After Sale on Secondary Market
Link: https://www.rapid7.com/blog/post/2023/08/02/security-implications-improper-deacquisition-medical-infusion-pumps/
Reason: This is why I normally don't have to crack wifi passwords during my research. There are usually other ways to gain access to networks than having to start up Hashcat. Also this report is really interesting and I'd like more people to be aware of it.
Cap-2-Hashcat
Link: https://hashcat.net/cap2hashcat/
Reason: This site will extract WPA handshakes from wireless packet captures and convert them into a format you can crack with Hashcat. I'm really hesitant to include this link since while it is very good for helping out during password cracking competitions, please don't ever use this site for any sort of real life penetration testing assistance. You are sending the data to the "cloud" and you should always be mindful about doing that with a sponsor's traffic.
HCXTools
Link: https://github.com/ZerBea/hcxtools
Reason: This is the proper way to parse packet captures and extract password hashes to crack on your own machines. It takes more work than Cap2Hashcat, but it's the proper way to treat sponsor data, and it gives you more flexibility to troubleshoot when something goes wrong.
Wireshark Wiki: How to Decrypt 802.11 Traffic
Link: https://wiki.wireshark.org/HowToDecrypt802.11
Reason: Once you've cracked the WPA1-PSK password, you'll want to view the decrypted traffic. Wireshark is one the easier ways to do that.
Example Hashcat Formatted Hashes
Link: https://hashcat.net/wiki/doku.php?id=example_hashes
Reason: You really should have this page bookmarked regardless of if you are competing in a competition or not. Whenever I'm starting a non-standard password cracking session I find myself referring back to this site to try to figure out what type of hash I'm dealing with, or to understand how I need to format it so I can crack it with Hashcat.

Background on cracking WPA sessions:

As a bit of backstory for myself, I got my start in computer security being a wireless penetration tester/red-teamer. I had perfect timing since new tools to crack WEP sessions had just been released and a Symantec Antivirus remote exploit had become available. So by pressing a few buttons I could look like a super L33t hacker without really knowing anything.

Ever since then, I've had a soft spot in my heart for wireless hacking. The challenge though is I rarely do any wireless hacking in my day job. Yes, I do vulnerability impact analysis research, but I usually start with a white card that assumes I already have access to the wireless network. How do I justify that? Well let me refer you to my talk at BSidesLV2023: Passwords911 Authentication Adventures in Healthcare. To be more specific, I end up buying a lot of used medical equipment off of eBay (my job is so weird but awesome). Often these medical devices still have hospital credentials such as wireless passwords, and Active Directory tokens, as well as patient records still on them.

Don't just take my word for it. The security company Rapid7 did a similar analysis and found over half of all wireless infusion pumps that they purchased on the secondary market had sensitive data still on them [Link]. As a disclaimer, I haven't seen any reporting of threat actors doing this in the wild. This particular attack requires a lot of luck and physical proximity to the institution that sold the medical equipment. Basically it would be a huge pain to try and pull off in real life. But it does work pretty well when I'm requesting a white-card so that way I can focus on the part of the assessment that I really want to dig into.

Now that the "cooking recipe back-story" of this post out of the way, let's look at the challenge itself. In the first hint file released after the contest started "cmiyc-2024_street_files_2" it included a "street.pcap" file along with the following note:

-----

From: Jarlaxle

To: Tiamat

Subject: We located the journalist
We found the journalist and sent a drone to do some recon at his house. We cracked his home wifi and have been monitoring all of Julian's communications. We'll soon learn which staff he's been communicating with and whose accounts he's been using. We'll put a stop to his investigation.

Jarlaxle

-----

Extracting the WPA1 Hash:

Figuring out accounts sounds promising when it comes to cracking the challenge passwords. So let's open up the pcap in Wireshark.

Ok, so this pcap looks like something we can use to crack the WPA1 pre-shared key. The next question is, how do we get the WPA1 hash?

One option is to use the excellent and easy to use cap2pcap site run by Hashcat [link]. You just need to upload the packet capture and a couple of seconds later it returns to you a download of the hash to crack. For a contest like this, it is super easy and absolutely the way to go so you can focus your time and effort on other tasks.

The problem is you can get in big trouble if you use this site for real world penetration testing engagements. You are uploading sponsor data to a cloud based hacking site probably being monitored by who knows what threat actors. I'm not throwing shade at the Hashcat team for offering this service. It's a great service and I really appreciate they provide it. But you don't want to be uploading client data to other reputable cyber security sites such as Virus Totals either. Basically if you are getting paid to do this type of analysis, you are also getting paid to learn to dump the hashes from a packet capture file using your own systems.

One solution for processing this data locally is to install and run HcxTools on your own system. The cap2hashcat service uses HcxTools as it's backend so the results will be exactly the same. The downside though is that getting HcxTools installed and working can often be a multi-hour process. So it would certainly be a task that you need to do in preparation of a contest or a red-team engagement vs. something you want to do during an event.

"Quick" HcxTools Install Guide for WSL2:

Install prerequisite libraries

sudo apt-get install pkg-config libpcap-dev libcurl4-openssl-dev

Remove the old version of libssl-dev if you have it already

sudo apt remove libssl-dev

Manually install libssl-dev to be a more modern version (the Ubuntu WSL version is way out of date and the toolset requires a version of libssl > 3.0)

Note: I'm using the instructions from the following webpage: [Link]
cd /usr/local/src/
wget https://www.openssl.org/source/openssl-3.0.8.tar.gz
tar xzvf openssl-3.0.8.tar.gz
cd openssl-3.0.8
./config shared zlib
make
make test (only do this if you have 2 hours to let it run instead of just YOLOing it and doing the next step)
sudo make install

Modify your system PKG_CONFIG_PATH and LD_LIBRARY_PATH to include the link to your new libsssl-dev

export PKG_CONFIG_PATH=/usr/local/lib64/pkgconfig
export LD_LIBRARY_PATH=/usr/local/lib64

Get the newest version of HexTools

cd [MAIN INSTALL DIR]
git clone https://github.com/ZerBea/hcxtools.git

Go into the hcxtools directory
Build and install hcxtools

make -j $(nproc)
sudo make install

Ignoring all the Googling and troubleshooting, the process to get HcxTools installed and working was a breezy two hours or so. I don't know why people don't do this instead of using Hashcat's easy to use online service...

I will say, these write-ups are mostly for myself. I guarantee sometime in the future I'll need to install HcxTools, so it is nice to have someplace I can refer back to vs. having to do all that Googling and troubleshooting again. So now that we have HcxTools installed, let's use it to extract the WPA hash from the packet capture.

Using HcxTools to Manually Dump the WPA1 Hash:

Once you have HcxTools installed and working, it's a pretty straightforward process to dump the password hash. Before that though, I should take a moment and highlight that HcxTools is a very powerful toolsuite and has a number of advanced options to deal with large packet captures containing hundreds of wireless networks. You know those people walking around the Defcon security conference with a "wifi cactus"? Or in a more general case, old school wardrivers sniffing traffic from all over a city. Well they are probably making use of HcxTools advanced features to sort through the outputs and create custom hash lists. For this contest though, the pcap is small and there's only one network, so we can use the basic options:

hcxpcaptool -o [FILE TO SAVE THE HASH] [HASHFILE]

e.g.: hcxpcapngtool -o ./challenge_files/CMIYC2024_Street/hashes/pcap.hash ./challenge_files/CMIYC2024_Street/hashes/cmiyc-2024_street_files_2/street.pcap

I'd include a screenshot of this, but it looks almost exactly like the online service that Hashcat provides in the picture included above.

Using John the Ripper's wpapcap2john to Manually Dump the WPA1 Hash:

If you are going to use John the Ripper to crack the WPA1 hash, there is an easier option available to you. Included in the Jumbo version of JtR is a tool called wpapcap2john that is super easy to use and will save the resulting hash in a format that John the Ripper can crack.

Example:

/mnt/c/github/JohnTheRipper/run/wpapcap2john ./hashes/cmiyc-2024_street_files_2/street.pcap

Question: Is there a way to convert hashes back and froth from Hashcat formats to JtR formats?

Reason: The formats that JtR and Hashcat use to crack WPA1-PSK hashes are very different. Therefore I need to use HcxTools for Hashcat and wpapcap2john for JtR. I suspect there is some flag in HcxTools that will do the conversion but I don't know what it is. It's not a big deal, but it would be a nice quality of life improvement for when I need to crack more complex Wifi passwords.
Update/Answer: To save JtR formatted hashes using hcxpcaptool use --john to specify the output file instead of -o. For example:

hcxpcaptool --john [FILE TO SAVE THE HASH] [HASHFILE]

Cracking the WPA1 Hash:

The WPA1-PSK hash file formats for both John the Ripper and Hashcat are different but once you have the hashes you can pick which tool you want to crack them with. I personally like using John the Ripper since I don't have a lot of GPU power and I like JtR's rule logic better. But Hashcat is much preferable if you do have the compute power to really throw at the problem. I'm going to show examples of using both tools, since luckily the street password that Korelogic provided for this file was fairly simple to crack.

Cracking with John the Ripper:

The hash format for John the ripper is "wpapsk" so I used the following command to run my initial attack in Batch mode.

john --format=wpapsk ./hashes/pcap2.hash

Cracking with Hashcat:

The first step with cracking the WPA1-PSK hash in hashcat is figuring out what format to target. There's a lot of different wifi cracking modes! Now if you are lazy you can simply skip the '-m' option and let Hashcat autodetect the password hash which works pretty well for well defined hash types. The other option is you can refer to the list of Hashcat hash mode examples. The hash dumped from HcxTools starts with "WPA*02*" and when looking through the example hashes that matches up to "-m 22000"

Now it's just up to us to run a cracking session. To keep things simple, I used the standard JtR passwords.lst wordlist since that's a pretty good small one, and I picked the d3ad0ne.rule included in Hashcat as that tends to be my default go-to ruleset to use. In retrospect I should have kept it smaller with something like best64.rule (you can tell the reason when you look at my anemic password cracking GPU in the screenshot below), but it was a very weak password so it didn't make much of a difference this time.

Example Hashcat Command:

hashcat -m 22000 -a 0 ./hashes/pcap.hash ~/repos/john/run/password.lst -r ~/tools/hashcat/rules/d3ad0ne.rule

Decrypting the Packet Capture:

The easiest way to decrypt the encrypted session in the packet capture is to use Wireshark. I included a link to the official tutorial on how to do that at the top of this blog entry since I always have to look it up every time I do this. But here is a quick list of the steps:

Open the packet capture in Wireshark
In Wireshark go to Edit->Preferences
Next go to Protocols and expand it out. Select "IEEE 802.11". Note: I always forget it starts with IEEE.
Next to "Decryption Keys" click "Edit"
Press the "+" button to add a new key and pick "WPA-pwd". Then enter in the PASSWORD:SSID and click "save"

Trust me on taking the time to enter in the PASSWORD:SSID vs. just the password. It'll usually work with just the password, but I've wasted time troubleshooting when problems popped up due to me being lazy and just YOLOing in the password by itself.

Quick Disclaimer: I don't know if Wireshark is smart enough to realize the last ":" is the delimitator or not so it might not actually mess up your decryption if a ":" is in the password. Setting up a wifi network and testing that out will need to be a rabbit hole for a different day.

Once you do this, Wireshark will automatically decrypt the traffic making it really easy to start digging into it and looking for fun conversations and artifacts.

Analyzing the Packet Capture:

I'm sure there is some toolset or script out there that will parse through this decrypted traffic and pull out interesting passwords and information from it. My first instinct would be to go with DSniff, but looking at the repo it hasn't been updated in the last 14 years. (Side note: This also gave me the realization "OMG I'm old").

Question: What tools do you recommend to parse through packet captures and extract passwords and keys?

Reason: This manual process works for grabbing a couple of passwords from the packet capture but it really doesn't scale and I could easily be missing things.
Update: Thanks for all the suggestions everyone. I wrote a follow-up blog entry diving into some of the recommended tools [here].

Since I was curious and didn't want to spend six hours digging into another side-quest of investigating packet capture analysis tools, I'll admit for this first pass I just started scrolling through the decrypted traffic to see if there was anything interesting. And the TLDR was, yes, there was a lot of interesting traffic.

FTP Credential Analysis: I need to dig into this more, but at least in the initial street file hashes, there does not appear to be a user with the name "yashica". I might need to throw their password, (along with others extracted from this packet capture) into a general wordlist and run it against all the uncracked hashes I have. That is, unless there is another hint file that provides further context. So the next question, is "what about that passwords-street" file they are downloading?

Yup, that could be useful as well! Now, I want to preface this next point by once again saying that there must be a better way to analyze these packet captures, and doing it by hand is tedious and error prone. But if you are going to parse it hand like I am doing, you can look at the filter that gets applied in Wireshark when you follow a TCP stream and manually increment it by one to view the next TCP session. It's not a pretty way to do things, but it's much better than hunting and pecking and scrolling through the full packet capture.

Other interesting side quests: There is one session where the user grabs the following webpage: "/obscure/path/for/extra/security/passwords-street.html",

Conclusion:

I know I started this post by saying that I needed to get away from all the plumbing of extracting/formatting hashes and start actually cracking the challenge passwords. Unfortunately, it looks like I totally failed in that task again. Now that we have some hint files and sample passwords though, making use of this data will hopefully be the subject of a follow up blog post. For now though I really enjoyed writing this blog post and getting back into practicing my wifi hacking skills. So thanks again to Korelogic for including this challenge in this year's contest.

CMIYC 2024: RAdmin3 Challenge

2024-08-14T22:04:00.000-07:00

"Nothing is more permanent than a temporary solution."

- Russian Proverb

Introduction:

This is a continuation of my write-up about this year's Crack Me If You Can challenge. You can view the previous entry focusing on the StripHash challenge [here]. Like the last write-up, this one is going to focus on one specific hash format (RAdmin3), details about that hash format, and how to load those hashes into a cracking session. I'm going to defer most of the actual cracking of these passwords to a later writeup though since running a successful cracking session relies on solving other challenges found throughout the contest.

Important Links, Tools, and References for this Post:

Synactiv's blog post: Cracking Radmin Server 3 Passwords
Link: https://www.synacktiv.com/en/publications/cracking-radmin-server-3-passwords
Reason: This is really an amazing blog post going into dumping Radmin password hashes, reverse engineering their hashing algorithm, and then cracking them. I can't recommend this write-up enough if you want to crack Radmin password hashes.
radmin3_to_hashcat.pl
Link: https://github.com/hashcat/hashcat/blob/master/tools/radmin3_to_hashcat.pl
Note: This tool is part of the base Hashcat install
Reason: This is the primary tool to convert the Radmin3 dump that Korelogic provided to hashes that Hashcat can crack.
Example Hashcat Formatted Hashes
Link: https://hashcat.net/wiki/doku.php?id=example_hashes
Reason: You really should have this page bookmarked regardless of if you are competing in a competition or not. Whenever I'm starting a non-standard password cracking session I find myself referring back to this site to try to figure out what type of hash I'm dealing with, or to understand how I need to format it so I can crack it with Hashcat.

Background on Radmin3 Hashes:

I'll admit that before this contest, I had never encountered Radmin3 password hashes. That probably reflects on my lack of recent full scope pentest experiences. On the flip side, their inclusion in this year's CMIYC challenge also highlights the contest organizer's (Korelogic) experiences with pentesting since the entire challenge revolves around limitations in current public open-source tools supporting Radmin3 cracking.

To learn more about the Radmin3 hash itself, how to obtain them, and how they are used, I *strongly* recommend you check out Synactiv's excellent blog post on this subject (linked above). The TLDR summary is that Radmin3 hashes are generated by a Windows remote server management tool (Radmin Server 3). My understanding is these hashes reside on the server itself (vs. being cached on client side systems), but this isn't a centrally managed tool so gaining access to one server and cracking an administrator's password will likely grant you access to other servers during an engagement. Also, having an administrator's cracked password is good for all sorts of other "fun" as well.

The hashes themselves are stored in the Windows registry and can be dumped using standard Windows registry tools. I'll dig more into that later as this was a core component of this year's CMIYC challenge. The actual hashing algorithm is weird. It starts out with a fairly standard:

sha_hash = SHA1(salt + SHA1("username:password"))

But then goes a bit crazy with:

Hash = pow(g,int(sha_hash.hex(),16),modulus)

The Radmin Server development team obviously has an aspiring cryptographer on their staff! Now while such an optimization is annoying/challenging to someone encountering it the first time, the Synactiv team managed to find a number of attack optimizations that limits the value of that last step. This just goes to show how hard it is to develop a secure hashing algorithm.

Extracting the Radmin3 Hashes:

In the first PGP encrypted KoreLogic provided for the CMIYC contest there was a file called radmin.reg that was 6473 lines long and formatted like a Microsoft Windows registry dump of the keys/values in:

[HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Radmin\v3.0\Server\Parameters\Radmin Security\]

Looking at this data, the filename itself, and the fact that Radmin3 hashes were listed in Korelogic's contest scoreboard it's pretty easy to guess there are Radmin3 password hashes in this file. So how do we convert these to a format we can crack? The short answer for me was to do a lot of Googling, but the more repeatable answer is that there is a tool called radmin3_to_hashcat.pl that comes bundled with Hashcat. A direct link to it is listed above as well. So let's try to use that Perl script on the challenge registry file!

Not really surprising the script errored out. Korelogic certainly doesn't have a track record of making things easy for players. Let's look at the radmin3_to_hashcat.pl script to understand what may be breaking.

And let's compare it to the start of our challenge registry file.

Oh hey, it looks like some extra registry keys and debugging info was included in the file by the Windows Registry Editor used to dump the registry keys. Let's delete those key/values and try to run the radmin3_to_hashcat.pl script again.

So it extracted one hash at least. Looking at the registry file though, there are a ton of hashes in it. Unfortunately it looks like the Perl script is not looping though all of those hashes.

Digression/Side-Note: This is more gossip than a technical point, but this challenge design feels like one of the Korelogic staff encountered this problem on a pen test engagement, was very annoyed, and thought "oh we can torture players with this and someone might fix the Perl script for us at the same time!" Props to Korelogic if this is the case. I mean I can't judge. I've certainly designed CTF challenges to get players to solve problems for me before. If you ever participate in the Biohacking Village CTF, the lateral movement challenge from an physical infusion pump falls into this category. I had no idea how to solve due to a buggy BusyBox telnet binary (those words together should concern you since this is a medical device) so I thought "hey, lets just throw it into the CTF with a disclaimer and see if someone else can do it." Research and development though CTFs is a powerful tool!

Ok, back to the technical challenge at hand! Let's take stock at where we are at:

We have a bunch of hashes in a registry file
We have a way to extract the first hash from the registry file

That's a pretty good spot to be. From here, the following two paths could be taken to be able to extract all the Radmin3 password hashes from the registry file:

Option 1) I could modify the Perl script to include a loop so that it would repeat the parsing logic on each registry key.

Plus: The resulting Perl script would be helpful for people after the contest.
Plus: I could add additional logic to dump the entry number of the registry entry as a userid in case there were some association attacks enabled by other Korelogic challenges.
Minus: I couldn't just simply add in a top level loop due to some design decisions of the script. So I'd also need to add logic that it was done parsing the hash file, which involved writing some basic Perl code.
Minus: I really didn't want to write any Perl code

Option 2) I could write a quick script to cut up the registry file into many, many different files. Each file would contain one registry value. I could then run the Perl script inside another script so that I could automate it running against all those files and collect the output.

Minus: This is janky as all get out
Minus: This is the type of solution that you deploy as a temporary fix, it ends up in production, and years later someone looks at it and goes "what the hell happened here?!"
Minus: This isn't really useful to the larger password cracking community
Minus: This will probably take longer to implement than writing around 10 lines of Perl code to accomplish it using method #1
Plus: I could implement this in Python

Solution: I obviously picked Option #2!

I'm not proud of this, and looking back at the challenge I really should have picked option #1. Now Option #2 would be more justifiable if I had scripted up an elegant solution like player 64_nickels described on the CMIYC Contest Discord server:

My solution was ... not that clean.

I can make a lot of excuses for my ugly approach, but I have to say, it did work!

Cracking the RAdmin3 Hashes:

So this part is going to be pretty boring. The Perl hash extraction script came from Hashcat's repository, so it stands to reason that Hashcat can crack these hashes. Looking at the different Hashcat modes (link above) you can see there is an entry for Radmin3 hashes (Mode 29200). Looking at the example hash, it looks very similar to what we were able to dump:

Without going into too many more details about the cracking session, I can verify that Hashcat can crack these hashes using that mode, but that it's a slower attack, and I didn't crack many of these hashes using "standard" password cracking wordlists/rulesets for slow attacks.

So you might be wondering, can you crack these hashes with John the Ripper? The short answer is no, (John only supports Radmin2 hashes), but this provides a good opportunity to talk about a John the Ripper feature I learned about after my last blog post. You can have John run through a mixed hash list and output all the possible hash types in JSON format using the --show=format command. For example you can run the following command on the main CMIYC2024 cmiyc-2024_street_passwd_1 mixed hash list:

./john --show=format cmiyc_2024_street_passd_1

That command will cause John the Ripper to parse the hash file and return the following output which is pretty cool and can then be read into other analysis programs:

Long story short, if you do this on the Radmin3 hashes, John the Ripper will report that it doesn't support them. But this was a feature of John that I wasn't aware of before, so I wanted to share it since I think it's super neat. Also it may point to some future blog posts :)

Conclusion:

I need to give Korelogic credit. This was a fun challenge and I'm glad they included it in this year's contest. I'm probably going to wait a bit to see if someone else beats me to it, but if nobody else steps up I may take a swing at modifying the Radmin3 extraction program to be a bit more robust to registry dump headers, unassociated keys, and to support multiple Radmin3 hashes. So fingers crossed someone else is working on it!

CMIYC 2024: Striphash Challenge

2024-08-12T20:34:00.000-07:00

"I can accept failure. Everyone fails at something. But I can't accept not trying."

- Michael Jordan

Introduction:

First off, I really want to thank the team over at Korelogic for putting together a truly impressive contest. Korelogic always uses the CMIYC contest to push for change/improvements in password cracking tools and this event in particular was jam packed with different challenges that forced teams to really stretch their skills vs. letting their GPUs go Brrrrrrrrrrrr.

Second, I'd like to compliment the skill shown by all the players. One thing Korelogic mentioned after the contest was that the Street challenges were the same level of difficulty as those given to the Pro teams. Looking at the scoreboard and seeing street teams succeed like they did highlights the the level of ability on display. As far as the number of players who just popped on to learn something new, that's also impressive There's a ton of stuff going on during Defcon and the fact that someone decided to try their hand at a password cracking competition vs. one of the other million neat things happening is really special.

One thing that really struck me about this year's CMIYC competition was the amount of content in it. In past years I felt like I had a chance to really experience about 50-70% of the contest (the slow hashes always gave me a problem). This year I felt the number was closer to 10%. I'd see a new PCAP get dropped and go "Wow, that would be fun to decrypt. I'm sure it would give tips and/or new hashes to crack, but I have no free time to mess with that so moving on..." That's where having a blog is nice. I can revisit these challenges at my leisure, as well as incorporate lessons learned from other players' experiences and writeups. To that end I'm going to try and make a number of smaller blog posts focusing on individual aspects of the contest. Now there is going to be a lot of overlap since everything is related, but hopefully this can keep things more manageable for me so that I actually post something vs. having an entry sit in my draft folder for the next several years.

Important Links, Tools, and References for this Post:

My JupyterLab Password Cracking Framework

Link: https://github.com/lakiw/Jupyter-Password-Cracking-Framework
Reason: This contest was the first time I really got to use the Jupyter Lab Password Cracking Framework in real time during a competition. While there were a lot of rough spots, it proved very helpful for dealing with the Radmin and Striphash challenges. A lot of the code I'm going to show in this blog entry is a screenshot from the tool which you can download. My apologies for not including text, but I don't know of a way to reliably render code blocks in Google blogger. Long story short, being able to download, run, and view the JupyterLab notebooks will help add context around a lot of the items discussed in this blog post.
Disclaimer: I need to clean my Notebooks from the contest up before I push them to Github. So it may be a week or so before I actually update the repo with the examples I'm showing here.

Example Hashcat Formatted Hashes

Link: https://hashcat.net/wiki/doku.php?id=example_hashes
Reason: You really should have this page bookmarked regardless of if you are competing in a competition or not. Whenever I'm starting a non-standard password cracking session I find myself referring back to this site to try to figure out what type of hash I'm dealing with, or to understand how I need to format it so I can crack it with Hashcat.

Example John the Ripper Formatted Hashes

Link 1: https://openwall.info/wiki/john/sample-hashes
Link 2: https://pentestmonkey.net/cheat-sheet/john-the-ripper-hash-formats
Reason: Just like with Hashcat, it's helpful to have some examples when trying to crack newly encountered hash formats in John the Ripper. These sites are a bit harder to read and search than Hashcat's site, but they are still a super helpful resource.

Korelogic Score Page

Link: https://contest-2024.korelogic.com/stats.html
Reason: This was how I figured out what hash types were being provided to us to crack

ADSync Hash Format

Link: https://aadinternals.com/talks/Attacking%20Azure%20AD%20by%20abusing%20Synchronisation%20API.pdf
Reason: Not super applicable to these challenges, but I wanted to put it here so I won't forget about it when talking about cracking ADSync hashes later.

ThatOnePasswordWas40Passwords CMIYC2024 Writeup:

Link: https://github.com/ThatOnePasswordWas40Passwords/crackmeifyoucan/tree/main/2024
Reason: An excellent writeup by the winner of this year's Street competition. I highly recommend checking it out. I'm certainly still going through it and trying to learn from their experiences!

Loading the Mixed Hash Lists:

Description of Hash Lists:

At the start of the contest Korelogic provided two PGP password encrypted files as well as the decryption password. The first file, cmiyc-2024_street_passwd_1, was a standard mixed hash list consisting of various different hash types in the general format:

username:hash

The hashes themselves were not identified so it was up to players to figure out what hash format they were targeting. Admittedly you can often let your cracking tool autodetect hash formats, but like everything this contest provided some "twists" which meant you couldn't just rely on your password cracker's autodetect feature.

The second challenge file was a PGP password encrypted tar file, which Korelogic also provided the password for. This tar file contained a number of different encrypted hint files as well as a Windows registry entry containing Radmin password hashes. For now though, let's focus on the first mixed hash list.

Identifying Hash Types and Loading Them In the Framework:

One quick way to know what hash formats might show up in that mixed hash list is to simply look at the Korelogic scoreboard. There they listed a number of different formats:

I'll admit while some of these formats were familiar, a lot of them such as striphash, radmin3 and adsync I've never really heard of or played with before. So at the start of the contest I ended up spending a lot of time trying to figure out how to identify the hash formats and add it to the autodetection scrips in the JupyterLab Framework. To do that I heavily relied on the HashCat and John the Ripper hash example websites listed above, as well as a healthy degree of googling for things like "hashcat adsync".

To share my findings, in the JupyterLab Framework there is a file called hash_fingerprint.py that is responsible for detecting the hash type given a raw password hash. This python code uses two different ways to detect a password hash type. The first method is to look for distinctive formatting options of the hash itself. This basically boils down to a lot of "if/then" statements such as what's below:

elif raw_hash.startswith("$radmin3$"):

hash_info['jtr_mode'] = "unknown"

hash_info['hc_mode'] = "29200"

hash_info['type'] = "radmin3"

hash_info['cost'] = "medium"

This code will identify the following has as type Radmin3:

$radmin3$75007300650072006e0061006d006500*c63[Results Truncated]

The second way to identify hash types is based on the length of the hash. This is much more problematic since many different hash types have the same length. For example MD4 and MD5 hashes have the exact same length. Therefore you can feed the Python script a list of the hash types you expect to find in the dataset your are processing to help deconflict hashes of the same length.

Now this isn't ideal. First of all, this is code I'm maintaining so there's huge gaps in the hashes it supports. I basically ended up spending a bunch of time at the start of the contest updating the script to support new hash types vs. actually doing cracking. Also, the detection logic and code quality leave a lot of room for improvements. That's one area where I was really excited to learn from "That One Password was 40 Passwords" writeup (linked above). They mentioned a python library called "Name That Hash" [Link] that I was not previously aware of. I need to check that out to see if I can outsource some of that work to other tools and limit the amount of work I need to do to intake contest hashes.

Question: Are you aware of any other scripts or library for autodetecting hash formats that doesn't involve just running Hashcat/JtR/MDXFind and letting them try to load and crack the hashes?

Reason: I'd love to make use of an existing toolset/effort vs. trying to implement and maintain my own Python code. So if you have any suggestions, please let me know!

That's a lot of words to say that once I loaded cmiyc-2024_street_passwd_1 into my JupyterLab framework I got three main warnings/errors.

It did not find any Radmin3 hashes despite the fact that I put them down as a potential target

This makes sense, since there is a huge file of Radmin3 hashes in the second challenge file (the tar archive).

No hashes of type Striphash were found

This also makes sense since I couldn't find any reference to an official hash format called striphash. So I hadn't added a signature for it into the hash autodetect script

An absolute ton of uncategorized hashes were found, ranging from length 33 to length 40 bytes long.

Based on the name "Striphash", I suspected that striphash represented truncated password hashes.

To that end I created a custom length helper for each length of striphash (aka striphash34, striphash35, etc), since I figured there might be some unique features of cracking each hash length so I wanted to keep them in different categories for cracking/analysis. Once I did that I was able to read in the full mixed hash list into my JupyterLab framework. This was nice since not only did it let me quickly be able to access the hashes and metadata and write custom Python scripts, but I could also print out a scoreboard and track how many hashes were in each category without having to depend on the public Korelogic scoreboard. I'll talk about this more later, but this helped when I was troubleshooting different attacks against the striphashes.

Cracking Striphash:

Identifying the Base Hashing Algorithm:

Now, my original theory was that Striphash was a set of truncated hashes, but that still left the question open: What was the original hashing algorithm? How can we figure that out?

Option #1: Use MDXFind to identify unknown hashes

MDXFind is an amazing password cracking tool for cracking unknown and nested hashes. If you don't know how a salt is applied to a hashing algorithm, MDXFind is the go-to tool for cracking that hash. Now MDXFind isn't magic. It still requires you to configure it to run an effective/successful cracking session. But MDXFind automates all the various hash format mangling tasks that you might have to do manually otherwise.
Tutorial: I have some base instruction on how to install and use MDXFind in the blog post linked below. It is under "Tip #4: Leverage MDXFind to identify unknown hash types".
Link: https://reusablesec.blogspot.com/2022/05/password-cracking-tips-crackthecon.html

Option #2: Look at the hash length, and then manually YOLO it

I may have taken this approach during the contest. I'm not proud of it. But I looked at the longest Striphash being 40 bytes, went "hey that's how long a Raw-SHA1 hash is!" and then threw it into a John the Ripper cracking session using that format to see if I cracked anything.
John the Ripper's Batch mode attack can be very good for this. It combines attacks against the username (if one exists), and a pretty good "base" wordlist and mangling rules that usually will crack something on larger password dumps. It will then switch to Incremental (Markov enhanced brute force) if those attacks are not successful, giving you an even better chance of cracking a weak password.

Here is a screenshot of me YOLOing it:

Given these results, my going assumption was that Striphash was a bunch of truncated Raw-SHA1 hashes. This was additionally boosted by the very famous 2012 LinkedIn breach where in the initial dump all the hashes had their first five bytes replaced by zeros [Link]. As a spoiler, this assumption wasn't exactly right.... But it was a good place to start from.

Cracking StripHash39:

Given the 39 byte hashes were likely missing a single byte, I could recreate the hashes by simply adding on an extra byte to them. That increases their size by a factor of 16 [0-F], but is totally doable for the 1222 StripHash39 hashes. In the JupyterLab Framework I have a function called ServiceMgr.create_left_list(format, file_name, hash_type, metadata) which returns a list of hashes I haven't cracked yet while also allowing me to further limit the results based on hash types and metadata filters. I can then write a quick Python script to append a hexadecimal character to the end and write it back to a file to crack.

Easy Peasy Leamon Squeezy. The only thing remaining was to try my YOLO hash detection method (run a basic attack against this using JtR's Batch mode) and see if I get any cracks....

Annnnnd nothing...

Thinking for a second, I realized maybe I should have inserted the extra hexadecimal character on the front of the hash like in the LinkedIn breach. It was a quick one line change to modify the file write in the script above. Then all I had to do was rerun the cell and conduct another YOLO JtR Batch attack. This time it yielded better results.

Looking at my potfile of cracked passwords also revealed something interesting.

Looks like all of the cracked hashes started with a '0'. That would seem to make things a lot easier, as rather than increase the size of my cracking file by a factor of 16, it doesn't increase the size at all. Still, something didn't sit right with me. Going back to the hash distributions above the number of each length strip hashes was:

raw-sha1 :884
striphash39 :1222
striphash38 :785
striphash37 :332
striphash36 :87
striphash35 :16
striphash34 :4
striphash33 :1

If striphash39 was just all the hashes with a leading 0, it should be roughly 1/15th the size of striphash40 (raw-sha1). But there are more hashes for it. I'll admit in the moment I shrugged off that bad feeling thinking that Korelogic must have just increased the included hashes with a leading zero so they would be worth more relative points for figuring out how to add the 0 to the front. But I should have listened to that bad feeling more. But we'll get back to that!

Cracking StripHash38:

Given the success with StripHash39, it seems like it would be straightforward to generate hashes for StripHash38. Just add two 0's in the front of the hashes! But doing that didn't yield any cracks. Same problem with other variations, such as adding two 0's to the end of the hash, or one 0 at the front and the end. What's going on?

I was obviously missing something and that's when I thought back to the length distributions. It didn't make sense. So manually looking at the hashes I started to get a hunch. To verify that hunch I wrote a quick and dirty script to see if there were other positions besides the first hexadecimal number where 0's did not show up in the Strip40 hashes.

I'm not proud of the code, but it works and when I'm troubleshooting a problem I figure it's better to get something running vs. trying to make it pretty. Going through each position in the hash I found that there were no 0's in any even spot of the Strip40 hashes (zero-indexed). Or in the way we speak about it, there were no 0's in any odd spot (aka the 1st character, the 3rd character, the 5th character, etc).

This meant I was missing out on attacking a large number of the Strip39 hashes since that 0 could have been stripped out of 19 other locations. Modifying my code was pretty easy and now I could start targeting all of the Strip39 hashes along with Strip38 hashes, all the way to Strip33. Below is an example of my Strip38 hash generation code:

Note: Some of the complexity in the hash generation was me trying to get fancy and update the target user to include information on which 0's were added to the hash. I was worried about how to convert these hashes back so I could submit them to Korelogic for points. Later on when I went to submit them though I found a much more elegant (and easy) solution. Since I had all the username/hashes loaded up in the Framework, I could simply generate a list of users/cracked-passwords using John the Ripper's show function. For example you can run:

john --show --pot=[PATH TO YOUR POTFILE] [HASHLIST]

This will print all your cracked hashes along with the usernames associated with those hashes from the [HASHLIST]. Here is me running that command:

The first value in the username listed there is the JupyterLab Frameword's internal ID for a hash since I figured the same user might have multiple different hashes in a challenge set. If I had my act together I would have included the real username in the cracking files to aid in Batch mode attacks, but this helped me read the results of the "john --show" command back in and associate the cracks with the truncated hashes Korelogic expected us to submit to them.

Once I had the cracked hashes back in the JupyterLab framework I could then use my normal scripts to generate a submission of newly cracked hashes to Korelogic.

Addendum: Cool Hashcat Fact

One thing I learned after the contest is that Hashcat will ignore the first five byes of a raw-SHA1 hash. This is a holdover of a modification to allow it to natively crack the original mangled LinkedIn dump, but it also means you could have used it to crack at least some of the Striphash challenges without really knowing what was going on under the hood. Here is a forum post talking about this [Link].

Quoting the example in the article:

SHA1(testing123) = 4c0d2b951ffabd6f9a10489dc40fc356ec1d26d5
But hashcat will find this hash for testing123:
hashcat -m 100 -a3 00000b951ffabd6f9a10489dc40fc356ec1d26d5 testing?d?d?d
00000b951ffabd6f9a10489dc40fc356ec1d26d5:testing123

More discussion was had about this in the after-the-contest Korelogic CMIYC Discord channel where it was revealed that most password crackers ignore a lot of the bits of different hash types when doing comparisons. In practice this isn't a problem (and it yields significant speed bonuses) since the chances of a random collision occurring are astronomically low. But now that Korelogic knows about that, well it's something to keep in mind for next year's contest!

Actually Running Cracking Sessions:

I know this entire blog post focused on setting up cracking sessions for targeting the Striphashes, but I haven't really talked about how to go about actually cracking them once all the plumbing has been taken care of. I'm going to defer that conversation to a later blog post since:

This post is already too long
I'm still learning about how these plaintexts were generated myself. Most of my attacks were not as successful as I expected they would be. I need to dig more into why that's the case.

Now I have a pretty good suspicion that the solution to cracking these hashes was scattered throughout all the hints and sub-challenges Korelogic released during the contest. While I didn't have time to dig into those hints/challenges during the contest, I can certainly investigate them now even though the contest is over. So that's a lot of words to say, I'll try to address this in a future blog post.

Conclusion:

I hope this was helpful. The striphashes weren't worth a lot of points, so spending time on these was more of a side-quest than trying to rank up in the contest itself. Still I had a lot of fun trying to figure this challenge out. In the upcoming weeks I will try to post more challenge focused blog entries (Next up: The RAdmin hash challenge) as well as a more general overall writeup of my experiences competing in the contest as a whole. I will also try to update the JupyerLab framework with the lessons learned from this contest. Some of the big items are John the Ripper log cleanup scripts (since the logs take too long to parse for my liking), better support for viewing cracked hashes, and better support for automatic hash submission. I'd also like to try out some 3rd party libraries for hash identification vs. using my own code for that. Anyways, it gives me an excuse to do something else than update my PCFG code which is something I've been procrastinating on for (cough) years ;p

Tutorial for CMIYC2024: Registering a Team and Cracking Test Hashes

2024-08-04T22:14:00.000-07:00

Whenever a thing is done for the first time, it releases a little demon.

- Emily Dickinson

Getting Ready For Crack Me If You Can 2024

August is like New Years for me. With BSides Vegas, Defcon, and all the other hacker conferences, I find myself taking stock of the previous year, realizing all the plans I did not accomplish, and I find myself setting goals for the next year. One of the things I am happy about though is that I now have a password cracking framework to help me compete in this year's Crack Me If You Can competition. Like the last couple of years I plan on competing by myself on team "Reusablesec" since I think it's important to try (and probably fail) vs. just talk about password security.

I'd also like it if new players also competed and learned from this contest. The problem is, the Crack Me If You Can contest can be intimidating, and the barrier of entry can be high. I mean, you need to figure out how to use PGP, which isn't easy, and that is on top of installing/configuring/using password cracking programs! To that end, this blog entry is focused on highlighting how you can use the Password Cracking Framework to get your team registered, and will walk you through how to crack the test hashes and how to submit them. That way you can focus on the fun parts such as cracking hashes and trying to figure out whatever Led Zeppelin themed challenge Korelogic is sure to throw at you this year.

Note: While the test hashes will be "spoiled" in this blog entry, I WILL NOT be posting any actual contest solutions until after the competition is over. If you want to attempt to crack the test hashes yourself (which would be awesome) feel free to skip this blog entry and come back to it after you are done so I don't spoil the challenge for you.

Resources:

Official Contest Resources:

Korelogic Contest Website: [Link]
Korelogic Registration Info: [Link]
Korelogic Hash Downloads: [Link]
Korelogic Scoreboard: [Link]

Password Cracking Framework Resources:

Github Site for Code: [Link]
Previous CMIYC Tutorial (Part 1): [Link]
Previous CMIYC Tutorial (Part 2): [Link]
Previous CMIYC Tutorial (Part 3): [Link]
Previous CMIYC Tutorial (Part 4): [Link]

Using The Framework:

The rest of this tutorial is really going to focus on using the Password Cracking Framework to help automate some of the tasks in the CMIYC contest. All the Pro teams have their own frameworks, so I developed this framework to be something anyone could use. The framework is written as a Python backend for a JupyterLab Notebook. Why JupyerLabs? Well I never want to write a GUI so JupyterLabs takes care of much of the frontend for me. Also, I've found the interactive shell into my Python backend to be very helpful when playing around with data.

You can download and install the framework from the link above. The real source of truth will be the CMIYC_2024_Examples.ipynb Notebook vs. this blog entry, as I will probably be making changes and updates to throughout the contest. While I won't be posting any contest hashes, I will likely update the Framework to handle file formats that Korelogic throws at the street teams. When you are using this Notebook, I'd recommend using the example one as a tutorial, but then creating your own Notebook to manage your own personal cracking sessions. That way you can arrange your Notebook the way that works best for you, and if I make changes to to the tutorial/example during the contest you don't have to worry about git merge conflicts.

For the rest of this blog entry I'm largely going to be referencing the Notebook, so I'd recommend taking the time to download and install it before continuting.

Registering a Team:

Korelogic requires teams to perform all communications using PGP. For most people, this probaly means using the GNU PGP client. For me, GNU stands for "Going to Never Use it" since the tutorials and MAN pages are often horrible. So the first order of business for me getting ready for this year's CMIYC was to automate PGP key generation, encryption, and decryption using the Python library PGPy [Link].

If you want to use a different PGP program to create a team, absolutely go ahead and do that! But in the Password Cracking Framework I created a Python class called PGPMgr that you can call through the Jupyter Notebook instead.

The first function is generate_read_pgp_private_key. This will create a new PGP key (if one does not already exit) and write it to the key_file name that you send to it. If a PGP key already exists it'll validate the key and then return it. This way you aren't creating a new key every time this is run. Feel free to skip using this function if you already have a PGP key you want to use, but even the conference organizers recommend using a contest specific PGP key to make it easier to partner with other players. I'd also recommend backing up your key in case something happens during the contest. Here is the output of running it in the Notebook:

My apologies that it's a screenshot vs. text, but that's why it's probably easier to use this in the framework.

To encrypt/decrypt messages I then created the PGP_Mgr class in the Framework. I figured a class would be better than having individual functions since I didn't want to have to keep passing in my private key as well as KoreLogic's public key.

While I could try to integrate this with individual e-mail services, that seems like a lot of work so PGP_Mgr will output encrypted messages to this workbook as well as save the messages to a file. It's up to you then to actually copy/paste (or attach) that message into an e-mail and send it to KoreLogic at sub-2024@contest.korelogic.com

Hopefully you are getting an idea about what this framework is now! I could keep copying and pasting screenshots from it, but I really recommend you just check it out instead! Even if all you use this for is just to register a team and not to crack any actual passwords, I hope this Framework helps you out. As a side tangent, I also need to come up with a better name for this Framework. I'm thinking CRIPTs (Collaborative Research Into Password Threats), but I'm open to other suggestions!

Cracking the Test Hashes:

For the test hashes, there are two different challenge files:

test_1_passwd.hash

Contains 5 different hashes

1 raw-md4
1 nsldaps
2 md5crypt
1 bcrypt which is worth ALL THE POINTS

test_1.zip

An encrypted zip file containing a note

Observation 1: It looks like we will need to crack encrypted files again this year to unlock additional hashes and get tips on how to crack the tougher hashes. If you want to crack encrypted files used in past CMIYC contests, I strongly recommend this blog post: [Link]

As far as cracking those hashes, you'll want to specify the correct format when running John the Ripper and Hashcat. Rather than look that up in the help files, I made a function to display what format name/number to use is in the Framework:

The framework unfortunately doesn't crack any passwords for you. You'll need to do that yourself. Luckily if you run default attacks against the raw-md5 hash and the nsldaps hashes using John the Ripper you should be able to crack them. Their plaintexts are:

Raw-MD5: 3-2-1

NSLDAPS: Contact

If you are a child of the late 70s and 80s you might remember "3-2-1 Contact" as a great PBS show teaching kids science (think of it as a nerdier Sesame Street). That seems like it might be useful information when creating a custom wordlist to target the other hashes. Doing some Googling led to the following link containing the show's lyrics: [Link].

Using a wordlist created from that, I managed to crack the zip password.

PKZip: When everything happens

Unzipping the file, we can see the following message:

-----------------------------------------------------------------------------------

Do not submit the zip passphrase, it is not worth anything.

user5's password is the sentence spoken by the synthesized voice

introducing itself in episode 101.

-----------------------------------------------------------------------------------

Observation 2: There will probably be a "Google/Bing" section of the contest where you will need to create solve puzzles and create customized wordlists

Wonder if I can use ChatGPT to solve this problem?

Uggg! Well I guess I don't have to worry about AI taking my job for another year. On the plus side, if KoreLogic makes me watch PBS science shows all weekend I really can't complaint too much. To watch the Episode 101 of 3-2-1 Contact yourself you can view it at the following [Link]. I forgot how much this show rocks!

Cracking the BCrypt Hash:

As far as actually cracking the passphrase for the BCrypt hash, the text in question is:

"I am a computer at bell laboratories and I am learning to talk"

It's pretty easy to throw that in a wordlist and try to crack the password. But, it doesn't work.... So there obviously is something wrong with the phrase. Maybe it is capitalization?

One option is to apply passphrase mangling rules that you can find in the following passphrase cracking github repo [Link Here]. Besides common passphrase mangling rules, this site also has a huge collection of passphrases wordlists which is nice for normal cracking session. For the mangling rules you can download from that repository, they will generate guesses such as the following:

take the red pill
take-the-red-pill
take.the.red.pill
take_the_red_pill
taketheredpill
Take the red pill
TAKE THE RED PILL
tAKE THE RED PILL
Taketheredpill
tAKETHEREDPILL
TAKETHEREDPILL
Take The Red Pill
TakeTheRedPill
Take-The-Red-Pill
Take.The.Red.Pill
Take_The_Red_Pill

Unfortunately when I applied these rules to the sentence spoken by the computer, I still was unable to crack the password. I'll admit this was an area where Ivan from team John_Users also beat me to it. Turns out, I needed to capitalize 'Bell Laboratories', and then add a period at the end. So the actual password was:

I am a computer at Bell Laboratories and I am learning to talk.

That's some of the gotchas that can get you when cracking passwords. Close doesn't cut it. Also this shows I need to up my passphrase mangling rules!

Speaking about how awesome Ivan was though, while I won't be competing with team John_Users this year, I really want to wish them the best of luck and state they are the Pro team I'll be rooting for! Go Team John_Users!!!! Also if you want to compete with them, they are still taking new members. This is an amazing opportunity to learn from some really advanced password cracking experts.

John_Users sigh-up link: https://openwall.info/wiki/john/contests

Update: After the Contest Spoiler

So what about those md5crypt hashes? Well there is a good lesson to be learned from that as well. Here are their solutions given by the contest organizers:

user3:$1$mjlHQY$7ofnyhSdWnn6HyIMf8GIE0:It's the secret
user4:$1$h2lb97w8$rSnzQB3FHv9o4VXaOTEyx.:It's the moment

So that's really weird, since I mentioned I had added the show lyrics to a cracking wordlist. In fact, the wordlist is available in the JupyterLab example [Here]. Looking at that wordlist, it has the following two lines:

It’s the secret
It’s the moment

Question: Can you spot the difference?

Answer: It's the secret, it's the moment, but really it is the apostrophe.

Observation 3: This is a real learning lesson for me. I need to be more careful about how I read in and normalize input scraped from webpages when creating wordlists for these contests. Changes like using the wrong apostrophe are really hard to spot if you are manually looking at the copied content, but it makes all the difference when cracking passwords. Just like with the bcrypt passphrase, getting close doesn't count. You need to make a guess exactly the right way. I guess that complication is part of what makes password cracking so fun!

Jupyter Lab Framework Example: Revisiting CMIYC2022

2023-11-08T20:05:00.006-08:00

Everything that happens once can never happen again. But everything that happens twice will surely happen a third time.
-- Paulo Coelho

Introducing the JupyterLab Password Cracking Framework:

For the last couple of months, I've been (slowly) working on building out a new backend/framework to be able to manage password cracking sessions using JupyterLab as the frontend/GUI. The current version of this framework is available [here].

This project is under active development (well active for me anyways), and I'd really appreciate feedback and suggestions on how to extend and improve it. My goal is to have an opensource, community driven alternative for Team Hashcat's List Condense (LC) collaboration server ready by CMIYC2024.

About The Framework:

I view JuypterLabs as a stone soup. It provides a good interface, interactive Python debugger, and a way to save and share analysis results. But it is still up to you to do all of the backend analysis. That became very evident when I used JupyterLabs in the CMIYC2023 contest (as detailed in the previous three blog posts on this site). I spent a lot of time debugging my code and messing with data structures when I really needed to be focused on cracking passwords. I quickly realized for this approach to be effective in future password cracking contests that I'd need to develop back-end data structures and classes to better organize all the data and provide built-in features to aid in common tasks.

Backing up a bit, in the past I've really enjoyed using MITRE CRITs (Collaborative Research Into Threats) [Link]. Unfortunately CRITs is no longer being maintained (the switch from Python2 to Python3 killed the project), but it was a tool for CSOC (Cyber Security Operation Center) team members to collaborate with each other when analyzing intrusion sets and threat actors. CRITs organized Top Level Objects (TLO) into the following buckets/categories:

Actors
Campaigns
Certificates
Domains
Emails
Events
Indicators
IPs
PCAPs
Raw Data
Samples
Targets

It then made the data available in these different buckets cross linkable as well as accessible to various plugins. Following that approach, I created several different TLOs for this password cracking framework:

Hashes: Contains information about the hashes including plaintext values, hash types, etc
Targets: Contains information about particular targets/users and metadata
Sessions: Contains information about a cracking session. The closes CRITs equivalent would be Campaigns.
PWCrackerMgr: Not really a data structure, but a way to translate between Hashcat and John the Ripper cracking sessions

In the future as this toolset becomes more developed, I may end up taking a lot of the metadata out of Targets and putting it into its own TLO much like CRITs did. Also longer term, I may end up incorporating disk storage or a database, but for now I'm keeping this focused on helping with password cracking competitions, (vs. activities like pen-testing and managing ALL my password lists). You can download the framework right now, and it already has some example Notebooks in it that show how to use it on the CMIYC2023 challenges.

What this framework WILL NOT do is crack hashes or run your actual cracking sessions. I may add some scripting/logging support in the future, but this framework is focused on helping with data analysis as well as automating some of the busywork/repetitive tasks in a password cracking competitions such as creating custom wordlists, left list, translating between JtR and Hashcat, and hash submission.

Using the Framework to Crack CMIYC2022 Challenges:

A big question I have is how effective will this framework be in the NEXT password cracking challenge? Since I can't predict what Korelogic is going to do (beyond the fact that Bon Jovi references will somehow be involved), probably the best thing I can do is look back at past competitions. To that end I figured I might dig up the 2022 contest hashes and attempt to crack them again.

To obtain the challenge files, you can visit Korelogic's 2022 contest page [here].

Side note: Mad props to Korelogic for continuing to host past contest files. I really appreciate it!

Also, I had written a short blog post about my experiences in the contest, which is available [here].

I'll admit, it's always hard looking back on past work/documentation, but I'm a bit annoyed with my past self. I think that blog post has a lot of good information in it, but it really doesn't go into too much detail about the contest itself. On the plus side, that will make using this framework for this challenge more "realistic" since I can't just rely on my past documentation, and I have very hazy memories of what happened over a year ago.

Unpacking the Contest Files:

The CMIYC2022 contest had three file "drops" over the course of the challenge. Aka not all the hashes were released at the start, so this gave teams something constantly new to bang their heads against.

The contest files are PGP encrypted and as a player you need to decrypt them with the password that KoreLogic provided. Since the first thing I always do is Google "How do you decrypt PGP files" I'm going to put the command here for future me. Also this will hopefully disabuse you early on the false idea that I know what I'm doing ;p

gpg -o <output_file> -d <input_file>.pgp

Do that for all three files, saving them with .tgz extensions. Then unzip the files using the command:

tar -xvzf <input_file>.tgz

This creates three different directories filled with different encrypted file types. These are:

cmiyc-2022_street_1/

list18-Thursday17January2021.odt
list17-TOWMINTP.hashes.gpg
list19-paidanextra500000.zip
list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z
list24-ThisYearsWorst.pdf
list16-FL_kdIZUGpI.zip

cmiyc-2022_street_2/

DEFCON-Street.kdbx

cmiyc-2022_street_3/

rar.sdrawkcab
1991whattimeisit.tgz

Cracking the Encrypted File Containers (Part 1: File Extraction):

This competition starting to come back to me. The main issue with this challenge was to figure out how to crack the various encrypted file types. Once you cracked the top level file, it would present you an internal hash_list of fast to compute hashes that you need to crack for actual points. For the street teams, the top level file hashes are encrypted with fairly easy to guess passwords. For the pro teams ... not as much.

In my previous writeup [here] the first two "Tips" cover how to set up John the Ripper to crack these files. I'm going to largely skip those tips here, but assuming you followed them, the following commands can be used to extract and save the hash for the above file to a single file you can crack. I'm appending them all to a "encrypted_file_hashes.hash" file that I can load into JtR. If you want to use Hashcat to crack the hashes instead, you'll need to do some additional fixups to remove things like the username (or run hashcat with the --username field). Side note, I also highly recommend checking out [this] external blog entry about using John the Ripper to crack different file formats. I heavily leveraged it since it highlights things like to crack .odt files you need to use lbreoffice2john.

list16-FL_kdIZUGpI.zip

zip2john list16-FL_kdIZUGpI.zip >> hash_files/encrypted_file_hashes.hash

list17-TOWMINTP.hashes.gpg

gpg2john list17-TOWMINTP.hashes.gpg >> hash_files/encrypted_file_hashes.hash

list18-Thursday17January2021.odt

libreoffice2john.py list18-Thursday17January2021.odt >> hash_files/encrypted_file_hashes.hash

list19-paidanextra500000.zip

zip2john cmiyc-2022_list19-paidanextra500000.zip >> hash_files/encrypted_file_hashes.hash

list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z

7z2john.pl list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z >> hash_files/encrypted_file_hashes.hash

DEFCON-Street.kdbx

keepass2john DEFCON-Street.kdbx >> hash_files/encrypted_file_hashes.hash

rar.sdrawkcab

rar2john rar.sdrawkcab >> hash_files/encrypted_file_hashes.hash

1991whattimeisit.tgz

Trick challenge here! You can unzip/untar this with normal commands. Aka:

gunzip 1991whattimeisit.tgz
tar -xvf 19whattimeisit.tar

The real challenge is to decrypt a gocrypt message which isn't directly supported by either John the Ripper or Hashcat.
I'm going to skip this challenge for now. I have vague memories of downloading a gocrypt command line utility and writing a quick script to pass password into it from an external file. But since I'm focusing on the JupyterLab Framework, that yak shaving task is out of scope of this writeup.

list24-ThisYearsWorst.pdf

pdf2john.pl list24-ThisYearsWorst.pdf >> hash_files/encrypted_file_hashes.hash

Cracking the Encrypted File Containers (Part 2: JtR Cracking):

While I could certainly load these hashes into the JupyterLab Framework ... I don't see a lot of value doing so as there isn't a lot of "analysis" to do on them. Perhaps if/when I add the ability to keep track of targeted attacks, wordlists, and mangling rules run against a hash then it will make sense, but for now let's just crack these hashes so we can get access to the larger hash lists which we will be able to leverage the current JupyterLab Framework against. Below I'm going to list the JtR command I used (including mode) to crack the password as well as the plaintext. To help with spoilers, I'm setting the background of the plaintext to be black, but to read it you can just highlight the text and copy/paste it somewhere else.

list16-FL_kdIZUGpI.zip

john --pot=../pot_files/cmiyc2020_john.pot --format=pkzip encrypted_file_hashes.hash
Note: As I said, the street file passwords are pretty easy to guess...
Hackers

list17-TOWMINTP.hashes.gpg

john --pot=../pot_files/cmiyc2020_john.pot --format=gpg --wordlist=../wordlists/hacker_movies.txt --rules=Single encrypted_file_hashes.hash
Note: This one required a non-default attack, partially because GPG is such a slow format to make guesses against, and partially since the base word wasn't in JtR's default wordlist. Based on the other hashes I cracked I figured it'd be a Vegas themed password or from a hacker movie so I made a small wordlist based on that.
DEFCON

list18-Thursday17January2021.odt

john --pot=../pot_files/cmiyc2020_john.pot --format=odf --wordlist=password.lst --rules=":c;:u" encrypted_file_hashes.hash
Note: This was a slow enough hash that I couldn't do all the rules in Single mode in a reasonable timeframe. Also the base word wasn't in my targeted dictionary. But given the other cracked passwords I ran two rules against the default JtR dictionary on the command line (Capitalize and Uppercase). I wrote more how to specify rules on JtR's command line [here].
Sunday

list19-paidanextra500000.zip

john --pot=../pot_files/cmiyc2020_john.pot --format=pkzip encrypted_file_hashes.hash
Note: I cracked this one in the same session as list16, which is why it is nice to save all these hashes to the same hash-file
Swordfish

list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z

john --pot=../pot_files/cmiyc2020_john.pot --format=76 --wordlist=password.lst --rules=":c;:u" encrypted_file_hashes.hash
Note: Same constraints and attack as I ran against list18.
Queen

DEFCON-Street.kdbx

john --pot=../pot_files/cmiyc2020_john.pot --format=KeePass encrypted_file_hashes.hash
Note: This attack froze up my laptop trying to run it under WSL, so I copied it over to my server to run which worked a lot better, with almost an instant crack. The funny part was, ((Spoiler Alert)) the "username" attack in JtR's Single mode cracked the password and not a normal dictionary attack
Street

rar.sdrawkcab

john --pot=../pot_files/cmiyc2020_john.pot --format=RAR5 encrypted_file_hashes.hash
Note: I'll admit I got thrown a bit with the format (trying --foramt=rar first) until I looked at the saved hash. Otherwise this was a simple crack with a default attack
drowssap

1991whattimeisit.tgz

Didn't try this one for this writeup

list24-ThisYearsWorst.pdf

john --pot=../pot_files/cmiyc2020_john.pot --format=pdf encrypted_file_hashes.hash
Note: ((Spoiler)) Surprisingly it wasn't the username attack that got this one. The base word was in passwords.lst which is JtR's default wordlist.
Worst

Extracting the "REAL" Hash Lists:

The next step is to decrypt/unzip/open all the files and save the hashes from them. You may laugh, but I always forget the command line options to do this, so I'm listing how I did that below for future me. I'm also listing what types of hashes were in each file. To determine the hash type I mostly relied on looking at the scoreboard that KoreLogic published [link] and matching it to the hash. If that wasn't provided though, there would be several ways to figure out the hash such as using mdxfind.

list16-FL_kdIZUGpI.zip

unzip list16-FL_kdIZUGpI.zip
Contains: 2766 half-md5 hashes
JtR Mode: Not supported
HC Mode: 5100

list17-TOWMINTP.hashes.gpg

gpg -d list17-TOWMINTP.hashes.gpg > ../../hash_files/list17.txt
Note: I needed to pipe the results to a file to save them
Contains: 2933 raw-md5 hashes
JtR Mode: raw-md5
HC Mode: 0

list18-Thursday17January2021.odt

Open it on a Linux system using LibreOffice and paste the hashes into list18.hash
Contains: 5456 raw-sha1 hashes
JtR Mode: raw-sha1
HC Mode: 100

list19-paidanextra500000.zip

unzip list19-paidanextra500000.zip
Contains: 4997 raw-sha256 hashes
JtR Mode: raw-sha256
HC Mode: 1400

list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z

7z x list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z
Contains: 10004 raw-sha384 hashes
JtR Mode: Raw-SHA384
HC Mode: 10800

DEFCON-Street.kdbx

Open the file in keepassx and then paste contents in list21.hashes
Note: There is a command line version but that became too big of a pain to figure out how to export the hashes properly
Contains: 10812 mssql05 hashes
JtR Mode: mssql05
HC Mode: 132

rar.sdrawkcab

unrar x rar.sdrawkcab
Contains: 4214 mysql CRAM hashes
JtR Mode: mysqlna
HC Mode: 11200

list24-ThisYearsWorst.pdf

Open in a PDF viewer and copy/paste the hashes into list24.hashes
Contains: 2000 SSHA/nsldaps hashes
JtR Mode: Salted-SHA1
HC Mode: 111

Creating a Config For the JupyterLab Framework:

Now that we have tens of thousands of password hashes to crack, it's time to use the JupyterLab password cracking framework. To do this, we'll first need to load the hashes into it, and to do that we'll need to configure the config file.

The framework uses the YAML file format for its configs. That means spaces/whitespace is important, but it is also a pretty flexible file format. For our config, we'll define our JtR and hashcat potfile locations, how much the hashes are worth, and where and how to load the hashes. I'll be including # Comments as well to help explain why I'm doing what I'm doing.

---

# This defines where the potfiles are for this contest. This lets you load cracked hashes from them

# as well as keep your JtR and HC potfiles synced.

jtr_config:

main_pot_file: "./challenge_files/CMIYC2022_Street/jtr_cmiyc2022.pot"

hashcat_config:

main_pot_file: "./challenge_files/CMIYC2022_Street/hc_cmiyc2022.potfile"

# This is information on where to load the challenge files. If they have additional metadata you

# may need to write a custom function to import them, but in this case they are pure raw hash lists

# so we can use the "plain_hash" plugin to import them.

# This plugin requires the hash type to be specified (if not it will default to "unknown"). Typically

# I use the JtR naming format for the hash type. The "source" field is used to list a source for the

# hashes in the framework. Basically it's a note for you later when looking at the cracked hashes.

challenge_files:

list14:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list14-4214-BrunnersMentalPrisoner.hashes"

format: "plain_hash"

type: "mysqlna"

source: "list14-BrunnersMentalPrisoner"

list16:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list16-FL_kdIZUGpI.txt"

format: "plain_hash"

type: "half-md5"

source: "list16-FL_kdIZUGpI"

list17:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list17.txt"

format: "plain_hash"

type: "raw-md5"

source: "list17"

list18:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list18.hash"

format: "plain_hash"

type: "raw-sha1"

source: "list18"

list19:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list19-paidanextra500000.hashes"

format: "plain_hash"

type: "raw-sha256"

source: "list19-paidanextra500000"

list20:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.hashes"

format: "plain_hash"

type: "raw-sha384"

source: "list20-Authoritiesappearto"

list21:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list21.hashes"

format: "plain_hash"

type: "mssql05"

source: "list21"

list24:

file: "./challenge_files/CMIYC2022_Street/sample_hashes/list24.hashes"

format: "plain_hash"

type: "ssha"

source: "list24"

# The score info is taken from the Korelogic scoreboard. This isn't necessary, but it is nice to have

# a local count of what your score should be so you can compare it to the official score to validate that

# you are submitting your cracks properly

score_info:

raw-sha384: 46

mysqlna: 17

raw-sha256: 13

mssql05: 9

raw-sha1: 5

ssha: 5

half-md5: 3

raw-md5: 1

Loading the Challenge Files into JupyterLab Framework:

This is the easy part for these hashes. The biggest challenge is setting up the config file. Once that's done you can just load it up in the current framework. At this point I should mention that I'll be eventually including the Notebook in this blog post in the example files in the framework github repo.

Once it is loaded you can run the built in tools to merge Hashcat and JtR potfiles, display cracked passwords, and calculate your expected score.

Next Steps - Cracking Some Passwords:

There's not a ton more analysis to do (besides look at the cracks). And don't worry, we'll get to that! But this is where I'm glad that I decided to go back through these old CMIYC challenges. The last contest in 2023 was very heavy in its use of metadata. So I of course focused on metadata analysis for this framework. But for this contest, there is very little metadata. The focus instead was on cracking encrypted containers and (cough, SPOILER ALERT, cough) building custom wordlists from online articles. This highlights other new capabilities that I really want to add into this framework. For example, going through logfiles, extracting the rule/dictionary that cracked a password, and then storing that data with the hash in this framework. That would be super cool! How about doing google searches on passwords for you? That would be cool too. That points to the stone-soup approach of this framework. Once we have these hashes/passwords/metadata stored in a searchable framework with a Python3 backend we can build upon this to add new capabilities.

Enough talking! Let's look at some cracks. Looking above, I managed to crack over 50% of the ssha hashes in under a minute cracking run in JtR. I wonder what those plaintexts look like. I can use the SessionMgr.print_all_plaintext(meta_fields=['source']) to display them.

You don't have to be a master password cracker to design an attack against these. That being said, going back to the score graph, they aren't worth a lot of points. Even if you cracked all 2k ssha hashes (which is totally doable) you would only get 10k points. That's a lot less than most of the other hash types. So this is a list to play around with when you don't have anything better to crack and need that serotonin hit to see cracks flash across your terminal.

That leads us to the other hash lists. Let's check out the raw-sha1 hashes:

At least for the limited cracks so far, it looks like these plaintexts were generated using some common words that have various mangling rules applied to them. You can start to see why I want to automate pulling out dictionaries + rules from JtR and HC logfiles to make it easier to reverse engineer these rules vs. having to do it by hand.

Actually, this might be a good time to end this blog entry and work on some of those new features.... All the code is uploaded to the gitlab site and I'll upload some sample hashes as well so you can follow along. But for now I probably need to look into parsing some Hashcat log files.

Hashcat Tips and Tricks for Hacking Competitions: A CMIYC Writeup Part 3

2023-08-30T20:51:00.008-07:00

I want to know1
and understand1
But I will not1
-- Hashes cracked from the KoreLogic CMIYC 2023 competition

In the previous two posts on the CMIYC competition [Part 1, Part 2], I had focused on how to integrate data science tools into your password cracking workflow and showed how to crack passwords on limited hardware (E.g. my laptop without using a GPU). Of course it's better to have some firepower to crack hashes! One of the hurdles to overcome is I don't have a lot of firepower at my disposal. Despite being super interested (OK, obsessed) about password cracking, I've never invested in a dedicated cracking rig. Still, when I do get serious about cracking passwords I turn to Hashcat and GPU based attacks to do the heavy lifting even if I only have a single NVIDIA GeForce GTX 1070 GPU. That's still significantly faster than trying to run CPU only attacks.

To that end, let's talk about how to leverage Hashcat when competing in these competitions. Full disclaimer: I'm going to go full spoiler in how I'm approaching my cracking. At this point, I've been running cracking sessions way longer than the competition would have lasted if I had competed. Also, I've been on the various Discord and Twitter conversations about the contest this year and know how the hashes were generated. Heck, KoreLogic even posted themselves how they created the challenges [Full Spoiler Link]. So I'm not going to even pretend that this post represents how I would have done. Instead I want to focus on "given what we know, how can someone use Hashcat to crack those hashes".

Using Hashcat and John the Ripper Together

One issue that pops up a lot for me when using both John the Ripper and Hashcat to crack hashes, is that while their file formats are *mostly* the same, they are not directly compatible. This goes for how these tools expect hashes to be formatted when loading them up, and their .pot file formats they save their cracked passwords to.

The hash format in particular has been a long source of annoyance for me, and writing this blog post inspired me to finally submit a github issue about it to the hashcat repo. The long story short is that John the Ripper uses hash type identifies that Hashcat doesn't recognize. For example, here is a raw-md5 hash (from the CMIYC2023 contest) that John the Ripper can load:

jithakur:$dynamic_0$38bb03886dd4fbda5a780f0617847e4c

And here is the same hash format that Hashcat expects:

jithakur:38bb03886dd4fbda5a780f0617847e4c

Side note, while you can have usernames in your hash lists, Hashcat won't load the hashes unless you include the "--username" flag on the command line telling Hashcat to strip/ignore those usernames. E.g.:

hashcat --username -a 0 -m 0 hashfile.txt dictionary.txt

What this really means is that to support both John the Ripper and Hashcat, I now have two sets of hash lists and two sets of pot files. It would be nice to incorporate some scripts in my Juypter Notebook to sync up both of the pot files between them so I'm not cracking the same hashed password twice. Given that's a rabbit hole which would totally side-track any hash cracking, I'm going to push that project off for another day. For now I'm just going to use Hashcat, and I modified my Notebook to support the Hashcat file formats, (mostly by copying and pasting the JtR code into another cell and then making small modifications). Once again, this is one of the super-powers of using Jupyter notebooks. I can load up my JtR cracked hashes, then write and load up my Hashcat plaintexts, and perform analysis on both in a very short period of time. It's not pretty but it works.

Running Basic Hashcat Attacks

The commands to run Hashcat are very different than those to run John the Ripper. There's pros and cons to both methods. File autocomplete works much better with Hashcat's command line and Hashcat does directory inclusion (such as use all wordlists in a directory) better. But John the Ripper's is less position dependent, has a ton of super powerful features for different attack modes on the command line, and quite honestly I'm just used to it more.

The basic command line for hashcat is:

hashcat -a ATTACK_TYPE -m HASH_TYPE HASH_LIST [ATTACK_OPTIONS]

So for a standard wordlist + rules attack you can run

hashcat -a 0 -m 0 uncracked_hashes.txt ../../wordlists/Alter-Hacker_Sorted-Cleaned.txt -r ../../repos/hashcat/rules/d3ad0ne.rule

To break this down:

-a 0: The attack mode. In this case, wordlist + rules. Also supports stdin input if a wordlist is not specified
-m 0: The hash type to crack. In this case it is targeting Raw-MD5 hashes
uncracked_hashes.txt: The list containing all the hashes I want to crack. Hashcat will load everything that looks like a MD5 hash from it.
./../wordlists/Alter-Hacker_Sorted-Cleaned.txt: A common password cracking dictionary/wordlist.

It's one of the bigger wordlists that is not based on pure cracked hashes which isn't 100% filled with junk. It used to be pretty easy to find online. But now that I'm looking for it again most of the links have dried up.
Side note: It used to be hosted on KoreLogic's dictionary list available here. Also, I forgot they hosted a ton of wordlists. I need to check them out again to see if they are helpful in this competition. (Spoiler, these dictionaries were not that helpful).

-r ../../repos/hashcat/rules/d3ad0ne.rule: The mangling rules. d3ad0ne.rule is a pretty decent set to use if you can make a lot of guesses

Running variations of the above attack using standard large dictionaries and a few other hashcat rules cracked a few more MD5 passwords but not many....

One cool feature of Hashcat is that you can specify a directory instead of a wordlist though. So you can use the following command to run a quick set of mangling rules against all of your dictionaries:

hashcat -a 0 -m 0 uncracked_hashes.txt ../../wordlists/ -r ../../repos/hashcat/rules/best66.rule

When running these attacks, the hashes.org-20202 wordlist did the best. It's a super effective wordlist to use in general and can be obtained from hashmob [link]. Side note, I'm not using Hashmob's own cracked wordlists for this blog post since I'm pretty sure the contest hashes were uploaded to them.

Given the limited success of these attacks (a few raw-MD5 cracks aren't going to give a lot of points). There's really three paths that I can take.

I can analyze the cracks and try to construct custom attacks.

THIS IS THE BEST OPTION.

I can run my existing wordlists but have Hashcat auto-generate rules for me
I can start brute-forcing key-spaces with smart masks and Markov attacks.

Side note: Options #2 and #3 are generally the ones picked on real dumps as the individual passwords are only loosely related to each other. Also password crackers (at least me) are lazy.

Going with the lazy options first, let's dive in on how to run them. To auto-generate rules you can use the --generate-rules=X option where X is the number of rules to generate. For example:

hashcat -a 0 -m 0 --debug-mode=5 --debug-file debug.txt --generate-rules=1000000 uncracked_hashes.txt ../../wordlists/hashes.org-2020.txt

When you do this, and I can't stress this enough, enable --debug-mode=5. Also log that info to file using the --debug-file debug.txt option. This will output both the rule that successfully cracks as password as well as the plain-text word. Don't get lazy, and do not skip this option. In fact, you probably should be running that for all your password cracking sessions.

Now you may be asking yourself, why "--debug-mode=5"? It's because the debug info will append itself to the debug-file (vs. overwrite it) and you'll be running a lot of cracking sessions. Going back and remembering which dictionary created which cracked password is super helpful. You want all that info. Why throw that info away with a lower debugging option?

Long story short, if you don't know what to do, a default option can be to generate rules for a dictionary you've had some success with, log the results, and then turn the successful rules into a contest specific ruleset to use with other dictionaries.

But what if your input dictionaries are the problem? That's where brute-forcing small key lengths can be helpful using masks.

Cracking Contest Hashes with Hashcat Masks

I'll admit, I started to go into a long, long diversion about the mechanics behind Hashcat's Masks and Markov optimizations. I really hate calling what Hashcat does a Markov attack and there's a ton of optimizations that Hashcat developers can make to it. But that's totally besides the point if you are trying to crack passwords RIGHT NOW. So I'll save that side tangent for a different post and instead focus on cracking these contest hashes.

Masks are one area where having more computational power makes a huge difference. They let serious cracking rigs just chew through keyspace without requiring much skill or ability from their operators. Contest organizers know this and tend to create passwords that are resistant to un-optimized mask attacks. This means going through the entire key-space for 5/6/7/8 passwords is unlikely to be very successful.

(Not recommended): hashcat -a 3 -m 0 hash_list.txt ?a?a?a?a?a?a?a

As an example of that, I left Hashcat running for a couple of hours brute forcing all ASCII passwords of length 1 through 7 for the raw-MD5 hashes. I didn't crack a single new hash that wasn't caught by earlier runs I had performed with John the Ripper. Going back to my Jupyter Notebook I decided to display password cracks by length, and then also the number of ASCII only (aka no Cyrillic) password cracked by length.

You probably don't have the GPU power to brute force 8-9 character passwords during the contest, and you certainly don't have that for the high value hashes that are worth a lot of points Therefore to be successful in a contest with Hashcat Masks you need to tailor them to find gaps in base-words or mangling-rules that you have already identified. I talked about this earlier with the attacks I ran using John the Ripper in Part 2 of these write-ups. For example, if you were looking to find more base-words for Sales passwords where many of them started with '2023' and ended with a special character, then you could try something like:

hashcat -a 3 -m 0 -1 ?l?u -2 cmiyc_sales_end.hcchr uncracked_hashes.txt 2023?1?l?l?l?l?l?l?2

There's a lot going on in the above command. Let's break this command down by parts:

hashcat -a 3 -m 0

The standard hashcat command targeting raw-md5 hashes (-m 0), and using mask mode (-a 3)

-1 ?l?u

I'm setting a custom mask character set here that includes two built in character sets [?l = all lowercase letters, and ?u = ALL UPPERCASE LETTERS]
In the actual mask you can refer to this custom character set as ?1 (that's the number 1)
You can specify up to 4 custom characters sets for your mask mode [1 - 4]. This is a hard limit. I wish you could do more actually, but that's how Hashcat is programmed.

-2 cmiyc_sales_end.hcchr

Rather than type out the characters for the mask on the command line, you can also save them to a *.hcchr file and read them in.
This is super helpful when you are targeting special characters that just don't play well on the command line and you don't want to mess with escaping them. For example '!,$.
The format for .hcchr files is just all the characters you want to target on the first line. E.g.:

uncracked_hashes.txt

Once again, just the hash-list of the hashes you are targeting

2023?1?l?l?l?l?l?l?2

The actual mask to run. Breaking it down further

2023: Simply starts every guess with the string "2023"
?1: Use the first custom character set. I know, it's hard to see the difference between the number 1 and the letter l. The above uses the number one. In this case it tries all lower and uppercase letters.
?l?l?l?l?l?l: Try 6 lower case characters
?2: Try the second custom character set. This appends common special characters I found when cracking other sales passwords.

That's great, but what if you want to try 5 lower case characters vs. 6. Running these attacks by hand is a pain so it's nice to queue up a bunch of mask attacks at once using a save mask file (e.g. a .hcmask file). Unfortunately, the format is a bit different so let's look at how we can do that next. First, here is the hashcat command line to run a .hcmask file:

hashcat -a 3 -m 0 uncracked_hashes.txt sales.hcmask

You'll notice that all the mask info has been removed from the command line and instead I'm calling an external sales.hcmask file. Let's take a look at what's in that file:

?l?u,!\,$,2023?1?l?l?l?2
?l?u,!\,$,2023?1?l?l?l?l?2
?l?u,!\,$,2023?1?l?l?l?l?l?2
?l?u,!\,$,2023?1?l?l?l?l?l?l?2
?l?u,!\,$,2022?1?l?l?l?2
?l?u,!\,$,2022?1?l?l?l?l?2
?l?u,!\,$,2022?1?l?l?l?l?l?2
?l?u,!\,$,2022?1?l?l?l?l?l?l?2

Breaking this file format down:

Each line defines a single mask to run. Lines starting with '#' are comments.
Each line will be run in order. Generally it helps to put the quick masks first so if you decide to cancel the job you have a better idea of how much key-space you checked.

I know, I didn't follow my own advice in this example...

Each line must define any custom character sets, and unlike with the command line you can't define them in external files.

Each custom character set (up to 4) are specified by putting a comma ',' after them.
In the above example this means the 2 custom character sets are:

?l?u
!\,$

For the second custom character set I wanted to include a comma, which is a problem because it's a deliminator. So I needed to escape it with a backslash. Aka: '\,'
You can read more about the hcmask file format here.

With all of that, I managed to identify a couple more base words to use targeting sales passwords. This in turn allowed me to target higher value hashes easier. The same can be done by targeting known words to find the mangling rules. E.g.:

?d?s,2022Sales?1?1?1

Yes you can also do that with a wordlist and mangling rules, but if you only have a couple of words you want to check it can sometimes be easier to do that with Masks instead. Now if you have a lot of words you want to try, then you can look into Hashcat's "-a 6" (Wordlist + Mask) and "-a 7" (Mask + Wordlist) attack modes. John the Ripper doesn't have this specifically because *cough cough* its rule preprocessor supports masks already in its normal mangling rules. But these attack modes can be very helpful if you are using Hashcat.

One thing you'll notice though with the hybrid -a [6/7] attacks is that you can't mangle or apply masks to both sides of a guess at the same time. Also, unlike with standard wordlist modes (-a 0) you can not pipe a wordlist in to -a [6/7] modes via stdin. This is a problem. The whole reason you are using Masks is probably because you don't know what mangling rules have been applied to the base-word.

The key then is to create custom word-lists that contain one side of the mangling rules. I'd recommend picking the "shorter" of the mangling rules to limit how much you write to disk. This is super annoying, but it works. So for example if you want to append 2022 and 2023 to a word and then append a mask attack you could do something like first creating a word-list containing all the words with 2022 and 2023 appended to them (this only doubles the size of the original input dictionary). In this case I'm accomplishing this by using Hashcat's rules and saving the results to disk. To do that, and the run the resulting full Mask attack, you can use the following commands:

Rule file: append_year.rule (Capitalize word and prepend 2022 and 2023).

c^2^2^0^2
c^3^2^0^2

Generate wordlist command:

hashcat -a 0 --stdout ./sales_words.txt -r append_year.rule

Now that we have a wordlist containing words like 2023Sales, run the mask hybrid attack:

hashcat -a 6 -m 0 -1 ?l?u uncracked_hashes.txt ./sales_words.txt '?1?1?1

Is all of this a pain? Absolutely! But it can be very effective so it's usually worth creating these temporary wordlists for your attacks and then combine them with masks.

Hashcat Association Attacks (Getting Big Points with BCrypt)

As mentioned earlier, the whole reason to try different "spray and pray" attacks against fast hashes is to crack enough to identify how the passwords were created and develop highly targeted attacks against expensive and high value hashes like BCrypt. The mangling rule that received the most post-contest conversation among all of the teams was that several users' passwords were their creation time (found in their metadata) converted to Unix epoch timestamps.

Creating a wordlist of all the various timestamps is certainly one way to go, but what we really want to do is crack bcrypt hashes. This is a perfect opportunity to talk about association (-a 9) attacks in Hashcat. Association attacks take one word per hash and target that hash with it. The word in association attacks can be combined with rules as well. This is a huge improvement when targeting a large number of salted hashes where you may have some idea what the plaintext for each account might be.

To perform an association attack you need to create a hashlist of the hashes you want to target, and then have a 1 to 1 mapping to a wordlist you want to target those hashes with. So for example you might have two files:

HashList.txt:

user1:$2a$:<rest of the hash here>
user2:$2a$:<rest of the hash here>
user3:$2a$:<rest of the hash here>

Wordlist.txt:

Word1
Word2
Word3

For this particular challenge I created the wordlists + uncracked bcrypt hashlist using the following python script in Jupyter Notebook:

Next, let's run some attacks. First, let's just do a quick naïve attack using (-a 0) and the timestamps as a normal wordlist.

Running this attack for an hour and a half isn't the end of the world. But this is a contest. You are a busy hacker. You have hashes to crack and other wordlists to run. Let's try Hashcat's association attack. Here is the command I ran:

hashcat -o cmiyc2023_hc.potfile -a 9 -m 3200 bcrypt_datetime.txt unix_timestamps_bcrypt.txt

ONE IMPORTANT THING TO KNOW: By default '-a 9' mode will not save to your standard .potfile. So if you want to capture these hashes you MUST specify a potfile on the command line using the '-o FILENAME' option. I learned this fact the hard way when none of my cracks were showing up. I asked some Hashcat developers about this and they said there's still some "weirdness" with '-a 9' mode. For example, it will "recrack" hashes you have already cracked and post duplicates cracks/plaintexts to your potfile. So if you are running this attack it is probably good to run it on a new potfile vs. your global one, and then merge the new cracks back into your main potfile after the fact.

And here's the results:

Over 100 Bcrypt hashes cracked in a couple of seconds! That's super fun. As some backstory, association attacks are amazing if you have known passwords for users. Aka you obtained passwords from a different password dump and you are attacking the fact that users re-use password between multiple sites. Leveraging association attacks, you can run common mangling attacks against those known passwords to crack computationally expensive hashes for a subset of users.

Cracking Multi-Words With Hashcat

The next area to focus on is multi-words and phrases. Korelogic gave out a hint during the contest that several of the Engineering passwords were created from phrases taken from sci-fi books and movies, with the number '1' appended on the end [Link]. This can be seen in some of the cracks I made earlier:

Going back to the hash breakdown by department, Engineering is also a huge department to target:

The approach here then is to crack as many hashes as possible with fast hashing algorithms to try and figure out the source materials. Then we need to target high-value hashes in the engineering department using phrases from those source materials. Basically dumb, untargeted attacks first, then smart attacks later. Let's start with those dumb untargeted attacks!

At a high level this looks like a Correct Hose Battery Staple problem. To target that, let's try all the common English words in two and three word phrases and add the number '1' to the end. For a dictionary we can use the following corpus which contains various word-lists of 10k English words sorted in probability order [Link]. The first really "just get it to work" option I selected was to write a quick python program that loops through the word-list and outputs possible phrases while appending the number 1 to them. I then used the fact that if you do not specify a dictionary, Hashcat's '-a 0' mode will read in words from stdin. So I can run my attack using the following command:

(Editor note: This option is bad. Keep reading for a better one) python3 word_combinator.py | hashcat -a 0 -m 0 uncracked_hashes.txt

This wasn't pretty, but it did crack a number of hashes. Still, my guess generation was super slow as it is running a slow python script and then pipes those guesses into hashcat (piping guesses is also slow). Raw-MD5 is fast to compute. Basically this option wastes a lot of time and limits the key-spaces I can search. How about we speed this up using Hashcat's combinator attack?

Hashcat's combinator attack '-a 1' allows you to combine two dictionaries together to target multi-word passwords. For example, let's assume you have the following two word-lists

dic1.txt

fluffy
scary
cuddly

dic2.txt

If you run the following command:

hashcat --stdout -a 1 dict1.txt dict2.txt

You'll get the following output:

fluffycat
fluffybat
fluffyrat
scarycat
scarybat
scaryrat
cuddlycat
cuddlybat
cuddlyrat

You can also apply one (AND ONLY ONE) rule to each dictionary if you want using the '-j' (applied to left word list) and '-k' (applied to right word list). So for example if you use the following command:

hashcat --stdout -a 1 -j '$ ' -k '$1' dict1.txt dict2.txt

It'll create the following guesses

fluffy cat1
fluffy bat1
fluffy rat1
<you get the idea>

As reference the '$' rule appends a character to the end of a guess. So '$ ' appends a space, and '$1' appends a '1'. I think you might see where this is going....

The problem is, this works great for two word phrases. But what about three and four word phrases? I wish I knew of a better solution, but the short answer is I hope your cracking system has some free hard-drive space! You can only use combinator with two input dictionaries, and you can't pipe in guesses into hashcat if you are using '-a 9' mode. The fastest option then is to create a word-list of all two word phrases. If you don't want to write a custom program to do this, you can always use hashcat and pipe the guesses to a file. For example:

hashcat --stdout -a 1 -j '$ ' english_words.txt english_words.txt > two_wordst.txt

Then to try three words you can run

hashcat -m 0 -a 1 -j '$ ' -k'$1 ' uncracked_hashes.txt two_words.txt english_words.txt

To try four words you can simply run

hashcat -m 0 -a 1 -j '$ ' -k'$1 ' uncracked_hashes.txt two_words.txt two_words.txt

Side note, I also has success by capitalizing the first letter by changing the -j rule to:

-j'c$ '

This attack yielded a ton of cracks. Looking through them I started trying to find "unique" and "odd" phrases to try and figure out where the source material came from. This is because while the above attack works great against fast hashes like raw-md5, they will not scale against slow hashes like Bcrypt. We need to further optimize our attacks. Given that, here is a subsection of my cracked passwords:

Most of these phrases were spectacularly unhelpful. But some of them stood out such as 'watch your food'. Running a quick google search on that + the "scifi" highlighted Project Hail Mary [link]. That was a book I loved and hated in equal parts so it brought up a number of mixed feelings, but it certainly seems like a good candidate. The challenge is that the book isn't in the public commons. Still, let's try and create a dictionary of quotes copied from that article.

Next step was to create a janky Python program that would output all 2, 3, and 4 word phrases from the book paragraphs I had found. I know janky Python programs are slow, but so is cracking Bcrypt hashes. In this case it is better to minimize the number of guesses I make vs. focusing on how fast those guesses are generated.

Side note: I apologize for putting this as a screenshot. I really wish Google's blogger had a code insert option...

Running this through hashcat again yielded a new cracked hash!

That's also a pretty unusual phrase, so I have high confidence that Project Hail Mary is one of the sources for the plain-texts. Let's try this against the bcrypt hashes!!!!!

Annnnd nothing cracked.......

This was disappointing, but it's probably because I was only using two paragraphs from the book. I need to find a better source to grab quotes from.

Let me take a step back and say, this workflow loop is one of the keys to this contest. If the cracked fast hashes (raw-md5, raw-sha1, etc) are any indication, around 1/3rd of the high value hashes are phrases taken from books and movies.

Key workflow for CMIYC 2023:

Find the source material for passphrases by analyzing your cracks against fast hashes
Create input dictionaries by scraping webpages of book and movie quotes and screenplays
Run those input dictionaries against the slow high-value hashes.
Repeat

The problem for me is that workflow is manually intensive, time consuming, and quite frankly boring as hell. During a competition it can be fun to get that dopamine hit as you crack new bcrypt hashes. After the contest, I'm simply wasting time while running up my power bill. So the question is, can I automate this at all? My power bill will still be high, but at least then I can watch new episodes of Asohka vs. staring at my computer screen! How about I train my PCFG guess generator on cracked passphrases and let it crunch away at generating guesses? I mean, it worked for the Hashcat team! [Link].

There's various ways to create the training set, but given how Korelogic generated these passwords, and the plain-text values I was seeing, I just threw everything that had a 'space' into a training file using the following command line:

cat cmiyc2023_.potfile | grep ' ' | awk -F ':' '{print $2}' > passphrase_cracked.txt

I know, I could have done the word-list generation much better as a short python script in my Jupyter Notebook, but I got places to be and Starwars episodes to watch! Now that I had a good training set, I then trained a PCFG grammar on it using the following command:

python3 ../../repos/pcfg_cracker/trainer.py -c 1 -r CMICY23_Passphrase -t passphrase_cracked.txt

I set coverage (-c) to be 1 so the PCFG guesser will not generate any brute force (OMEN) guesses. I then gave this attack a test run against raw-sha256 hashes using the following command:

python3 ../../repos/pcfg_cracker/pcfg_guesser.py -r CMICY23_Passphrase | hashcat -m 1400 -a 0 uncracked_hashes.txt

And.... Yup this looks promising:

Let's see how it does with Bcrypt using the following command:

python3 ../../repos/pcfg_cracker/pcfg_guesser.py -r CMICY23_Passphrase | hashcat -m 3200 -a 0 uncracked_hashes.txt

Success! Limited Success!

There is still a ton of optimization I could do. You'll notice I haven't re-added / merged my potfiles in from the previous cracking of the Unix Epoch timestamp hashes. I also am targeting all of the Bcrypt hashes vs. just the ones in the engineering department. By reducing the target hashes I could easily double the speed of plain-text guesses I am making against the target hash list. I also don't want to give the false impression that this is the best attack method for these hashes. It's not. You would be much more successful by trying to find the source material and creating custom word-lists from that. What this attack workflow has going for it though is it is one of the most automatable options. You can let this run while trying to figure out better methods. Or... you can go do something else besides crack passwords. Call you parents maybe? I'm sure they would appreciate it!

I think this is a good spot to end this blog post. Looking back at it, I somehow managed to cover every attack mode in Hashcat. There's still more techniques to dig into, and there's a ton of uncracked hashes left in this contest. But I might leave that for a future post. If you have any tips, suggestions, or comments, feel free to leave them in the comments. Good luck, and I hope to see everyone at CMIYC 2024! Also thanks once again to the KoreLogic team for putting together such a great contest!

Using JupyterLab to Manage Password Cracking Sessions (A CMIYC 2023 Writeup) Part 2

2023-08-22T10:20:00.003-07:00

“Tools?" scoffed Kalisti, "Tools are for people who have nothing better to do than think things through and make sensible plans.”
― Laini Taylor, Muse of Nightmares

When we left off in Part 1 of my CMIYC2023 Writeup, I had cracked a measly 437 passwords. Yes I had a Jupyter Notebook set up to perform analysis, but what I really needed was more cracked passwords to do analysis on. To that end, I started off doing some basic exploratory attacks very similar to what I detailed in previous competitions [Crack the Con, CMIYC2022].

These included running JtR Single Mode with the RockYou, dic-0294, Alter_Hacker, and the sraveau-Wikipedia wordlists. Basically these attacks are about as dumb and untargeted as you can get. But they are also easy and quick to run against fast hash types. And they can be helpful! The Wikipedia wordlist in particular highlighted that Cyrillic passwords would likely play a role in this competition.

Running an attack using a Russian wordlist (from here) didn't yield many additional cracks though. Doing some Googling, it looks like some of these are Ukrainian words, not Russian words so I tried this dictionary as well [link]. One thing though is I was able to see all the usernames are also Cyrillic. This will probably be useful to target tougher hashes.

Around this time, I also created a second Notebook in JupyterLabs to automate some of the common tasks that I am always doing. For example, running loopback attacks against fast hashes using previously cracked passwords is a great source of new cracks. So that's a good task to automate since I'm constantly doing that in the background.

I also created cells for similar activities that I'm constantly doing such as creating a single list of all the plaintext cracked passwords I can use for training. This in turn allows me to quickly create a PCFG training set on the full list for cracking fast and medium-speed hashes. Aka:

python3 trainer.py -r CMIYC2023 -t ../../research/cmiyc/2023/all_plains.txt
python3 ../../../github/pcfg_cracker/pcfg_guesser.py -r CMIYC2023 | ../../../github/JohnTheRipper/run/john --format=raw-sha256 -stdin raw_sha256_hashlist.txt

This also netted me my first two bcrypt cracks

Since I'm doing this after the contest and can't submit cracks. I also created a quick "scoreboard" in Jupyter to estimate how I'd stack up to the other street teams:

I want to stress again. I'm cheating. These teams actually competed in the competition. I'm leisurely sitting down for short cracking sessions while writing this blog post. Which is another way of saying my numbers are even worse than they appear ;p

Well, I need to do something about that! Next step is to look through my updated meta/cracked list and try to spot patterns:

I now have cracks for every department, but it doesn't seem like the individual departments follow a set pattern. There's a couple of different ways to go from here. The first thing to do is create some custom rules to match patterns that I'm seeing in the plaintexts.

Running the rules above and several others yielded a few more cracks. The other thing I did was create some custom PRINCE wordlists from the cracked passwords using PRINCE-LING from the pcfg toolset.

I created both a full wordlist (as seen above) and also wordlists containing only 500 values for targeting slower hash-types. Using Prince attacks then identified a few more rules that yielded additional cracks.

Another area seemed to be to target non-ascii usernames with non-ascii guesses.

Side note: If you ever have to identify non-ASCII characters using Python the following check is highly effective as it strips non-ASCII characters and then sees if the word shrunk:

if len(key.encode("ascii", "ignore")) < len (key):

In addition, I later added in the GivenName and SurName fields from the metadata (not depicted above) which helped a lot too.

As we're continuing to throw things against the wall, let's try and build more wordlists from all the metadata. Many of the companies seem to be two words concatenated together. Let's strip them out and break them up.

And ... using dictionaries based on the company metadata was totally unhelpful. I did not get a single additional crack using those dictionaries.

After all of that, let's check where I'm at with my score:

I really need to switch things up because that's not even close to respectable. Looking through the list of cracked passwords no new patterns stood out, but then I decided to map out my progress targeting each department in my Jupyter Notebook.

OMG. This hadn't been apparent as I was looking through the previous view of cracked hashes since I forgot how many more IT hashes there were than any other group, but what I had been doing is basically cracking Sales passwords. I had that pattern down pat, and my PCFG cracker was trained on mostly Sales passwords. How about I run a PCFG attack, trained on CMIYC2023 plains against Bcrypt Sales-only hashes? To create the hashlist using Jupyter was easy:

I then ran a pcfg attack against the sales-only bcrypt hashes using the following command:

python3 pcfg_guesser.py -r CMIYC2023 --skip_brute | john --stdin --format=bcrypt sales_hashes.txt

That was super effective!

I want to stress, this was all using John the Ripper on a single laptop. And these are BCrypt hashes I'm targeting. They are super slow and annoying even with better cracking rigs! After about an hour of cracking time my score had significantly increased.

I then started running the same attack against other hash types such as sha256crypt and had similar success.

Next I realized I needed to build up my dictionary. To do this I took a common prefix (2023), added a '!' to the end, and used JtR's Mask mode to exhaust alpha characters for the remaining letters, capitalizing/lowercasing the first letter and having the rest lowercase

john --mask=2023[a-Z]?l?l?l?l?l?l! --format=raw-md5 raw_md5_hashlist.txt

I did several variations of the above and found a few new base words but not many. I then re-ran the PCFG trainer to update my sales ruleset and then re-ran my attacks to net a few more hashes cracked. Still, there was certainly room for improvement and I was semi-happy with my wordlist so the next thing to do was look for new mangling rules. To that end I re-ran JtR's mask attacks with a known plaintext word "Sales". For example:

john --mask=?a[Ss]ales?a?a?a?a --format=raw-md5 raw_md5_hashlist.txt

This didn't yield more cracks so I was very kerfluffled about that. There's obviously a good chunk of passwords I'm still missing in this group. Still by running my PCFG attack again (retrained with the newly found base words) against the "Sales Only" tougher hashes had a big impact on my score.

I know these calculated scores have real "if the game lasted one more round I would have won" little brother energy. I have a ton of respect for everyone who did compete in CMIYC and want to stress if I was doing this live, in Vegas, with a million other things going on I would have done much, much worse. But it's helpful to know that without solving any of the "real" challenges in the contest there was still ways make progress with limited cracking resources.

Conclusion:

I think this is a good place to wrap up this blog post. All these attacks were run on my laptop with John the Ripper, and I think it shows a base level of ability that anyone can do. I also hope this highlighted the value of using Jupyter Notebooks, not just for password cracking, but in any data analysis task you might find yourself doing.

I'd specifically like to thank the KoreLogic team for running yet another great contest! I had a ton of fun digging into it and even more fun talking to everyone at their booth at Defcon! These contests require a ton of work to set up and they help the community so much so I really appreciate all the work their team puts into this.

One thing I'd like you to walk away from these entries is how useful a tool JupyterLab can be for all your data analysis tasks. It was a long time before I started using it myself, and I'm constantly surprised by how easily it integrates into my workflows and how much more productive it makes me. I highly recommend checking it out, even if you aren't cracking passwords.

I have a lot of ideas of where to go next. I'm tempted to write a follow-up blog entry that goes full spoiler into this contest, looks at how the plains were generated and then uses Hashcat on a more powerful machine to show how to target them (such as using Hashcat's amazing -a 9 association mode). I also have some improvements I need to make to my PCFG toolset based on feedback from other people using it during this contest. Finally I want to clean up my Jupyter notebook that I created for these posts and make it available on Github. When I finally get around to that I'll post a link on this blog entry. These were fun entries to write, and this was a great contest to (belatedly) participate in. Thanks again to KoreLogic and congratulations to all the teams that participated.

Using JupyterLab to Manage Password Cracking Sessions (A CMIYC 2023 Writeup) Part 1

2023-08-19T21:20:00.009-07:00

“We become what we behold. We shape our tools, and thereafter our tools shape us.”
-- Marshall McLuhan

This year I didn't compete in the Defcon Crack Me If You Can password cracking competition. It was my wife's first Defcon, so there was way too much stuff going on to sit around our hotel room slouched over a computer. But now that a week has passed and I'm back home, I figure the CMIYC Street Team Challenge would be a great use-case to talk about data science tools!

Big Disclaimer: I've read spoilers from other teams and have participated in the post-contest Discord server. I'm totally cheating here. The focus is on how you can use JupyterLab to perform analysis while cracking passwords. Not my problem solving skills (or lack there-of).

Initial Exploration of the Challenge Files:

The CMIYC challenge file for street teams is available here. It's a pgp encrypted file so the first thing to do is decrypt them with the password KoreLogic provided.

gpg -o cmiyc_street_01_2023.yaml -d cmiyc-2023_01_street.yaml.pgp

Looking at the file in a text editor, you can quickly see that at first glance it appears to be a yaml file.

Of course, you shouldn't trust anything that the contest organizers throw your way! Next up is to validate the yaml format and see if there is anything obviously wrong with it. A quick way to do that is using yamllint. To install and run yamllint:

pip install yamllint
yamllint cmiyc_2023_01_street.yaml

And the results are .... ok there's a lot of long lines....

Luckily you can easily tune any of the checks that yamllint performs. To hide these errors you can set the max line length to 130 and run yamllint again using the command:

yamllint -d "{extends: default, rules: {line-length: {max: 130}}}" cmiyc_2023_01_street.yaml

This time, the file validated without any warnings. So it looks like the CMIYC challenge file is a valid YAML file. That doesn't mean that there isn't anything sneaky in it, but it makes data parsing a much easier task.

Next, let's quickly glance at the yaml contents. Opening up the file again, I see that it has 260424 lines. But each user entry has a variable number of fields associated with it. To get a quick idea of how many hashes I'm dealing with I used grep on PasswordHash. I then did a quick grep to see how many users there were by leveraging the fact that the YAML secondary categorty will start with a " - ".

Luckily the two numbers matched so that means there I'm looking to crack at least 29,847 password hashes. It also means that every user probably has one password hash associated with them.

So now we have the file, and looked around a bit, it seems like it's time to extract the hashes and crack some passwords! My default "Quick and Dirty" approach is to write a short awk script such as the following:

cat cmiyc_2023_01_street.yaml | grep PasswordHash: | awk -F": " '{print "1:"substr($2,2, length($2)-2)}' > greped_password_hashes.txt

The problem with this approach is that it dumps all the hashes into the same file, doesn't separate them by type, and I lose all that user and metadata associated with them. The loss of metadata is a real problem since I suspect it will play a very important role in actually cracking hashes for this contest. I'd really like to have a better way to create hash-lists and manage cracking sessions! This leads us to the next section of this writeup!

Creating a JupyterLab Notebook:

JupyterLab notebooks are a way to organize and document all the random code you write while analyzing data. The name Jupyter stands for the programing/scripting languages it supports: [Julia, Python, R]. I think a better description of JupyterLab is that it's a stone soup. If you are on your own and doing a task only once, then it doesn't really add a whole lot. You're just drinking muddy water and it's a lot of extra pain to set it up and use it. The thing is, you are vary rarely on your own, and almost no task is done only once. Heck, I've probably only written one hello world program from scratch in my life. Every other program I've worked on since then I've copied off previous efforts. The documentation JupyterLab provides makes it easier to remember what you've done and build upon it for future efforts.

Long story short, I've never regretted starting a Jupyter Notebook. Somehow that soup is full of delicious ingredients at the end of the day!

Installing Jupyter is super easy. I primarily use Python (I really need to start moving into R), so all I need to do is install it via pip:

pip install jupyterlab

To run it locally you can start it up with the following command and then connect to it with a web-browser:

jupyter lab

Wait! Web-browser?! Yup, it runs as a server-side application which makes it very easy for teams to collaborate using it. Enabling remote access requires a few more steps (such as configuring authentication) which I'll leave for you to Google yourself (the documentation to do this is not great). For this tutorial I'm going to stick with local access.

Starting up Jupyter in the directory for this challenge, initially it's pretty boring. It's just a stone sitting in the bottom of an empty cauldron. I can see the YAML file and open it up, but even then, by default Jupyter doesn't have a lot of built-in functionality to start carving it up.

Things start getting a *little* more interesting when you go ahead and create a Notebook. A Notebook lets you combine Markdown blocks with code blocks that you can execute. Basically it's a wiki where rather than post code snippets you post programs that you can run and save their results.

Ok, so that's great and all. It's a fancy wiki. But time is ticking and we still haven't extracted those password hashes and started cracking them yet! Let me get off my data-scientist high horse and say, by all means, take a moment to use a messy awk/grep script, create a hashlist, and start your systems running default rules against the faster hashes. But once those GPUs are cracking lets come back to this Jupyter Notebook. The first question that is useful to answer is what types of hashes do we need to crack? Now, for this competition the hash types are easy to figure out since KoreLogic posts them on their score-page:

The question is, how many hashes are of each type? One option is you can load them up in your cracker of choice and see which ones get processed. In fact, that's what I did initially, and it "works". But it's still nice to visually see the breakdown as well as understand how many points each category is worth. So let's write a quick Python script!

The first thing to do is load in the yaml file. Jupyter Notebooks are based around the concept of cells. Each cell can contain markdown or code, and can be executed independently but the results persist until they are re-run. I know this is confusing and I apologize, but let me try to explain this with an example. I'm going to make the scrip to load the Yaml file into Python its own cell. This is because this operation takes a bit (it's a big challenge file). This also brings up one of the huge advantages of using Jupyter and what makes it more than just a "fancy wiki with code". It's that variables are saved once a cell is run. This means I only need to load that data file up once, and I can then access it from code snippets in other cells as I advance my analysis of this file.

Cell layout, breaking up your code, running cells in the correct order. These are all issues you'll encounter as you use Jupyter Notebooks more often. But the key here is I don't want to write my entire analysis program at one time. I don't know what I'll encounter during this challenge. But Jupyter saves my execution environment so time intensive tasks like this only need to be run once (or until the underlying data changes). To demonstrate this, let's access that data and try to figure out the breakdown of hash types in a different cell.

Now I have a count for each hash type, and I can see the hashes are fairly equally distributed. While the hash types are roughly equally distributed the total points per hash type are not. One bcrypt hash is worth roughly 16 million times more points than a raw_md5. This highlights the key to this contest is to find patterns by cracking fast hashes, but then focus on cracking the slower high-value hashes. Aka the fast hashes on their own are basically worthless from a point perspective, but cracking them can allow better targeting of high-value hashes.

Side note: The top street team (Hashmob Users) cracked 500 bcrypt hashes. Almost twice as much as the next nearest player/team. But only around 15% of the total possible bcrypts.

The next step is to make better hash_lists so we can actually start cracking effectively. As I mentioned earlier, I already parsed out the data so all I need to do is to create a new cell that saves the contents to disk.

At this point, I want to stress, it's really time to stop messing with this notebook and start focusing on cracking some hashes. Let's take a break and run some default cracking sessions against the raw hashes (md5, sha1, and sha256).

For the cracking sessions I simply ran the default John the Ripper attack (default dictionary, rules, and incremental mode) for a couple minutes each on my laptop. Aka:

john --format=raw-sha256 raw_sha256_hashlist.txt

Unsurprisingly this was not very effective, cracking a total of 437 passwords across the three raw hash-types. This is where the CMIYC contest really starts. Next step usually is to start running more complicated attacks, look at the cracks to identify base words and mangling rules, and build upon that. And if I was really competing in this competition that's exactly what I'd do. But as those attacks are running let's go back to JupyterLab and see if we can optimize how we're analyzing those cracks.

To analyze the cracks we need to see which passwords we cracked. John the Ripper has a great option called "show" which allows you to give it a hash-list and it'll output all of the hashes it has cracked. Side note, if you run "show=left" it instead output all the hashes it *hasn't* cracked.

But wait, it looks like I forgot something when I created my hash-lists since JtR's show option does not work with my raw-MD5 and raw-SHA256 hashes....

The problem is in how I created those hash-lists since JtR's show option isn't that smart. It needs to be told explicitly what the hash-type is and those hashes are ambiguous. Now I could specify the hash-type on the command line, but for future analysis I want to look at cracked hashes across multiple hash-types so it's easier to recreate the hash-types with the proper hash designator, such as "$dynamic_0$" for raw-md5, included in them.

This is actually good since it provides a learning example on how you can update your code in jupyter (That's how I'm going to spin this oversight anyways ;p). First let's modify the parsing code to add the required fields to the hashes.

I then re-ran this cell, and after that, re-ran the cell that wrote the hashes to disk for cracking. Once I did this the john --show option worked great for the raw-md5 and raw-sha256 hash-types.

One key thing I want to highlight is you can re-run each cell independently of each other, so if you want to make a quick modification you can without having to run the whole notebook again. What you need to keep in mind though is all the variable are global so the order you run your cells in is very important. Aka running one cell can change the variables that other cells use. So this is a very powerful feature of Jupyter notebooks allowing you to quickly tweak your code. But it's also a very dangerous feature so you need to be a bit careful when using it. If at any point things get wonky you can instead re-run all your cells in the notebook from scratch to reset the global state of things. Ok, enough harping about that, but spoiler alert, I'm going to be tweaking my code a lot as things progress.

While I can perform analysis of the cracked passwords, what I'm really interested in how the metadata associated with each account matches up to the cracks. How about we do a quick analysis of the variation of the company and department metadata?

Looking at the company info ... it looks like the companies themselves are pretty random.

Two companies stand out though. Let's dig into this and look for companies that have over a hundred users:

So of all the companies, GHosting and Dandy might be ones to dig into more later.

Now let's do the same for departments:

Now this is something I can work with! Next step, let's see how our cracks break down by department. To do this, we need to import our cracks back into JupyterLab.

You'll notice in the second cell I also realized I needed a quick lookup based on username so I added that in as well. Now that we have the plains we can start printing out cracks based on Department.

There's more cracked passwords of course that I'm not showing since the picture would be too large, but the following password makes me feel personally attacked...

I kid as I know it wasn't intentional :) One thing that I should mention though is these lists can be easily updated as I crack more passwords. All I need to do is re-run these cells in the Notebook. Looking at the plains, you can start to see that certain mangling rules and dictionaries start to pop out as areas of future exploration.

Backing up though, there probably is some other metadata that I could use to further refine attacks on these lists. In Team Hashcat's excellent writeup (available here) they talked about an enhancement to their team collaboration tools called "Metaf****r" that displayed all the metadata next to the plaintexts. Can we replicate this in JupyterLab? Absolutely!!

I'm going to end this post here as I think this starts to show the value of JupyterLab Notebooks. I know, I didn't really crack that many hashes! For my next post I'll leverage this Notebook to actually create wordlists + mangling rules to target hashes and start to show the real value of data analysis when performing password cracking attacks.

More Password Cracking Tips: A Defcon 2022 Crack Me If You Can Roundup

2022-08-21T22:16:00.002-07:00

“We do not learn from experience... we learn from reflecting on experience.”

-- John Dewey

Introduction:

KoreLogic's Crack Me if You Can (CMIYC) is one of the oldest as most established password cracking competitions. Held every year at Defcon, it serves as a great way to pull together password enthusiasts from all over the world and provides a shared use-case that drives password cracking tool development throughout the rest of the year.

This year I competed as a street team and managed to finish in 12th place:

Now that I've had a week to look back on things, there certainly are strategies where I could have done better. The first is with my cracking setup. I had two systems I used. My primary cracking system was still my laptop running an Ubuntu VM utilizing WSL on a Windows 11 install. My secondary system was the computer I described setting up in this blog post.

Primary Laptop:

CPU: i7-8640U CPU
RAM: 16 GB
Storage: 500GB SSD

Desktop Computer:

CPU: Intel i5-7600k, 1 processor; 4 cores
RAM: 16GB
Storage: 500GB SSD
GPU: GeForce GTX 1070

I really didn't do a good job of splitting my work between both these systems and making sure that my limited GPU was always working. For example, I had a bad habit of running JtR sessions on my desktop computer. Long story short, one week later I have a lot of ideas for future projects to improve my cracking skills, and I'm super excited to start working on them, which is the real benefit of competing in contests like this. Rather than go through a blow for blow recount of the contest, I'll instead try to highlight a couple of tips and lessons I learned along the way.

Core Contest Techniques:

Before diving into this write-up, I HIGHLY recommend reading my previous write-up for the CrackTheCon contest which is available here.

I'm going to skip most of the techniques covered there, but I will say they all applied to the KoreLogic contest as well. It really surprised me how much I referred back to that article when I was competing in this contest.

Contest Overview:

At a high level the contest consisted of cracking a variety of encrypted files, each of which would have individual hashes to crack. For the street teams, the password to crack the encrypted files were fairly simple, so the real challenge there was getting your tooling setup properly to handle those files.

Once the encrypted files were cracked, the unencrypted files could be opened up to reveal a set of very quick to compute hashes. As someone who doesn't have a lot of compute resources to throw at the problem, I really appreciated the fact that the hashes were so fast! Cracking these hashes was all about trying to figure out the base words used to construct them, as well as the mangling rules that were applied. One thing I will say is that the selection of mangling rules Korelogic picked made "loopback" style attacks significantly less effective than the CrackTheCon contest. Don't get me wrong, loopback attacks were still very powerful! But as a player I really needed to analyze the passwords and figure out the underlying mangling rules vs. using loopback as a crutch.

Long story short, I thought that KoreLogic outdid themselves when it came to creating a fun challenge. I thought the contest had a good difficulty scaling to make it approachable to a wide variety of players while still providing areas of growth and frustration to more experienced players.

Tip #1: Make use of John the Ripper *2John utilities to crack encrypted files

Password cracking programs don't need to use the entire encrypted file. Just think about it; Would you really want to try to have you cracking program parse a 100 GiB file every time it makes a guess? What cracking programs really need is a "hash" to make a guess against. To extract that "hash", and to save it in a format that password cracking programs can utilize, John the Ripper comes with a large selection of helper programs in the /john/run/ directory which are identifiable by the '2john' suffix. You can see this below:

The main challenge is to figure out which helper program you want to use. For example, here is me running pdf2john to extract the password hash from the list23-ThisYearsWorst.pdf challenge:

Rather than having it print out to your screen, I'd recommend piping the output of this into a file which you would then load in as the target hash file for your cracking program. One important thing: If you are cracking multiple encrypted files at once, you can store all of these hashes in the same file, just like with any other John the Ripper hash format. Many of these hashes are also supported by Hashcat too, so once you extract them using the 2john helper utilities, don't feel like you have to stick to using John the Ripper to crack the hashes.

Tip #2: Make sure you compile John the Ripper with all the optional libraries to enable cracking encrypted files.

One downside about the flexibility that John the Ripper provides by being able to compile and run it on just about anything, is that it will gladly compile without certain features and cracking modes being enabled if you don't have the correct libraries present when building it. This can be very hard to diagnose after the fact beyond a "For some reason JtR doesn't seem to recognize a particular hash type" style errors.

This happened to me in the previous CrackTheCon contest where I couldn't get John the Ripper to crack an encrypted Zip file. Luckily for this contest I realized what was going on and was able to fix it, but I really need to update my JtR install instructions here with the new information.

That being said, here are all the additional libraries I needed to have before running './configure' to build John the Ripper (with Ubuntu 18) to enable support for cracking encrypted files used during the CMIYC contest:

sudo apt-get install libz-dev, bzip2, yasm libgmp-dev libpcap-dev libnss3-dev libkrb5-dev pkg-config libbz2-dev zlib1g-dev libcompress-raw-lzma-perl

Most of these libraries were specifically for cracking the 7zip and zip files. The perl library was for being able to successfully run 7zip2john.pl.

What this process really highlighted for me is that I really should create an Ansible Playbook to configure a system to run John the Ripper. Going back through all my write-ups in the past to figure out the different dependencies is no fun, and causes a lot of problems when I accidently miss one of them. Unless I get distracted, watch this space as I'll probably end up posting about that Ansible playbook here, and posting it to github.

Tip #3: Save your John the Ripper rules in an external file

Let's face it, John the Ripper's default config file has grown way too large and unwieldy to effectively edit during a password cracking competition. Instead, I highly recommend including your custom rules in an external file to make it easier to quickly find the rules you want to edit or modify. Another advantage of this approach is if you upgrade your copy of John the Ripper, and the config file changes, your old rules will still be saved.

The first step to do this is to include a link in your john.conf file to your custom .conf file by inserting the line:

.include <FILENAME_OF_YOUR_CONFIG_FILE>

Here is a snapshot of my john.conf file I used for this contest:

And here is a subset of the rules in my custom "cmiyc.conf" file for targeting challenge 20 hashes:

You'll notice I still have individual rule sets in my custom configuration. This way I can perform quick cracking runs to figure out new rules, (or pipe the output to other John sessions. See Tip #4), and then have longer runs to perform on new dictionary words that I later identify.

Tip #4: Use the --stdout and --pipe options to combine multiple cracking rules

In the screenshot above of my rules for targeting challenge 20, you'll see similar blocks of rules where the only different is the first mangling rule, (either nothing, 'c', or 'u'). 'c' stands for Capitalizing and 'u' stands for UPPERCASE. The proper way to handle this would be to leverage John the Ripper's rule preprocessor to try combinations of different rules. The rule preprocessor is one of those killer features that JtR has but Hashcat doesn't. For example you can try multiple rule types, (such as capitalization and uppercasing), by including them between brackets []. For example:

[cu]

Here is a screenshot of that in action:

Still, there are times when you have a larger set of rules you quickly want to apply one or more additional mangling rules to. One of the easier ways to to this is to pipe one instance of JtR or Hashcat into another instance of your cracking program of choice.

The format for doing this with both JtR and Hashcat is slightly different. With JtR, the base generating instance will have the '--stdout' flag in place of a hashfile. You can then pipe '|' the results into another JtR instance that has the '--pipe' flag instead of a wordlist. Note: You will want to use the '--pipe' command and not the '--stdin' command so that the rules of the second instance are applied to every word sent to it. For example:

You can also pipe guesses into Hashcat instead of John the Ripper. This is a very powerful technique because you can take advantage of John the Ripper's rule preprocessor, (or features such as its better Incremental Markov mode, or built-in Prince mode), but still have Hashcat take advantage of your GPUs when cracking hashes. All you need to do in Hashcat is not enter in a wordlist file and it will automatically accept guesses from stdin. This tends to work better if you also have a large number of mangling rules in Hashcat to help keep those GPUs of yours busy since you want to limit the amount of time transferring information from your CPU to the GPU. Aka if you can transfer a limited number of base "words" from the CPU and expand them via additional mangling rules in the GPU, you'll achieve a higher guess per second rate. Below is a screenshot of using this approach. Ignore the '--force' option as I took the screenshot on my laptop vs. my desktop which I normally run my Hashcat sessions from.

Tip #5: For password cracking competitions, perform web searches on "interesting" words

This was the piece of advice I wish I could build a time machine and send back to my past self. I really didn't do a good job of this during the contest. This is despite the fact that Saturday night I finally googled some of the words for challenge #20 and found that creating wordlists from articles discussing a high schooler hacking the Homecoming queen prom vote were extremally effective. In fact, I had the biggest jumps in my score thanks to finding those articles.

This is an area ripe for tool development. Admittedly it likely won't have much real world applications. But for contests, having a tool or process to automate the identification of sources of wordlists would be super helpful. In my head, the tool would take the following approach:

Use the PCFG trainer to create an input wordlist of the base words in cracked passwords
Identify words that weren't in the "top 500 English words" or in John the Ripper's "password.lst" wordlist
Perform a google search and identify results that contained [all/most] of the words to identify possible sources of the wordlist
Scrape the sites and build a custom dictionary.

Who knows, maybe I'll get motivated and have this done before next year's CMIYC?

Tip #6: Use Linux's 'alias' command to make your commands shorter

I'll admit I don't always do this, (for example see all of the screenshots above), but rather than type the full path for John the Ripper or Hashcat, you can use Linux's 'alias' command to link to them. For example:

alias john=/mnt/c/github/JohnTheRipper/run/john
alias hashcat=/mnt/c/tools/hashcat/hashcat.bin

With the above, now you can simply type 'john' or 'hashcat' to invoke them. Note: This works better than trying to add the John the Ripper or Hashcat directories to your command path as John the Ripper specifically gets weird when you do that. This probably won't help you crack more passwords, but it is a nice quality of life improvement, especially if you have different directories you are maintaining for contest hash lists and dictionaries.

Tip #7: Modify the PCFG's multiword detector to identify shorter words

Of course I need to make a new tip utilizing the PCFG toolset! The PCFG trainer is a really powerful tool to create input dictionaries from cracked passwords. During this contest, one thing I noticed from the passwords I was cracking was that KoreLogic added a large number of two/three letter prefixes/suffixes to the base word. For example, here is some of the mangling rules I started using.

One problem I had utilizing the PCFG trainer on these passwords was that its multiword detector enforced a minimum length five characters long for detecting base words. This was to reduce false positives. Or to put it another way, if you are parsing 60 million passwords, if you reduced the minimum base-word length to three characters, everything would look like a multiword!

The difference during a competition is that your training list is not 60 million passwords long (unless you are doing really, really well!). Therefore it was helpful for me to modify my code to detect multiwords that were only three characters in length. I eventually plan on releasing a patch to the PCFG toolset to make this a command line option, but until then you can make the changes yourself here in the code:

Conclusion:

As I continue to reflect on this contest, I'll probably keep adding to the list of tips above. Even as I write this conclusion other ideas are popping into my head (such as using the online version of Microsoft OneNote to pass documentation and commands between different computers). But I want to conclude by saying I hope these blog posts are helpful, and that I really wanted to thank the KoreLogic team once again for running an amazing contest.

Password Cracking Tips: A CrackTheCon Roundup

2022-05-03T19:51:00.011-07:00

“It is common sense to take a method and try it. If it fails, admit it frankly and try another. But above all, try something.”― Franklin D. Roosevelt

CrackTheCon, a password cracking contest run by CynoSurePrime, just finished. I competed as a Street team and I was really impressed. This was a well run contest, and I felt was very friendly to new and experienced password crackers alike. At least from a player's perspective, the infrastructure was rock solid, there was a great variety of challenges, and the difficulty level had a good gradient. Thanks to everyone who helped put this contest together!

My computer setup for this challenge was limited. I performed all my cracking on one laptop with no GPU support. You read that right, I was rolling old school with a pure CPU cracking session. Because of that, my primary password cracking program was John the Ripper, which has a ton of features that I prefer when I can't just let HashCat burn through some GPUs. While my operating system was Windows, I used Windows Subsystem for Linux to run John the Ripper and perform analysis on the cracked passwords. You can read about how to configure JtR and WSL here.

This lead to a modest performance of 9th place:

GPUs are nice, and this certainly shows it! If you have some GPUs available I highly recommend using them along with HashCat. As some backstory, I still have my main password cracker set up to run medical security capture the flag events, and I was too lazy to get it reconfigured for this contest.

Therefore you should probably take everything I say with a healthy degree of skepticism. Based on the chat on Discord afterwards though I realized there's a few password cracking tips that might be helpful to share. One important point I want to stress is that anyone can make use of these tips. You don't need a fancy GPU hash-cracking monster to crack passwords. In fact, most of all my attacks were "semi-automated" with very little manual analysis of the cracked passwords. So you can apply all of these techniques yourself regardless of your past level of experience.

Tip #1: Make sure your John the Ripper build is based off Bleeding-Jumbo, and update it regularly!

Even if you normally use Hashcat, JtR is a very powerful password cracking tool that has a lot of nice "research friendly" features. This makes it an extremely useful tool to have in your toolbox. As a general rule of thumb, if I'm cracking passwords with a GPU I use Hashcat. If I'm leveraging my CPU I use JtR. The key to JtR is you need to use the Bleeding-Jumbo version of it. The "main" branch prioritizes compatibility with different architectures, but the Bleeding-Jumbo branch goes all in on features. As an example, over the last couple of months they added "duplication detection" to the early portions of a password cracking session (to help with slow or salted hashes), and performed a complete rework of the included rulesets. What I do is use Git to clone JtR from its github repo at: https://github.com/openwall/john, check out the "bleeding-jumbo" branch, and then periodically pull down updates and rebuild it, (roughly once a month). This makes a huge difference!

As to the deeper question of "Why would you ever crack passwords on a CPU and not GPU", that gets more complicated... At a high level, I do a lot of password cracking research from a researcher and hobbyist viewpoint, so a CPU based approach makes it easier to tailor attacks. The real reason though is I don't own a massive cracking setup. From a training perspective, this means even if you only have a Raspberry Pi, you can pretty much recreate all of the techniques described here. That being said, sometimes the features of John the Ripper still outweigh the speed that a GPU provides, and at the very least it's a good tool to run on parallel on a VM or research computer while running longer GPU sessions with Hashcat on your main cracking box.

Tip #2: Use the '--loopback' option to leverage previously cracked passwords in your rules

The clickbait title was going to be: "This one simple trick is like a cheat code for password cracking competitions!" That's not much of an exaggeration. Full disclaimer, this technique probably resulted in around 50% of my successful password cracks in the CrackTheCon competition. I'd periodically wander over to my laptop, and feel l33t by hitting the enter key to kick off a new loopback session. So if you only follow one of these tips, this is the one to pay attention to.

As to the actual technique itself, John the Ripper's '--loopback' option tells JtR to use previously cracked passwords as a wordlist in a cracking session. Hashcat also supports loopback attacks as well. There's a million different names for this approach, which by itself should tell you how powerful it can be. You can further optimize this attack by specifying a different .pot file from your main one such as '--loopback=Challenge1.pot'. This can be helpful if you are keeping your .pot files separate for different challenges, (I don't actually do this, but some people might). Once you are using --loopback to generate your base words, you can then apply mangling rules to them like a normal wordlist. Aka by also adding: '--rules=hashcat'.

What this means was that my typical cracking session would start by running fairly basic attacks to generate an initial set of cracked passwords. For example, I'd run '--incremental' to brute force shorter passwords. I'd run a quick cracking session using the wordlist 'dic-0294' and hashcat + single rules to get slightly more complicated passwords. And I'd run a quick PCFG guessing session as well. After that initial set of passwords were cracked, loopback became one of my main attacks. And as you can see from the results, it was very effective.

Now in the real world, loopback attacks while still powerful, aren't nearly as game braking as it is in a password cracking competition. Real users don't exclusively pick their passwords from a list of fungi names. But even then, loopback can still be useful to help augment your other cracking sessions.

Tip #3: John the Ripper supports dynamic hash formants on the command line. No need to modify a kernel or look though lots of documentation!

This being a CynoSurePrime cracking competition, there was bound to be weird hashtypes to crack. This problem also pops up time in real life cracking situations where some vendor decides to roll their own password hashing function. This can be a challenge since writing your own Hashcat kernel is not a lot of fun. That's one area where JtR really shines is with their extensive "Dynamic" hash type support. You can see the main formats that JtR supports by specifying '--list=formats' on the command line. That only shows the "mainstream" formats though. If you really want to see all the various formats supported by "Dynamic" mode you can specify '--list=subformats' on the command line.

There's a lot of them included, and sometimes even that is a pain to look through and remember. One feature of JtR most people don't know about though is you can specify the hash details directly on the command line. For example, Challenge2 of the CtC contest was five rounds of MD5. To crack this with John the Ripper I simply needed to specify the following command:

./john '--format=dynamic=md5(md5(md5(md5(md5($p)))))'

The single quote around format is important so that your shell command doesn't misinterpret the parenthesis (). Basically though, you can specify the hash type, and how the password ($p) is applied, along with any salt ($s) as well. Dynamic mode supports multiple types of hash primitives, so for example, with Challenge4 which was a sha256 of a md5 hash I was able to use the following command:

./john '--format=dynamic=sha256(md5($p))'

Long story short, if you ever find yourself needing to crack a weird hash type, don't forget about John the Ripper's Dynamic formats.

Tip #4: Leverage MDXFind to identify unknown hash types

I'll be up-front. I did not follow this tip and I'm really kicking myself over it. To guess the hash types, I relied on trying the suggestions provided by John the Ripper, and when that failed, I manually tried different hashing functions using the command line dynamic mode (Tip #3). Don't be like me. If you are dealing with an unknown hash, the tool you want to use is MDXFind. You can obtain it here: https://www.techsolvency.com/pub/bin/mdxfind/. If I had followed this advice, I probably would have ranked higher as I never figured out that Challenge #5 was:

--format=dynamic=sha256(sha1($p))

To get MDXFind running on an Ubuntu image running on Windows Subsystem for Linux (WSL2):

Download mdxfind.1.116.bin
sudo apt-get install libjudy-dev
sudo apt-get install libmhash-dev
sudo apt-get install librhash-dev

Here is an example leveraging MDXFind to identify the hash type for Challenge #5. The passwords look like SHA256, so the command I'd start with would be:

./mdxfind.1.116.bin -h 'SHA256' -f Challenge5.txt wordlist.txt

-h 'SHA256': is the base hash type to use
-f Challenge5.txt: is the hashlist
wordlist.txt: is the wordlist

And the results...

It quickly identified SHA256(SHA1($p)) in 4 seconds... Yeah that would have been nice to use.

Tip #5: John the Ripper supports mangling rules on the command line

In password cracking competitions one of the keys is to try and identify mangling techniques and create rulesets to target them. Now, I'll admit that for this competition, I mostly relied on the included rulesets (Tip #6), and using the PCFG Toolset to autodetect and create rulesets (Tip #7). A hands on approach is more effective though, but it quickly becomes annoying to have to constantly open up your ruleset file to modify it. This may sound like a minor nitpick, but your analysis time is valuable. One hidden feature John the Ripper supports is creating rulesets right on the command line. This is a huge timesaver, and in my opinion one of the killer features of John the Ripper. For example, let's say you want to duplicate a word and then add two digits to the end of it. Your JtR command might look like:

./john --wordlist=somelist.txt '--rules=:d$[0-9]$[0-]' hashlist

Key points:

You need to include --rules in single quotes. Aka '--rules...'
Your rule needs to start with ':' which is JtR's "no-op"
You can include multiple rules separated by a ';'. For example: '--rules=:d$[0-9]$[0-];:$[a-z]'

This is also very useful to test the output of your rules. To do this you can feed in an single word via stdin, and then you can apply rules to it using JtR's --pipe command. So for example:

echo test | ./john --stdout --pipe '--rules=:d$[0-9]$[0-9]'

It may seem weird, but this is one of those tricks that makes me smile every time I use it.

Tip #6: Making use John the Ripper's mangling rulesets

John the Ripper includes a ton of wordlist mangling rules. Given this contest was run by CynoSurePrime, I figured there would be heavy hashcat users on the hash creation side, so I primarily used the 'hashcat' ruleset. Aka:

./john --wordlist=somelist.txt --rules=hashcat

This runs through some of the main rules included in Hashcat such as:

[List.Rules:hashcat]
.include [List.Rules:best64]
.include [List.Rules:d3ad0ne]
.include [List.Rules:dive]
.include [List.Rules:InsidePro]
.include [List.Rules:T0XlC]
.include [List.Rules:rockyou-30000]
.include [List.Rules:specific]

Other useful rulesets (though not as useful for this particular competition)

--rules=phrase: Great for attacking passphrases
--rules=l33t: Good for attacking l33tsp33k passwords
--rules=ShiftToggle: Good for attacking weird capitalization
--rules=by-score: A good set of rules to use for fast hashes
--rules=by-rate: A good set of rules to use for slower hashes

Tip #7: Using the Pretty Cool Fuzzy Guesser (PCFG) Toolset

Of course I was going to mention the PCFG toolset! I just recently released version 4.3 and it has a ton of expanded documentation, plus better support for cracking Russian passwords. You can get it here:

https://github.com/lakiw/pcfg_cracker

The default PCFG ruleset is usually decent, but not great when it comes to password cracking competitions. This is because context passwords don't resemble RockYou passwords which the default ruleset was trained on. The real value is using the PCFG trainer to learn new rules and create new wordlists based on cracked passwords. The trainer does a lot of cool stuff in the backend such as multiword detection, keyboard walk identification, and other mangling rule generation. It can also be fairly effective even if you only have a couple of hundred cracked passwords.

To train a PCFG ruleset:

You first need to create the training list. Adding support for JtR pot files has been on my todo list forever, but currently you need to strip the hash information off of your cracked passwords. For example, if I wanted to create a training list for Challenge #2 which was five rounds of md5 I ran: "cat john.pot | grep 'md5(md5(md5(md5(' | awk -F':' '{print$2$3$4$5$6}' > plains_2.txt" Yes this is a horribly inefficient way to do this, but it printed out all of the hashes, then only printed ones with the correct hashtype, then stripped off the hash, and then saved the results to plains_2.txt.
Run the PCFG trainer on the set. For example: "python3 trainer.py -r Challenge2 -t plains_2.txt

Once you have the training set you can do a couple of things:

You can run a PCFG attack against the challenge using the new ruleset. For this I recommend disabling Markov generation using the --skip_brute option. For example: python3 pcfg_guesser.py --skip_brute -r Challenge2 | ../JohnTheRipper/run/john --stdin '--format=dynamic=md5(md5(md5(md5(md5($p)))))' Challenge2.txt
Another good option is to use princeling to generate a wordlist optimized for PRINCE attacks. You can also use this as a normal wordlist as well. For example, this will create a 50k word dictionary: python3 prince_ling.py -r Challenge2 --size 50000 -o new_wordlist.txt
You can manually go through the generate rules file to identify mangling rules. A good option to open up is: Rules/<RULENAME>/Grammar/grammar.txt

Summing all of this up, 99% of my cracking sessions for this contest were:

Identify the correct hashtype
Run a default attack against it using the dict0294 wordlist and the hashcat rules
At the same time run a JTR bruteforce Incremental attack "./john --incremental=All". That's the nice thing about CPU cracking. I have enough cores I can run around three sessions at the same time on my laptop before things get really slow.
Run a couple of loopback attacks using the hashcat rules
Train a PCFG ruleset and run a PCFG cracking session until it gets to around 95% coverage. You can see the coverage by hitting enter while running it.
Run a couple more loopback attacks
Re-Train the PCFG ruleset
Create a wordlist using the PCFG prince_ling
Run a PRINCE cracking session using the wordlist and JtR
Run a normal cracking session using the prince wordlist and the hashcat ruleset
Re-Train the PCFG ruleset and run a PCFG cracking session
Repeat. Maybe run a longer incremental session, or try another input dictionary.

Following these steps, you too can get 9th place in a password cracking competition!

Installing John the Ripper on Microsoft's Windows Subsystem for Linux (WSL)

2019-08-01T19:40:00.000-07:00

"I see my path, but I don't know where it leads. Not knowing where I'm going is what inspires me to travel it." --Rosalía de Castro

Introduction:

With great regret I finally decided to retire my 10-year-old MacBook Pro as my personal travel laptop. Part of that is I'll be attending Defcon this year to help out #IAmTheCalvary and the #WeHeartHackers initiative by volunteering in the Defcon Biohacking village. Side note, if you are in Vegas, feel free to drop by and we can talk about cyber security in a clinical setting. Doctors and nurses hate passwords too!

Getting back on track, I wanted something a bit more modern to participate in this year's Crack Me If You Can Completion, as well as to play around in the various hacking villages so I bought myself a Microsoft Surface Book. The challenge was while Hashcat has a native Windows build, my experiences getting John the Ripper (JtR) running on Windows in the past have been ... troubled. That's part of why I loved my old MacBook. It just worked (sorry Linux), and JtR ran great on it. Now I could re-image my laptop with Linux or dual boot it but having Excel and Notepad++ makes my life so much better. Plus, I'm really digging the tablet. So before I went ahead and installed VirtualBox and ran JtR in a VM I figured I'd try and install JtR using the new Windows Subsystem on Linux (WSL). Long story short, it worked great and was straightforward to do, so I figured I'd share my experiences.

Other Options for Running John the Ripper on Windows

If you want to skip this guide and instead install a pre-built executable of JtR, you can obtain a relatively up-to-date version here: https://github.com/claudioandre-br/packages/releases/tag/jumbo-dev

Note: I've never run these, so I'm not very familiar with how they perform.

Other options include installing JtR using Cygwin. A guide for doing so is available here: https://openwall.info/wiki/john/tutorials/win64-howto-build

Finally, a very common option that I referenced to above is to simply install VirtualBox, and then run JtR in a VM.

Windows Subsystem for Linux:

If you are wondering what WSL is, you are not alone! At a high level, it lets you run Linux programs on Windows without having to recompile them or run them in CygWin. To steal Microsoft's own words:

The Windows Subsystem for Linux lets developers run a GNU/Linux environment -- including most command-line tools, utilities, and applications -- directly on Windows, unmodified, without the overhead of a virtual machine.

You can:

Choose your favorite GNU/Linux distributions from the Microsoft Store.

Run common command-line free software such as grep, sed, awk, or other ELF-64 binaries.

Run Bash shell scripts and GNU/Linux command-line applications including:

Tools: vim, emacs, tmux

Languages: Javascript/node.js, Ruby, Python, C/C++, C# & F#, Rust, Go, etc.

Services: sshd, MySQL, Apache, lighttpd

Install additional software using own GNU/Linux distribution package manager.

Invoke Windows applications using a Unix-like command-line shell.

Invoke GNU/Linux applications on Windows.

The mechanics of it are complicated with significant differences between WSLv1 and WSLv2. This guide was written with WSLv1, though if I get adventurous before Defcon I may try to upgrade to WSLv2.

Enabling WSLv1 and Install a Linux Distro:

The first thing you need to do is enable WSLv1 as it is disabled by default. As a fair warning, this will require a reboot.

There are several ways to enable WSLv1. I opted to use PowerShell. The first step then is to open an Administrative instance of PowerShell.
Run the following command (ref):

Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsytem-Linux

Reboot your system when prompted to.
Once your computer starts back up, the next step is to pick a Linux distro. Open the Microsoft store and type Linux in the search menu

Side note: You'll be happy to know that Kali Linux is rated "E for Everyone"!

Important Note: All the Linux distros I looked at in the Windows Store, (including Kali), are barebones and do not include graphical desktops, or many tools or installed libraries. It's not like installing a Kali live boot image.
Because Kali doesn't come with any tools preconfigured, I opted to go with a base Ubuntu build. That's also partially because Kali and Hashcat in the past haven't been an ideal match, so I tend to stay away from it on my desktop builds

Once you install Ubuntu, you'll still need to initialize it. To do this open PowerShell again, though this time you can run it as a standard user. For Ubuntu, simply type 'ubuntu'

You'll be prompted to create a user account. Go ahead and do so.
Congratulations, you are now running Linux on Windows!

Installing John the Ripper

This guide was written using the bleeding-jumbo version of John the Ripper, which is available here: https://github.com/magnumripper/JohnTheRipper
It's beyond the scope of this guide on how to install and use Git on Windows, (I personally like GitKracken). While you can download the source-code as a zip file, I highly recommend downloading it using git to make keeping it up to date much easier. With WSLv1, it's recommended that you install the code somewhere besides your new Linux filesystem. I put it in c:\github\JohnTheRipper\. With WSLv2 that changes, but I'll cross that bridge when I try that out. You could also probably install git into Ubuntu and download it that way, but I didn't try that.
The next step is to install all the required libraries in WSLv1 Ubuntu. Run all the following commands in the PowerShell window above after starting Ubuntu. If you ever close your window, you can restart PowerShell and type "ubuntu" to restart Ubuntu.
Update your package libraries. If you don't do this, the following installs will not work, (as seen in all the errors above the command in the below screenshot)

sudo apt update

Install GCC. Select 'Y'es when prompted. The install will take a while.

sudo apt install gcc

Install Make

sudo apt install make

Install various libraries required/recommended for JtR Bleeding-Jumbo

sudo apt install libssl-dev
sudo apt install libgmp-dev
sudo apt install libkrb5-dev

Navigate to your Windows drive where you installed the John the Ripper source-code. You can access you C:\ Drive under the /mnt/c directory. Run the following command to build JtR

./configure && make

The build process will likely take around 10-15 minutes. After it is done you should see the following. If there are any errors, something went wrong so you will likely need to perform additional troubleshooting.

Finally navigate to the run directory '../run/' and try to start John the Ripper:

.\john

Congratulations! You are now running John the Ripper on Windows!

Performance:

If you are curious, here is a short snipped of me benchmarking JtR on my PC. Note, this is only running on a single core. I should have also included the --fork=8, which I'll admit I didn't realize worked with the --test option before writing this guide.

Laptop Specs:

Microsoft Surface Book 13 Inch,
Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz
16.0 GB Ram

Test command: ./john --test

Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (8xOMP) DONE

Speed for cost 1 (iteration count) of 32

Raw: 6344 c/s real, 790 c/s virtual

Benchmarking: Raw-MD5 [MD5 256/256 AVX2 8x3]... DONE

Raw: 61074K c/s real, 61074K c/s virtual

Benchmarking: scrypt (16384, 8, 1) [Salsa20/8 128/128 AVX]... (8xOMP) DONE

Speed for cost 1 (N) of 16384, cost 2 (r) of 8, cost 3 (p) of 1

Raw: 280 c/s real, 35.0 c/s virtual

Benchmarking: LM [DES 256/256 AVX2]... (8xOMP) DONE

Raw: 121470K c/s real, 15241K c/s virtual

Configuring a Password Cracking Computer

2018-09-11T20:02:00.001-07:00

“Be willing to be a beginner every single morning.” —Meister Eckhart

Disclaimer: While the reason I'm writing this is because I was lucky enough to win a new cracking rig from Netmux's Hash Crack Challenge, I want to state for the record that he never asked me to blog about it, and all of the good things I say are 100% of my own choosing and not contingent on me receiving any prize.

2nd Disclaimer: I plan on this being a "living" blog entry as I continue to update and use my new computer. Since install procedures change over time, for the record I started to perform my install on September 7th 2018. I'll try to date my entries as I write them to help anyone trying to follow this so they can estimate how useful these instructions are.

ChangeLog:

September 12, 2018, (rearranged sections, added MDXFind, updated installing OpenCL instructions)

September 7, 2018 (Computer Arrives):

Wow, I suddenly and unexpectedly found myself in possession of a dedicated password cracking machine! For more background how that happened, please refer to my post on Netmux's Hash Cracking Challenge here. For the record, Netmux was amazing when it came to promptly shipping my portable cracking rig and keeping me in the loop. I'll admit I was a bit hesitant to hand out my home address to professional pen-tester and password cracker I met on the internet, but I've made a lot worse threat modeling decisions in the past, (There is a story behind the first picture that gets everyone who knows and cares about me legit angry for the stupid trust I've put in absolute strangers before). Long story short, Netmux was professional in shipping the server, kept me in the loop, and when it showed up I was super excited! As some background, while I study password cracking, develop and analyze password cracking tools, and participate in password cracking challenges, I've never been willing to personally invest in a dedicated password cracking rig. Mostly I've made do with a 2010 MacBook Pro, and a Windows machine with a GTX970 that I'll freely admit spends more time running Excel and playing World of Warcraft than cracking hashes. Which is another way of saying please take all my advice with a grain of salt, and the understanding that I'm planning on using this new server for research. I'm not optimizing it as a pure password cracking rig. But also this is a way of saying that I no longer have any excuses in how much I contribute in password cracking challenges in the future! This gift has inspired me to start a few new research projects so I want to give yet another huge thanks to Netmux!!! If you see me post additional blog content in the next few months or update my PCFG cracker, please give credit to him!

A Quick Aside on my New Password Cracking Rig:

Let me first say that it arrived in perfect shape so of course the first thing I did was crack it open and look at the inside...

My new rig from Netmux

Super excited!!!

The wiring was very well done, the whole rig is water cooled, the case certainly adds hacker creds, and little things were taken care of such as having good filters over the air vents which is pretty much a make or break requirement for this cat owner. I'm *very* happy with it, and would recommend it to someone else.

As far as the specs go:
CPU: Intel i5-7600k, 1 processor; 4 cores
RAM: 16GB
Storage: 500GB SSD
GPU: GeForce GTX 1070

Installing the OS:

Netmux's cracking rig came pre-installed with Ubuntu, but I figured I might as well re-install everything from scratch. After consulting with several password cracking experts I'm lucky to know, my end decision was to re-install Ubuntu. The version I used was 18.04.1 LTS. I plan on using this server for research as well so I went with a full graphical desktop. If you are hardcore and want 100% of your machine devoted to cracking then by all means go with a server deployment, but this guide probably won't help you to much since I *love* GUIs. Spoiler alert, I recommend installing a GUI git client like GitKracken, so that's where this guide is taking you.

Building the Boot USB (September 7, 2018):
Like anyone has a DVD anymore... The very first step I took was to create a bootable USB.

Steps:

You can download an Ubunto ISO from here
Since I already was running Ubuntu, I could use Startup Disk Creator to create a bootable USB drive. You can perform a search, (use the Windows key), for that application if you are running Ubuntu already.
Follow the options to create a bootable USB using the ISO that you previously downloaded

Installing Ubuntu fro USB (September 7, 2018):

Use multiple swear words and reboot several times until you find the BIOS option to change your boot preference to start with your USB drive. In my case it was hitting F2.
Once you boot from the USB, follow the steps in the Ubuntu installer and configure it how you want.
If you are going to configure full hard drive encryption, (this will be a real portable rig that will potentially be unattended in your car when you make a restroom stop, or you are worried about legal issues), this is the time to configure full hard drive encryption. Just saying.

Core OS Drivers and Important Tools for Other Capabilities:

Installing OpenCL drivers (Originally installed September 7, 2018, updated September 12):

Special thanks to WinXP5421. The following section was written by him, though I tested it on my system and made minor edits based on my experiences and formatting it for this blog

Download the appropriate Opencl Drivers for your system. We are specifically looking for “Intel® Xeon™ Processors OR Intel® Core™ Processors OpenCL runtime” drivers.

Drivers at the current time of writing this are located here: https://software.intel.com/en-us/articles/opencl-drivers
The current version at the time of writing this can be found: http://registrationcenter-download.intel.com/akdlm/irc_nas/12556/opencl_runtime_16.1.2_x64_rh_6.4.0.37.tgz
Run: wget http://registrationcenter-download.intel.com/akdlm/irc_nas/12556/opencl_runtime_16.1.2_x64_rh_6.4.0.37.tgz

Extract the archive:

tar -xvzf opencl_runtime*.tgz

The opencl runtime requires `lsb-core` to be installed on the ubuntu machine:

sudo apt install lsb-core

Now install the drivers:

Go to the intel directory that you extracted in step #2
sudo ./install.sh
Work your way through the installer answering questions as needed. The install script will complain that your Ubuntu operating system is not supported this is fine continue with the installation anyway.

Let’s verify we have a working Opencl environment by installing and running `clinfo`

Note: clinfo was already installed on my machine, but one of the other tools I installed later may have installed it -- Matt
sudo apt install clinfo
clinfo
The output of clinfo should display detailed information about each CPU core you have on your system. Simply put “Lots of output = all good” If OpenCL did not install properly you will see short and specific errors after running clinfo.

Installing NVidia Drivers (September 7, 2018):

Run: ubuntu-drivers devices
Select the driver from the list you want to install. In my case it was:

sudo apt-get install nvidia-driver-396

Install basic GIT (September 7, 2018):

I usually only use a command line git when something goes horribly wrong, but having it ready helps a lot when that happens.

Sudo apt-get install git

Install a GUI GIT Client (September 7, 2018):

I've used a lot of git GUIs in the past. The following is purely personal preference, but I would highly recommend using a graphical git GUI if you are doing any development. Having the ability to easily view changes, manage merge requests, fork, etc, I've found to be invaluable in all my work.

My favorite git GUI of all time has been the official github client from several years ago. Unfortunately since then they re-based everything in a web layout, it completely broke my workflow. I've tried to use Atlassian's SourceTree, but after a few horribly failed merges was told to never use it again by several co-workers. I currently use GitKracken, and am very happy with it. GitKracken is not free for commercial use. I've been told to use SmartGit by several people but don't have experience with it. If you are using this tutorial for commercial use and don't have funding to pay for GitKracken please check it out. Otherwise, I've found GitKracken to be great for non-profit and personal use.

Install GitKracken from https://www.gitkraken.com/
Run the following command or gitkracken will never actually start: sudo apt install libgnome-keyring0
Once GitKracken is installed, log in to your github account using it
Now add your computer's SSH key to your github account using: File->Preferences->Authentication->Github.com->Add_SSH_Public_Key

Installing Password Cracking Programs:

Install Hashcat (September 7, 2018):

Yes there are pre-built binaries for Hashcat, but I highly recommend using the github based source code to stay up to date with all the latest changes, fixes, and features.

Install Hashcat using your git tool of choice. If you are using GitKracken, import the following repo: git@github.com:hashcat/hashcat.git
Full instuctions for installing Hashcat can be found at: https://github.com/hashcat/hashcat/blob/master/BUILD.md
You'll need to update the OpenCL Header submodule. This can be done in GitKracken by importing Hashcat using the above link and then in gitkracken "viewing Left Hand Side" at SubModules, right clicking on the deps/OpenCl-Headers, and selecting "Create" or "Update", If you are not using GitKracken, follow the instructions listed in step #2
In a terminal, select "make", and then "make install"
By building from source, you can periodically pull from the Hashcat repository and re-build it to add new features before an "official" release is published

Benchmarking Hashcat With New Install, (and gratuitous plug for NetMux's Hashcracking Manual which is awesome)

Install John the Ripper (September 7, 2018):

John the Ripper is my favorite password cracking program. If you are doing any sort of academic research or tool development, I can't suggest it enough. I'll admit though that if I'm only concerned with cracking standard hashes I generally use Hashcat instead. Regardless, I'd recommend installing John the Ripper on any password cracking rig you configure. Furthermore, you really need to install the magnum-ripper bleeding edge version of John the Ripper since the base version hasn't been updated in years. New patches, fixes, and features are normally pushed weekly, so building it from source, and constantly re-building it is highly recommended.

Install the following branch of John the Ripper: https://github.com/magnumripper/JohnTheRipper./
Install SSL libraries: sudo apt-get install libssl-dev
cd ./JohnTheRipper/src/
./configure
Note: The following does not have OpenCL support. I'll try to circle back to this later to figure out how to add it.
make -s clean && make -sj4
cd ../run/
./john --test

Install MDXFind (September 12th 2018):

I've been told I really need to start using MDXFind so since I'm starting a new cracking platform this is certainly the right time to install it.

A quick aside, most people might question why I need three different password cracking programs on the same computer. I'm sure it's a lot like how chefs view their kitchen knife collection. Yes they all cut, but the right one depends on what you are trying to do.

While certainly not set in stone, as a general rule of thumb I use John the Ripper for research, CPU cracking sessions, cracking file encryption "hashes", and a few other hash types that don't translate well to GPU like SCrypt/BCrypt. It also has the best support for non-English data-sets.

I use Hashcat for most GPU cracking that I do. Yes, John the Ripper GPU support has been getting more robust, but I've had better luck with Hashcat. For example, I'm cracking large lists of unsalted MD5, Hashcat is my go-to cracking program.

MDXFind seems tailored to cracking large "messy" data-sets. Think of a lot of the major password dumps that become public. It's fast and can handle data-sets going into the millions of password hashes. It also has support for cracking nested hashes which have a way of ending up in some of these dumps. Oh, and it seems to be the password cracking tool of choice for CynoSurePrime and they know a few things...

Obtain the latest copy of the source-code from https://hashes.org/mdxfind.php

MDXFind is only provided as a pre-compiled binary so you don't need to build it. Grab the 64bit Linux variant.
Download and copy the file to the directory you want to install MDXFind into

Make MDXFind executable

chmod +x mdxfind

Install required dependencies

sudo apt install libjudydebian1 libmhash2 librhash0

Test MDXFind

./mdxfind

Other Quality of Life Installations:

Install Text Editor:

I like Kate. To install it: sudo apt-get install kate
You might also want to install Atom which has more features. I'm hesitant to recommend it with Microsoft buying GitHub, but it is free and has a ton of features: https://atom.io/

Change Login Background (September 7th 2018):

Not really important, but I always do this because it helps my gumption level:

Find a picture you want to see when typing your login picture.
sudo cp Pictures/FILENAME_OF_PCITURE_YOU_WANT_TO_USE /usr/share/backgrounds/login.jpg
vim /etc/alternatives/gdm3.css
Find: #lockDialogGroup background: #2c001e url(resource:///org/gnome/shell/theme/noise-texture.png) background-repeat: repeat; }

Replace it with

#lockDialogGroup {  background: #2c001e url(file:///usr/share/backgrounds/login.jpg);
  background-repeat: no-repeat;
  background-size: cover;
  background-position: center; }

Netmux's Hash Crack Challenge Writeup

2018-09-03T21:03:00.004-07:00

"Good luck is when opportunity meets preparation, while bad luck is when lack of preparation meets reality" -Eliyahu Goldratt

This last week I participated in Netmux's Hash Crack Challenge, and this happened:

HASH CRACK CHALLENGE Hash #2 has been cracked by @lakiw. Congrats to our winner and thanks to everyone that participated! It was a bumpy ride but a lot of fun to create and host. Thanks again to everyone and look for the final write-up in the coming days!
— Netmux (@netmux) September 1, 2018

So I figured the least I could do was make a blog posting about it along with my analysis of Netmux's One Time Grids, which the challenge was based on.

TLDR/Bottom Line(s) Up Front (BLUF):
I was lucky enough to be checking Twitter right when Netmux posted his final hint, and that was the only reason I won. As to the security of One Time Grids, they share a lot of similarities to other password books, which can be both good or bad depending on your threat model. Compared to other physically written down password books, the One Time Grid approach pushes users to stronger passwords at the expense of usability. It is *very* secure against your typical online hacker, but shares the weakness of other password books in that it may be weak against people in physical proximity you, (such as ex-boyfriends, nosy parents, nosy children, etc). I didn't find any weaknesses that could be exploited by an online attacker. Long story short, I wouldn't recommend it due to the usability issues, but if you have fun with it, feel free to use it.

What is a One Time Grid and how does that apply to the contest?
Netmux does a better job explaining it in his blog here, but it basically is a password creation book that you can buy from Amazon, available here, that provides a bunch of One Time Grids for creating and storing passwords. The contest was an attempt to crack two different raw-SHA1 password hashes generated using a One-Time-Grid. They were:

Hash1: fe0c9f335b35c45e92d5e7d07c5933b6c4c0a522

Hash2: 120c249bc0f301ef3cba7a0fcbff463aaaded486

As to the One Time Grids themselves, they are either a 7x7 grid filled randomly with one of the following 84 characters:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890-!@#$%^&*=?[](),.;{}:+

One Time Grid used in the contest

Or a 3x26 grid filled with random words:

Example word based One Time Grid. Not used in the contest

The One Time Grid used in the challenges was composed of random letters, so this blog post will focus on that. When it comes to the security of a One Time Grid though, most of the statements I'll make will apply to both unless otherwise specified.

Netmux also suggests three different ways to turn a One Time Grid into a passwords, a "basic" random grid, a "pattern" random grid, and a "scatter" random grid. Only pattern and scatter were used in the contest, so I'll focus on them, but a "basic" grid is simply a "pattern" with no bends. Aka all walks go in a straight line. Below are examples he gave for pattern and scatter on his site. Note, these examples do not use the contest One Time Grid.

Example "Pattern" Password Creation rules

Scatter One Time Grid password creation, taken from Netmux's site

Contest Start:
The first thing that should be apparent is that without the One Time Grid that a password was based on, no attack can be run that has a hope of being successful against passwords longer than 9 characters. Even 8 characters would require significant horsepower. 84^8 = 2.4 quadrillion keyspace which is quite big, even for GPUs. This assumes that the One Time Grids are generated using a true random number generator, yada yada yada, but for the purposes of this contest, no effective attacks could be started. Which is ok, because it gave me time to prep some tools and do some research.

Side note, I'll give Netmux credit that doing a "search inside" check of his Amazon One Time Grid book didn't accidentally share any of the real grids. Not that I've abused that feature in other contexts before...

First Clue: "Pattern" & "Scatter"
Sometime around this point Netmux released his first clue: "Pattern" & "Scatter". This pretty clearly indicated that the above two methods were used to generate the password, so I started to develop some scripts to generate walks of One Time Grids in anticipation of when the actual grid would be released. I originally started out investigating if I could use a custom keyboard layout with Hashcat's kwprocessor, which generates keyboard walks, but quickly realized I would have to significantly modify it to target One Time Grids. That's because kwprocessor was set up to crack 4 row keyboards vs 7x7 grids, along with some other optimizations it made for keyboard quirkiness which is great for normal cracking, but would cause problems with what I wanted it to do. So I wrote my own script, which I posted on github and is available here. It admittedly went through several rounds of improvement throughout the contest, but here is a general overview of how it works, and the constraints I added to reduce the key-space:

one_time_grid_walker.py only targets the "Pattern" random grids. "Scatter" random grids need a lot more information to effectively target them. I'll dig into that more later
The first constraint I added to it was that all "walks" had to start and end on the edge of a grid. This was based on my reading of netmux's examples and how I expected a typical user to interpret his suggestions. Examples of "valid" and "invalid" walks can be seen below.

Valid walk of contest grid

Invalid walk of contest grid

The second constraint I added was a walk could not double back on itself or cross a part of itself. In the above example, a walk could no go, "8oyIyo8". This admittedly was a naive assumption on my part, but I made it once again to reduce the keyspace and based it on my reading of the examples given.
The third constraint that I struggled with but felt when coding up my script that I needed to make was to limit the maximum size of a walk. As the maximum length increased, the keyspace also did, which would cause problems later when running a combinator/Prince attack. Len8= 4081, Len9= 7268, Len10=12011, Len11=19131. This on its own would be trivial, but when you start combining multiple walks together, can be significant. For example, 19131^2 = 365 million. 19131^3 = 7 trillion. This admittedly was where I probably made my biggest mistake, prematurely optimizing this.
Skipping ahead a bit, I later optimized my approach further to limit the number of "bends" that a walk could make. If I only allowed one "bend", (or change in direction), there were only 575 possible walks for a current grid. This allowed combining many different walks practical. I felt for a typical user following the advice given, this represented what I would expect to see them do.

As far as weaponizing this goes, I was tempted to use the Prince attack, but when talking with Chick3nman, he gave the helpful advice that if you didn't need the optimizations that Prince uses, a straight combinator attack with Hashcat was much faster for easy hashes like raw-sha1.

And then I pretty much waited. Well in reality I tried some attacks against the sample One Time Grids to bide my time, but I didn't expect to crack the first hash. I was a bit cocky though, and expected that I'd crack the first hash within minutes of it being released.

Second Clue: One-Time Grid attached below

Yes! The target one time grid was finally released. I'll admit I said a few choice words that it was released as a picture though, which led to some squinting and me questioning if letters were lower or uppercase. Oh, and also one typo when entering it into my code that I nearly missed, but luckily Hops pointed it out to me. In any future contests, it would be really nice if items like this could be released as text that allowed copying/pasting.

Another challenge I ran into was that I wasn't at my cracking computer, so couldn't run any effective attacks myself. Luckily Chick3nman agreed to run my script and try to crack the first hash for me. Unfortunately he wasn't successful. I want to stress that was my fault since he was running my scripts and attacks.

There was a lot of head scratching, and variations of walks plus the suggested PIN and random word, but long story short, even when I got back to my computer and ran attacks myself, I was completely ineffective at cracking that first hash. I'll admit it really annoyed me in a good way like any fun problem does. I want to give a huge shout out to Boursier Etienne, who actually managed to crack it first. I'd love to hear what Boursier did.

Third Clue: Birthday Paradox
I may have uttered a few more choice words over this clue. I'm well versed in the birthday problem, but that doesn't seem to be applicable to One Time Grids. Yes some individual characters appear more often than others, but the heart of the "scatter" problem is a "Choose X with no replacement" problem. Aka, the first character has 49 different options. The second character has 48 different options. The third character has 47 different options. And so on. This is not related with generating collisions between multiple inputs as far as I can see.

Fourth Clue: Are all cell values equally probable?
I see where Netmux was going with this. For a scatter password, if you were modeling it, cells 3/26, 6/25, and 7/23 all contained periods ".". If you selected any of them when generating a password guess, it didn't matter which order you picked them which can reduce the effective keyspace. The problem comes when trying to weaponize this info. I did some back of the napkin calculations and if your guess generator took into account the "choose and no replacement" aspects along with the "several characters show up several times", you could reduce the keyspace by roughly a factor of 10 for the password lengths I thought the password might be. This sounds great, but one problem I've run into many times before, is that more effective guess generators take time to generate guesses. So while a script that I coded might reduce the keyspace by 10x, it would probably take 100x more time to generate a guess against a raw-sha1 hash then just using a custom mask. Therefore trying to optimize my solution would actually make it worse.

Now admittedly someone could take the time to create a custom solution in Hashcat or John the Ripper that would be fast, but that wasn't going to happen in the time this contest ran. More importantly though, for a 10 character password generated by a "scatter" method, it didn't matter. The keyspace was so large that even a 10x speedup wouldn't be enough to make it practical.

Fifth Clue: str(PIN)[:-1]
This hint was a good clue that the PIN, minus the last character of the PIN, was part of one or both of the passwords. Aka "71997" could be found in the password. This was good info to have when trying to crack the password, but I'll admit I was a little annoyed since guidance to apply mangling rules like this wasn't in the instructions for using One Time Grids. By that I mean, it's totally within the bounds of someone doing this in real life. In fact, I'd recommend it, as it explodes the keyspace of One Time Grids. But based on the instructions I wouldn't expect a typical user of One Time Grids to do mangling rule like "remove the last character of the PIN". Now, most of my password cracking techniques are based on targeting "typical users". If everyone was unique I'd be the worst password cracker out there. But people typically follow standard behavior patterns which makes password cracking possible. I'm biased, but I like to see that reflected in contests. Needless to say though, this wasn't enough information to crack either one of the two password hashes.

Sixth Clue: scatter_cells + str(PIN)[:-1]
This clue said that the PIN-1 would be at the end of the scatter cells password, which was helpful without being useful. They keyspace for likely scatter cells passwords was so large that knowing any additional mangling didn't make a difference.

Seventh Clue: Use seven of the possible ten "repeats" to mask your way to the other half of the scatter_cells solution.
This provided a lot of useful information without being actionable. It said the "scatter" portion of the password was 14 characters long, with 7 of those characters being a repeat item, and the other 7 being unique characters. This meant 7 characters had 10 possible values, and the other 7 had 29 possible values. What's more, the second set was a pure chose with no replacement, so the 7th character would technically only have 22 possible options. The problem once again was making use of this information. For example, I didn't know which positions would take from either set. So for a 14 character password, that increases the keysize by 2^14 = 16,384, which is a problem because the current mask setups for JtR and Hascat don't support that kind of selection. In retrospect, I realized I could have created a script to generate all 16k masks and feed them into Hashcat, but during the contest that didn't occur to me. Long story short, this was the point where if given six months it's possible someone could have cracked the second hash, but it was unrealistic to do it in a day or two.

Eighth Clue: Hash #2 = print(len(scatter_cells + str(PIN)[:-1])) = 19
While this made explicit that there were no other mangling rules or surprises for the second password hash, it didn't make the problem more crackable compared to the previous clue.

Ninth Clue: No cell values have been reused in the composition of scatter_cells.
“q$*????????)wc” + str(PIN)[:-1]
This is where I got really lucky. I managed to check Twitter at the exact right time and saw the following tweet by Netmux:

T-minus 15 minutes until the release of the final Hash Crack Challenge clue!#hashcrack #passwords pic.twitter.com/EBPfiPXc9R
— Netmux (@netmux) September 1, 2018

Therefore I was at my computer and ready to go for the final hint. When he posted it, I quickly created the following mask attack using hashcat:

hashcat64.exe -m100 -O -a 3 ..\contests\netmux\netmux.hsh -1 IA9GV8oyILM.!03WKH+epP{TxJz3hbu\? q$*?1?1?1?1?1?1?1?1)wc71997

By Netmux giving me 6 of the scatter characters used I only had to bruteforce a 8 character password, and there were only 32 possible characters per posision, making this significantly easier than a Lanman password hash. All told, it took me around 5 minutes to crack the password hash, which admittedly was a heart pounding five minutes since I was sure other people were running the same attack as I was. I was sweating the whole time and my adrenaline was pumping. As proof of the timing to run the attack, here is me re-running the cracking attack on my system. It took 9 minutes to exhaust the whole keyspace, but I got my crack around five minutes in.

Cracking the 2nd Hash. Path information and the actual hash plaintext redacted.

For comparison, I have a single NVidea GTX 970 in my computer. Not even a Ti. Really what it comes down to was that I was very lucky, to the point where I feel a little bit guilty about it. In the future I'd advise contest creators to publish set times when they will release hints so that way everyone is on an even field when it comes to making use of this information.

Conclusion:
First of all, I'd like to give thanks to Netmux for putting on this competition. I had a lot of fun and I hope this blog post points that out. There's many "contests" out there but putting my time into this was way more enjoyable than dealing with the drama of hacking Bitfi. Also dealing with a new type of bounded problem like One Time Grids was very interesting.

I'd also like to thank Chick3nman, Hops, and Royce Williams, for lending cracking hardware, giving advice, and all the heckling ;p

As to the security of One Time Grids, let me back up a bit.

When doing any threat analysis or security review my first step is to categorize the adversary. A good rule of thumb brought up by James Mickens is the "Massad vs. not-Massad" categorization. I highly recommend following that link because the write-up is hilarious, but it boils down to if you are worried about the Massad, well there's nothing you can do because you are going to f***ing die. But if your adversary is someone else, there's effective strategies you can take to protect yourself. Now admittedly there's variations of this, but basically if you are worried about nation level attackers, then don't use One Time Grids. If you are worried about typical hackers though, One Time Grids can be extremely effective. I'll freely admit that I'm not the best password cracker out there, but the fact remains that if Netmux hadn't given me the One Time Grid, along with 11 characters of an 19 character password, I'd never have cracked it. Also One Time Grids are such a niche technique that even after this contest I don't see myself incorporating the lessons learned into any of my normal cracking strategies.

There's two major problems I see with One Time Grids though. The first is they don't produce memorable passwords. If you don't want to write the passwords down, you'll need to take your book with you, which is a pain. And if you do write your passwords down, I'd recommend using a traditional password manager instead. Most of which have built in random password generation tools which are just as effective as One Time Grids for creating strong passwords.

The second problem is that One Time Grids share the same issue as many other password "books". They have the potential for horrible failure if your adversary is someone you know and/or love who has access to it directly. Ex-boyfriends/girlfriends/husbands/wives are the big ones, but nosy children or parents also pop up. I'm always very sensitive to this threat vector since while dealing with an abusive ex is bad, dealing with an abusive ex who has access to your e-mail and facebook is way worse. Password management programs can help in this regards, but written down books are problematic. Yes, someone could avoid writing down their "patterns" for One Time Grids, but that doesn't scale as having unique passwords for sites is more important than strong passwords in my opinion. You have no idea how sites are storing their passwords, so the best way to minimize your risk of a site storing your password in plaintext is to use different passwords for different sites.

I guess what I'm trying to say is I'm a big believer in hike your own hike. If you enjoy using One Time Grids, I haven't seen anything to caution against it. You are probably way more secure than most people who don't do anything special. While I'm biased to suggest standard password management programs like 1password, I'll readily admit that programs like 1password have usability problems too. If you really want to have a physical password book, free options include diceware, but if you like the idea of One Time Grids, quite simply, I'm not going to crack those passwords without a whole lot of help.

Bonus Snark

While doing research on One Time Grids, I came across the following on Amazon and my first thought was, "I bet whoever owned that copy previously was *really* important!!!" /jk

Only $4.67 for shipping though...

Creating Long Term SSL Certificates

2018-03-11T15:13:00.000-07:00

"It's constantly fascinating for me that something that feels absolutely right one year, 12 months later feels like the wrong thing to do." --Damian Lewis

Often I find myself having to create my own SSL certificates. Be it an internal web-server, or two scripts that need to communicate to each other, SSL is the easiest way to encrypt network traffic. Unfortunately it's also one of the most dangerous encryption methods. If you make a mistake setting it up it usually works ... at least for a little while.

Ignoring the client SSL checks for now, (hint if your script is using SSL and it works the first time, you probably are not checking SSL correctly), one area of danger is having your SSL certificates expire. As an example of that, recently every Oculus Rift broke because a code signing certificate expired. Admittedly this was a different type of certificate, but the same thing tends to happen with internal SSL deployments. People do not remember to update them, and when they expire things tend to break, (at least if your clients are checking SSL properly). The problem is when you use the standard OpenSSL libraries to create your certificates, there's three places that you need to specify certificate lifetimes. If you forget to specify any of the three, the certificate will be valid only for the default which is set to be "365 days".

These lifetime checks are:

The Certificate Authority has an expiration date
The actual certificate you are using has an expiration date
The CA signature for the certificate has an expiration date

Since most stack-overflow posts don't cover this, and Linux man pages are not helpful unless you already know what you are doing, I wanted to share my cheat sheet for creating long term, (valid for one thousand years), SSL Certificate Authorities and signing certs. This script was born from many previous failed efforts, and to be honest I'm still not sure I have it perfectly right. If you notice any improvements that could be made, please let me know!

Requirements/Comments:

These instructions were written for CentOS. It should work for most other Linux flavors without any changes. If you are using Windows, good luck!
OpenSSL
Whenever you see 365000 in the command that's the expiration date. I'm using 365*1000 as shorthand for one thousand years. Yes I realize that isn't exactly accurate. Feel free to change this to the time period you want to use.

Creating the Certificate Authority: (If you already have a CA ignore this, but you might want to check the valid lifetime for that CA)

Generate the key for the CA using 4096 RSA. Note the key will be cakey.pem so protect that!

openssl req -new -newkey rsa:4096 -nodes -out ca.csr -keyout cakey.pem

Create the CA's public certificate which will be called "cacert.pem". Note the '-days' field:

openssl x509 -trustout -signkey cakey.pem -days 365000 -req -in ca.csr -out cacert.pem

Important: When you run the previous command, you'll be set a list of questions. Note, for many SSL deployments you *must* have the Country, City, State, and Organization match between your CA and the certificates you are signing. Does this make sense? Of course not! The domain can be pretty important as well depending on what you are doing.

Next you need to copy the CA info and create the required files into where OpenSSL expects them. Yes if you know what you are doing you can override the defaults, but if not here's what to do:

If the /etc/pki/CA directly does not exist, create it
mv cakey.pem /etc/pki/CA/secret/cakey.pem
touch /etc/pki/CA/index.txt
create or edit /etc/pki/CA/serial using the text editor of your choice
In this file put a list of all the serial numbers you want to assign certificates, separated by a newline. For example:

It is *highly* recommended that you set permissions on the /etc/pki/CA directory so only the user you want to sign certificates has access to it.

Note, cacert.pem is not used for signing SSL certificates, but you'll need to push it to clients that are verifying the certificates

Creating and Signing a SSL Certificate:

Create the certificate private key using RSA 4096. It is named client.key in this example. Make sure you protect this!

openssl genrsa -out client.key 4096

Create the certificate request. Note the "days" field.

openssl req -new -key client.key -out client.csr -days 365000

Important: Remember for the questions it asks you, the Country, City, State, and Organization *must* match between your CA and the certificates you are signing. In addition, the domain can be pretty important depending on if you are checking that with your client or not

Create the actual client certificate. Once again, note the '-days' field

openssl ca -in client.csr -out client.pem -days 365000 -notext

Resulting Files:

Public Client Certificate: client.pem
Client private key: client.key (Only deploy on the server that owns this key)
Public CA certificate: cacert.pem
Private CA key: cakey.pem (Protect this one!!)

Solving Problems with Unknown Constraints

2017-08-16T12:31:00.000-07:00

"Software constraints are only confining if you use them for what they're intended to be used for"

-- David Byrne (Of the Talking Heads)

I recently had an ongoing conversation that spanned several days about the subject of solving mazes. A friend casually mentioned the "Same Wall Rule", (also known as the "Right Hand Rule"), for solving a maze. This is where if you want to find the exit of a maze you should pick a wall and follow it, with the assumption that you will eventually find the exit this way.

Same Wall Rule for Solving a Maze

I pointed out that while this rule generally works, you can't count on it as it can fail spectacularly. For example, what if you start out next to a free-standing wall?

Same Wall Rule Failing Horribly

After that our conversation turned to other things but the next day my friend came back and said "I found the problem! The Same Wall Rule will work, but you have to start at the beginning of the maze! Then you can be guaranteed that you won't hit a free-standing wall".

Which is true in most cases, but what if what you are looking for an exit in a free-standing section of the maze? For example what if the treasure is in the middle or you are dealing with a 3-dimensional maze?

Same Wall Rule Failing to Find Treasure

This reminded me of a paper that Cormac Herley recently wrote titled: Justifying Security Measures. I highly recommend reading it. It points out that in the security community we often say:

Security(X) > Security(~X)

When we really mean:

Outcome(X|ABCD) > Outcome(~X|ABCD).

Which is a fancy way of showing that when we say doing X is more secure than not doing X, there usually is a large number of assumptions, (ABCD....), that we're leaving out. Where this directly relates to the main topic of this blog, (password security), is that Herley specifically calls out the password field for the practice of ignoring constraints in our security advice. Or, to quote his paper:

"Passwords offers a target-rich environment for those seeking tautologies and
unfalsifiable claims."

Now back to the issue of maze solving, the same problem often arises. When we make a maze solving algorithm, we're making certain assumptions about the rules of the game. For example, the next iteration of a mapping algorithm might involve marking rooms that you have been in before to detect loops. Well there is a certain fairy-tale where that approach failed due to the marks being destroyed by a 3rd party actor:

Hansel and Gretel showing that marks aren't always permanent

Even assuming you can safeguard your marks in the maze, that approach may still not be effective if the maze moves while you are traversing it.

I've never seen such an amazing premise turned into such a boring book

Note, these assumptions go both ways. For example if you are designing a super hard maze, a snarky player can often do something completely unexpected.

Seriously, why would you want to go through the maze?

I'd argue that coming up with a perfect maze solver that works for all mazes with no constraints is a near impossible problem. If you can design an algorithm, chances are someone else can come up with a situation where it will fail. On the plus side, the same goes for maze designers. If you come up with a maze with constraints, someone probably can solve it even if it's not how you expected the maze to be solved.

This is a point that I'm actually optimistic about. We deal with imperfect knowledge of the rules we're operating under every day. That's part of the human condition! Tying this back in with Herley's paper, I think there's some things to keep in mind.

When giving advice to end users, I think it's fair to leave implied constraints out as long as the person giving the advice keeps them in mind. Aka telling your kids to follow the right hand wall to get through a corn maze is perfectly reasonable. Telling your kids this assumes there are no minotaurs or evil clowns waiting in the maze to eat them probably will not result in the end state you are aiming for.
Unfortunately following the above can lead to those constraints being forgotten over time and that advice being applied to situations where it is no longer helpful.
Therefore you need to be willing to question previously held beliefs and come up with new approaches when reality doesn't match your expected experiences.

The question then is, how do you discover/rediscover unknown constraints when your start experiencing issues?

One way to deal with this is through experimental design along with making hypothesis about what the results of those experiments will be before you run them. That's something I'm trying to get better at doing as seen in my previous blog post.

As an example: Hurley raises the question "Are lower-case pass-phrases better or worse than passwords with a mix of characters". If I construct an experiment I have to specify a set of constraints that experiment will run under. Now do those constraints match up with the real world use-cases. Of course not! But the fact that there are constraints can help myself and other people interpret how to use those results. Likewise before running an experiment it's important to have a theory and make a hypothesis about what the results will be. Once that's done, running the experiment can validate or falsify the hypothesis. I can then update theory as needed and the process continues.

To put it another way, I think there is a lot of areas where the academic side of computer security can help improve the practical impact that computer security choices impose on the end user ;p

Evaluating the Value of the (@)Purge Rule

2016-08-14T21:12:00.002-07:00

“Only sometimes when we pick and choose among the rules we discover later that we have set aside something precious in the process.”

― Helen Simonson, Major Pettigrew's Last Stand

Background and Problem Statement:

I was recently asked the following question: "Is there any value in supporting the character purge rule in Hashcat?" The purge rule '@x' will remove all characters of a specific type from a password guess. So for example the rule '@s' would turn 'password' into 'paword'. The full thread can be found on the Hashcat forum here. The reason behind this inquiry was that while the old version of Hashcat implemented the character purge rule, GPU versions of Hashcat and Hashcat 3.0 dropped support for it. Since then, At0m added support for the rule back in the newest build of Hashcat which makes this question much less pressing. That being said, similar questions pop up all the time and I felt it was worth looking into if only to talk about the process of investigating problems like this.

Side note, as evidence that any change will break someone's workflow, when researching this topic I did find one user who stored passphrase dictionaries with spaces left intact. They would then use the purge rule to remove the spaces during a cracking session so that way they wouldn't have to save a second copy of their passphrase wordlist without spaces. For that reason alone I think there is some value in the purge rule

The Purge Rule Explained:

Hashcat Rule Syntax: @X where (X) is the character you want to purge from the password guess
Example Rule: @s
Example Input: password
Example Output: paword

Hypothesis:

My gut feeling is that the purge rule will have limited impact on a cracking session. I base that on a rule of thumb that mangling rules work best if they mimic the thought process people use when creating passwords. For example, people often start with a base word and then append digits to it, replace letters with L33t replacements, etc. Therefore rules that mimic these behaviors tend to be more successful. I just don't see many people removing character classes from their password.

Now if you are a Linux fan, you'll realize Linux developers *love* removing characters from commands. Do you want to change your password? Well "passwd" is the command for you! Maybe Linux developers use the same strategy for their passwords? So I certainly could be wrong. That being said, the whole idea of a hypothesis is to go out on a limb and make a prediction on how an existing model will react so here I go:

My hypothesis is that the purge rule will crack less than 1 thousand passwords of a 1 million password dataset, (0.1%). Of those passwords cracked, a vast majority (95%), will be cracked due to weaknesses of the input dictionary vs. modeling how the user created the password. For example, 'paword' might be a new Pokemon type that didn't show up in the input dictionary vs being created by a user taking the word 'password' and then removing the S's.

Short Summary of Results:

The purge ruleset cracked 164 passwords (0.016% of the test set). This was slightly better then just using random rules which in a test run cracked 23 password, but not by much. Supporting this rule is unlikely to help in any noticeable degree with your cracking sessions.

Experimental Setup:

Test Dataset: 1 million passwords from the newest MySpace leak. These were randomly selected from the full set using the 'gshuf -n 1000000' command.

Reason: Truth be told, the main reason I used the MySpace passwords was I'm getting tired of using the RockYou dataset for everything. That being said, it's useful for this experiment that all of the passwords in that dataset have been converted to lowercase since I don't have to worry about combining case mangling rules with the purge rules.

Tools Used: Hashcat for the cracking, and John the Ripper for the --status option

Rulesets Used: Hashcat's D3ad0ne manging rules. I broke it up into two different rulesets with one containing the purge rules, (along with a few append/prepend '@' rules that snuck in), and the other one containing all the other mangling rules.

Reason: D3ad0ne's mangling rules contains about 34 thousand individual mangling rules. Due to its size and the fact that it is included with Hashcat it should make a good example of a ruleset that many Hashcat users are likely to incorporate in their cracking sessions. I initially split the base ruleset into two different subsets, with all rules including the '@' into one ruleset called d3ad0ne_purge, and all the other rules into another one called d3ad0ne_base. I then started manually going through d3ad0ne_purge and placing rules such as "append a @" into the d3ad0ne_base, but with over 1k rules in d3ad0ne_purge I quickly decided to remove the results of the append/prepend '@' after the fact instead of trying to fully isolate only purge rules in their own ruleset.

Dictinary Used: I used dic-0294 as my wordlist. Yes there are better input dictionaries out there, but this is a common one and strikes a good balance between size and coverage, plus it is public vs other dictionaries I have that are based on cracked passwords

Experimental Results:

Step 1) Run a normal cracking session on the 1 million myspace passwords using dic-0294 and D3ad0ne_base. This is important since the purge rule will likely crack many passwords that would be cracked normally with other rules. Running a normal cracking session first remove those passwords so we can focus on password that would only be cracked by the purge rules. The command I ran was below, (note, I'm editing some of the path information out of the commands for clarity sake).

./hashcat -D1 -m 100 -a 0 --remove myspace_rand_1m_hc.txt -r rules/d3ad0ne_base.rule dic-0294.txt

A couple of notes about the above rule. I'm using a version of Hashcat that I updated on August 10th 2016. I ran it on a very old MacBook Pro so the -D1 is telling it to use CPU only, (since the GPU doesn't have enough memory). The -m 100 is telling it to crack unsalted SHA-1 hashes. The -a 0 is to do a basic dictionary attack. --remove was to remove any cracked hashes so they aren't counted twice in future cracking sessions. myspace_rand_1m_hc.txt is my target set, rules/d3ad0ne_base.rule is my ruleset, and dic-0294.txt is my input dictionary. Below are the results of running this first attack.

With 36% of the passwords cracked by a very vanilla attack on a slow computer, that isn't bad. Next up is running the purge rules.

Step 2) Delete the previous hashcat.pot file. Run a cracking session on the remaining passwords using the purge ruleset. The command I ran was very similar to the one above:

./hashcat -D1 -m 100 -a 0 myspace_rand_1m_hc.txt -r rules/d3ad0ne_purge.rule dic-0294.txt

Note, I took off the --remove option since I didn't care about removing cracked hashes for this. I also deleted the previous .pot file of cracked passwords since I only wanted to store passwords associated with this test. Here is a screenshot I took partway through the cracking session:

As you can see. many of the cracked passwords were due to "insert a @ symbol" vs. using the purge rule. Here are the final results:

The session managed to crack 405 unique hashes. I then went into the pot file and deleted any password containing the '@' character so what was left was due to the purge rule. This left me a list containing 128 unique passwords. A screenshot is shown below:

Now it's hard to tell what people were thinking when they created these passwords, but glancing through the list, it certainly appeared that most of the cracked passwords were simply due to limitations in my input dictionary vs users purging characters from their passwords. I was actually surprised 'jayden' and 'fatguy' weren't in dic-0294 but after double checking it they were in fact missing from it.

Now, input dictionaries are always going to be limited to a certain extent so these cracks absolutely count. They only represent uniq cracked hashes though. For example, if 20 people used the password 'imabear' it would only be counted once. To figure out how many total accounts would have been cracked, I re-ran the above dictionary through John the Ripper against the myspace_1m_rand list. This was to get the files into John's cracked file (pot) format. For example here is 'imabear' in john.pot:

{SHA}QiPoQuc4sqqs3J+OulWLt3H09kY=:imabear

The reason I did this was because JtR has a really cool feature '-show' that will match up cracked passwords with the accounts in the target set. Running the command:

./john -format=raw-sha1 -show myspace_rand_1m_clean.txt

resulted in the following output:

Therefore the purge rules cracked a total of 164 passwords from the test set, or 0.0164% of the total. That's a really small amount. Admittedly every password cracked is nice, but still I was curious if the purge rules were better then just running random mangling rules instead. Luckily, Hashcat supports a command to test that out:

./hashcat -D1 -m 100 -a 0 myspace_rand_1m_hc.txt -g 500 dic-0294.txt

The only difference with the above command and the previous Hashcat commands I ran was that instead of a rules file I specified '-g 500'. What that does is tell Hashcat to generate 500 random rules to run on the input dictionary. I choose that number since there were over a thousand rules in my D3ad0ne_purge dictionary and I guestimated that about half of them were actual purge rules. When I ran the above I ended up cracking 23 more passwords. That's significantly less then the 164 the purge rules did but in the grand scheme of things it was about the same in effectiveness. Considering some of those rules were likely duplicates of rules in D3ad0ne_base ruleset as well I'd argue that running a purge rule is about equivalent of running a random mangling rule. In fact if you don't already have purge rules in your mangling set, I'd probably recommend not worrying about it and just running a brute force method like Markov mode to stretch your dictionary instead.

Conclusion:

For once my gut feeling was right and the value of Hashcat's purge rule '@' was limited in the tests that were run. That's not to say that it's not useful. It may help when targeting certain users or aid in keeping the size of your dictionary files on disk manageable. But at the same time, it's not a major feature that other password crackers should rush to mimic. I hope this blog post was informative in helping show different ways to evaluate the effectiveness of a mangling technique. If you have any questions, comments or suggestions please feel free to leave them in the comments section.

Cracking the MySpace List - First Impressions

2016-07-07T17:52:00.000-07:00

Alt Title: An Embarrassment of Riches

Backstory:

Sometime around 2008, a hacker or disgruntled employee managed to break into MySpace and steal all the usernames, e-mails, and passwords from the social networking site. This included information covering more than 360 million accounts. Who knows what else they stole or did, but for the purposes of this post I'll be focusing only on the account info. For excellent coverage of why the dataset appears to be from 2008 let me refer you to the always superb Troy Hunt's blog post on the subject. Side note, most of my information about this leak also comes from Troy's coverage.

This dataset has been floating around the underground crime markets since then, but didn't gain widespread notoriety until May 2016 when an advertisement offering it for sale was posted to the "Real Deal" dark market website. Then on July 1st, 2016, another researcher managed to obtain a copy and then posted a public torrent of then entire leak for anyone to download. That's where things stand at this moment.

Unpacking the Dataset:

The first thing that stands out about the dataset is how big it is. When uncompressed the full dump is 33 Gigs. Now, I've dealt with database dumps of similar size but they always included e-mails, forum posts, website code, etc. The biggest password dataset I previously had the chance to handle was RockYou set which weighed in at 33 million passwords and took up 275 MB of disk. Admittedly that didn't include user info and passwords were stored as plaintext, (the plaintexts are generally shorter than hex representation of hashes), but still that's a huge leap in data to process. Heck, even the full RockYou list is a bit of a pain to processes.

Let me put this another way. Here is a simple question, "How many accounts are in the MySpace list?" Normally that's quick and easy. Just run:

wc -l

And then you wait ... and wait ... and wait ... and then Google if there is a faster way to count lines .. and then wait. 16 minutes and 24 seconds later, I fount out there were 360,213,049 lines in the file. Does that equal the number of total accounts or is there junk in that file? Well, I don't want to spend the 30+ minutes to run a more complicated parser so that sounds about right to me ¯\_(ツ)_/¯. Long story short, doing anything with this file takes time. Eventually I plan on moving over to a computer with a SSD and more hardware which should help but it's something to keep in mind.

That being said, the next question is "What does the data look like?" Well here is a screenshot of the first couple of lines.

As you can see, it takes the form of unique ID that increments, e-mail address, username, and then two hashes. All of the fields except the unique ID can be blank.To answer the next question, "Why two hashes?" well ... ¯\_(ツ)_/¯. That's something I plan on looking at but I haven't gotten around to it yet.

Update: 7/7/16: Just as I was finalizing this post, I ran across CynoSure Prime's analysis where they managed to crack almost every single hash in this dataset. You can find their blog post here. It turns out the second hash is actually the original password, (full length with upper case characters) salted with the user_id. I'm going to leave most of this blog entry unmodified even though how to parse the list can certainly be optimized based on this new info. </Update>

Other random tidbits: The final unique ID is 1005290998. That's significantly higher than the number of accounts in this dataset so there are large chunks of accounts that were deleted at some point in time. My guess is when a user deleted their MySpace account it really was deleted in which case, kudos to MySpace for doing that! That's just a guess though. As you would expect the first accounts were administrative accounts and system process accounts. I know I blocked out the user e-mails but I will admit I googled the first name. When I found his LinkedIn profile my first reaction was, "Wow, he needs brag about his accomplishments more than just saying:"

Developed, and launched the initial Myspace community which currently has over 100 million members and was acquired by Fox Corp. for $580 million.

I mean if it was me I would post that database dump on my resume! Of course further googling led me to to the book "Stealing MySpace." Reading about all the drama that went on and suddenly there went my evening. Needless to say, the general layout of the dataset looks legit but one more interesting fact was all those gmail accounts. MySpace was created in 2003, Gmail opened for invitation access in 2004, and the lead engineer of MySpace left in 2003. So employees were able to update their accounts after they had left the company. Once again, kudos to MySpace but that was surprising.

Password Hash Format:

I initially learned from Troy Hunt's posts that the hashes were unsalted SHA1 with the plaintext lowercased and then truncated to 10 characters long. Therefore the password:

123#ThisIsMyPassword

would be saved as:

123#thisis

I've heard some people say that this means hackers can just brute force the entire key-space. If I was feeling nit-picky I could argue *technically* that's beyond the reach of commercial setups as 70^10 is still a really big number (27 characters + 10 digits, + 33 special characters). In reality though by intelligently searching the key-space, (who uses commas in their password?), a vast majority of unsalted password hashes can be cracked under that format. It's a bit of a moot point though since the real issue is using such a fast unsalted hash. Ah 2008, when it was still acceptable to claim ignorance for using a bad hashing set-up.

Long story short, from my experiments so far I can confirm that it appears all the hashes had their plaintexts lowercased and truncated to 10 characters. Also, yes, serious attackers are very likely to crack almost every password in this list.

Cracking MySpace Passwords With John the Ripper (Take 1):

After glancing around the dataset, the next thing I wanted to do was start cracking. To do this, I needed to extract and format the hashes. My first attempt to do this yielded the following script:

cat Myspace.com.txt | awk -F':' '{if (length($2) > 3) {print "myspace_big_hash1:" substr($4,3); if (length($5) > 3) {print "myspace_big_hash2:" substr($5,3)}}}' > myspace_clean_big.hsh

To point out a couple of features, I was labeling my data-sets so they are correctly identified in my input file, (I maintain different input files for different data sets but still having that name there has saved me trouble in the past), and I was removing blank hashes. Also I was stripping the username and e-mail addresses since I really didn't want to see passwords associated with names. The problem was the resulting file was huge. I didn't save it, but it was bigger than the original list! I couldn't afford the full naming convention. Therefore I switched to to following script:

cat Myspace.com.txt | awk -F':' '{if (length($2) > 3) {print substr($4,3); if (length($5) > 3) {print substr($5,3)}}}' > myspace_temp.hsh

And then to remove duplicates I ran:

sort -u myspace_temp.hsh > myspace_big.hsh

The resulting file was a little under 8 gigs which was better. Problems occurred though when I tried to load the resulting hash file into JtR. More specifically after letting it run overnight, JtR still hadn't loaded up the password list and started making guesses. That kind of makes sense, That's way more passwords than normal to parse and my laptop only had 8 gigs of ram so even in an ideal case the whole list probably couldn't be stored in memory. That's not an ideal cracking situation. Being curious, I then decided to try and load it up in Hashcat.

Cracking MySpace Passwords With Hashcat:

Loading up the dump in Hashcat was interesting since it gave me warnings about records in the dataset that weren't parsed correctly.

Regardless, once all was said and done, I ended up with the following error:

ERROR: cuMemAlloc() 2

Doing some quick Googling, I found out the cause was that the GPUs ran out of memory trying to load the hashes. Not surprising but it meant I had to take a different approach if I wanted to crack any hashes from this set.

The easiest way to do this was to split the full list up into smaller chunks and then crack each section by itself. One way to do that is with the split command

split -l 5000000000 myspace_big.hs myspace_split_

This will break up the list into 5 million hash chunks that follow the line of myspace_split_aa, myspace_split_ab .... The downside is since you have to crack each file individually, the total cracking time has been increased by close to a factor of 40. I'd recommend playing with the file size to maximize the total number of hashes per file that your GPU supports. On the plus side, after all that I can now finally crack passwords!

Finally cracking passwords

One issue I had was that there were so many hashes cracking all the time that it was hard to see the status of my session. It's not that my attack was effective, but with a list that large it's hard not to crack something. I belatedly realized I could pause hashcat, print the status and then resume. Or are Jeremi Gosney replied on Twitter, I could have used the following switch with Hashcat:

-o /dev/null

Closing Thoughts:

I'll admit I'm writing this conclusion with CynoSure Prime's analysis fresh in my mind. While the MySpace list is great for giving me a real world challenge to knock my head against, I'm not sure how useful it'll be from a research perspective. The 66 million salted hashes that were created from the original plaintexts will be nice for new training and testing sets so researcher's don't have to keep using RockYou for everything. That being said, MySpace is actually an older list than RockYou. Also I fully expect there to be a lot of overlap in the passwords between the two datasets. RockYou's entire business model was allowing apps to work across multiple social networking sites in the era before federated logins. RockYou was storing MySpace + LiveJournal + Facebook passwords in the clear so its app could post cross-post across all of them. Statistically I expect MySpace and RockYou to be very similar.

What worries me though, and what makes the MySpace list special, is it has user information associated with all those 360 million accounts + password hashes. Just about everyone who did any social networking and is between the ages of 24 and 40 is in this dump. I realize this list has been in the hands of criminals for the last eight years and a lot of the damage has already been done. Still, now that this list is public it enables many more targeted attacks to be carried out by malicious actors from all over the internet. How long before we start seeing the top 100 celebrity passwords posted on sites like Gawker? What about ex's using this information against former partners? Previous public password dumps have been much more limited or didn't contain e-mail addresses. I really don't know what will happen with this one. Hopefully I'm being overly paranoid but it's hard not to think about the downsides associated with this dump being widely distributed. On the plus side, hopefully this is the only mega-breach we'll see with weak password storage. Sites like Google and Facebook are now using very strong hashes which will limit a lot of damage if their user information is disclosed in the future.

Getting Started With Quantum Computing

2016-05-11T18:53:00.000-07:00

“More often than not, the only reason we need experiments is that we're not smart enough.” ― Scott Aaronson

IBM is currently offering free time on one of their quantum computers for interested researchers. Yup, you can program a real life quantum computer right now! In fact, I highly recommend signing up which you can do here. Go ahead and check it out. It took me about 24 hours to get my account approved so you can come back here afterwards to finish reading this post.

What got me interested in this opportunity was that while I have tried to keep up on the field of quantum computing, it basically is magic to me. I've been building up some general rules in my head about quantum systems, but any sort of question about them that did more than scratch the surface left me shrugging my shoulders. Also it was hard to separate fact from fiction.

Quantum Laws (in Matt's head):

Quantum is a system like everything else.

A quantum state is a configuration of the system.

A quantum state changes; it naturally wants to evolve, but it can always be undone.

Evolution of a closed system is a unitary transformation on its Hilbert space.

Only the Keeper can block quaffle shots thrown by the opposing team

Do not feed your qubits after midnight

That's why IBM's offer interested me so much. Let's be honest, there's always going to be some magic when it comes to quantum systems, but the opportunity to actually get hands on time programming one would at least turn the whole experience into alchemy if not science for me.

Participating in IBM's Quantum Experience:

After your account is approved you immediately have access to a research portal which IBM calls the "Quantum Experience". It's currently in Beta, but beyond a few bugs in the composer, (which I'll talk about in a bit), it's a very well polished site.

Are you ready to experience some quantums!?

The portal is divided into three tabs, "User Guide", "Composer", and "My Scores". The User Guide is fairly self explanatory but actually impressed me more than the quantum computer itself. I'm still making my way through it but the authors deserve a pat on the back since it's some of the best technical writing I've seen in a while. What's more, there are multiple links to the quantum simulator with examples for each section so you can read about a particular operation or theory and then run a simulation of it and check the results. You can then modify the example, re-run it, and in general play around with the concept before going back to where you were in the user's guide.

Don't worry, it starts out with simpler concepts.

The Composer is the programming interface for the quantum computer. It is attached to a quantum simulator as well. In it, you write "Scores" which are basically circuits to run on the quantum computer. IBM calls them scores since with five qubits to work with it looks like sheet music. That's also how the composer got its name.

An example score in the composer. Yes, this is the default example for Grover's algorithm, but I renamed it since it's all about how you frame the problem.

You can simulate a given quantum score as much as you'd like. When doing so, (or creating a new score), you have the option of choosing an ideal or real layout. The difference is that there are physical limitations of the real quantum computer which directly impact how you design your score.

Red pill or blue pill?

Qubit 2 is the gatekeeper

That's one of the neat things about using this service vs a standard quantum simulator. You can see some of the limitations that current implementations have to deal with. For example, Qubit 2 is the only qubit that can talk to other qubits, so if you want to perform operations like conditional NOTs, (CNOT), that has a huge impact.

Running it For Real:

That's all fun but the real reason you are probably using IBM's service is to actually run programs on their quantum computer. I'll admit the "good old days" of punch card mainframes was before my time but the whole setup is somewhat similar. You are given "Units" which are then used up when you run a program vs simulate it. IBM currently is being very generous with giving them out and you can request more for free. The typical program uses around 3 units to run. The results are probabilistic, so each run can be made up of multiple "shots", and then in the end the average of the results is presented to you. Further display options, such as blotch spheres where the results are plotted as a vector on a 3D sphere take even more shots to generate.

I feel a bit guilty about not just running this once, but it's the same price!

To further help you save units, as well as get you the results sooner, if your quantum score has previously been run by someone IBM will give you the option to see the saved results vs re-running it yourself.

Well I guess I wasn't that original...

If you do choose to run your program you are added to the queue. So far, most of my results have been available within a couple of minutes.

Someone else is quantuming ahead of me

Looks like the plain-text is '00'. As I said, it's all about framing the problem.

Remember earlier when I said the runs were probabilistic? You can really see that in the results above. The correct answer was '00', but around 24% of the time a different answer was chosen.

Issues With the Composer:

I need to submit bug reports, (the bug icon is prominently displayed in the lower right corner of the screen on the portal site), but I've been hesitant to since all the issues I've run into have been very minor. Sometimes the composer gets a bit wonky, (gates get stuck or aren't saved when you run your simulation), but the problem goes away when I refresh my screen. Also, it would be nice if the transitions between composer and users guide were quicker or I could have them open side by side, (opening multiple browser windows does not work). All in all though, I haven't run into any major issues considering this program is currently a beta release.

Summary:

You are not going to be able to hack any Gibsons with IBM's quantum computer. It's very limited, but that is kind of the point. It shows where the field of quantum computing is right now. IBM is providing an amazing free learning opportunity with this service and if you are at all interested in the future of computing I highly recommend checking it out.

Challenges with Evaluating Password Cracking Algorithms

2015-08-19T19:37:00.003-07:00

"In theory, theory and practice are the same. In practice they are not" -Quote from somebody on the internet. Also attributed to Albert Einstein but I've never been able to find the original source to back that up.

Back-story:

Currently I'm writing a post looking into Hashcat's Markov mode but I found myself starting off by including several paragraphs worth of disclosures and caveats. Or to put it another way:

It was a valid point Jeremi brought up and it's something I'm trying to avoid. After thinking about it for a bit I figured this topic was worth its own post.

Precision vs Recall:

Part of the challenge I'm dealing with is I'm performing experiments vs writing tutorials. That's not to say I won't write tutorials in the future but designing and running tests to evaluate algorithms is fun and what I'm interested in right now. Why this can be a problem though is that I can get so deep into how an algorithm works that it's easy to loose sight of how it performs in a real life cracking session. I try to be aware of this, but an additional challenge is representing these investigations to everyone else in a way that isn't misleading.

This gets into the larger issue of balancing precision and recall. In a password cracking context, precision is modeling how effective each guess is when it comes to cracking a password. The higher your precision, the fewer guesses on average you need to make to crack a password. As a rule of thumb if you see a graph with number of guesses on the X axis and percentage of passwords cracked on the Y axis, it's probably measuring precision.

An example of measuring the precision of different cracking techniques

Recall on the other hand is the percentage of passwords cracked during a cracking session in total, regardless of how many guesses are made. Usually this isn't represented in a graph format, and if it is, the X axis will be represented by "Time", and not number of guesses.

Courtesy of Korelogic's Crack Me If You Can contest. This represents a Recall based graph

It's tempting to say that "Precision" is a theoretical measurement and "Recall" is the practical results. It's not quite so clear cut though since the "time" factor in password cracking generally boils down to "number of guesses". In an online guessing scenario an attacker may only be able to make 10-20 guesses. With a fast hash, offline attack, and a moderate GPU setup, billions of guesses a second are possible and an attack might run for several weeks. Therefore recall results tend to be highly dependent of the particular situation being modeled.

Now it would be much easier to switch between "Precision" and "Recall" if there was a direct mapping between number of guesses and time. The problem is, not all guesses take the same amount of time. A good example of that is CPU vs GPU based guessing algorithms. Going back to John the Ripper's Incremental mode, I'm not aware of any GPU implementation of it so guesses have to be generated by the CPU and then sent to the GPU for hashing. Meanwhile Hashcat's Markov mode can run in the GPU itself, and in Atom's words "it has to create 16 billions candidates per 10 milliseconds on a single GPU. Yes, billions". Therefore this can lead to situations such in the case of a very fast hash where certain attacks might have a higher precision, but worse recall.

Amdahl's law and why I find precision interesting

When trying to increase recall an attacker generally has two different avenues to follow. They can increase the number of guesses they make or they can increase the precision of the guesses they make. These improvements aren't always exclusive; many times you can do both. Often though there is a balancing act as more advanced logic can take time and may be CPU bound. What this means is that you might increase precision only to find your recall has fallen since you are now making fewer guesses. That being said, if the increase in precision is high enough, then even an expensive guessing algorithm might do well enough to overcome the decrease in the total number of guesses it can make.

Often in these optimization situations Amdahl's law pops into my head, though Gustafson's law might be more appropriate for password cracking due to the rate of increase in the number of guesses. Amdahl's law in a nutshell says the maximum speedup you can have is always limited by the part of the program you can't optimize. To put it another way, if you reduce the cost of an action by 99%, but that action only accounts for 1% of the total run-time, then your maximum total speedup no matter how cool your optimization is would be no more than 1%.

Where this applies to password cracking is the cost of a guess in an offline cracking attack can be roughly modeled as:

Cost of making the plain-text guess + cost of hashing + general overhead of the cracking tool

Right now the situation in many cases is that the cost of hashing is low thanks to fast unsalted hashing algorithms and GPU based crackers. Therefore it makes sense to focus on reducing the cost of making the plain-text guesses as much as possible since that will have a huge impact on the overall cost of making a guess. Aka, trading precision for speed in your guessing algorithm can have a significant impact on the total number of guesses you can make. If on the other hand a strong hash is used, (or you at least are trying to crack a large number of uniquely salted hashes), the dominant factor in the above equation becomes the hashing itself. Therefore a speedup in the plaintext generation will not have as much impact on the overall cost and therefore precision becomes more important.

As a researcher, precision is very interesting for me. From a defensive standpoint a good starting place is "use a computationally expensive salted hash". If you aren't at least doing that then the chances are you aren't interested in doing anything more exotic. Also when it comes to contributing to the larger research community, well my coding skills are such that I'm not going to be making many improvements to the actual password cracking tools. Evaluating and improving the precision of different attacks is much more doable.

Carnegie Mellon's Password Guessability Service:

One cool resource for password security researchers is the new Password Guessability service being offered by the CUPs team over at Carnegie Mellon. I'm going to paraphrase their talk, but basically their team got tired of everyone comparing their passwords attacks to the same default rulesets of John the Ripper so they created a service for researchers to model more realistic password cracking sessions. If you are interested their USNIX paper describing their lab setup can be found here. Likewise if you want to see a video of their Passwords15LV talk you can view it here. More importantly, if you want to go to their actual site you can find it here:

https://pgs.ece.cmu.edu/

The service itself is free to ethical security researchers and is run by students so don't be a jerk. The actual attacks they run are bound to change with time, but as of right now they are offering to model several different default password cracking attacks consisting of around 10 trillion guesses each. These cracking attacks use the public TrustWave's JtR KoreLogic Rulelist, several different HashCat rulesets, an updated Probabilistic Context Free Grammar attack, and another custom attack designed by Korelogic specifically for this service. All in all, if you need to represent an "industry standard" cracking session it's hard to do better. In fact it probably represents a much more advanced attacker than many of the adversaries out there if you assume the target passwords were protected by a hashing algorithm of moderate strength.

I could keep on talking about this service but you really should just read their paper first. I think it's a wonderful resource for the research community and I have lots of respect for them offering this. So the next question of course is what does that mean for this blog? I plan on using this service as it makes sense without hogging Carnegie Mellon's resources. I need to talk to them more about it but I expect that I'll have them run it against a subset of the RockYou list and then use, and reuse, those results to evaluate other cracking techniques as I investigate them. If I attack some other dataset though I may just run a subset of the attacks myself, unless that dataset and the related tests are interesting enough to make using CM's resources worth it.

Fun with Designing Experiments:

When designing experiments there's usually a couple of common threads I'm always struggling with:

Poor datasets. I know there's a ton of password dumps floating around but often due to the nature of their disclosure there's massive problems or shortcomings with most of them. For example most of the dumps on password cracking forums or pastebin have only unique hashes, so '123456' only shows up once, and there is no attribution. Gawker was a site most people didn't care about and the hashing algorithm cut off the plaintext after 8 characters and replaced non-ASCII text as a '?'. A majority of the passwords in the Stratfor dataset were machine generated. Myspace, well that was a result of a phishing attack so it has many instances of 'F*** You You F***ing Hacker'. Even with RockYou the dataset is complicated as it contained many passwords from the same users for different sites but since there were no usernames connected with the public version of it, it can be hard to sort out. Then there is the fact that most of these datasets were for fairly unimportant sites. I'm not aware of any confirmed public Active Directory dump, (though there are a large number of NT hashes floating about and this whole Ashley Madison hack may change things with the Avid Life Media NT hashes there). Likewise, while there are some banking password lists, the amount of drama surrounding them makes me hesitant to use them.
Short running time. Personally I like keeping the time it takes to run a test to around an hour or so. While I can certainly run longer tests, realistically anything over a couple of days isn't going to happen since I like using my computers for other things and truth be told, it always seems like I end up finding out I need to run additional tests or I messed something up in my original setup and need to re-run it. Shorter tests are very much preferred. Add into that the fact that most of the time I'm modeling precision and running my tests on a CPU system means most of my tests will not be modeling GPU cracking vs fast hashes.
What hypothesis do I want to test, and can I design an experiment to test it? I'll admit, sometimes I'll have no clue what the results of a test will be so I'll pull a YOLO, throw some stuff together and just run it to see what pops out. That's not ideal though as I usually like to try and predict the results. I'm often wrong, but that at least forces me to look deeper into what assumptions I held were wrong, and hey that's why I run tests in the first place.

Furthermore, for at least the next couple of tools I'm investigating I plan on using both Hashcat and John the Ripper as much as possible. While it might not always make sense to use both of them as often there isn't an apples to apples comparison, I do have some ulterior motives. Basically it helps me to use both of these tools in a public setting and I've already gotten a lot of positive feedback from my PRINCE post. It's pretty amazing when I can have a creator of a tool tell me how I can optimize my cracking techniques. My secondary reason for this is to make people more aware of both of these tools. When it comes to the different attack modes I've found there's a lot of misunderstandings of what each tool is capable of.

That being said, I explicitly don't want to get into "Tool A is better than Tool B" type debates. Which tool you use really depends on your situation. Heck, occasionally I'm glad I still have Cain and Abel installed. I'll admit, this is going to get tricky when I'm doing tests such as comparing Hashcat's Markov mode to JtR's Incremental mode, but please keep in mind that I want to make all the tools better.

Enough talk; Give us some code or graphs or GTFO:

Thanks for putting up with all of that text. In the spirit of showing all my research I'm sharing the tool that I wrote to evaluate password cracking sessions which I'll be using in this blog. The code is available here:

https://github.com/lakiw/Password_Research_Tools

The specific tool I'm talking about, (in the hope that I release multiple tools in the future so it isn't obvious ;p), is called checkpass2.py. It's a significantly faster version of the old checkpass program I had used and released in the past. The options on how it works are detailed in the -h switch, but basically you can pipe whatever password guess generation tool you are using into it and it'll compare your guesses against a plaintext target list and tell you how effective your cracking session would have been. For example if you were using John the Ripper you could use the -stdout option to model a cracking session as follows:

./john -wordlist=passwords.lst -rules=single -stdout | python checkpass2.py -t target.pws -o results.txt

It also has some options like limiting the maximum number of guesses or starting a count at a specific number if you want to chain multiple cracking sessions together. There's certainly still a lot of improvements that need to be made to it, but if you like graphs I hope it might be useful to you. Please keep in mind that this isn't a password cracker. Aka, It does not do any hashing of password guesses. So if you want to model a password cracking session against a hashed list you'll need to run two attacks, One to crack the list using the tool of your choice, and a second session to use the checkpass2.py tool to model your cracking session against the cracked passwords. Since both John the Ripper and Hashcat have logging options you might want to consider using them instead to save time. Where checkpass2 is nice for me anyway is the fact that I can quickly edit the code depending on what I need so it's easier to do things like format the output for what I'm doing. Long story short, I hope it is helpful but I still strongly recommend looking into the logging options that both John the Ripper and Hashcat offer.

Tool Deep Dive: PRINCE

2014-12-22T08:03:00.004-08:00

Tool Name: PRINCE (PRobability INfinite Chained Elements)
Version Reviewed: 0.12
Author: Jens Steube, (Atom from Hashcat)
OS Supported: Linux, Mac, and Windows
Password Crackers Supported: It is a command line tool so it will work with any cracker that accepts input from stdin

Blog Change History:

1/4/2015: Fixed some terminology after talking to Atom
1/4/2015: Removed a part in the Algorithm Design section that talked about a bug that has since been fixed in version 0.13
1/4/2015: Added an additional test with PRINCE and JtR Incremental after a dictionary attack
1/4/2015: Added a section for using PRINCE with oclHashcat

Brief Description:

PRINCE is a password guess generator and can be thought of as an advanced Combinator attack. Rather than taking as input two different dictionaries and then outputting all the possible two word combinations though, PRINCE only has one input dictionary and builds "chains" of combined words. These chains can have 1 to N words from the input dictionary concatenated together. So for example if it is outputting guesses of length four, it could generate them using combinations from the input dictionary such as:

4 letter word
2 letter word + 2 letter word
1 letter word + 3 letter word
1 letter word + 1 letter word + 2 letter word
1 letter word + 2 letter word + 1 letter word
1 letter word + 1 letter word + 1 letter word + 1 letter word
..... (You get the idea)

Algorithm Design:

As of this time the source-code of PRINCE has not been released. Therefore this description is based solely on At0m's Passwords14 presentation, talking to At0m himself on IRC as well as running experiments with various small dictionaries using the tool itself and manually looking at the output.

As stated in the description, PRINCE combines words from the input dictionary to produce password guesses. The first step is processing the input dictionary. Feeding it an input dictionary of:

a
a

resulted it in generating the following guesses:

a
a
aa
aa
aaa
aaa
aaaa
aaaa
...(output cut to save space)

Therefore, it's pretty obvious that the tool does not perform duplicate detection when loading a file

Finding #1: Make sure you remove duplicate words from your input dictionary *before* you run PRINCE

After PRINCE reads in the input dictionary it stores each word, (element), in a table consisting of all the words of the same length. PRINCE then constructs chains consisting of 1 to N different elements. Right now it appears that N is equal to eight, (confirmed when using the --elem-cnt-min option). It does this by setting up structures of the different tables and then filling them out. For example with the input dictionary:

a

It will generate the guesses:

a
aa
aaa
aaaa
aaaaa
aaaaaa
aaaaaaa
aaaaaaaa

This isn't to say that it won't generate longer guesses since elements can be longer then length 1. For example with the following input dictionary:

a
bb
BigInput

It generates the following guesses

a
aa
bba
aabb
bbabb
bbbbbb
abbbbbb
BigInput
BigInputbb
bbBigInputbb
bb
aaa
...(output cut to save space)

Next up, according to the 35 slide of the Passwords14 talk it appears that Prince should be sorting these chains according to keyspace. This way it can output guesses from the chains with the smallest keyspace first. This can be useful so it will do things like append values on the end of dictionary words before it tries a full exhaustive brute force of all eight character passwords. While this appears to happen to a certain extent, something else is going on as well. For example with the input dictionary:

a
b
cc

It would output the following results:

a
b
cc
cca
cccc
ccacc
cccccc
acccccc
cccccccc
cccccccccc
cccccccccccc
aa
ccb
acca
ccbcc
aacccc
bcccccc
aacccccc
.....(Lots of results omitted).....
aaaabbbb
baaabbbb
abaabbbb
bbaabbbb
aababbbb
bababbbb
abbabbbb
bbbabbbb
aaabbbbb
baabbbbb

This is a bit of a mixed bag. While it certainly saved the highest keyspace chains for the end, it didn't output everything in true increasing keyspace order since elements of length 1, (E1), had two items, while elements of length 2, (E2), only had one item, but it outputted E1 first. I have some suspicions that the order it outputs its chains is independent on how many items actually are in each element for that particular run, (aka as long as there is at least one item in each element, it is independent of your input dictionary). I don't have anything hard to back up that suspicion though beyond a couple of sample runs like the one above. Is this a problem? Quite honestly, I'm not really sure, but it is something to keep in mind. When I talked to Atom about this he said that password length compared to the average length of items in the training set also influenced the order at which chains were selected so that may have something to do with it.

Finding #2: PRINCE is not guaranteed to output all chains in increasing keyspace order, though it appears to at least make an attempt to do so

Additional Options:

--elem-cnt-max=NUM: This limits the number of elements that can be combined to NUM. Aka if you set NUM to 4, then it can combine up to 4 different elements. So if you had the input word 'a' it could generate 'aaaa' but not 'aaaaa'. This may be useful to limit some of the brute forcing it does.

The rest of the options are pretty self explanatory. One request I would have is for PRINCE to save its position automatically, or at least print out the current guess number when it is halted, to make it easier to restart a session by using the "--skip=NUM" option.

Performance:

PRINCE was written by Atom so of course it is fast. If you are using a CPU cracker it shouldn't have a significant impact on your cracking session even if you are attacking a fast hash. For comparison sake, I ran it along with JtR's incremental mode on my MacBook Pro.

Prince:
run laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/passwords_top10k.txt | ./john --format=raw-sha1-linkedin -stdin one_hash.txt
Loaded 1 password hash (Raw SHA-1 LinkedIn [128/128 SSE2 intrinsics 8x])
guesses: 0 time: 0:00:02:00 c/s: 1895K trying: asdperkins6666 - bobperkins

JtR Incremental Mode:

run laki$ ./john -incremental=All -stdout | ./john --format=raw-sha1-linkedin -stdin one_hash.txt

Loaded 1 password hash (Raw SHA-1 LinkedIn [128/128 SSE2 intrinsics 8x])

guesses: 0 time: 0:00:00:14 c/s: 2647K trying: rbigmmi - rbigm65

Using PRINCE with OCLHashcat:

Below is a sample screen shot of me using PRINCE as input for OCLHashcat on my cracking box, (it has a single HD7970 GPU). Ignore the --force option as I had just installed an updated video card driver and was too lazy to revert back to my old one that OCLHashcat supports. I was also too lazy to boot into Linux since I was using Excel for this post and my cracking box also is my main computer...

What I wanted to point out was that for a fast hash, (such as unsalted SHA1 in this case), since PRINCE is not integrated into OCLHashcat it can't push guesses fast enough to the GPU to take full advantage of the GPU's cracking potential. In this case, the GPU is only at around 50% utilization. That is a longer way of saying that while you still totally make use of OCLHashcat when using PRINCE, it may be adventurous to also run dictionary based rules on the guesses PRINCE generates. Since those dictionary rules are applied on the GPU itself you can make a lot more guesses per second to take full advantage of your cracking hardware. This is also something Atom recommends and he helpfully included two different rulesets with the PRINCE tool itself.

Side note: PRINCE plows though the LinkedIn list pretty effectively. To get the screenshot above I had to run the cracking session twice since otherwise the screen would have been filled with cracked passwords.

Big Picture Analysis:

The main question of course is how does this tool fit into a cracking session? Atom talked about how he saw PRINCE as a way to automate password cracking. The closest analogy would be John the Ripper's default behavior where it will start with Single Crack Mode, (lots of rules applied to a very targeted wordlist), move on to Wordlist mode, (basic dictionary attack), and then try Incremental mode, (smart bruteforce). Likewise with PRINCE depending on how you structure your input dictionary it can act as a standard dictionary attack, (appending/prepending digits to input words for example), combinator attack, (duh), and pure brute force attack, (trying all eight character combos). It can even do a limited passpharse attack though it gets into "Correct Horse Battery Staple" keyspace issues then. For example, with the input dictionary of:

Correct
Horse
Battery
Staple

It will generate all four word combinations such as:

HorseHorseCorrectBattery
HorseHorseBatteryBattery
HorseCorrectCorrectHorse
HorseBatteryCorrectHorse
HorseCorrectBatteryHorse
HorseBatteryBatteryHorse
CorrectCorrectHorseHorse
BatteryCorrectHorseHorse
CorrectBatteryHorseHorse
BatteryBatteryHorseHorse

When talking about passpharse attacks then, keep in mind it doesn't have any advanced logic so you are really doing a full keyspace attack of all the possible combinations of words.

The big question then is how does it compare against other attack modes when cracking passwords? You know what this means? Experiments and graphs!

I decided I would base my first couple of comparisons using the demos Atom had listed in his slides as a starting point. I figure no-one would know how to use PRINCE better than he would. Note: these are super short runs. While I could explain that away by saying this simulates targeting a slow hash like bcrypt, the reality is Atom made some noticeable changes in PRINCE while I was writing this post, (yay slow update schedule). I figured it would be good to make some quick runs with the newer version to get a general idea of how PRINCE performs and then post a more realistic length run at a later time. Also, this way I can get feedback on my experiment design so I don't waste time running a longer cracking session on a flawed approach.

Experiment 1) PRINCE, Hashcat Markov mode, and JtR Incremental mode targeting the MySpace list

Experiment Setup:
The input dictionary for PRINCE was the top 100k most popular passwords from the RockYou list, as this is what Atom used. For Hashcat I generated a stats file on the full RockYou list and used a limit of 16. For JtR I ran the default Incremental mode using the "All" character set. The target list was the old MySpace list. The reason why I picked that vs the Stratfor dataset which Atom used was simply because there are a ton of computer generated passwords, (aka default passwords assigned to users), in the Startfor dataset so it can be a bit misleading when used to test against.

Cracking Length: 1 billion guesses

Commands used:
laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

laki$ ../../../John/john-1.7.9-jumbo-7/run/john -incremental=All -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

laki$ ../../../hashcat/statsprocessor-0.10/sp64.app --threshold=16 ../../../hashcat/statsprocessor-0.10/hashcat.hcstat | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

Experiment Results:

Click on the graph for a zoomed in picture. As you can see, Prince did really well starting out but then quickly became less effective. This is because it used most, (if not all), of the most common words in the RockYou list first so it acted like a normal dictionary attack. At the same time, Incremental Mode was starting to catch up by the end of the run. While I could continue to run this test over a longer cracking session, this actually brings up the next two experiments....

Experiment 2) PRINCE and Dictionary Attacks targeting the MySpace list

Experiment Setup:
This is the same as the previous test targeting the MySpace dataset, but this time using dictionary attacks. For JtR, I stuck with the default ruleset and the more advanced "Single" ruleset. I also ran a test using Hashcat and the ruleset Atom included along with PRINCE, (prince_generated.rule). For all the dictionary attacks, I used the RockYou top 100k dictionary to keep them comparable to the PRINCE attack.

Cracking Length: I gave each session up to 1 billion guesses, but the two JtR attacks were so short that I only displayed the first 100 million guesses on the graph so they wouldn't blend in with the Y-axis. The hashcat attack used a little over 700 million guesses which I annotated its final results on the graph. Side note, (and this merits another blog post), but Hashcat performs its cracking sessions using word order, vs JtR's rule order. I suspect this is to make hashcat faster when cracking passwords using GPUs. You can read about the difference in those two modes in one of my very first blog posts back in the day. What this means is that Hashcat's cracking sessions tend to be much less front loaded unless you take the time to run multiple cracking sessions using smaller mangling rulesets.

Commands used:
laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

laki$ ../../../John/john-1.7.9-jumbo-7/run/john -wordlist=../../../dictionaries/Rockyou_top_100k.txt -rules=wordlist -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

laki$ ../../../John/john-1.7.9-jumbo-7/run/john -wordlist=../../../dictionaries/Rockyou_top_100k.txt -rules=single -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

laki$ ../../../hashcat/hashcat-0.48/hashcat-cli64.app --stdout -a 0 -r ../../../Tools/princeprocessor-0.12/prince_generated.rule ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000

Experiment Results:

As you can see, all of the dictionary attacks performed drastically better than the PRINCE over the length of their cracking sessions. That's to be expected since their rulesets were crafted by hand while PRINCE generates its rules automatically on the fly. I'd also like to point out that once the normal dictionary attacks are done, PRINCE keeps on running. That's another way of saying that PRINCE still has a role to play in a password cracking session even if standard dictionary attacks initially outperform it. All this test points out is if you are going to be running a shorter cracking session you would be much better off running a normal dictionary based attack instead of PRINCE. This does lead to my next question and test though. After you run a normal dictionary attack, how does PRINCE do in comparison to a Markov brute force based attack?

Experiment 3) PRINCE and JtR Wordlist + Incremental mode targeting the MySpace list

Experiment Setup:
Based on feedback from Atom I decided to restructure this next test. First of all, Atom recommended using the full Rockyou list as an input dictionary for PRINCE. Since that is a larger input dictionary than just the first 100k most frequent passwords, I re-ran JtR's single mode ruleset against the MySpace list using the full Rockyou dictionary as well. I also used the most recent version of JtR, 1.8-jumbo1 based on the recommendation of SolarDesigner. This cracked a total of 23,865 passwords from the MySpace list, (slightly more than 64%). I then ran PRINCE, (the newer version 0.13) with the full RockYou dictionary, (ordered), and JtR Incremental=UTF8, (equivalent to "ALL" in the older version of JtR), against the remaining uncracked passwords. I also increased the cracking time to 10 billion guesses.

Side note: I ran a third test PRINCE using the RockYou top 100k input dictionary as well since the newer results were very surprising. I'll talk about that in a bit...

Cracking Length: 10 billion guesses

Commands used:
laki$ ../../../John/john-1.8.0-jumbo-1/run/john -wordlist= ../../../dictionaries/Rockyou_full_ordered.txt -rules=single -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -u uncracked_myspace.txt

laki$ ../../../Tools/princeprocessor-0.13/pp64.app < ../../../dictionaries/Rockyou_full_ordered.txt | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865

laki$ ../../../Tools/princeprocessor-0.13/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865

laki$ ../../../John/john-1.8.0-jumbo-1/run/john -incremental=UTF8 -stdout | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865

Experiment Results:

I'll guiltily admit before running this test I hadn't been that impressed with PRICE. That's because I had been running it with the top 100k RockYou dictionary. As you can see, with the smaller dictionary it performed horribly. When I ran the new test with the full RockYou dictionary though, PRINCE did significantly better than an Incremental brute force attack. Yes, cracking 1.5% more of the total set might not seem like much, but it will take Incremental mode a *long* time to catch up to that. Long story short though, PRINCE's effectiveness is extremly dependend on the input dictionary you use for it.

Like most surprising test results, this opens up more questions then it solves. For example, what exactly is going on with PRINCE to make it so much more effective with the new dictionary. My current hypothesis is that it is emulating a longer dictionary attack, but I need to run some more tests to figure out if that's the case or not. Regardless, these results show that PRINCE appears to be a very useful tool to have in your toolbox if you use the right input dictionary for it.

Current Open Questions:

What is the optimal input dictionary to use for PRINCE? Yes the full RockYou input dictionary does well but my gut feeling is we can do better. That leads me to the next open question...
Can we make PRINCE smarter? Right now it transitions between dictionary attacks and brute force automatically, but beyond sorting the chains by keyspace it doesn't have much advanced logic in it. Perhaps if we can better understand what makes it effective we can make a better algorithm that is even more effective than PRINCE.

Other References: