tag:blogger.com,1999:blog-4964515364938053712024-03-12T15:56:44.137-07:00Reusable SecurityPassword Cracking, Crypto, and General Security ResearchMatt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.comBlogger95125tag:blogger.com,1999:blog-496451536493805371.post-43489082490150727052023-11-08T20:05:00.006-08:002023-11-08T20:13:48.364-08:00Jupyter Lab Framework Example: Revisiting CMIYC2022<p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBc5CIftOkjEYeVfN9CjBIr72UfY5n9hHfHAgkZpJxd9mPGr07fvL7-CQhhXP271nlXjun4T-2V8GTYD-0Eal58oSlsfBybUOVIHeCeBqhgbU0307URgMkr0jVRrGoAgmAS04t7GDSzXhGch6CsyxpnB8xJiQeuVQNW79UFq3u5VdLRmQZt_ZnuaRqJlk/s1456/jtr_hashcat_cmiyc2022.png" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img alt="Midjourney generated picture of a cat in a tophat thowing playing cards" border="0" data-original-height="832" data-original-width="1456" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhBc5CIftOkjEYeVfN9CjBIr72UfY5n9hHfHAgkZpJxd9mPGr07fvL7-CQhhXP271nlXjun4T-2V8GTYD-0Eal58oSlsfBybUOVIHeCeBqhgbU0307URgMkr0jVRrGoAgmAS04t7GDSzXhGch6CsyxpnB8xJiQeuVQNW79UFq3u5VdLRmQZt_ZnuaRqJlk/s16000/jtr_hashcat_cmiyc2022.png" title="Going back to Vegas 2022!" /></a></p><div><br /></div><div></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><i><b>Everything that happens once can never happen again. But everything that happens twice will surely happen a third time.</b></i><br />-- Paulo Coelho</blockquote><h2 style="text-align: left;">Introducing the JupyterLab Password Cracking Framework: </h2><div>For the last couple of months, I've been (slowly) working on building out a new backend/framework to be able to manage password cracking sessions using JupyterLab as the frontend/GUI. The current version of this framework is available <a href="https://github.com/lakiw/Jupyter-Password-Cracking-Framework">[here]</a>.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEkhoCBmj2qkmUHgwacPeYdnqu3I8um3rpdpkw27ZlcsjtYb3-UOPCkW3JKiFTNl0zzWd9ZEqUwl6vhPldG8NI9B5gqQ1x4OqM0_pXthTPaumhS_JIJ3XK40cGmDz8nJ9lngAM94xjbOatGHR0znw-ZtufmQmKxUyDOvThnYZP5-UG1PMRGhytcnyRAao/s1947/quick_screenshot.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Screenshot of the JupyterLab Framework" border="0" data-original-height="1220" data-original-width="1947" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhEkhoCBmj2qkmUHgwacPeYdnqu3I8um3rpdpkw27ZlcsjtYb3-UOPCkW3JKiFTNl0zzWd9ZEqUwl6vhPldG8NI9B5gqQ1x4OqM0_pXthTPaumhS_JIJ3XK40cGmDz8nJ9lngAM94xjbOatGHR0znw-ZtufmQmKxUyDOvThnYZP5-UG1PMRGhytcnyRAao/s16000/quick_screenshot.png" title="It's like an interactive wiki for your cracking sessions" /></a></div><br /><div>This project is under active development (well active for me anyways), and I'd really appreciate feedback and suggestions on how to extend and improve it. My goal is to have an opensource, community driven alternative for Team Hashcat's List Condense (LC) collaboration server ready by CMIYC2024.</div><div><h2>About The Framework:</h2></div><div>I view JuypterLabs as a stone soup. It provides a good interface, interactive Python debugger, and a way to save and share analysis results. But it is still up to you to do all of the backend analysis. That became very evident when I used JupyterLabs in the CMIYC2023 contest (as detailed in the previous three blog posts on this site). I spent a lot of time debugging my code and messing with data structures when I really needed to be focused on cracking passwords. I quickly realized for this approach to be effective in future password cracking contests that I'd need to develop back-end data structures and classes to better organize all the data and provide built-in features to aid in common tasks.</div><div><br /></div><div>Backing up a bit, in the past I've really enjoyed using MITRE CRITs (Collaborative Research Into Threats) <a href="https://crits.github.io/">[Link]</a>. Unfortunately CRITs is no longer being maintained (the switch from Python2 to Python3 killed the project), but it was a tool for CSOC (Cyber Security Operation Center) team members to collaborate with each other when analyzing intrusion sets and threat actors. CRITs organized Top Level Objects (TLO) into the following buckets/categories:</div><div><ul style="text-align: left;"><li>Actors</li><li>Campaigns</li><li>Certificates</li><li>Domains</li><li>Emails</li><li>Events</li><li>Indicators</li><li>IPs</li><li>PCAPs</li><li>Raw Data</li><li>Samples</li><li>Targets</li></ul>It then made the data available in these different buckets cross linkable as well as accessible to various plugins. Following that approach, I created several different TLOs for this password cracking framework:</div><div><ul style="text-align: left;"><li>Hashes: Contains information about the hashes including plaintext values, hash types, etc</li><li>Targets: Contains information about particular targets/users and metadata</li><li>Sessions: Contains information about a cracking session. The closes CRITs equivalent would be Campaigns.</li><li>PWCrackerMgr: Not really a data structure, but a way to translate between Hashcat and John the Ripper cracking sessions</li></ul><div>In the future as this toolset becomes more developed, I may end up taking a lot of the metadata out of Targets and putting it into its own TLO much like CRITs did. Also longer term, I may end up incorporating disk storage or a database, but for now I'm keeping this focused on helping with password cracking competitions, (vs. activities like pen-testing and managing ALL my password lists). You can download the framework right now, and it already has some example Notebooks in it that show how to use it on the CMIYC2023 challenges.</div></div><div><br /></div><div>What this framework WILL NOT do is crack hashes or run your actual cracking sessions. I may add some scripting/logging support in the future, but this framework is focused on helping with data analysis as well as automating some of the busywork/repetitive tasks in a password cracking competitions such as creating custom wordlists, left list, translating between JtR and Hashcat, and hash submission.</div><div><h2>Using the Framework to Crack CMIYC2022 Challenges: </h2></div><div>A big question I have is how effective will this framework be in the <b>NEXT</b> password cracking challenge? Since I can't predict what Korelogic is going to do (beyond the fact that Bon Jovi references will somehow be involved), probably the best thing I can do is look back at past competitions. To that end I figured I might dig up the 2022 contest hashes and attempt to crack them again.</div><div><br /></div><div>To obtain the challenge files, you can visit Korelogic's 2022 contest page <a href="https://contest-2022.korelogic.com/downloads.html">[here]</a>.</div><div><br /></div><div>Side note: Mad props to Korelogic for continuing to host past contest files. I really appreciate it!</div><div><br /></div><div>Also, I had written a short blog post about my experiences in the contest, which is available <a href="https://reusablesec.blogspot.com/2022/08/more-password-cracking-tips-defcon-2022.html">[here]</a>.</div><div><br /></div><div>I'll admit, it's always hard looking back on past work/documentation, but I'm a bit annoyed with my past self. I think that blog post has a lot of good information in it, but it really doesn't go into too much detail about the contest itself. On the plus side, that will make using this framework for this challenge more "realistic" since I can't just rely on my past documentation, and I have very hazy memories of what happened over a year ago.</div><h3 style="text-align: left;">Unpacking the Contest Files:</h3><div>The CMIYC2022 contest had three file "drops" over the course of the challenge. Aka not all the hashes were released at the start, so this gave teams something constantly new to bang their heads against.</div><div><br /></div><div>The contest files are PGP encrypted and as a player you need to decrypt them with the password that KoreLogic provided. Since the first thing I always do is Google "How do you decrypt PGP files" I'm going to put the command here for future me. Also this will hopefully disabuse you early on the false idea that I know what I'm doing ;p</div><div><ul style="background-color: white; color: #212121; font-family: Roboto, sans-serif; font-size: 15px;"><li>gpg -o <output_file> -d <input_file>.pgp</li></ul><div><span face="Roboto, sans-serif" style="color: #212121;"><span style="font-size: 15px;">Do that for all three files, saving them with .tgz extensions. Then unzip the files using the command:</span></span></div></div><div><ul style="text-align: left;"><li>tar -xvzf <input_file>.tgz</li></ul><div>This creates three different directories filled with different encrypted file types. These are:</div></div><div><ul style="text-align: left;"><li><b>cmiyc-2022_street_1/</b></li><ul><li>list18-Thursday17January2021.odt</li><li>list17-TOWMINTP.hashes.gpg</li><li>list19-paidanextra500000.zip</li><li>list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z</li><li>list24-ThisYearsWorst.pdf</li><li>list16-FL_kdIZUGpI.zip</li></ul><li><b>cmiyc-2022_street_2/</b></li><ul><li>DEFCON-Street.kdbx</li></ul><li><b>cmiyc-2022_street_3/</b></li><ul><li>rar.sdrawkcab</li><li>1991whattimeisit.tgz</li></ul></ul><h3>Cracking the Encrypted File Containers (Part 1: File Extraction):</h3></div><div>This competition starting to come back to me. The main issue with this challenge was to figure out how to crack the various encrypted file types. Once you cracked the top level file, it would present you an internal hash_list of fast to compute hashes that you need to crack for actual points. For the street teams, the top level file hashes are encrypted with fairly easy to guess passwords. For the pro teams ... not as much.</div><div><br /></div><div>In my previous writeup <a href="https://reusablesec.blogspot.com/2022/08/more-password-cracking-tips-defcon-2022.html">[here]</a> the first two "Tips" cover how to set up John the Ripper to crack these files. I'm going to largely skip those tips here, but assuming you followed them, the following commands can be used to extract and save the hash for the above file to a single file you can crack. I'm appending them all to a "encrypted_file_hashes.hash" file that I can load into JtR. If you want to use Hashcat to crack the hashes instead, you'll need to do some additional fixups to remove things like the username (or run hashcat with the <b style="background-color: white; color: #212121; font-family: Roboto, sans-serif; font-size: 15px;">--username</b> field). Side note, I also highly recommend checking out <a href="https://miloserdov.org/?p=5191#47">[this]</a> external blog entry about using John the Ripper to crack different file formats. I heavily leveraged it since it highlights things like to crack .odt files you need to use lbreoffice2john.</div><div><ul style="text-align: left;"><li>list16-FL_kdIZUGpI.zip</li><ul><li>zip2john list16-FL_kdIZUGpI.zip >> hash_files/encrypted_file_hashes.hash</li></ul><li>list17-TOWMINTP.hashes.gpg</li><ul><li>gpg2john list17-TOWMINTP.hashes.gpg >> hash_files/encrypted_file_hashes.hash</li></ul><li>list18-Thursday17January2021.odt</li><ul><li>libreoffice2john.py list18-Thursday17January2021.odt >> hash_files/encrypted_file_hashes.hash</li></ul><li>list19-paidanextra500000.zip</li><ul><li>zip2john cmiyc-2022_list19-paidanextra500000.zip >> hash_files/encrypted_file_hashes.hash</li></ul><li>list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z</li><ul><li>7z2john.pl list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z >> hash_files/encrypted_file_hashes.hash</li></ul><li>DEFCON-Street.kdbx</li><ul><li>keepass2john DEFCON-Street.kdbx >> hash_files/encrypted_file_hashes.hash</li></ul><li>rar.sdrawkcab</li><ul><li>rar2john rar.sdrawkcab >> hash_files/encrypted_file_hashes.hash</li></ul><li>1991whattimeisit.tgz</li><ul><li>Trick challenge here! You can unzip/untar this with normal commands. Aka:</li><ul><li>gunzip 1991whattimeisit.tgz</li><li>tar -xvf 19whattimeisit.tar</li></ul><li>The real challenge is to decrypt a gocrypt message which isn't directly supported by either John the Ripper or Hashcat.</li><li>I'm going to skip this challenge for now. I have vague memories of downloading a gocrypt command line utility and writing a quick script to pass password into it from an external file. But since I'm focusing on the JupyterLab Framework, that yak shaving task is out of scope of this writeup.</li></ul><li>list24-ThisYearsWorst.pdf</li><ul><li>pdf2john.pl list24-ThisYearsWorst.pdf >> hash_files/encrypted_file_hashes.hash</li></ul></ul></div><div><h3>Cracking the Encrypted File Containers (Part 2: JtR Cracking):</h3></div><div>While I could certainly load these hashes into the JupyterLab Framework ... I don't see a lot of value doing so as there isn't a lot of "analysis" to do on them. Perhaps if/when I add the ability to keep track of targeted attacks, wordlists, and mangling rules run against a hash then it will make sense, but for now let's just crack these hashes so we can get access to the larger hash lists which we will be able to leverage the current JupyterLab Framework against. Below I'm going to list the JtR command I used (including mode) to crack the password as well as the plaintext. To help with spoilers, I'm setting the background of the plaintext to be black, but to read it you can just highlight the text and copy/paste it somewhere else.</div><div><ul><li>list16-FL_kdIZUGpI.zip</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=pkzip encrypted_file_hashes.hash</li><li>Note: As I said, the street file passwords are pretty easy to guess...</li><li><span style="background-color: black;">Hackers</span></li></ul><li>list17-TOWMINTP.hashes.gpg</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=gpg --wordlist=../wordlists/hacker_movies.txt --rules=Single encrypted_file_hashes.hash</li><li>Note: This one required a non-default attack, partially because GPG is such a slow format to make guesses against, and partially since the base word wasn't in JtR's default wordlist. Based on the other hashes I cracked I figured it'd be a Vegas themed password or from a hacker movie so I made a small wordlist based on that.</li><li><span style="background-color: black;">DEFCON</span></li></ul><li>list18-Thursday17January2021.odt</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=odf --wordlist=password.lst --rules=":c;:u" encrypted_file_hashes.hash</li><li>Note: This was a slow enough hash that I couldn't do all the rules in Single mode in a reasonable timeframe. Also the base word wasn't in my targeted dictionary. But given the other cracked passwords I ran two rules against the default JtR dictionary on the command line (Capitalize and Uppercase). I wrote more how to specify rules on JtR's command line <a href="https://reusablesec.blogspot.com/2022/05/password-cracking-tips-crackthecon.html">[here]</a>.</li><li><span style="background-color: black;">Sunday</span></li></ul><li>list19-paidanextra500000.zip</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=pkzip encrypted_file_hashes.hash</li><li>Note: I cracked this one in the same session as list16, which is why it is nice to save all these hashes to the same hash-file</li><li><span style="background-color: black;">Swordfish</span></li></ul><li>list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=76 --wordlist=password.lst --rules=":c;:u" encrypted_file_hashes.hash</li><li>Note: Same constraints and attack as I ran against list18.</li><li><span style="background-color: black;">Queen</span></li></ul><li>DEFCON-Street.kdbx</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=KeePass encrypted_file_hashes.hash</li><li>Note: This attack froze up my laptop trying to run it under WSL, so I copied it over to my server to run which worked a lot better, with almost an instant crack. The funny part was, ((Spoiler Alert)) the "username" attack in JtR's Single mode cracked the password and not a normal dictionary attack</li><li><span style="background-color: black;">Street</span></li></ul><li>rar.sdrawkcab</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=RAR5 encrypted_file_hashes.hash</li><li>Note: I'll admit I got thrown a bit with the format (trying --foramt=rar first) until I looked at the saved hash. Otherwise this was a simple crack with a default attack</li><li><span style="background-color: black;">drowssap</span></li></ul><li>1991whattimeisit.tgz</li><ul><li>Didn't try this one for this writeup</li></ul><li>list24-ThisYearsWorst.pdf</li><ul><li>john --pot=../pot_files/cmiyc2020_john.pot --format=pdf encrypted_file_hashes.hash</li><li>Note: ((Spoiler)) Surprisingly it wasn't the username attack that got this one. The base word was in passwords.lst which is JtR's default wordlist.</li><li><span style="background-color: black;">Worst</span></li></ul></ul><div><h3>Extracting the "REAL" Hash Lists:</h3></div></div><div>The next step is to decrypt/unzip/open all the files and save the hashes from them. You may laugh, but I always forget the command line options to do this, so I'm listing how I did that below for future me. I'm also listing what types of hashes were in each file. To determine the hash type I mostly relied on looking at the scoreboard that KoreLogic published <a href="https://contest-2022.korelogic.com/stats.html">[link]</a> and matching it to the hash. If that wasn't provided though, there would be several ways to figure out the hash such as using mdxfind.<br /><ul style="text-align: left;"><li>list16-FL_kdIZUGpI.zip</li><ul><li>unzip list16-FL_kdIZUGpI.zip</li><li><b>Contains:</b> 2766 half-md5 hashes</li><li><b>JtR Mode:</b> Not supported</li><li><b>HC Mode:</b> 5100</li></ul><li>list17-TOWMINTP.hashes.gpg</li><ul><li>gpg -d list17-TOWMINTP.hashes.gpg > ../../hash_files/list17.txt</li><li>Note: I needed to pipe the results to a file to save them</li><li><b>Contains:</b> 2933 raw-md5 hashes</li><li><b>JtR Mode:</b> raw-md5</li><li><b>HC Mode: </b>0</li></ul><li>list18-Thursday17January2021.odt</li><ul><li>Open it on a Linux system using LibreOffice and paste the hashes into list18.hash</li><li><b>Contains:</b> 5456 raw-sha1 hashes</li><li><b>JtR Mode:</b> raw-sha1</li><li><b>HC Mode: </b>100</li></ul><li>list19-paidanextra500000.zip</li><ul><li>unzip list19-paidanextra500000.zip</li><li><b>Contains:</b> 4997 raw-sha256 hashes</li><li><b>JtR Mode:</b> raw-sha256</li><li><b>HC Mode: </b>1400</li></ul><li>list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z</li><ul><li>7z x list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.7z</li><li><b>Contains:</b> 10004 raw-sha384 hashes</li><li><b>JtR Mode:</b> Raw-SHA384</li><li><b>HC Mode: </b>10800</li></ul><li>DEFCON-Street.kdbx</li><ul><li>Open the file in keepassx and then paste contents in list21.hashes</li><li>Note: There is a command line version but that became too big of a pain to figure out how to export the hashes properly</li><li><b>Contains: </b>10812 mssql05 hashes</li><li><b>JtR Mode:</b> mssql05</li><li><b>HC Mode: </b>132</li></ul><li>rar.sdrawkcab</li><ul><li>unrar x rar.sdrawkcab</li><li><b>Contains: </b>4214 mysql CRAM hashes</li><li><b>JtR Mode:</b> mysqlna</li><li><b>HC Mode: </b>11200</li></ul><li>list24-ThisYearsWorst.pdf</li><ul><li>Open in a PDF viewer and copy/paste the hashes into list24.hashes</li><li><b>Contains:</b> 2000 SSHA/nsldaps hashes</li><li><b>JtR Mode:</b> Salted-SHA1</li><li><b>HC Mode: </b>111</li></ul></ul><div><div><h3>Creating a Config For the JupyterLab Framework:</h3></div><div>Now that we have tens of thousands of password hashes to crack, it's time to use the JupyterLab password cracking framework. To do this, we'll first need to load the hashes into it, and to do that we'll need to configure the config file.</div></div></div><div><br /></div><div>The framework uses the YAML file format for its configs. That means spaces/whitespace is important, but it is also a pretty flexible file format. For our config, we'll define our JtR and hashcat potfile locations, how much the hashes are worth, and where and how to load the hashes. I'll be including <b># Comments</b> as well to help explain why I'm doing what I'm doing.</div><div><br /></div><div><div>---</div><div><div># This defines where the potfiles are for this contest. This lets you load cracked hashes from them</div><div># as well as keep your JtR and HC potfiles synced.</div></div><div><br /></div><div><div> jtr_config:</div><div> main_pot_file: "./challenge_files/CMIYC2022_Street/jtr_cmiyc2022.pot"</div><div> </div><div> hashcat_config:</div><div> main_pot_file: "./challenge_files/CMIYC2022_Street/hc_cmiyc2022.potfile"</div></div><div><br /></div><div># This is information on where to load the challenge files. If they have additional metadata you</div><div># may need to write a custom function to import them, but in this case they are pure raw hash lists</div><div># so we can use the "plain_hash" plugin to import them.</div><div>#</div><div># This plugin requires the hash type to be specified (if not it will default to "unknown"). Typically</div><div># I use the JtR naming format for the hash type. The "source" field is used to list a source for the</div><div># hashes in the framework. Basically it's a note for you later when looking at the cracked hashes.</div><div># </div><div> challenge_files:</div><div> list14:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list14-4214-BrunnersMentalPrisoner.hashes"</div><div> format: "plain_hash"</div><div> type: "mysqlna"</div><div> source: "list14-BrunnersMentalPrisoner"</div><div> list16:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list16-FL_kdIZUGpI.txt"</div><div> format: "plain_hash"</div><div> type: "half-md5"</div><div> source: "list16-FL_kdIZUGpI"</div><div> list17:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list17.txt"</div><div> format: "plain_hash"</div><div> type: "raw-md5"</div><div> source: "list17"</div><div> list18:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list18.hash"</div><div> format: "plain_hash"</div><div> type: "raw-sha1"</div><div> source: "list18"</div><div> list19:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list19-paidanextra500000.hashes"</div><div> format: "plain_hash"</div><div> type: "raw-sha256"</div><div> source: "list19-paidanextra500000"</div><div> list20:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list20-Authoritiesappeartohaveuncoveredavastnefariousconspiracy.hashes"</div><div> format: "plain_hash"</div><div> type: "raw-sha384"</div><div> source: "list20-Authoritiesappearto"</div><div> list21:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list21.hashes"</div><div> format: "plain_hash"</div><div> type: "mssql05"</div><div> source: "list21"</div><div> list24:</div><div> file: "./challenge_files/CMIYC2022_Street/sample_hashes/list24.hashes"</div><div> format: "plain_hash"</div><div> type: "ssha"</div><div> source: "list24"</div><div><br /></div><div><div># The score info is taken from the Korelogic scoreboard. This isn't necessary, but it is nice to have</div><div># a local count of what your score should be so you can compare it to the official score to validate that</div><div># you are submitting your cracks properly</div></div><div> score_info:</div><div> raw-sha384: 46</div><div> mysqlna: 17</div><div> raw-sha256: 13</div><div> mssql05: 9</div><div> raw-sha1: 5</div><div> ssha: 5</div><div> half-md5: 3</div><div> raw-md5: 1</div></div><div><div><h3>Loading the Challenge Files into JupyterLab Framework:</h3></div><div>This is the easy part for these hashes. The biggest challenge is setting up the config file. Once that's done you can just load it up in the current framework. At this point I should mention that I'll be eventually including the Notebook in this blog post in the example files in the framework github repo.</div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRMeBn4EhNoa3EJDCU1yMPmqbm0TO7UTuzFqhIGRLFlJ6fIBp_FT4QuO-IbZ2AX0pNy2hJkTRY7xlkcW7iJEJi5VHvlX6KAPi2riWJ5ezjn5nhWkjxM_Y-DtUkq1SAABr_hGM5TcYaAt9fu0Nnzoa3oCiFW5payOTV1J-pDc4SwoWsRBJ_WDCmMQ9eIKI/s1328/loading_hashes.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Loading the config/hashes in the Jupyter Framework" border="0" data-original-height="469" data-original-width="1328" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgRMeBn4EhNoa3EJDCU1yMPmqbm0TO7UTuzFqhIGRLFlJ6fIBp_FT4QuO-IbZ2AX0pNy2hJkTRY7xlkcW7iJEJi5VHvlX6KAPi2riWJ5ezjn5nhWkjxM_Y-DtUkq1SAABr_hGM5TcYaAt9fu0Nnzoa3oCiFW5payOTV1J-pDc4SwoWsRBJ_WDCmMQ9eIKI/s16000/loading_hashes.png" title="Easy Peesy Lemon Squeezy" /></a></div><br /><div>Once it is loaded you can run the built in tools to merge Hashcat and JtR potfiles, display cracked passwords, and calculate your expected score.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUpoVadaV1CSvZwd76EfX7OiqkY3qHLWQOTNOPUM4BVrdYC4i0E2tG5Ui1mXrDsGkuLYWB-iMeq7LNejCu0zNVyJWtEkQH6VBgc611_Eo4W54b_G9U3mS3E6l-LUVCFrwKG-QxxoeuxIrL8vJHkCG3JN0Zu8s7KDoxyMUEHe4ZaIFkYfLtp0xSjsoCsm8/s1144/starting_status.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Picture of status after loading the initial hashes" border="0" data-original-height="1059" data-original-width="1144" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiUpoVadaV1CSvZwd76EfX7OiqkY3qHLWQOTNOPUM4BVrdYC4i0E2tG5Ui1mXrDsGkuLYWB-iMeq7LNejCu0zNVyJWtEkQH6VBgc611_Eo4W54b_G9U3mS3E6l-LUVCFrwKG-QxxoeuxIrL8vJHkCG3JN0Zu8s7KDoxyMUEHe4ZaIFkYfLtp0xSjsoCsm8/s16000/starting_status.png" title="Gotta start somewhere" /></a></div><br /><div><div><h3>Next Steps - Cracking Some Passwords:</h3></div><div>There's not a ton more analysis to do (besides look at the cracks). And don't worry, we'll get to that! But this is where I'm glad that I decided to go back through these old CMIYC challenges. The last contest in 2023 was very heavy in its use of metadata. So I of course focused on metadata analysis for this framework. But for this contest, there is very little metadata. The focus instead was on cracking encrypted containers and (cough, SPOILER ALERT, cough) building custom wordlists from online articles. This highlights other new capabilities that I really want to add into this framework. For example, going through logfiles, extracting the rule/dictionary that cracked a password, and then storing that data with the hash in this framework. That would be super cool! How about doing google searches on passwords for you? That would be cool too. That points to the stone-soup approach of this framework. Once we have these hashes/passwords/metadata stored in a searchable framework with a Python3 backend we can build upon this to add new capabilities.</div></div><div><br /></div><div>Enough talking! Let's look at some cracks. Looking above, I managed to crack over 50% of the ssha hashes in under a minute cracking run in JtR. I wonder what those plaintexts look like. I can use the <b>SessionMgr.print_all_plaintext(meta_fields=['source'])</b> to display them.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3EtsOchKlWPdWdyvHHezg4msr3rc-BMeHO-rKcgGYoBYEgWkHs23jvzpQWWSn7YsG8SM1LRSTeCWuaws8eGbtYHxpOpyAGYJhXyNVD5N8g-1TMsdGxCzM_knNCgolL4tO70sHtwI8dwf_ei0b8qlbFy5cRRwKIoOJv8kZ9_t5CYqBDcoRlr89t64dFrQ/s733/ssha_cracks.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A bunch of easy passwords followed by 2022" border="0" data-original-height="733" data-original-width="596" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3EtsOchKlWPdWdyvHHezg4msr3rc-BMeHO-rKcgGYoBYEgWkHs23jvzpQWWSn7YsG8SM1LRSTeCWuaws8eGbtYHxpOpyAGYJhXyNVD5N8g-1TMsdGxCzM_knNCgolL4tO70sHtwI8dwf_ei0b8qlbFy5cRRwKIoOJv8kZ9_t5CYqBDcoRlr89t64dFrQ/s16000/ssha_cracks.png" title="Oh that's why I cracked so many of them..." /></a></div><br /><div>You don't have to be a master password cracker to design an attack against these. That being said, going back to the score graph, they aren't worth a lot of points. Even if you cracked all 2k ssha hashes (which is totally doable) you would only get 10k points. That's a lot less than most of the other hash types. So this is a list to play around with when you don't have anything better to crack and need that serotonin hit to see cracks flash across your terminal.</div><div><br /></div><div>That leads us to the other hash lists. Let's check out the raw-sha1 hashes:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMFNsysDIILdGmh59od7cW7yJPWBAJ4yAFEYTfXErwLJAufBmFg9vzHU_-rQKGlMmFQ-qKNeK5lWmTkGe4wLGLEy32Uy3ssHIJvTBEQRmyZRZZqu7AzeUxNifp_cHK-tqO6T85OzvDbcpGene7UBtfblRSxslyiQs53lDBEhZWtGEbg2-0KkBgmBiUaKA/s756/raw-sha1_cracks.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A bunch of cracks but many of them have the same base words" border="0" data-original-height="756" data-original-width="558" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMFNsysDIILdGmh59od7cW7yJPWBAJ4yAFEYTfXErwLJAufBmFg9vzHU_-rQKGlMmFQ-qKNeK5lWmTkGe4wLGLEy32Uy3ssHIJvTBEQRmyZRZZqu7AzeUxNifp_cHK-tqO6T85OzvDbcpGene7UBtfblRSxslyiQs53lDBEhZWtGEbg2-0KkBgmBiUaKA/s16000/raw-sha1_cracks.png" title="This is where some custom rules can help" /></a></div><br /><div>At least for the limited cracks so far, it looks like these plaintexts were generated using some common words that have various mangling rules applied to them. You can start to see why I want to automate pulling out dictionaries + rules from JtR and HC logfiles to make it easier to reverse engineer these rules vs. having to do it by hand. </div><div><br /></div><div>Actually, this might be a good time to end this blog entry and work on some of those new features.... All the code is uploaded to the gitlab site and I'll upload some sample hashes as well so you can follow along. But for now I probably need to look into parsing some Hashcat log files.</div><div></div><p></p>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-10187272333385012902023-08-30T20:51:00.008-07:002023-08-30T21:19:19.859-07:00Hashcat Tips and Tricks for Hacking Competitions: A CMIYC Writeup Part 3<p> <a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKp9NUyr3T9laDtvScyRvVWKJK-8kOIeKlPcNpGmT_XERkZrF7rK5fU-EiSoTYBEBWuGJzkO4DAso0YogToayLxzP_6W8KeV_Yu3_tG1E_0_-bA-QG_F7gecMP0V8ovUAUPQLrF-fWG86JgIX-TBjNI1rL_WgLvPnGSftTsyBkMo_m3S8aSi_J9g5ZMVg/s938/cat_server_room.png" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img alt="AI generated image of a cat in front of a messy computer" border="0" data-original-height="536" data-original-width="938" height="366" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjKp9NUyr3T9laDtvScyRvVWKJK-8kOIeKlPcNpGmT_XERkZrF7rK5fU-EiSoTYBEBWuGJzkO4DAso0YogToayLxzP_6W8KeV_Yu3_tG1E_0_-bA-QG_F7gecMP0V8ovUAUPQLrF-fWG86JgIX-TBjNI1rL_WgLvPnGSftTsyBkMo_m3S8aSi_J9g5ZMVg/w640-h366/cat_server_room.png" title="I've learned the second you put the word "cat" in Midjourney, all it does is draw cats." width="640" /></a></p><p></p><blockquote><div style="text-align: left;"><div style="text-align: center;"><b><i>I want to know1</i></b></div><b><div style="text-align: center;"><b><i>and understand1</i></b></div></b><b><div style="text-align: center;"><b><i>But I will not1</i></b></div></b></div><p style="text-align: center;"><b>-- Hashes cracked from the KoreLogic CMIYC 2023 competition</b></p></blockquote><p></p><p>In the previous two posts on the CMIYC competition [<a href="https://reusablesec.blogspot.com/2023/08/using-jupyterlab-to-manage-password.html">Part 1</a>, <a href="https://reusablesec.blogspot.com/2023/08/using-jupyterlab-to-manage-password_22.html">Part 2</a>], I had focused on how to integrate data science tools into your password cracking workflow and showed how to crack passwords on limited hardware (E.g. my laptop without using a GPU). Of course it's better to have some firepower to crack hashes! One of the hurdles to overcome is I don't have a lot of firepower at my disposal. Despite being super interested (OK, obsessed) about password cracking, I've never invested in a dedicated cracking rig. Still, when I do get serious about cracking passwords I turn to Hashcat and GPU based attacks to do the heavy lifting even if I only have a single NVIDIA GeForce GTX 1070 GPU. That's still significantly faster than trying to run CPU only attacks.</p><p>To that end, let's talk about how to leverage Hashcat when competing in these competitions. Full disclaimer: I'm going to go full spoiler in how I'm approaching my cracking. At this point, I've been running cracking sessions way longer than the competition would have lasted if I had competed. Also, I've been on the various Discord and Twitter conversations about the contest this year and know how the hashes were generated. Heck, KoreLogic even posted themselves how they created the challenges [<a href="https://contest-2023.korelogic.com/password-info.html">Full Spoiler Link</a>]. So I'm not going to even pretend that this post represents how I would have done. Instead I want to focus on "given what we know, how can someone use Hashcat to crack those hashes".</p><h3 style="text-align: left;">Using Hashcat and John the Ripper Together</h3><div>One issue that pops up a lot for me when using both John the Ripper and Hashcat to crack hashes, is that while their file formats are *mostly* the same, they are not directly compatible. This goes for how these tools expect hashes to be formatted when loading them up, and their .pot file formats they save their cracked passwords to.</div><div><br /></div><div>The hash format in particular has been a long source of annoyance for me, and writing this blog post inspired me to finally submit a <a href="https://github.com/hashcat/hashcat/issues/3854">github issue</a> about it to the hashcat repo. The long story short is that John the Ripper uses hash type identifies that Hashcat doesn't recognize. For example, here is a raw-md5 hash (from the CMIYC2023 contest) that John the Ripper can load:</div><div><ul style="text-align: left;"><li>jithakur:$dynamic_0$38bb03886dd4fbda5a780f0617847e4c</li></ul><div>And here is the same hash format that Hashcat expects:</div></div><div><ul style="text-align: left;"><li>jithakur:38bb03886dd4fbda5a780f0617847e4c</li></ul></div><div>Side note, while you can have usernames in your hash lists, Hashcat won't load the hashes unless you include the "--username" flag on the command line telling Hashcat to strip/ignore those usernames. E.g.:</div><div><ul style="text-align: left;"><li>hashcat <b>--username</b> -a 0 -m 0 hashfile.txt dictionary.txt</li></ul>What this really means is that to support both John the Ripper and Hashcat, I now have two sets of hash lists and two sets of pot files. It would be nice to incorporate some scripts in my Juypter Notebook to sync up both of the pot files between them so I'm not cracking the same hashed password twice. Given that's a rabbit hole which would totally side-track any hash cracking, I'm going to push that project off for another day. For now I'm just going to use Hashcat, and I modified my Notebook to support the Hashcat file formats, (mostly by copying and pasting the JtR code into another cell and then making small modifications). Once again, this is one of the super-powers of using Jupyter notebooks. I can load up my JtR cracked hashes, then write and load up my Hashcat plaintexts, and perform analysis on both in a very short period of time. It's not pretty but it works.</div><div><h3>Running Basic Hashcat Attacks</h3></div><div>The commands to run Hashcat are very different than those to run John the Ripper. There's pros and cons to both methods. File autocomplete works much better with Hashcat's command line and Hashcat does directory inclusion (such as use all wordlists in a directory) better. But John the Ripper's is less position dependent, has a ton of super powerful features for different attack modes on the command line, and quite honestly I'm just used to it more. </div><div><br /></div><div>The basic command line for hashcat is:</div><div><ul style="text-align: left;"><li>hashcat -a ATTACK_TYPE -m HASH_TYPE HASH_LIST [ATTACK_OPTIONS]</li></ul><div>So for a standard wordlist + rules attack you can run</div></div><div><ul style="text-align: left;"><li>hashcat -a 0 -m 0 uncracked_hashes.txt ../../wordlists/Alter-Hacker_Sorted-Cleaned.txt -r ../../repos/hashcat/rules/d3ad0ne.rule</li></ul><div>To break this down:</div></div><div><ul style="text-align: left;"><li><b>-a 0</b>: The attack mode. In this case, wordlist + rules. Also supports stdin input if a wordlist is not specified</li><li><b>-m 0:</b> The hash type to crack. In this case it is targeting Raw-MD5 hashes</li><li><b>uncracked_hashes.txt</b>: The list containing all the hashes I want to crack. Hashcat will load everything that looks like a MD5 hash from it.</li><li><b>./../wordlists/Alter-Hacker_Sorted-Cleaned.txt</b>: A common password cracking dictionary/wordlist. </li><ul><li>It's one of the bigger wordlists that is not based on pure cracked hashes which isn't 100% filled with junk. It used to be pretty easy to find online. But now that I'm looking for it again most of the links have dried up.</li><li>Side note: It used to be hosted on KoreLogic's dictionary list available <a href="https://contest-2010.korelogic.com/wordlists.html">here</a>. Also, I forgot they hosted a ton of wordlists. I need to check them out again to see if they are helpful in this competition. (Spoiler, these dictionaries were not that helpful).</li></ul><li><b>-r ../../repos/hashcat/rules/d3ad0ne.rule</b>: The mangling rules. d3ad0ne.rule is a pretty decent set to use if you can make a lot of guesses</li></ul><div>Running variations of the above attack using standard large dictionaries and a few other hashcat rules cracked a few more MD5 passwords but not many....</div></div><div><br /></div><div>One cool feature of Hashcat is that you can specify a directory instead of a wordlist though. So you can use the following command to run a quick set of mangling rules against all of your dictionaries:</div><div><ul style="text-align: left;"><li>hashcat -a 0 -m 0 uncracked_hashes.txt ../../wordlists/ -r ../../repos/hashcat/rules/best66.rule</li></ul><div>When running these attacks, the hashes.org-20202 wordlist did the best. It's a super effective wordlist to use in general and can be obtained from hashmob [<a href="https://hashmob.net/resources/wordlists">link</a>]. Side note, I'm not using Hashmob's own cracked wordlists for this blog post since I'm pretty sure the contest hashes were uploaded to them.</div></div><div><br /></div><div>Given the limited success of these attacks (a few raw-MD5 cracks aren't going to give a lot of points). There's really three paths that I can take.</div><div><ol style="text-align: left;"><li> I can analyze the cracks and try to construct custom attacks.</li><ul><li>THIS IS THE BEST OPTION.</li></ul><li>I can run my existing wordlists but have Hashcat auto-generate rules for me</li><li>I can start brute-forcing key-spaces with smart masks and Markov attacks.</li></ol><div>Side note: Options #2 and #3 are generally the ones picked on real dumps as the individual passwords are only loosely related to each other. Also password crackers (at least me) are lazy.</div><div><br /></div><div>Going with the lazy options first, let's dive in on how to run them. To auto-generate rules you can use the --generate-rules=X option where X is the number of rules to generate. For example:</div><div><ul style="text-align: left;"><li>hashcat -a 0 -m 0 --debug-mode=5 --debug-file debug.txt --generate-rules=1000000 uncracked_hashes.txt ../../wordlists/hashes.org-2020.txt</li></ul><div>When you do this, and I can't stress this enough, enable <b>--debug-mode=5</b>. Also log that info to file using the <b>--debug-file debug.txt </b>option. This will output both the rule that successfully cracks as password as well as the plain-text word. Don't get lazy, and do not skip this option. In fact, you probably should be running that for all your password cracking sessions.</div><div><br /></div><div>Now you may be asking yourself, why "<b>--debug-mode=5"</b>? It's because the debug info will append itself to the debug-file (vs. overwrite it) and you'll be running a lot of cracking sessions. Going back and remembering which dictionary created which cracked password is super helpful. You want all that info. Why throw that info away with a lower debugging option?</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhd64PXcTH4HleWc8m87j5KB_P_DgyTdvIhVCpHsukZSZ5jEpHMSAQPERw6mOCIs83CeKNJOf6xlwi_pPkAJTZesmB2Pn77udX_I6rSsahuvTD-EKVyVoP-8ZckvR9rk-ekXuQaNpwJNM1XlhqJno16lyq57ToZV674Q6DxkujMWrkPT3M63Q6epfrRD5E/s1124/hashcat_debug.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing a debug file from cracking sha-1 hashes in the CMIYC 2023 challenge" border="0" data-original-height="240" data-original-width="1124" height="85" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhd64PXcTH4HleWc8m87j5KB_P_DgyTdvIhVCpHsukZSZ5jEpHMSAQPERw6mOCIs83CeKNJOf6xlwi_pPkAJTZesmB2Pn77udX_I6rSsahuvTD-EKVyVoP-8ZckvR9rk-ekXuQaNpwJNM1XlhqJno16lyq57ToZV674Q6DxkujMWrkPT3M63Q6epfrRD5E/w400-h85/hashcat_debug.png" title="Not many hashes were cracked, but you can start to see how this is useful for developing better attacks." width="400" /></a></div><br /><div>Long story short, if you don't know what to do, a default option can be to generate rules for a dictionary you've had some success with, log the results, and then turn the successful rules into a contest specific ruleset to use with other dictionaries.</div></div></div><div><br /></div><div>But what if your input dictionaries are the problem? That's where brute-forcing small key lengths can be helpful using masks.</div><div><h3>Cracking Contest Hashes with Hashcat Masks</h3></div><div>I'll admit, I started to go into a long, long diversion about the mechanics behind Hashcat's Masks and Markov optimizations. I really hate calling what Hashcat does a Markov attack and there's a ton of optimizations that Hashcat developers can make to it. But that's totally besides the point if you are trying to crack passwords RIGHT NOW. So I'll save that side tangent for a different post and instead focus on cracking these contest hashes. </div><div><br /></div><div>Masks are one area where having more computational power makes a huge difference. They let serious cracking rigs just chew through keyspace without requiring much skill or ability from their operators. Contest organizers know this and tend to create passwords that are resistant to un-optimized mask attacks. This means going through the entire key-space for 5/6/7/8 passwords is unlikely to be very successful.</div><div><ul style="text-align: left;"><li>(Not recommended): hashcat -a 3 -m 0 hash_list.txt ?a?a?a?a?a?a?a</li></ul></div><div><div>As an example of that, I left Hashcat running for a couple of hours brute forcing all ASCII passwords of length 1 through 7 for the raw-MD5 hashes. I didn't crack a single new hash that wasn't caught by earlier runs I had performed with John the Ripper. Going back to my Jupyter Notebook I decided to display password cracks by length, and then also the number of ASCII only (aka no Cyrillic) password cracked by length.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjapP_0XlW7HrscXuZv84AusdjlbajPMGAoLk97sxX1kOTVxkfvJlHrzivDo5NU8VU0vla5zrYDNsRVRx7YuU5hx4iMShxeZ3bDqdFAod_QYisS-zXnqV5JEgloR594o1wrFU-8w3qNykoDzHUWmrzT-7rd-sDpX0g9B5U1mLMIOOabEd1HHHsOLM6-_5Y/s791/cracks_by_length.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Graph showing most passwords are over 7 characters long" border="0" data-original-height="699" data-original-width="791" height="354" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjapP_0XlW7HrscXuZv84AusdjlbajPMGAoLk97sxX1kOTVxkfvJlHrzivDo5NU8VU0vla5zrYDNsRVRx7YuU5hx4iMShxeZ3bDqdFAod_QYisS-zXnqV5JEgloR594o1wrFU-8w3qNykoDzHUWmrzT-7rd-sDpX0g9B5U1mLMIOOabEd1HHHsOLM6-_5Y/w400-h354/cracks_by_length.png" title="If anything the real results are worse since this doesn't show all the passwords I haven't cracked" width="400" /></a></div>You probably don't have the GPU power to brute force 8-9 character passwords during the contest, and you certainly don't have that for the high value hashes that are worth a lot of points Therefore to be successful in a contest with Hashcat Masks you need to tailor them to find gaps in base-words or mangling-rules that you have already identified. I talked about this earlier with the attacks I ran using John the Ripper in <a href="https://reusablesec.blogspot.com/2023/08/using-jupyterlab-to-manage-password_22.html">Part 2</a> of these write-ups. For example, if you were looking to find more base-words for Sales passwords where many of them started with '2023' and ended with a special character, then you could try something like:</div><div><ul style="text-align: left;"><li><b>hashcat -a 3 -m 0 -1 ?l?u -2 cmiyc_sales_end.hcchr uncracked_hashes.txt 2023?1?l?l?l?l?l?l?2</b></li></ul><div>There's a lot going on in the above command. Let's break this command down by parts:</div><div><ul style="text-align: left;"><li><b>hashcat -a 3 -m 0</b></li><ul><li>The standard hashcat command targeting raw-md5 hashes (-m 0), and using mask mode (-a 3)</li></ul><li><b>-1 ?l?u </b></li><ul><li>I'm setting a custom mask character set here that includes two built in character sets [?l = all lowercase letters, and ?u = ALL UPPERCASE LETTERS]</li><li>In the actual mask you can refer to this custom character set as <b>?1 </b>(that's the number 1)</li><li>You can specify up to 4 custom characters sets for your mask mode [1 - 4]. This is a hard limit. I wish you could do more actually, but that's how Hashcat is programmed.</li></ul><li><b>-2 cmiyc_sales_end.hcchr</b></li><ul><li>Rather than type out the characters for the mask on the command line, you can also save them to a *.hcchr file and read them in.</li><li>This is super helpful when you are targeting special characters that just don't play well on the command line and you don't want to mess with escaping them. For example <b>'!,$</b>.</li><li>The format for .hcchr files is just all the characters you want to target on the first line. E.g.:</li><ul><li>!,$</li></ul></ul><li><b>uncracked_hashes.txt </b></li><ul><li>Once again, just the hash-list of the hashes you are targeting</li></ul><li><b>2023?1?l?l?l?l?l?l?2</b></li><ul><li>The actual mask to run. Breaking it down further</li><ul><li><b>2023: </b>Simply starts every guess with the string "2023"</li><li><b>?1:</b> Use the first custom character set. I know, it's hard to see the difference between the number 1 and the letter l. The above uses the number one. In this case it tries all lower and uppercase letters.</li><li><b>?l?l?l?l?l?l:</b> Try 6 lower case characters</li><li><b>?2:</b> Try the second custom character set. This appends common special characters I found when cracking other sales passwords.</li></ul></ul></ul></div><div>That's great, but what if you want to try 5 lower case characters vs. 6. Running these attacks by hand is a pain so it's nice to queue up a bunch of mask attacks at once using a save mask file (e.g. a .hcmask file). Unfortunately, the format is a bit different so let's look at how we can do that next. First, here is the hashcat command line to run a .hcmask file:</div><div><ul style="text-align: left;"><li>hashcat -a 3 -m 0 uncracked_hashes.txt sales.hcmask</li></ul><div>You'll notice that all the mask info has been removed from the command line and instead I'm calling an external sales.hcmask file. Let's take a look at what's in that file:</div></div><div><ul style="text-align: left;"><li>?l?u,!\,$,2023?1?l?l?l?2</li><li>?l?u,!\,$,2023?1?l?l?l?l?2</li><li>?l?u,!\,$,2023?1?l?l?l?l?l?2</li><li>?l?u,!\,$,2023?1?l?l?l?l?l?l?2</li><li>?l?u,!\,$,2022?1?l?l?l?2</li><li>?l?u,!\,$,2022?1?l?l?l?l?2</li><li>?l?u,!\,$,2022?1?l?l?l?l?l?2</li><li>?l?u,!\,$,2022?1?l?l?l?l?l?l?2</li></ul>Breaking this file format down:</div><div><ul style="text-align: left;"><li>Each line defines a single mask to run. Lines starting with '#' are comments.</li><li>Each line will be run in order. Generally it helps to put the quick masks first so if you decide to cancel the job you have a better idea of how much key-space you checked.</li><ul><li>I know, I didn't follow my own advice in this example...</li></ul><li>Each line must define any custom character sets, and unlike with the command line you can't define them in external files.</li><ul><li>Each custom character set (up to 4) are specified by putting a comma '<b>,</b>' after them.</li><li>In the above example this means the 2 custom character sets are:</li><ul><li>?l?u</li><li>!\,$</li></ul><li>For the second custom character set I wanted to include a comma, which is a problem because it's a deliminator. So I needed to escape it with a backslash. Aka: '<b>\,</b>'</li><li>You can read more about the hcmask file format <a href="https://hashcat.net/wiki/doku.php?id=mask_attack">here</a>.</li></ul></ul><div>With all of that, I managed to identify a couple more base words to use targeting sales passwords. This in turn allowed me to target higher value hashes easier. The same can be done by targeting known words to find the mangling rules. E.g.:</div></div><div><ul style="text-align: left;"><li>?d?s,2022Sales?1?1?1</li></ul><div>Yes you can also do that with a wordlist and mangling rules, but if you only have a couple of words you want to check it can sometimes be easier to do that with Masks instead. Now if you have a lot of words you want to try, then you can look into Hashcat's "<b>-a 6</b>" (Wordlist + Mask) and "<b>-a 7</b>" (Mask + Wordlist) attack modes. John the Ripper doesn't have this specifically because *cough cough* its rule preprocessor supports masks already in its normal mangling rules. But these attack modes can be very helpful if you are using Hashcat.</div></div><div><br /></div><div>One thing you'll notice though with the hybrid -a [6/7] attacks is that you can't mangle or apply masks to both sides of a guess at the same time. Also, unlike with standard wordlist modes (-a 0) you can not pipe a wordlist in to -a [6/7] modes via stdin. This is a problem. The whole reason you are using Masks is probably because you don't know what mangling rules have been applied to the base-word. </div><div><br /></div><div>The key then is to create custom word-lists that contain one side of the mangling rules. I'd recommend picking the "shorter" of the mangling rules to limit how much you write to disk. This is super annoying, but it works. So for example if you want to append 2022 and 2023 to a word and then append a mask attack you could do something like first creating a word-list containing all the words with 2022 and 2023 appended to them (this only doubles the size of the original input dictionary). In this case I'm accomplishing this by using Hashcat's rules and saving the results to disk. To do that, and the run the resulting full Mask attack, you can use the following commands:</div><div><br /></div><div><b>Rule file: append_year.rule </b>(Capitalize word and prepend 2022 and 2023).</div><div><ul style="text-align: left;"><li>c^2^2^0^2</li><li>c^3^2^0^2</li></ul><b>Generate wordlist command:</b></div><div><ul style="text-align: left;"><li>hashcat -a 0 --stdout ./sales_words.txt -r append_year.rule</li></ul><div><b>Now that we have a wordlist containing words like 2023Sales, run the mask hybrid attack:</b></div></div><div><ul style="text-align: left;"><li>hashcat -a 6 -m 0 -1 ?l?u<b> </b>uncracked_hashes.txt ./sales_words.txt '?1?1?1</li></ul><div>Is all of this a pain? Absolutely! But it can be very effective so it's usually worth creating these temporary wordlists for your attacks and then combine them with masks.</div></div><div><h3>Hashcat Association Attacks (Getting Big Points with BCrypt)</h3></div><div>As mentioned earlier, the whole reason to try different "spray and pray" attacks against fast hashes is to crack enough to identify how the passwords were created and develop highly targeted attacks against expensive and high value hashes like BCrypt. The mangling rule that received the most post-contest conversation among all of the teams was that several users' passwords were their creation time (found in their metadata) converted to Unix epoch timestamps. </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0jc2XnXTY1-rMjYigzET67XqCnrBOxEecae5cda2GD0OhY59yr2qVW_hgnG5eeKNruofPg9eoZc6hR6MHgMwsEN4Iy6R8HAzvzuGopkNpx_zEXy7UiuqVB8ZP__2-IyJrFl3ust2m7a0xc1A7f0Vp1VWRP5ryrlTqLEnSYyxIk1JoFzue_zqq2mvXMXQ/s2383/timestamp.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing cracked password with the timestamp being it's plaintext password" border="0" data-original-height="104" data-original-width="2383" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj0jc2XnXTY1-rMjYigzET67XqCnrBOxEecae5cda2GD0OhY59yr2qVW_hgnG5eeKNruofPg9eoZc6hR6MHgMwsEN4Iy6R8HAzvzuGopkNpx_zEXy7UiuqVB8ZP__2-IyJrFl3ust2m7a0xc1A7f0Vp1VWRP5ryrlTqLEnSYyxIk1JoFzue_zqq2mvXMXQ/s16000/timestamp.png" title="I'll be up front. I don't think I would have figured this out on my own." /></a></div><br /><div>Creating a wordlist of all the various timestamps is certainly one way to go, but what we really want to do is crack bcrypt hashes. This is a perfect opportunity to talk about association (-a 9) attacks in Hashcat. Association attacks take one word per hash and target that hash with it. The word in association attacks can be combined with rules as well. This is a huge improvement when targeting a large number of salted hashes where you may have some idea what the plaintext for each account might be.</div><div><br /></div><div>To perform an association attack you need to create a hashlist of the hashes you want to target, and then have a 1 to 1 mapping to a wordlist you want to target those hashes with. So for example you might have two files:</div><div><br /></div><div><b>HashList.txt:</b></div><div><ul style="text-align: left;"><li>user1:$2a$:<rest of the hash here></li><li>user2:$2a$:<rest of the hash here></li><li>user3:$2a$:<rest of the hash here></li></ul><div><b>Wordlist.txt:</b></div></div><div><ul style="text-align: left;"><li>Word1</li><li>Word2</li><li>Word3</li></ul><div>For this particular challenge I created the wordlists + uncracked bcrypt hashlist using the following python script in Jupyter Notebook:</div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijMTMAGZmiLnnmOCpP48ns4olFUomUubj_lMo2wxxo8TI1LZpPm7HPHBE8E5O4NoGCWnwMcsB8UwJ9TzE7jqgI9uK-yDsgSqhBwA5NY9aZEI-2a0_PhOpHvrvr8eBKVkYdXonlbiyQv6tghPYEY9D9VcdW1AHZDcuTK8-mOyZbiYQdo9GBL2uCa6ziw7g/s2016/timestamp_a9.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Code creating the timestamps in Unix Epoch time" border="0" data-original-height="750" data-original-width="2016" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEijMTMAGZmiLnnmOCpP48ns4olFUomUubj_lMo2wxxo8TI1LZpPm7HPHBE8E5O4NoGCWnwMcsB8UwJ9TzE7jqgI9uK-yDsgSqhBwA5NY9aZEI-2a0_PhOpHvrvr8eBKVkYdXonlbiyQv6tghPYEY9D9VcdW1AHZDcuTK8-mOyZbiYQdo9GBL2uCa6ziw7g/s16000/timestamp_a9.png" title="Python does a lot of things well. Timezones is not one of those things..." /></a></div><br /><div>Next, let's run some attacks. First, let's just do a quick naïve attack using (-a 0) and the timestamps as a normal wordlist. </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhY2U9hNhH0bNGWDJh6PZ5d_7RCuLTsS_WVLcwM6SyqpCVin_Jd3jQsU3DajaZnDemE6-RWPIofZorR_KQ-5dIgQYKJN0gyg6s52edVJOmRZ7I3Ku_K9IaVxgPLknYLhFR2-raFVl6x_78LXS2kiZl8W8gIKIxN3FmzL84hK7aZ35srQEU2C3ZOAr0xlB8/s1837/hashcat_timestamp_a0.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Using Hashcat's -a 0 attack against the bcrypt timestamp hashes" border="0" data-original-height="1705" data-original-width="1837" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhY2U9hNhH0bNGWDJh6PZ5d_7RCuLTsS_WVLcwM6SyqpCVin_Jd3jQsU3DajaZnDemE6-RWPIofZorR_KQ-5dIgQYKJN0gyg6s52edVJOmRZ7I3Ku_K9IaVxgPLknYLhFR2-raFVl6x_78LXS2kiZl8W8gIKIxN3FmzL84hK7aZ35srQEU2C3ZOAr0xlB8/s16000/hashcat_timestamp_a0.png" title="Don't judge my GPU. This is why I normally play MUDs ;p" /></a></div><br /><div>Running this attack for an hour and a half isn't the end of the world. But this is a contest. You are a busy hacker. You have hashes to crack and other wordlists to run. Let's try Hashcat's association attack. Here is the command I ran:</div><div><ul style="text-align: left;"><li> hashcat -o cmiyc2023_hc.potfile -a 9 -m 3200 bcrypt_datetime.txt unix_timestamps_bcrypt.txt</li></ul><div><b>ONE IMPORTANT THING TO KNOW: </b>By default '-a 9' mode will not save to your standard .potfile. So if you want to capture these hashes you MUST specify a potfile on the command line using the '-o FILENAME' option. I learned this fact the hard way when none of my cracks were showing up. I asked some Hashcat developers about this and they said there's still some "weirdness" with '-a 9' mode. For example, it will "recrack" hashes you have already cracked and post duplicates cracks/plaintexts to your potfile. So if you are running this attack it is probably good to run it on a new potfile vs. your global one, and then merge the new cracks back into your main potfile after the fact.</div><div><br /></div><div>And here's the results:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg83d85-eEy0d2FYMN1xg-NtFIA4cF78rBzrsORXqXa8ZdOKg4oX5NK-okitdCIVm0MpcFb258vTKZ-lq3rxPjnOx5q9tz7wPbJhn6CdnW-_XXn0kLqHek-qNWfFCOLAlJMdbbukuDpQ4U_Hi6BQL1TWYyiqCd34E-q35gjuR866TtRzpFDemdkV9p2cjU/s2026/hashcat_timestamp_a9.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Results of running Hashcat's -a 9 Association attack. Over a hundred bcrypt hashes were cracked in a couple of seconds" border="0" data-original-height="1734" data-original-width="2026" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg83d85-eEy0d2FYMN1xg-NtFIA4cF78rBzrsORXqXa8ZdOKg4oX5NK-okitdCIVm0MpcFb258vTKZ-lq3rxPjnOx5q9tz7wPbJhn6CdnW-_XXn0kLqHek-qNWfFCOLAlJMdbbukuDpQ4U_Hi6BQL1TWYyiqCd34E-q35gjuR866TtRzpFDemdkV9p2cjU/s16000/hashcat_timestamp_a9.png" title="Ok, this is officially cool." /></a></div><br /><div>Over 100 Bcrypt hashes cracked in a couple of seconds! That's super fun. As some backstory, association attacks are amazing if you have known passwords for users. Aka you obtained passwords from a different password dump and you are attacking the fact that users re-use password between multiple sites. Leveraging association attacks, you can run common mangling attacks against those known passwords to crack computationally expensive hashes for a subset of users.</div></div><div><h3>Cracking Multi-Words With Hashcat</h3></div><div>The next area to focus on is multi-words and phrases. Korelogic gave out a hint during the contest that several of the Engineering passwords were created from phrases taken from sci-fi books and movies, with the number '1' appended on the end [<a href="https://infosec.exchange/@CrackMeIfYouCan/110880144837117383">Link</a>]. This can be seen in some of the cracks I made earlier:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHQxT8AWxSgjJAPx3AiyUu6_HCCpzBxT0j0os0txND-3wyM5dQnJeBJ09Fip6Tty-Ia8q6uXnQ8d3Be2sBsulAAp27GuXj3U3_FMDRYkKYEvP9H8SIOSuUmuKbLlm3UCpBdwPzuwjip1YvGUbChZCVePz1TWZWo4keWVH8unqrXg5sUwrdJHv7aVgaRMQ/s2379/phrases_initial.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Cracked passwords of three word phrases ending with 1" border="0" data-original-height="105" data-original-width="2379" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHQxT8AWxSgjJAPx3AiyUu6_HCCpzBxT0j0os0txND-3wyM5dQnJeBJ09Fip6Tty-Ia8q6uXnQ8d3Be2sBsulAAp27GuXj3U3_FMDRYkKYEvP9H8SIOSuUmuKbLlm3UCpBdwPzuwjip1YvGUbChZCVePz1TWZWo4keWVH8unqrXg5sUwrdJHv7aVgaRMQ/s16000/phrases_initial.png" title="I didn't plan for these passwords to make a deep sounding quote but I really dig this..." /></a></div><br /><div>Going back to the hash breakdown by department, Engineering is also a huge department to target:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEATExv35cIg-2S7ev9LOZffiWyHVrDaKY9cynFF6w_RX5Ur6xLEhJaqJX2RHcUcSwGeYcVF-JO2-o4Ta4zhfyoCgMFQuY3LPnzgcQpTJCE18s6zF7cN5X3K5OcNJOjLIl9a7qmxqNxjK2f8rTIFWCUSo4QoH44p1_LtRkVFaJ03LoF3Y0USLcx9rs1kY/s1561/cmiyc2023_jupyter_department_info.png" style="margin-left: 1em; margin-right: 1em;"><img alt="SHowing the same graph from before of hash breakdown by department. Almost half of the hashes are from the engineering department" border="0" data-original-height="1561" data-original-width="1191" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEATExv35cIg-2S7ev9LOZffiWyHVrDaKY9cynFF6w_RX5Ur6xLEhJaqJX2RHcUcSwGeYcVF-JO2-o4Ta4zhfyoCgMFQuY3LPnzgcQpTJCE18s6zF7cN5X3K5OcNJOjLIl9a7qmxqNxjK2f8rTIFWCUSo4QoH44p1_LtRkVFaJ03LoF3Y0USLcx9rs1kY/w305-h400/cmiyc2023_jupyter_department_info.png" title="I was too lazy to take another screenshot so here's the code as well again" width="305" /></a></div><br /><div>The approach here then is to crack as many hashes as possible with fast hashing algorithms to try and figure out the source materials. Then we need to target high-value hashes in the engineering department using phrases from those source materials. Basically dumb, untargeted attacks first, then smart attacks later. Let's start with those dumb untargeted attacks!</div><div><br /></div><div>At a high level this looks like a <a href="https://xkcd.com/936/">Correct Hose Battery Staple problem</a>. To target that, let's try all the common English words in two and three word phrases and add the number '1' to the end. For a dictionary we can use the following corpus which contains various word-lists of 10k English words sorted in probability order [<a href="https://github.com/first20hours/google-10000-english">Link</a>]. The first really "just get it to work" option I selected was to write a quick python program that loops through the word-list and outputs possible phrases while appending the number 1 to them. I then used the fact that if you do not specify a dictionary, Hashcat's '-a 0' mode will read in words from stdin. So I can run my attack using the following command:</div><div><ul style="text-align: left;"><li>(Editor note: This option is bad. Keep reading for a better one) <b>python3 word_combinator.py | hashcat -a 0 -m 0 uncracked_hashes.txt</b></li></ul><div>This wasn't pretty, but it did crack a number of hashes. Still, my guess generation was super slow as it is running a slow python script and then pipes those guesses into hashcat (piping guesses is also slow). Raw-MD5 is fast to compute. Basically this option wastes a lot of time and limits the key-spaces I can search. How about we speed this up using Hashcat's combinator attack?</div><div><br /></div><div>Hashcat's combinator attack '-a 1' allows you to combine two dictionaries together to target multi-word passwords. For example, let's assume you have the following two word-lists</div></div><div><br /></div><div>dic1.txt</div><div><ul style="text-align: left;"><li>fluffy</li><li>scary</li><li>cuddly</li></ul><div>dic2.txt</div></div><div><ul style="text-align: left;"><li>cat</li><li>bat</li><li>rat</li></ul><div>If you run the following command:</div></div><div><ul style="text-align: left;"><li>hashcat --stdout -a 1 dict1.txt dict2.txt</li></ul><div>You'll get the following output:</div></div><div><ul style="text-align: left;"><li>fluffycat</li><li>fluffybat</li><li>fluffyrat</li><li>scarycat</li><li>scarybat</li><li>scaryrat</li><li>cuddlycat</li><li>cuddlybat</li><li>cuddlyrat</li></ul><div>You can also apply one (AND ONLY ONE) rule to each dictionary if you want using the '-j' (applied to left word list) and '-k' (applied to right word list). So for example if you use the following command:</div></div><div><ul style="text-align: left;"><li>hashcat --stdout -a 1 -j '$ ' -k '$1' dict1.txt dict2.txt</li></ul><div>It'll create the following guesses</div></div><div><ul style="text-align: left;"><li>fluffy cat1</li><li>fluffy bat1</li><li>fluffy rat1</li><li><you get the idea></li></ul><div>As reference the '<b>$</b>' rule appends a character to the end of a guess. So '<b>$</b> ' appends a space, and '<b>$1</b>' appends a '<b>1</b>'. I think you might see where this is going....</div></div><div><br /></div><div>The problem is, this works great for two word phrases. But what about three and four word phrases? I wish I knew of a better solution, but the short answer is I hope your cracking system has some free hard-drive space! You can only use combinator with two input dictionaries, and you can't pipe in guesses into hashcat if you are using '-a 9' mode. The fastest option then is to create a word-list of all two word phrases. If you don't want to write a custom program to do this, you can always use hashcat and pipe the guesses to a file. For example:</div><div><ul><li>hashcat --stdout -a 1 -j '$ ' english_words.txt english_words.txt > two_wordst.txt</li></ul></div><div>Then to try three words you can run</div><div><ul style="text-align: left;"><li>hashcat -m 0 -a 1 -j '$ ' -k'$1 ' uncracked_hashes.txt two_words.txt english_words.txt</li></ul><div>To try four words you can simply run</div></div><div><ul style="text-align: left;"><li>hashcat -m 0 -a 1 -j '$ ' -k'$1 ' uncracked_hashes.txt two_words.txt two_words.txt</li></ul><div>Side note, I also has success by capitalizing the first letter by changing the -j rule to:</div></div><div><ul style="text-align: left;"><li>-j'c$ '</li></ul></div><div>This attack yielded a ton of cracks. Looking through them I started trying to find "unique" and "odd" phrases to try and figure out where the source material came from. This is because while the above attack works great against fast hashes like raw-md5, they will not scale against slow hashes like Bcrypt. We need to further optimize our attacks. Given that, here is a subsection of my cracked passwords:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRtCDDnPnix6IlJ5-u5NWaLqVxIp4fHSdbHpyWKqD_dZcirsemdAYuLPiv8ZWLEUacN-sWQLPaG_nluU4Yoi4x8f4MiAaNJI9V2-eHRNPBf252Ej4YUo6WOqqllw5_9SSHivgRM2HwTs6g2v5HCluYmjgGrq_CMcUlriMCa2rWvzIet2qX8r-0pMlEu2M/s1231/cracked_phrases.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Cracked passphrases from sci-fi novels" border="0" data-original-height="1231" data-original-width="1052" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRtCDDnPnix6IlJ5-u5NWaLqVxIp4fHSdbHpyWKqD_dZcirsemdAYuLPiv8ZWLEUacN-sWQLPaG_nluU4Yoi4x8f4MiAaNJI9V2-eHRNPBf252Ej4YUo6WOqqllw5_9SSHivgRM2HwTs6g2v5HCluYmjgGrq_CMcUlriMCa2rWvzIet2qX8r-0pMlEu2M/s16000/cracked_phrases.png" title="This approach does not scale well to cracking four word passphrases" /></a></div><br /><div>Most of these phrases were spectacularly unhelpful. But some of them stood out such as 'watch your food'. Running a quick google search on that + the "scifi" highlighted Project Hail Mary [<a href="https://bookertalk.com/project-hail-mary-by-andy-weir-earth-in-peril/">link</a>]. That was a book I loved and hated in equal parts so it brought up a number of mixed feelings, but it certainly seems like a good candidate. The challenge is that the book isn't in the public commons. Still, let's try and create a dictionary of quotes copied from that article.</div><div><br /></div><div>Next step was to create a janky Python program that would output all 2, 3, and 4 word phrases from the book paragraphs I had found. I know janky Python programs are slow, but so is cracking Bcrypt hashes. In this case it is better to minimize the number of guesses I make vs. focusing on how fast those guesses are generated.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9Tqjg4F5yTCf0HU_HRKDwQj7ZThfRSB2oIVhVQEGnjaiQnEAYc9vv3JArIi-3ooA5GT59slzDgVrW6ePiUegO3RX8oofLQPjSkiCfnX_gquYOXcLw8fJOXJ7pfG1ies3NIdKiQJhrj0jh7YY_a_eoIjHsjA1U1ycWgt1iuelFW_GzpF3JNBQVHmt1_aM/s1437/book_extractor.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Python code to output phrases from a book" border="0" data-original-height="1437" data-original-width="1358" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9Tqjg4F5yTCf0HU_HRKDwQj7ZThfRSB2oIVhVQEGnjaiQnEAYc9vv3JArIi-3ooA5GT59slzDgVrW6ePiUegO3RX8oofLQPjSkiCfnX_gquYOXcLw8fJOXJ7pfG1ies3NIdKiQJhrj0jh7YY_a_eoIjHsjA1U1ycWgt1iuelFW_GzpF3JNBQVHmt1_aM/w378-h400/book_extractor.png" title="All python scripts start off the worst scripts you can imagine until you use them enough that you rewrite them the correct way" width="378" /></a></div><br /><div>Side note: I apologize for putting this as a screenshot. I really wish Google's blogger had a code insert option...</div><div><br /></div><div>Running this through hashcat again yielded a new cracked hash!</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnj8lOXlU4yHjdjkM540i6T8xWeOOcO5zJlZyjD7OyOEdAFIBiWPmsWLYeiCHynliQsB0EnYTbvBdkbOAWxIG2qbQHLXakQePZdQ3o8k9_KdpRmGfAOwAk_9C25q-yJUB_09imwxOw_r5q54Uo55Z-oRsPYZ0gBYhiml2tvBRhGPat42Ms5SkGXCHv04U/s1530/hail_mary_wordlist.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Cracked password using the Hail Mary wordlist" border="0" data-original-height="894" data-original-width="1530" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgnj8lOXlU4yHjdjkM540i6T8xWeOOcO5zJlZyjD7OyOEdAFIBiWPmsWLYeiCHynliQsB0EnYTbvBdkbOAWxIG2qbQHLXakQePZdQ3o8k9_KdpRmGfAOwAk_9C25q-yJUB_09imwxOw_r5q54Uo55Z-oRsPYZ0gBYhiml2tvBRhGPat42Ms5SkGXCHv04U/w400-h234/hail_mary_wordlist.png" title="“I feel kind of stupid. There’s a whole bunch of science I should be doing, right?” -- Project Hail Mary Quote" width="400" /></a></div><br /><div>That's also a pretty unusual phrase, so I have high confidence that Project Hail Mary is one of the sources for the plain-texts. Let's try this against the bcrypt hashes!!!!!</div><div><br /></div><div>Annnnd nothing cracked.......</div><div><br /></div><div>This was disappointing, but it's probably because I was only using two paragraphs from the book. I need to find a better source to grab quotes from.</div><div><br /></div><div>Let me take a step back and say, this workflow loop is one of the keys to this contest. If the cracked fast hashes (raw-md5, raw-sha1, etc) are any indication, around 1/3rd of the high value hashes are phrases taken from books and movies. </div><div><br /></div><div><b>Key workflow for CMIYC 2023:</b></div><div><ol style="text-align: left;"><li>Find the source material for passphrases by analyzing your cracks against fast hashes</li><li>Create input dictionaries by scraping webpages of book and movie quotes and screenplays</li><li>Run those input dictionaries against the slow high-value hashes. </li><li>Repeat</li></ol>The problem for me is that workflow is manually intensive, time consuming, and quite frankly boring as hell. During a competition it can be fun to get that dopamine hit as you crack new bcrypt hashes. After the contest, I'm simply wasting time while running up my power bill. So the question is, can I automate this at all? My power bill will still be high, but at least then I can watch <a href="https://www.youtube.com/watch?v=J_1EXWNETiI">new episodes of Asohka</a> vs. staring at my computer screen! How about I train my PCFG guess generator on cracked passphrases and let it crunch away at generating guesses? I mean, it worked for the Hashcat team! [<a href="https://github.com/hashcat/team-hashcat/blob/main/CMIYC2023/CMIYC2023TeamHashcatWriteup.pdf">Link</a>]. </div><div><br /></div><div>There's various ways to create the training set, but given how Korelogic generated these passwords, and the plain-text values I was seeing, I just threw everything that had a 'space' into a training file using the following command line:</div><div><ul style="text-align: left;"><li>cat cmiyc2023_.potfile | grep ' ' | awk -F ':' '{print $2}' > passphrase_cracked.txt</li></ul><div>I know, I could have done the word-list generation much better as a short python script in my Jupyter Notebook, but I got places to be and Starwars episodes to watch! Now that I had a good training set, I then trained a PCFG grammar on it using the following command:</div></div><div><ul style="text-align: left;"><li>python3 ../../repos/pcfg_cracker/trainer.py -c 1 -r CMICY23_Passphrase -t passphrase_cracked.txt</li></ul><div>I set coverage (-c) to be 1 so the PCFG guesser will not generate any brute force (OMEN) guesses. I then gave this attack a test run against raw-sha256 hashes using the following command:</div><ul style="text-align: left;"><li>python3 ../../repos/pcfg_cracker/pcfg_guesser.py -r CMICY23_Passphrase | hashcat -m 1400 -a 0 uncracked_hashes.txt</li></ul><div>And.... Yup this looks promising:</div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTfngJTL94624XfCw8Bo00QzUTS9usO9j5cEKdLOCdGo5fTjb2aiVcvGLOLEF1F_huhVOtCMnli9EsuPo0OCAIZnU5SDPwCcWIrbVgOGJXq4B2WfloxJEvbXL3wAV6eZECa4B5xxjahPVpqjw7wbSAVnOfOcwX7v__odTZ1HWEeAaurd1KeqFweYJLAPQ/s1703/pcfg_sha256_phrase.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing the start of a pcfg passphrase cracking session with a lot of hashes being instantly cracked" border="0" data-original-height="1703" data-original-width="1502" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhTfngJTL94624XfCw8Bo00QzUTS9usO9j5cEKdLOCdGo5fTjb2aiVcvGLOLEF1F_huhVOtCMnli9EsuPo0OCAIZnU5SDPwCcWIrbVgOGJXq4B2WfloxJEvbXL3wAV6eZECa4B5xxjahPVpqjw7wbSAVnOfOcwX7v__odTZ1HWEeAaurd1KeqFweYJLAPQ/w353-h400/pcfg_sha256_phrase.png" title="Woho! I migh actually be able to watch some Jedi wrecking stuff tonight!" width="353" /></a></div><br /><div>Let's see how it does with Bcrypt using the following command:</div><div><ul style="text-align: left;"><li>python3 ../../repos/pcfg_cracker/pcfg_guesser.py -r CMICY23_Passphrase | hashcat -m 3200 -a 0 uncracked_hashes.txt</li></ul><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgz4cO7j3OyM-NnxO2B0lhWc61v0zVnJge8eCPB6Fk1L4hI_iCSzR-cM_5csBDmHyOJ5CvmZp_JbmkbrU8vrsGwYZaBJjbQg5vYyXVCJuLQxQGrXP2ka255xHR_74yQGHEsZgpHRW2Eur3mowQJaaUHZPjbghIJ8E8KLIC2hvBJEq7WEKfp_5XJTLVFcI/s1946/pcfg_bcrypt_phrase.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing 1 bcrypt two word passphrase being cracked in the first two minutes" border="0" data-original-height="781" data-original-width="1946" height="160" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjgz4cO7j3OyM-NnxO2B0lhWc61v0zVnJge8eCPB6Fk1L4hI_iCSzR-cM_5csBDmHyOJ5CvmZp_JbmkbrU8vrsGwYZaBJjbQg5vYyXVCJuLQxQGrXP2ka255xHR_74yQGHEsZgpHRW2Eur3mowQJaaUHZPjbghIJ8E8KLIC2hvBJEq7WEKfp_5XJTLVFcI/w400-h160/pcfg_bcrypt_phrase.png" title="Hey, this is better than my Project Hail Mary dictionary!" width="400" /></a></div><br /><div>Success! Limited Success!</div></div><div><br /></div><div>There is still a ton of optimization I could do. You'll notice I haven't re-added / merged my potfiles in from the previous cracking of the Unix Epoch timestamp hashes. I also am targeting all of the Bcrypt hashes vs. just the ones in the engineering department. By reducing the target hashes I could easily double the speed of plain-text guesses I am making against the target hash list. I also don't want to give the false impression that this is the best attack method for these hashes. It's not. You would be much more successful by trying to find the source material and creating custom word-lists from that. What this attack workflow has going for it though is it is one of the most automatable options. You can let this run while trying to figure out better methods. Or... you can go do something else besides crack passwords. Call you parents maybe? I'm sure they would appreciate it!</div><div><br /></div><div>I think this is a good spot to end this blog post. Looking back at it, I somehow managed to cover every attack mode in Hashcat. There's still more techniques to dig into, and there's a ton of uncracked hashes left in this contest. But I might leave that for a future post. If you have any tips, suggestions, or comments, feel free to leave them in the comments. Good luck, and I hope to see everyone at CMIYC 2024! Also thanks once again to the KoreLogic team for putting together such a great contest!</div><div><br /></div><div></div></div>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-68367068013579027582023-08-22T10:20:00.003-07:002023-08-22T13:26:36.617-07:00Using JupyterLab to Manage Password Cracking Sessions (A CMIYC 2023 Writeup) Part 2<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8RYDab9CRXr8mtDhZePTerZJNpxpvDCgSI6aRlkch975LV_7DFVws5QUwYqXX7kojlpVEcY5E1KkWQZgDx1bngHH8N-WLH2klzJzTrstnlO6rO7UQsMS1hJdR7pKw0fJzJRtga9gZkAr_O4si1wosVYRx2J1H2vZmJhDOZZPeM6H2oTqBLnpjfymco2A/s1456/cracking_computer.png" style="margin-left: 1em; margin-right: 1em;"><img alt="AI generated art of a computer with colors behind it" border="0" data-original-height="832" data-original-width="1456" height="366" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg8RYDab9CRXr8mtDhZePTerZJNpxpvDCgSI6aRlkch975LV_7DFVws5QUwYqXX7kojlpVEcY5E1KkWQZgDx1bngHH8N-WLH2klzJzTrstnlO6rO7UQsMS1hJdR7pKw0fJzJRtga9gZkAr_O4si1wosVYRx2J1H2vZmJhDOZZPeM6H2oTqBLnpjfymco2A/w640-h366/cracking_computer.png" title="Stylized AI generated art of my cracking setup" width="640" /></a></div><blockquote><p><i><b> “Tools?" scoffed Kalisti, "Tools are for people who have nothing better to do than think things through and make sensible plans.”</b></i></p><p><b>― Laini Taylor, Muse of Nightmares</b></p></blockquote><p>When we left off in <a href="https://reusablesec.blogspot.com/2023/08/using-jupyterlab-to-manage-password.html">Part 1 of my CMIYC2023 Writeup</a>, I had cracked a measly 437 passwords. Yes I had a Jupyter Notebook set up to perform analysis, but what I really needed was more cracked passwords to do analysis on. To that end, I started off doing some basic exploratory attacks very similar to what I detailed in previous competitions [<a href="https://reusablesec.blogspot.com/2022/05/password-cracking-tips-crackthecon.html">Crack the Con</a>, <a href="https://reusablesec.blogspot.com/2022/08/more-password-cracking-tips-defcon-2022.html">CMIYC2022</a>].</p><p>These included running JtR Single Mode with the RockYou, dic-0294, Alter_Hacker, and the sraveau-Wikipedia wordlists. Basically these attacks are about as dumb and untargeted as you can get. But they are also easy and quick to run against fast hash types. And they can be helpful! The Wikipedia wordlist in particular highlighted that Cyrillic passwords would likely play a role in this competition.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP3iHxEO1moYwnVXw11K0IGaxf9jlzxGp1bNRi49qv6WiFTsd5Nh0rvWur5NjN_J-mraNzj9uhNWyt1NXirvN0zHuxew52OmmQ4y-bbB0mH3u_PTXWstH_8__9UVroQHfShYMF10R9nUKroub6eR-c-K46xfNsIVh7Oz5Et4C0CPU5VicZ6ueJuISaCrI/s918/wiki_wordlist.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A running JtR cracking session showing several Cyrillic passwords being cracked" border="0" data-original-height="918" data-original-width="476" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiP3iHxEO1moYwnVXw11K0IGaxf9jlzxGp1bNRi49qv6WiFTsd5Nh0rvWur5NjN_J-mraNzj9uhNWyt1NXirvN0zHuxew52OmmQ4y-bbB0mH3u_PTXWstH_8__9UVroQHfShYMF10R9nUKroub6eR-c-K46xfNsIVh7Oz5Et4C0CPU5VicZ6ueJuISaCrI/w208-h400/wiki_wordlist.png" title="This still isn't very many passwords cracked. But it is interesting!" width="208" /></a></div><p>Running an attack using a Russian wordlist (<a href="https://github.com/hingston/russian">from here</a>) didn't yield many additional cracks though. Doing some Googling, it looks like some of these are Ukrainian words, not Russian words so I tried this dictionary as well [<a href="https://github.com/lang-uk/ukrainian-word-stress-dictionary/blob/master/stress.txt">link</a>]. One thing though is I was able to see all the usernames are also Cyrillic. This will probably be useful to target tougher hashes.</p><p>Around this time, I also created a second Notebook in JupyterLabs to automate some of the common tasks that I am always doing. For example, running loopback attacks against fast hashes using previously cracked passwords is a great source of new cracks. So that's a good task to automate since I'm constantly doing that in the background.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjA8NFmeSQFyW1CpeLOrUQW1vC97XROoIwv8GpoWv0DAqF7guFVeSCxcO6JqKNGoqtp2oOuR8zNOtZe-J1CRuEZyfyY8o7_Yi8QVUnbw4YL_dHMXQmCBclsmjBRHqUSvePLIxIqoW4yHpUuyzJFx2cIH34UH2HVpsmfOoCERmf66C_q8IPkp_YwbKFcNUk/s1182/jupyter_loopback.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Running JtR loopback attacks via Jupyter Notebook" border="0" data-original-height="797" data-original-width="1182" height="270" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjA8NFmeSQFyW1CpeLOrUQW1vC97XROoIwv8GpoWv0DAqF7guFVeSCxcO6JqKNGoqtp2oOuR8zNOtZe-J1CRuEZyfyY8o7_Yi8QVUnbw4YL_dHMXQmCBclsmjBRHqUSvePLIxIqoW4yHpUuyzJFx2cIH34UH2HVpsmfOoCERmf66C_q8IPkp_YwbKFcNUk/w400-h270/jupyter_loopback.png" title="Notice I made a "--session=automated" so this doesn't interfere with my manually created cracking attacks" width="400" /></a></div><p>I also created cells for similar activities that I'm constantly doing such as creating a single list of all the plaintext cracked passwords I can use for training. This in turn allows me to quickly create a PCFG training set on the full list for cracking fast and medium-speed hashes. Aka:</p><p></p><ul style="text-align: left;"><li>python3 trainer.py -r CMIYC2023 -t ../../research/cmiyc/2023/all_plains.txt</li><li>python3 ../../../github/pcfg_cracker/pcfg_guesser.py -r CMIYC2023 | ../../../github/JohnTheRipper/run/john --format=raw-sha256 -stdin raw_sha256_hashlist.txt</li></ul><div>This also netted me my first two bcrypt cracks</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjE0alPNDhspxruj-RwVaY-TnjVRx_3t6cFBsjKYhz9epRhG6PT2DW56XODdQbMcpUrA5-EMhPc2MHzxL2Z4lqcOr6S66TJRWffeAFQBQAT7z1ft4xvjeNx95SPAXUp65DKM7DKtm2Wb2pam1EUaZpSfUAVrsHQKKuEtdSoiKrG3uXbyI_TvAOm2_fClEM/s1808/first_two_bcrypt.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Two bcrypt hashes being cracked by a pcfg attack" border="0" data-original-height="865" data-original-width="1808" height="191" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjE0alPNDhspxruj-RwVaY-TnjVRx_3t6cFBsjKYhz9epRhG6PT2DW56XODdQbMcpUrA5-EMhPc2MHzxL2Z4lqcOr6S66TJRWffeAFQBQAT7z1ft4xvjeNx95SPAXUp65DKM7DKtm2Wb2pam1EUaZpSfUAVrsHQKKuEtdSoiKrG3uXbyI_TvAOm2_fClEM/w400-h191/first_two_bcrypt.png" title="This is with a single laptop running JtR so I can't make many Bcrypt guesses" width="400" /></a></div><br /><div>Since I'm doing this after the contest and can't submit cracks. I also created a quick "scoreboard" in Jupyter to estimate how I'd stack up to the other street teams:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghL46xZ7YQcbJuTeDiv0KmuGU4GrrgiMyTo-jK4hnW-RkOV7FycEsfooK-ggvi36LtZ9UycCvE2Qn8wq1xl5idEljJ4u9x2L_qQ-zcsWgYJFaNViOgPtfirxOtRIoJhkQS77HZCNMzSQZe8rY6KryzIRUFur8RgSTltmUd3u9xsa2tQwW6WdYhmdLo0Hs/s1083/current_score.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Currently ranked 21st place" border="0" data-original-height="663" data-original-width="1083" height="245" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghL46xZ7YQcbJuTeDiv0KmuGU4GrrgiMyTo-jK4hnW-RkOV7FycEsfooK-ggvi36LtZ9UycCvE2Qn8wq1xl5idEljJ4u9x2L_qQ-zcsWgYJFaNViOgPtfirxOtRIoJhkQS77HZCNMzSQZe8rY6KryzIRUFur8RgSTltmUd3u9xsa2tQwW6WdYhmdLo0Hs/w400-h245/current_score.png" title="Not great, but at least I'm on the board!" width="400" /></a></div><br /><div>I want to stress again. I'm cheating. These teams actually competed in the competition. I'm leisurely sitting down for short cracking sessions while writing this blog post. Which is another way of saying my numbers are even worse than they appear ;p</div><div><br /></div><div>Well, I need to do something about that! Next step is to look through my updated meta/cracked list and try to spot patterns:</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqxPFiZ9oH1SqOKdO1u3nyGK4LH5ylmNxemQK__vlU61uQDLRC0pew37xhfLmgwFawuBE-ajdXlc_UBTjqQL3jWbUMqEEa-VqnU9HP8gzZ23P2zUTyhZbXDwec1rUst2s9ebB9cNUW3Zyw0upYzLg57dQVQHrkZB1hxOAcLmJ2pVt67D8167BY-gYR8fc/s1934/meta_cracked1.png" style="margin-left: 1em; margin-right: 1em;"><img alt="List of cracked passwords and associated metadata" border="0" data-original-height="1718" data-original-width="1934" height="355" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqxPFiZ9oH1SqOKdO1u3nyGK4LH5ylmNxemQK__vlU61uQDLRC0pew37xhfLmgwFawuBE-ajdXlc_UBTjqQL3jWbUMqEEa-VqnU9HP8gzZ23P2zUTyhZbXDwec1rUst2s9ebB9cNUW3Zyw0upYzLg57dQVQHrkZB1hxOAcLmJ2pVt67D8167BY-gYR8fc/w400-h355/meta_cracked1.png" title="The list is slowly getting more respectable..." width="400" /></a></div><br />I now have cracks for every department, but it doesn't seem like the individual departments follow a set pattern. There's a couple of different ways to go from here. The first thing to do is create some custom rules to match patterns that I'm seeing in the plaintexts.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiinc1Sp3SYc72PEziAPcCrkmv7GP6L2IoQ8CwrpUODMv_tHJRoqhqawlaenx33b3oErqOrVadQ4GOVWJdasSlBtQYE7Ilzp41tV68rup2m2WP4ywlxqnazwvgiI6OcQmfdXDKBycn9XRqxEappEtVH4hTUKxoBd2ITAbfoW9RjUd0vB4F8N3_AU77MYAw/s813/partial_ruleset.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Partial ruleset I generated for CMIYC" border="0" data-original-height="813" data-original-width="394" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiinc1Sp3SYc72PEziAPcCrkmv7GP6L2IoQ8CwrpUODMv_tHJRoqhqawlaenx33b3oErqOrVadQ4GOVWJdasSlBtQYE7Ilzp41tV68rup2m2WP4ywlxqnazwvgiI6OcQmfdXDKBycn9XRqxEappEtVH4hTUKxoBd2ITAbfoW9RjUd0vB4F8N3_AU77MYAw/w194-h400/partial_ruleset.png" title="This is only some of them, and there's a ton of optimization I can make" width="194" /></a></div><br /><div>Running the rules above and several others yielded a few more cracks. The other thing I did was create some custom PRINCE wordlists from the cracked passwords using PRINCE-LING from the pcfg toolset.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHK5nUgN80hZfvse3roeAYk5Kn82ftwAaVb3p9EpBjtpklCVysLhlgK7EWAMjbkrhzrEUjEm7i6U6ITvs32QdDztoUADt7K2yCrVZuO3VTCeEM8WlDebATvABINxSr2oPuJDFkNFmjAhz83OzycaWyLR3W5UCLIMnrgMzSqOFNmoZpym27kzz517YxGWU/s1807/princeling.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Running PRINCE-LING to generate rulesets" border="0" data-original-height="897" data-original-width="1807" height="199" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjHK5nUgN80hZfvse3roeAYk5Kn82ftwAaVb3p9EpBjtpklCVysLhlgK7EWAMjbkrhzrEUjEm7i6U6ITvs32QdDztoUADt7K2yCrVZuO3VTCeEM8WlDebATvABINxSr2oPuJDFkNFmjAhz83OzycaWyLR3W5UCLIMnrgMzSqOFNmoZpym27kzz517YxGWU/w400-h199/princeling.png" title="This is one of the cooler tools I've written, and probably the one that most easily fits into most people's cracking workflows." width="400" /></a></div><br /><div>I created both a full wordlist (as seen above) and also wordlists containing only 500 values for targeting slower hash-types. Using Prince attacks then identified a few more rules that yielded additional cracks.</div><div><br /></div><div>Another area seemed to be to target non-ascii usernames with non-ascii guesses. </div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZo2DPfdeefs66QWe_YZR_hcaSNdKH9hSEJiqPmMUBYoPkwremcyGrZBWD-rK1edihYH9j2JV3ybHTRyxiDSNiKt4gvjv_0_JUBgWTkE1eehpZPPw0nW6WE5cL1aSLXw99Zb7jGE7X7lJp_PkU2ZZjO0BE0QWZhpTe69-XzVsA7TCN-HDlzHp35qUbM5I/s1175/non-ascii.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Jupyter script to create user lists and dictionaries only containing non-ascii values" border="0" data-original-height="640" data-original-width="1175" height="347" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhZo2DPfdeefs66QWe_YZR_hcaSNdKH9hSEJiqPmMUBYoPkwremcyGrZBWD-rK1edihYH9j2JV3ybHTRyxiDSNiKt4gvjv_0_JUBgWTkE1eehpZPPw0nW6WE5cL1aSLXw99Zb7jGE7X7lJp_PkU2ZZjO0BE0QWZhpTe69-XzVsA7TCN-HDlzHp35qUbM5I/w640-h347/non-ascii.png" title="The input dicitonary is created by PRINCE-LING so all the words are already stemmed" width="640" /></a></div><br /><div>Side note: If you ever have to identify non-ASCII characters using Python the following check is highly effective as it strips non-ASCII characters and then sees if the word shrunk:</div><div><ul style="text-align: left;"><li>if len(key.encode("ascii", "ignore")) < len (key):</li></ul><div>In addition, I later added in the GivenName and SurName fields from the metadata (not depicted above) which helped a lot too.</div><div><br /></div><div>As we're continuing to throw things against the wall, let's try and build more wordlists from all the metadata. Many of the companies seem to be two words concatenated together. Let's strip them out and break them up.</div></div><div><br /></div><div>And ... using dictionaries based on the company metadata was totally unhelpful. I did not get a single additional crack using those dictionaries.</div><div><br /></div><div>After all of that, let's check where I'm at with my score:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKuK2ZShVPj5pzRIL5QzBl5kRMnse4WRrNT1MlpXUc9PfOZY2ILylbSUk9P48eisi2OupNrB7s8WsUXbiaiLxKtLCEr87earjjm7Ao0y6iahPnrYZgJeg4GQckYHPEpNVdT8Bt_eyAE86zuagQeIoBcUl7zVm-iBvasRS-RtWyqd29LpQiPV2soQiYKXw/s412/rank17.png" style="margin-left: 1em; margin-right: 1em;"><img alt="My score shows I'd be rank 17 if I had competed ... which I hadn't" border="0" data-original-height="82" data-original-width="412" height="80" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhKuK2ZShVPj5pzRIL5QzBl5kRMnse4WRrNT1MlpXUc9PfOZY2ILylbSUk9P48eisi2OupNrB7s8WsUXbiaiLxKtLCEr87earjjm7Ao0y6iahPnrYZgJeg4GQckYHPEpNVdT8Bt_eyAE86zuagQeIoBcUl7zVm-iBvasRS-RtWyqd29LpQiPV2soQiYKXw/w400-h80/rank17.png" title="Not great Bob!" width="400" /></a></div><br /><div>I really need to switch things up because that's not even close to respectable. Looking through the list of cracked passwords no new patterns stood out, but then I decided to map out my progress targeting each department in my Jupyter Notebook.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_WKMchTsEMKndLyLAlCu3cxxsTqnx8n4kaKoSbqT64nuQ_3uTcvTmDz-__y6pIigWW4sIgABcUpHf9L1sROMANu0gvNgFvctEAjCqU2fjy5lF8T2noZsEnBm7W7z1CJZHcDQOHIxylD_xxMlyGsjA6hTuseIMfVR5_mjrMX80uub2CmCy3Pcf6oQmpyk/s1470/target_sales.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A list of cracked/uncracked sorted by departments. Sales is where most of my cracks came from" border="0" data-original-height="1193" data-original-width="1470" height="325" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj_WKMchTsEMKndLyLAlCu3cxxsTqnx8n4kaKoSbqT64nuQ_3uTcvTmDz-__y6pIigWW4sIgABcUpHf9L1sROMANu0gvNgFvctEAjCqU2fjy5lF8T2noZsEnBm7W7z1CJZHcDQOHIxylD_xxMlyGsjA6hTuseIMfVR5_mjrMX80uub2CmCy3Pcf6oQmpyk/w400-h325/target_sales.png" title="I could have done this as a pie chart but this is easier to read" width="400" /></a></div><br /><div>OMG. This hadn't been apparent as I was looking through the previous view of cracked hashes since I forgot how many more IT hashes there were than any other group, but what I had been doing is basically cracking Sales passwords. I had that pattern down pat, and my PCFG cracker was trained on mostly Sales passwords. How about I run a PCFG attack, trained on CMIYC2023 plains against Bcrypt Sales-only hashes? To create the hashlist using Jupyter was easy:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBRhgZiNVQS7dzw1dyFCe00hlwAZn-fHGrh0gXRXvgneUmjT44upNzHqYlB-cmuoY0atvaAVB4XLAqChNdtX1VY7uTZcZH3YElFyoaT9W5KIQYXUUrWgFzlgBJzHU7yaoizw5lAtwetcoCL_LWKwGEfHpSYB71d0Dq6nL_AONcPR8YCH8JEPO26Ck2tTw/s906/sales_dictionary.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Python code to create a sales only hashlist" border="0" data-original-height="269" data-original-width="906" height="119" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBRhgZiNVQS7dzw1dyFCe00hlwAZn-fHGrh0gXRXvgneUmjT44upNzHqYlB-cmuoY0atvaAVB4XLAqChNdtX1VY7uTZcZH3YElFyoaT9W5KIQYXUUrWgFzlgBJzHU7yaoizw5lAtwetcoCL_LWKwGEfHpSYB71d0Dq6nL_AONcPR8YCH8JEPO26Ck2tTw/w400-h119/sales_dictionary.png" title="I'm kicking myself for not doing this earlier" width="400" /></a></div>I then ran a pcfg attack against the sales-only bcrypt hashes using the following command:<p></p><p></p><ul style="text-align: left;"><li>python3 pcfg_guesser.py -r CMIYC2023 --skip_brute | john --stdin --format=bcrypt sales_hashes.txt</li></ul><div>That was super effective!</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTbcWBNWjvXxwThB05BmBiHmDLXtN94zpAy28PlU0XUAlb-X_KiCUmKbZbObxzpWOvQFq5z06pcjP4RvUlzu9PuAC7Lle4W3wUKMSICK8JVBEIlnUbpAYcE9b2fjJKmd28RCnm55f4l3nLXDyxSxpkziASZ6ywxNbI4Pc4Y2m4PhdpzrfqrYFIsITpyj0/s948/bcrypt_better.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Cracking lots of bcrypt on my laptop" border="0" data-original-height="928" data-original-width="948" height="391" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiTbcWBNWjvXxwThB05BmBiHmDLXtN94zpAy28PlU0XUAlb-X_KiCUmKbZbObxzpWOvQFq5z06pcjP4RvUlzu9PuAC7Lle4W3wUKMSICK8JVBEIlnUbpAYcE9b2fjJKmd28RCnm55f4l3nLXDyxSxpkziASZ6ywxNbI4Pc4Y2m4PhdpzrfqrYFIsITpyj0/w400-h391/bcrypt_better.png" title="This is all using a CPU cracker on a laptop against Bcrypt hashes!" width="400" /></a></div><br /><div>I want to stress, this was all using John the Ripper on a single laptop. And these are BCrypt hashes I'm targeting. They are super slow and annoying even with better cracking rigs! After about an hour of cracking time my score had significantly increased.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW_xGW0Gc6k6hLsB6LABwWng_l0Wg0cUoyIEBcAsCvlJt_wxBf0lHpSTx2XqcBGuvcbMiIXfO4T2UfgZHM3K_NcyBqZAdkiZIpMr9i2kMfXYInqjTlqZ-TWJuxLuVtqUqOfW5NI_22iyIN1H_It2MtJUAE2aBWzKwmQN07Xn8rVboj6pAnfdlJrX6JlZI/s323/better_score.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Score showing Rank 12" border="0" data-original-height="77" data-original-width="323" height="76" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW_xGW0Gc6k6hLsB6LABwWng_l0Wg0cUoyIEBcAsCvlJt_wxBf0lHpSTx2XqcBGuvcbMiIXfO4T2UfgZHM3K_NcyBqZAdkiZIpMr9i2kMfXYInqjTlqZ-TWJuxLuVtqUqOfW5NI_22iyIN1H_It2MtJUAE2aBWzKwmQN07Xn8rVboj6pAnfdlJrX6JlZI/w320-h76/better_score.png" title="Let's hear it for those people in Sales!" width="320" /></a></div><br /><div>I then started running the same attack against other hash types such as sha256crypt and had similar success.</div><div><br /></div><div>Next I realized I needed to build up my dictionary. To do this I took a common prefix (2023), added a '!' to the end, and used JtR's Mask mode to exhaust alpha characters for the remaining letters, capitalizing/lowercasing the first letter and having the rest lowercase</div><div><ul style="text-align: left;"><li>john --mask=2023[a-Z]?l?l?l?l?l?l! --format=raw-md5 raw_md5_hashlist.txt</li></ul>I did several variations of the above and found a few new base words but not many. I then re-ran the PCFG trainer to update my sales ruleset and then re-ran my attacks to net a few more hashes cracked. Still, there was certainly room for improvement and I was semi-happy with my wordlist so the next thing to do was look for new mangling rules. To that end I re-ran JtR's mask attacks with a known plaintext word "Sales". For example:</div><div><ul style="text-align: left;"><li>john --mask=?a[Ss]ales?a?a?a?a --format=raw-md5 raw_md5_hashlist.txt</li></ul>This didn't yield more cracks so I was very kerfluffled about that. There's obviously a good chunk of passwords I'm still missing in this group. Still by running my PCFG attack again (retrained with the newly found base words) against the "Sales Only" tougher hashes had a big impact on my score.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg1MqhUEGdhx0yj16HiIvDzT6pC3bIFrVrmKcK6J_KGkkx_kcGxEEg6F1WbUlN75hl-AJUc-v67UUDUYIK0Rucbwg8Ziu4yHMONwSBgQ2k6qJqx9gdIGlaE8S9hxebsciRow10vDDRlcAOf1Oo7xi1IQxNcviMQ9Y-37GkbZ6Q-0wQXlI7o2P6SGPvTTo/s367/final_score_post2.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A calculated score showing that I would have been in 10th place" border="0" data-original-height="82" data-original-width="367" height="71" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg1MqhUEGdhx0yj16HiIvDzT6pC3bIFrVrmKcK6J_KGkkx_kcGxEEg6F1WbUlN75hl-AJUc-v67UUDUYIK0Rucbwg8Ziu4yHMONwSBgQ2k6qJqx9gdIGlaE8S9hxebsciRow10vDDRlcAOf1Oo7xi1IQxNcviMQ9Y-37GkbZ6Q-0wQXlI7o2P6SGPvTTo/w320-h71/final_score_post2.png" title="Let me stress again, I didn't compete. This score is not representative of how I actually would have done. Everyone who competed did better than me." width="320" /></a></div><br /><div>I know these calculated scores have real "if the game lasted one more round I would have won" little brother energy. I have a ton of respect for everyone who did compete in CMIYC and want to stress if I was doing this live, in Vegas, with a million other things going on I would have done much, much worse. But it's helpful to know that without solving any of the "real" challenges in the contest there was still ways make progress with limited cracking resources.</div><h3 style="text-align: left;">Conclusion:</h3><div>I think this is a good place to wrap up this blog post. All these attacks were run on my laptop with John the Ripper, and I think it shows a base level of ability that anyone can do. I also hope this highlighted the value of using Jupyter Notebooks, not just for password cracking, but in any data analysis task you might find yourself doing.</div><div><br /></div><div>I'd specifically like to thank the KoreLogic team for running yet another great contest! I had a ton of fun digging into it and even more fun talking to everyone at their booth at Defcon! These contests require a ton of work to set up and they help the community so much so I really appreciate all the work their team puts into this.</div><div><br /></div><div>One thing I'd like you to walk away from these entries is how useful a tool JupyterLab can be for all your data analysis tasks. It was a long time before I started using it myself, and I'm constantly surprised by how easily it integrates into my workflows and how much more productive it makes me. I highly recommend checking it out, even if you aren't cracking passwords.</div><div><br /></div><div>I have a lot of ideas of where to go next. I'm tempted to write a follow-up blog entry that goes full spoiler into this contest, looks at how the plains were generated and then uses Hashcat on a more powerful machine to show how to target them (such as using Hashcat's amazing -a 9 association mode). I also have some improvements I need to make to my PCFG toolset based on feedback from other people using it during this contest. Finally I want to clean up my Jupyter notebook that I created for these posts and make it available on Github. When I finally get around to that I'll post a link on this blog entry. These were fun entries to write, and this was a great contest to (belatedly) participate in. Thanks again to KoreLogic and congratulations to all the teams that participated.</div><p></p><p></p>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-50487836945722823132023-08-19T21:20:00.009-07:002023-08-19T23:35:58.250-07:00Using JupyterLab to Manage Password Cracking Sessions (A CMIYC 2023 Writeup) Part 1<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihtL654wNiDQbEEkozTnOuR8F86CffSe0MnNbnBL02R3LF4JgoMcKig1wXTQ5euhqy5YNmkAjvR05bHrOnPMi67sCRFKMGjfbCqb0cpjak17wheTMD1VD_2eCdFsZuzsKwkgzdauchtoWj73VRKGg2AVnnADbKXO2krNG0BA4ifX0rJJAzM5eGM_NRgok/s1456/group_of_computer_security_experts_graphs_and_charts.png" style="margin-left: 1em; margin-right: 1em;"><img alt="MidJourney Imagining a Bunch of Data Scientists Cracking Passwords" border="0" data-original-height="832" data-original-width="1456" height="366" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEihtL654wNiDQbEEkozTnOuR8F86CffSe0MnNbnBL02R3LF4JgoMcKig1wXTQ5euhqy5YNmkAjvR05bHrOnPMi67sCRFKMGjfbCqb0cpjak17wheTMD1VD_2eCdFsZuzsKwkgzdauchtoWj73VRKGg2AVnnADbKXO2krNG0BA4ifX0rJJAzM5eGM_NRgok/w640-h366/group_of_computer_security_experts_graphs_and_charts.png" title="MidJourney Imagining a Bunch of Data Scientists Cracking Passwords" width="640" /></a></div><p></p><p></p><blockquote><p style="text-align: center;"><b>“We become what we behold. We shape our tools, and thereafter our tools shape us.”</b></p><p style="text-align: center;"><b>-- Marshall McLuhan</b></p></blockquote><p>This year I didn't compete in the Defcon Crack Me If You Can password cracking competition. It was my wife's first Defcon, so there was way too much stuff going on to sit around our hotel room slouched over a computer. But now that a week has passed and I'm back home, I figure the CMIYC Street Team Challenge would be a great use-case to talk about data science tools!</p><p><b>Big Disclaimer:</b> I've read spoilers from other teams and have participated in the post-contest Discord server. I'm totally cheating here. The focus is on how you can use JupyterLab to perform analysis while cracking passwords. Not my problem solving skills (or lack there-of).</p><h3 style="text-align: left;">Initial Exploration of the Challenge Files:</h3><div>The CMIYC challenge file for street teams is available <a href="https://contest-2023.korelogic.com/downloads.html">here</a>. It's a pgp encrypted file so the first thing to do is decrypt them with the password KoreLogic provided.</div><div><ul style="text-align: left;"><li>gpg -o cmiyc_street_01_2023.yaml -d cmiyc-2023_01_street.yaml.pgp</li></ul></div><div>Looking at the file in a text editor, you can quickly see that at first glance it appears to be a yaml file.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiduWn1xFuh8h3bHcxn0acCmoWPdnVKntIW51nNqb4wO1UhCIGATdPwY3UOibcqeOzfDGeCiTWO41fH-ITJogG9gc3TIbNG3Bv3FWVSQRuOyuQtBXDPt7mvzTrVVQcVvq8I_gWLTARkCzqdmTartWCBpMq-UqYr2RjzleSVa0XvJJFlXZwPhcZb14FtglY/s1084/cmiyc2023_yaml.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Picture of the start of the yaml file for the contest" border="0" data-original-height="736" data-original-width="1084" height="271" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiduWn1xFuh8h3bHcxn0acCmoWPdnVKntIW51nNqb4wO1UhCIGATdPwY3UOibcqeOzfDGeCiTWO41fH-ITJogG9gc3TIbNG3Bv3FWVSQRuOyuQtBXDPt7mvzTrVVQcVvq8I_gWLTARkCzqdmTartWCBpMq-UqYr2RjzleSVa0XvJJFlXZwPhcZb14FtglY/w400-h271/cmiyc2023_yaml.png" title="I really don't like yaml files due to indentation mattering. But I do like Python... I can't explain it." width="400" /></a></div><br /><div>Of course, you shouldn't trust anything that the contest organizers throw your way! Next up is to validate the yaml format and see if there is anything obviously wrong with it. A quick way to do that is using yamllint. To install and run yamllint:</div><div><ul style="text-align: left;"><li>pip install yamllint</li><li>yamllint cmiyc_2023_01_street.yaml</li></ul><div>And the results are .... ok there's a lot of long lines....</div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQsAqDXv4TjJqr36YB3ihor0UTMDPKbEdUHQyvSsaoBMPW_j2R-pL7kexobmjxkMZTHTzu3q4zkSFc2TKENRI2Sx0HjoXjxjSytSdZ5VZcKPUajzqyAKgwJr8cvBCn4K2fG_SXHBxmCajCDAi9Tv2h8Ld2aEbojmCh_BkD0L9nxMZ0glLIvtsiMNpH9-8/s1139/cmiyc2023_yaml_error.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Lots of line too long errors" border="0" data-original-height="332" data-original-width="1139" height="116" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQsAqDXv4TjJqr36YB3ihor0UTMDPKbEdUHQyvSsaoBMPW_j2R-pL7kexobmjxkMZTHTzu3q4zkSFc2TKENRI2Sx0HjoXjxjSytSdZ5VZcKPUajzqyAKgwJr8cvBCn4K2fG_SXHBxmCajCDAi9Tv2h8Ld2aEbojmCh_BkD0L9nxMZ0glLIvtsiMNpH9-8/w400-h116/cmiyc2023_yaml_error.png" title="To be fair, most of my code has lines too long. Heck, this alt-text is way too long!" width="400" /></a></div><br /><div>Luckily you can easily tune any of the checks that yamllint performs. To hide these errors you can set the max line length to 130 and run yamllint again using the command:</div><div><ul style="text-align: left;"><li>yamllint -d "{extends: default, rules: {line-length: {max: 130}}}" cmiyc_2023_01_street.yaml</li></ul></div><div>This time, the file validated without any warnings. So it looks like the CMIYC challenge file is a valid YAML file. That doesn't mean that there isn't anything sneaky in it, but it makes data parsing a much easier task.</div><div><br /></div><div>Next, let's quickly glance at the yaml contents. Opening up the file again, I see that it has 260424 lines. But each user entry has a variable number of fields associated with it. To get a quick idea of how many hashes I'm dealing with I used grep on PasswordHash. I then did a quick grep to see how many users there were by leveraging the fact that the YAML secondary categorty will start with a " - ".</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjENGlmJbKK2FP8fexUhGBCV-SvpFtR4YOd9ehsfdlYRFZs1sh2TECAs6MvQaQ3S2qfyjd_8vMqGBsFvL3TzLy2bzwgm7FN0lt_6NboNrY3bHR0DLwpGrn-ICPxJSUeCAIzdOMCq0Vxu2gum1Z3zelfjAK0RInL1nDs1ScvPearcNF8v_mxPGkZBDAJNw/s1622/cmiyc2023_hash_count.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing both of my greps returned the same number of expected password hashes" border="0" data-original-height="170" data-original-width="1622" height="43" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjjENGlmJbKK2FP8fexUhGBCV-SvpFtR4YOd9ehsfdlYRFZs1sh2TECAs6MvQaQ3S2qfyjd_8vMqGBsFvL3TzLy2bzwgm7FN0lt_6NboNrY3bHR0DLwpGrn-ICPxJSUeCAIzdOMCq0Vxu2gum1Z3zelfjAK0RInL1nDs1ScvPearcNF8v_mxPGkZBDAJNw/w400-h43/cmiyc2023_hash_count.png" title="I still remember when KoreLogic hid hashes by just pasting them into random files." width="400" /></a></div><div><br /></div>Luckily the two numbers matched so that means there I'm looking to crack at least 29,847 password hashes. It also means that every user probably has one password hash associated with them.<div><br /></div><div>So now we have the file, and looked around a bit, it seems like it's time to extract the hashes and crack some passwords! My default "Quick and Dirty" approach is to write a short awk script such as the following:</div><div><ul style="text-align: left;"><li>cat cmiyc_2023_01_street.yaml | grep PasswordHash: | awk -F": " '{print "1:"substr($2,2, length($2)-2)}' > greped_password_hashes.txt</li></ul><div>The problem with this approach is that it dumps all the hashes into the same file, doesn't separate them by type, and I lose all that user and metadata associated with them. The loss of metadata is a real problem since I suspect it will play a very important role in actually cracking hashes for this contest. I'd really like to have a better way to create hash-lists and manage cracking sessions! This leads us to the next section of this writeup!</div><div><h3>Creating a JupyterLab Notebook:</h3></div><div>JupyterLab notebooks are a way to organize and document all the random code you write while analyzing data. The name Jupyter stands for the programing/scripting languages it supports: [Julia, Python, R]. I think a better description of JupyterLab is that it's a stone soup. If you are on your own and doing a task only once, then it doesn't really add a whole lot. You're just drinking muddy water and it's a lot of extra pain to set it up and use it. The thing is, you are vary rarely on your own, and almost no task is done only once. Heck, I've probably only written one hello world program from scratch in my life. Every other program I've worked on since then I've copied off previous efforts. The documentation JupyterLab provides makes it easier to remember what you've done and build upon it for future efforts.</div><div><br /></div><div>Long story short, I've never regretted starting a Jupyter Notebook. Somehow that soup is full of delicious ingredients at the end of the day!</div><div><br /></div><div>Installing Jupyter is super easy. I primarily use Python (I really need to start moving into R), so all I need to do is install it via pip:</div><div><ul style="text-align: left;"><li>pip install jupyterlab</li></ul><div>To run it locally you can start it up with the following command and then connect to it with a web-browser:</div></div><div><ul style="text-align: left;"><li>jupyter lab</li></ul><div>Wait! Web-browser?! Yup, it runs as a server-side application which makes it very easy for teams to collaborate using it. Enabling remote access requires a few more steps (such as configuring authentication) which I'll leave for you to Google yourself (the documentation to do this is not great). For this tutorial I'm going to stick with local access.</div></div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjpM5rWnO6v7hzftHCd_HnIEi3GtzdreJTLFl6sxTgdLG1pIlV-qc1FrRU7lEUePrQbQH94UgVW5TlVrpjupp22CiicBWYIlOp9xEfmV_tId6-XQI5LViO9ql6HSwdcc2ACA_YxzjjPIivv6OczOQZb8UgaQYrOfmn74QKG4gf5KDM_gKQX3GxcYcGDuJU/s2096/cmiyc2023_jupy_start.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Picture of initial Jupyter lauch page" border="0" data-original-height="1392" data-original-width="2096" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjpM5rWnO6v7hzftHCd_HnIEi3GtzdreJTLFl6sxTgdLG1pIlV-qc1FrRU7lEUePrQbQH94UgVW5TlVrpjupp22CiicBWYIlOp9xEfmV_tId6-XQI5LViO9ql6HSwdcc2ACA_YxzjjPIivv6OczOQZb8UgaQYrOfmn74QKG4gf5KDM_gKQX3GxcYcGDuJU/w400-h266/cmiyc2023_jupy_start.png" title="Pretty boring so far..." width="400" /></a></div><br /><div>Starting up Jupyter in the directory for this challenge, initially it's pretty boring. It's just a stone sitting in the bottom of an empty cauldron. I can see the YAML file and open it up, but even then, by default Jupyter doesn't have a lot of built-in functionality to start carving it up.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5XBrF6y1DLLVA-__WU3ydHzH_dVazRFL-RM0w5GgdxJn74Dsi7YYYswqBRUPJND8Ak-PKwlzAwQsZHCSFo_sN4U7_lgnWUZRIzQLLLOc7bQMQacohRzzYDGWvV0tUibE_dR_-65TZRwb1nCYAdEvLYwgXpIm52oNTL0dd5eV0yDXXOyRrMtr0Z__U2_g/s1697/cmiyc2023_jupy_loadyml.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Viewing the YAML file in Jupyter" border="0" data-original-height="1328" data-original-width="1697" height="313" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj5XBrF6y1DLLVA-__WU3ydHzH_dVazRFL-RM0w5GgdxJn74Dsi7YYYswqBRUPJND8Ak-PKwlzAwQsZHCSFo_sN4U7_lgnWUZRIzQLLLOc7bQMQacohRzzYDGWvV0tUibE_dR_-65TZRwb1nCYAdEvLYwgXpIm52oNTL0dd5eV0yDXXOyRrMtr0Z__U2_g/w400-h313/cmiyc2023_jupy_loadyml.png" title="Internal voice: "I could have done this better in VIM"" width="400" /></a></div><br /><div>Things start getting a *little* more interesting when you go ahead and create a Notebook. A Notebook lets you combine Markdown blocks with code blocks that you can execute. Basically it's a wiki where rather than post code snippets you post programs that you can run and save their results.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBsQW-f8MVQfweaLQq-Sn3EkaqHe_0IRQMFB1lz4vawjb0QAtXapANvJmo1YX_cp51oQN-kV04nWBVEVKo5zoJeqIZ7WLx4CIAYBoFj2EuTZ3wCFYFppYlgs2kuKbYWpaI8Kq3dL7oa7Gq8T1SH-cJociIOhs3WzXxeFat3QnEXb-QHKrxX4TQKs0CuQc/s1829/cmiyc2023_jupy_first_notebook.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A picture of a workbook with Markdown at the start that states this is for analyzing files for the CMIYC 2023 competition" border="0" data-original-height="722" data-original-width="1829" height="158" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjBsQW-f8MVQfweaLQq-Sn3EkaqHe_0IRQMFB1lz4vawjb0QAtXapANvJmo1YX_cp51oQN-kV04nWBVEVKo5zoJeqIZ7WLx4CIAYBoFj2EuTZ3wCFYFppYlgs2kuKbYWpaI8Kq3dL7oa7Gq8T1SH-cJociIOhs3WzXxeFat3QnEXb-QHKrxX4TQKs0CuQc/w400-h158/cmiyc2023_jupy_first_notebook.png" title="Just a fancy wiki. This is the "water" in our stone soup." width="400" /></a></div><br /><div>Ok, so that's great and all. It's a fancy wiki. But time is ticking and we still haven't extracted those password hashes and started cracking them yet! Let me get off my data-scientist high horse and say, by all means, take a moment to use a messy awk/grep script, create a hashlist, and start your systems running default rules against the faster hashes. But once those GPUs are cracking lets come back to this Jupyter Notebook. The first question that is useful to answer is what types of hashes do we need to crack? Now, for this competition the hash types are easy to figure out since KoreLogic posts them on their score-page:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSscc1Z9sPvd9Agnk3tnQ1V6Nkj8ZQQYG3_aYz2GWxl9e-mn_aU-eEmqRfqqwLqGvnRjAx6Eba_F13ZRS6FDbriZeyrQZS089LT56UIaLTswcyz1QlWnsODARhUkOu_FyT9LcHCqPab2uOAku1vbCA8_hkDMs-n3oB2HD1GVHq3CM8eXL2IRDG28gLPEE/s1063/cmiyc2023_hash_types.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Listing of hash types and score value that KoreLogic posted" border="0" data-original-height="903" data-original-width="1063" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiSscc1Z9sPvd9Agnk3tnQ1V6Nkj8ZQQYG3_aYz2GWxl9e-mn_aU-eEmqRfqqwLqGvnRjAx6Eba_F13ZRS6FDbriZeyrQZS089LT56UIaLTswcyz1QlWnsODARhUkOu_FyT9LcHCqPab2uOAku1vbCA8_hkDMs-n3oB2HD1GVHq3CM8eXL2IRDG28gLPEE/w400-h340/cmiyc2023_hash_types.png" title="Thank goodness these hashes look different vs. "raw-MD5" + "raw-MD4" + NTLM + ...." width="400" /></a></div><br /><div>The question is, how many hashes are of each type? One option is you can load them up in your cracker of choice and see which ones get processed. In fact, that's what I did initially, and it "works". But it's still nice to visually see the breakdown as well as understand how many points each category is worth. So let's write a quick Python script!</div><div><br /></div><div>The first thing to do is load in the yaml file. Jupyter Notebooks are based around the concept of cells. Each cell can contain markdown or code, and can be executed independently but the results persist until they are re-run. I know this is confusing and I apologize, but let me try to explain this with an example. I'm going to make the scrip to load the Yaml file into Python its own cell. This is because this operation takes a bit (it's a big challenge file). This also brings up one of the huge advantages of using Jupyter and what makes it more than just a "fancy wiki with code". It's that variables are saved once a cell is run. This means I only need to load that data file up once, and I can then access it from code snippets in other cells as I advance my analysis of this file.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVbDTJoUpuJyGCUtkjShJ3Ur6_eq5UrCLIK1bcMgOhKAcC-iB21jiLT3WnVJROUgCPOdiFUO5zG4w4xznZnwPvuk7UXCBKCROvZJlTs-ZQf3JA8AQ0sNWlngfgO4Uj8woah4jPlSmGsFx2rotFwJ2fCzwB735ym0MF9OUQ3jLdV2VCFyoWJYkGChTflbY/s1014/cmiyc2023_jupyter_first_code_cell.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Short code snippet loading the yaml file in python" border="0" data-original-height="413" data-original-width="1014" height="260" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVbDTJoUpuJyGCUtkjShJ3Ur6_eq5UrCLIK1bcMgOhKAcC-iB21jiLT3WnVJROUgCPOdiFUO5zG4w4xznZnwPvuk7UXCBKCROvZJlTs-ZQf3JA8AQ0sNWlngfgO4Uj8woah4jPlSmGsFx2rotFwJ2fCzwB735ym0MF9OUQ3jLdV2VCFyoWJYkGChTflbY/w640-h260/cmiyc2023_jupyter_first_code_cell.png" title="Not much, but saves a lot of time when scripting up other code to look at this data" width="640" /></a></div><br /><div>Cell layout, breaking up your code, running cells in the correct order. These are all issues you'll encounter as you use Jupyter Notebooks more often. But the key here is I don't want to write my entire analysis program at one time. I don't know what I'll encounter during this challenge. But Jupyter saves my execution environment so time intensive tasks like this only need to be run once (or until the underlying data changes). To demonstrate this, let's access that data and try to figure out the breakdown of hash types in a different cell.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaXWrX6Efr_534Brm42hPsoVXyOM8Gf5Z2W7YZpfxHBYZKhbZMsZjCTYa5oH8t4SB04Y32pjQx47PRIW0xmzEdhU3PWOJHGFCo45zZ1-r42WVATz8GwbbI30eL7mx4iguhawOG2l_htJWaNZ5Wb43Dp-IViTcWVXj6_sEx4kvHMZQtzDYB2g9wPKCPoDk/s1609/cmiyc2023_jupyter_hashlist.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Python code extracting each hash_type from the contest" border="0" data-original-height="1609" data-original-width="1287" height="640" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjaXWrX6Efr_534Brm42hPsoVXyOM8Gf5Z2W7YZpfxHBYZKhbZMsZjCTYa5oH8t4SB04Y32pjQx47PRIW0xmzEdhU3PWOJHGFCo45zZ1-r42WVATz8GwbbI30eL7mx4iguhawOG2l_htJWaNZ5Wb43Dp-IViTcWVXj6_sEx4kvHMZQtzDYB2g9wPKCPoDk/w512-h640/cmiyc2023_jupyter_hashlist.png" title="There's better ways to write this code, but you'll notice 90% of it is copy/paste so it was easy to write." width="512" /></a></div><br /><div>Now I have a count for each hash type, and I can see the hashes are fairly equally distributed. While the hash types are roughly equally distributed the total points per hash type are not. One bcrypt hash is worth roughly 16 million times more points than a raw_md5. This highlights the key to this contest is to find patterns by cracking fast hashes, but then focus on cracking the slower high-value hashes. Aka the fast hashes on their own are basically worthless from a point perspective, but cracking them can allow better targeting of high-value hashes.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK5wPKLhYKVsumzTVaRehcz_lAVv-pI0KZX5SqFq91eMjCWgd0ufjq6vKl-8ASDn8WIREj4nL0_ed7_hYyQWNQ4ZT9EUmTsWE9NsJpI7rIlxReHTmKS0OgfTMw1b82R2TUIXgAjxuYotAkTlw3nuyY7sLAkGSyDeFiyVBPhgWRVRm4rfkxaOVFTHDWic0/s1505/cmiyc2023_jupyter_point_value.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Graphing the total points possible for each hash type as a pie chart" border="0" data-original-height="1363" data-original-width="1505" height="363" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgK5wPKLhYKVsumzTVaRehcz_lAVv-pI0KZX5SqFq91eMjCWgd0ufjq6vKl-8ASDn8WIREj4nL0_ed7_hYyQWNQ4ZT9EUmTsWE9NsJpI7rIlxReHTmKS0OgfTMw1b82R2TUIXgAjxuYotAkTlw3nuyY7sLAkGSyDeFiyVBPhgWRVRm4rfkxaOVFTHDWic0/w400-h363/cmiyc2023_jupyter_point_value.png" title="Oh lawd, bcrypt commin" width="400" /></a></div><br /><div>Side note: The top street team (Hashmob Users) cracked 500 bcrypt hashes. Almost twice as much as the next nearest player/team. But only around 15% of the total possible bcrypts.</div><div><br /></div><div>The next step is to make better hash_lists so we can actually start cracking effectively. As I mentioned earlier, I already parsed out the data so all I need to do is to create a new cell that saves the contents to disk.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYHwTtikEnsbl6iUdj3mRc4kyO-eE3Tu36JY3TAGbV8Bt7fS9l-RjP3luzDxnvpFcx1L_n3e40IZBoklzis5bLdrTRdHwnVJmW0NRRyfe3H-S8cdpcVpx6ifdPyRMUX0bxWStkfk-nwFR0xJsLrR_XP2RKfGGy-hFpW_m4aj9b_ARFO9smzlWGMPNEVqk/s885/cmiyc2023_jupyter_save_hashes.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Saving hashes to disk for cracking" border="0" data-original-height="297" data-original-width="885" height="134" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjYHwTtikEnsbl6iUdj3mRc4kyO-eE3Tu36JY3TAGbV8Bt7fS9l-RjP3luzDxnvpFcx1L_n3e40IZBoklzis5bLdrTRdHwnVJmW0NRRyfe3H-S8cdpcVpx6ifdPyRMUX0bxWStkfk-nwFR0xJsLrR_XP2RKfGGy-hFpW_m4aj9b_ARFO9smzlWGMPNEVqk/w400-h134/cmiyc2023_jupyter_save_hashes.png" title="Easy Peesy Lemon Squeezy" width="400" /></a></div><div class="separator" style="clear: both; text-align: left;"><br /></div><div class="separator" style="clear: both; text-align: left;">At this point, I want to stress, it's really time to stop messing with this notebook and start focusing on cracking some hashes. Let's take a break and run some default cracking sessions against the raw hashes (md5, sha1, and sha256).</div><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-tObFaqRJYHMk7k8cu_tLwit_pWnyG018HSuQVG6faBqaHDXPYFlX60oFcmbYVihOJuRVD4K99ebba_PkzkCOp-nkGIftOX12xp2CTz41ozEiJIMcyfCbWmQ8rtsPtmA-VjuJMAHSfrUrHUGyZsgI5VVAyHUu_HS_9r1iSJnPbMJBBjXIzcVfir-Cz14/s1456/data_scientist_hacking2.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Midjourney generated image of data scientist hacking" border="0" data-original-height="832" data-original-width="1456" height="229" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj-tObFaqRJYHMk7k8cu_tLwit_pWnyG018HSuQVG6faBqaHDXPYFlX60oFcmbYVihOJuRVD4K99ebba_PkzkCOp-nkGIftOX12xp2CTz41ozEiJIMcyfCbWmQ8rtsPtmA-VjuJMAHSfrUrHUGyZsgI5VVAyHUu_HS_9r1iSJnPbMJBBjXIzcVfir-Cz14/w400-h229/data_scientist_hacking2.png" title="This AI generated image makes me laugh since the hacker isn't doing anything on their computer. Which is factually accurate!" width="400" /></a></div><br /><div>For the cracking sessions I simply ran the default John the Ripper attack (default dictionary, rules, and incremental mode) for a couple minutes each on my laptop. Aka:</div><div><ul style="text-align: left;"><li>john --format=raw-sha256 raw_sha256_hashlist.txt</li></ul><div>Unsurprisingly this was not very effective, cracking a total of 437 passwords across the three raw hash-types. This is where the CMIYC contest really starts. Next step usually is to start running more complicated attacks, look at the cracks to identify base words and mangling rules, and build upon that. And if I was really competing in this competition that's exactly what I'd do. But as those attacks are running let's go back to JupyterLab and see if we can optimize how we're analyzing those cracks.</div></div><div><br /></div><div>To analyze the cracks we need to see which passwords we cracked. John the Ripper has a great option called "show" which allows you to give it a hash-list and it'll output all of the hashes it has cracked. Side note, if you run "show=left" it instead output all the hashes it *hasn't* cracked.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkescjtS69Ui5KQ5PZ7e732IRinR_jpQ91s-sauJNlnLWFYZQPJdQbwt4JVA2yDTKAAzE4Q-Hq4swLFuzw1wp6C2hVRCFzk9l71kSOo11KyRucbV_HI8TKJ9LQzeQ8y5PlStSe3LsjSQI1Mbuei-mLZcxRc7vVvc4ykQImW7Y5L8bD5GycQGXJmsz9cMs/s1749/cmiyc2023_jtr_show_left_working.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing the cracked sha1 password hashes" border="0" data-original-height="476" data-original-width="1749" height="109" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhkescjtS69Ui5KQ5PZ7e732IRinR_jpQ91s-sauJNlnLWFYZQPJdQbwt4JVA2yDTKAAzE4Q-Hq4swLFuzw1wp6C2hVRCFzk9l71kSOo11KyRucbV_HI8TKJ9LQzeQ8y5PlStSe3LsjSQI1Mbuei-mLZcxRc7vVvc4ykQImW7Y5L8bD5GycQGXJmsz9cMs/w400-h109/cmiyc2023_jtr_show_left_working.png" title="JtR associates the plains with the usernames which is super helpful" width="400" /></a></div><br /><div>But wait, it looks like I forgot something when I created my hash-lists since JtR's show option does not work with my raw-MD5 and raw-SHA256 hashes....</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXI0Z7bSytw3we_d5sVPY1uidpP_8fFKKR1UDRG1nVST5jaaUnefp6CeTBvwu0qZJuVOKTLsvp-c00OtliF4ltqqlUCId023S0soKvVUSUAAWDGlFyUPiTsierXwE1D68U5Xzh5uSQKuo6wxYU6FJUMxXpgsY6mg3NeLQP3gwRYe2bS9QqBTfvhUEsEX4/s1782/cmiyc2023_jtr_show_left_broken.png" style="margin-left: 1em; margin-right: 1em;"><img alt="No cracked passwords are being shown for the raw-sah256 hashes" border="0" data-original-height="67" data-original-width="1782" height="15" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhXI0Z7bSytw3we_d5sVPY1uidpP_8fFKKR1UDRG1nVST5jaaUnefp6CeTBvwu0qZJuVOKTLsvp-c00OtliF4ltqqlUCId023S0soKvVUSUAAWDGlFyUPiTsierXwE1D68U5Xzh5uSQKuo6wxYU6FJUMxXpgsY6mg3NeLQP3gwRYe2bS9QqBTfvhUEsEX4/w400-h15/cmiyc2023_jtr_show_left_broken.png" title="I know I cracked some of these....." width="400" /></a></div><br /><div>The problem is in how I created those hash-lists since JtR's show option isn't that smart. It needs to be told explicitly what the hash-type is and those hashes are ambiguous. Now I could specify the hash-type on the command line, but for future analysis I want to look at cracked hashes across multiple hash-types so it's easier to recreate the hash-types with the proper hash designator, such as "$dynamic_0$" for raw-md5, included in them.</div><div><br /></div><div>This is actually good since it provides a learning example on how you can update your code in jupyter (That's how I'm going to spin this oversight anyways ;p). First let's modify the parsing code to add the required fields to the hashes.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJinpDOtQdroKeLs2lEWYc_w7ojkhsRlqseySVfMknPd6kC45o3H4CSduwZwgWTsz_uHTLuAKUqGY34OEzNNRIqNU0gPzTtI5hxDeAVXDTrAx130rCNmnOLW_Jh0vbYdOm8Y6H0jXmnsSwP_ERHTsefQUrfh7MExXLhbVmOTAb--st3APKHOkr5NOSGFs/s1461/cmiyc2023_jupyter_update_hash.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing me adding the hash type to the hash parsing code" border="0" data-original-height="408" data-original-width="1461" height="111" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhJinpDOtQdroKeLs2lEWYc_w7ojkhsRlqseySVfMknPd6kC45o3H4CSduwZwgWTsz_uHTLuAKUqGY34OEzNNRIqNU0gPzTtI5hxDeAVXDTrAx130rCNmnOLW_Jh0vbYdOm8Y6H0jXmnsSwP_ERHTsefQUrfh7MExXLhbVmOTAb--st3APKHOkr5NOSGFs/w400-h111/cmiyc2023_jupyter_update_hash.png" title="For the full code, look above to the second python code block I created for the jupyter notebook" width="400" /></a></div><br /><div>I then re-ran this cell, and after that, re-ran the cell that wrote the hashes to disk for cracking. Once I did this the john --show option worked great for the raw-md5 and raw-sha256 hash-types.</div><div><br /></div><div>One key thing I want to highlight is you can re-run each cell independently of each other, so if you want to make a quick modification you can without having to run the whole notebook again. What you need to keep in mind though is all the variable are global so the order you run your cells in is very important. Aka running one cell can change the variables that other cells use. So this is a very powerful feature of Jupyter notebooks allowing you to quickly tweak your code. But it's also a very dangerous feature so you need to be a bit careful when using it. If at any point things get wonky you can instead re-run all your cells in the notebook from scratch to reset the global state of things. Ok, enough harping about that, but spoiler alert, I'm going to be tweaking my code a lot as things progress. </div><div><br /></div><div>While I can perform analysis of the cracked passwords, what I'm really interested in how the metadata associated with each account matches up to the cracks. How about we do a quick analysis of the variation of the company and department metadata?</div><div><br /></div><div>Looking at the company info ... it looks like the companies themselves are pretty random.</div><div><br /></div><div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW3UE7dJoWMbZnTS1IGCtnQAJ-TDE5lfj3vb8in15qtJ_AvHBLUikBhKOtWyX62wDTIqiiKnI_P334HZtmPxhk9hAHXN8XxdV93Dytq8L7mVCzJr25ge2qYkHlsGXucrS92lIs4PeU0RDWtzxQ0a3d2AWvfzF0rr7hYSxZiEYffmGZc-lbmeu1GiwDX8A/s1549/cmiyc2023_jupyter_company_info.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A pie graph of the companies. Most companies only have a few users except for two" border="0" data-original-height="1549" data-original-width="1055" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiW3UE7dJoWMbZnTS1IGCtnQAJ-TDE5lfj3vb8in15qtJ_AvHBLUikBhKOtWyX62wDTIqiiKnI_P334HZtmPxhk9hAHXN8XxdV93Dytq8L7mVCzJr25ge2qYkHlsGXucrS92lIs4PeU0RDWtzxQ0a3d2AWvfzF0rr7hYSxZiEYffmGZc-lbmeu1GiwDX8A/w273-h400/cmiyc2023_jupyter_company_info.png" title="Guess I'm going to be using company names as dictionaries and not categories..." width="273" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div>Two companies stand out though. Let's dig into this and look for companies that have over a hundred users:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAXMkGvmO5nkYhNuNySGbKww7PpnNtGrOK6Rc2aAsqGB_xZbNBYEojrRb566ZjRKiJpjjeAKsnN-KCLXphOoNIRD47Bdq_jI4fxOI_x9ytBE867XPt78jbxGwl8YdIATGDKQZlH9tQj4smKsn0fjXwBE2b2U6jQRyLVN1Wz3M-RPa8JODiyG4AuDvK6G0/s1153/cmiyc2023_jupyter_large_company_info.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A pie graph showing ghosting and Dandy both have around 600 users each" border="0" data-original-height="787" data-original-width="1153" height="272" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhAXMkGvmO5nkYhNuNySGbKww7PpnNtGrOK6Rc2aAsqGB_xZbNBYEojrRb566ZjRKiJpjjeAKsnN-KCLXphOoNIRD47Bdq_jI4fxOI_x9ytBE867XPt78jbxGwl8YdIATGDKQZlH9tQj4smKsn0fjXwBE2b2U6jQRyLVN1Wz3M-RPa8JODiyG4AuDvK6G0/w400-h272/cmiyc2023_jupyter_large_company_info.png" title="Oh hey, this might be useful info later..." width="400" /></a></div><div><br /></div>So of all the companies, GHosting and Dandy might be ones to dig into more later.<br /><div><br />Now let's do the same for departments:</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjak-_fKKz2ZWxLfY4QLt7FjqQQGpyPZr8JvKkjo_JgWXRIxLCOeaDCfZX5KlFmrRzBbxLuDpHLN-mHaQ1xCpGRyI5SnkeI9dGxrfYdtIzj0raXF3m0etdxquVJnYMm8ww726zNlPzOCuxS3AJXomLDxVwMxQ020VNHd9fLnOkoJRaiE_-w5D1MGd265U4/s1561/cmiyc2023_jupyter_department_info.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A pie chart showing the department breakdown. Engineering is the largest department" border="0" data-original-height="1561" data-original-width="1191" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjak-_fKKz2ZWxLfY4QLt7FjqQQGpyPZr8JvKkjo_JgWXRIxLCOeaDCfZX5KlFmrRzBbxLuDpHLN-mHaQ1xCpGRyI5SnkeI9dGxrfYdtIzj0raXF3m0etdxquVJnYMm8ww726zNlPzOCuxS3AJXomLDxVwMxQ020VNHd9fLnOkoJRaiE_-w5D1MGd265U4/w305-h400/cmiyc2023_jupyter_department_info.png" title="Oh hey, this might prove useful too..." width="305" /></a></div><br /><div><br /></div><div>Now this is something I can work with! Next step, let's see how our cracks break down by department. To do this, we need to import our cracks back into JupyterLab.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSFg7vF3a2Ot8QE7VX_QHwmZgrFcEc02aC1TgXBbrAA4rB7cewGOx4a66ahc5xZfr1PMf40LXuyNHVET7NSFv-wiGWNrbDPWKupgpqE7AH-rVEBOxBIHl2mmsI0J8z7yQGo8N2XuWi6IyxP9a3uc0gWV3j20dtuzTNxQ5T7Ocz_FSyfejStCZrODRvTic/s1458/cmiyc2023_jupyter_read_in_plains.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Reading in the plains by creating a file with jtr --show and then bringing them into Jupyter" border="0" data-original-height="805" data-original-width="1458" height="221" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgSFg7vF3a2Ot8QE7VX_QHwmZgrFcEc02aC1TgXBbrAA4rB7cewGOx4a66ahc5xZfr1PMf40LXuyNHVET7NSFv-wiGWNrbDPWKupgpqE7AH-rVEBOxBIHl2mmsI0J8z7yQGo8N2XuWi6IyxP9a3uc0gWV3j20dtuzTNxQ5T7Ocz_FSyfejStCZrODRvTic/w400-h221/cmiyc2023_jupyter_read_in_plains.png" title="Note: I'll need to expand out my files as I crack more hashes. I could have had one file with all the users/hashes as well... Lots of optimizations but this is quick. That's the key. Don't get hung up on the best way to do things, just get them done and iterate quickly." width="400" /></a></div><br /><div>You'll notice in the second cell I also realized I needed a quick lookup based on username so I added that in as well. Now that we have the plains we can start printing out cracks based on Department.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFicxpotGUU5iJcra-MbLAx5AD3MQ7BYAA1cH0kOcsak9ifkO_cqF4JsuCDYh4gugduW7JnSMzJ12poPKUu-tTPuH_FewzuuJFuZ0404o3w1AKMx3jYFKpaZpXm_whuYfyMLj0SwhW8tUaZskVh1nnN48SqlKmTzA8SO6upd_3-KkimPbjWH71RAm8aTs/s1230/cmiyc2023_jupyter_cracked_by_department.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing cracks by department to identify patterns" border="0" data-original-height="1230" data-original-width="1051" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhFicxpotGUU5iJcra-MbLAx5AD3MQ7BYAA1cH0kOcsak9ifkO_cqF4JsuCDYh4gugduW7JnSMzJ12poPKUu-tTPuH_FewzuuJFuZ0404o3w1AKMx3jYFKpaZpXm_whuYfyMLj0SwhW8tUaZskVh1nnN48SqlKmTzA8SO6upd_3-KkimPbjWH71RAm8aTs/w341-h400/cmiyc2023_jupyter_cracked_by_department.png" title="This is something we can dig into" width="341" /></a></div><br /><div>There's more cracked passwords of course that I'm not showing since the picture would be too large, but the following password makes me feel personally attacked...</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVuRYywDjLSXr6CMc-0iEmGvyeiZdTdFGUA3OkZQhT0XdyjERQb2mkEWPgW-8I74FWTqasj6nxv_8b_g40zTgA9b-7tGaj4cl0VkJX4tCsm4iG0WYXzOdQff6D6WpiZMqTHjo0mEQ_1_1wz1Q25-TiFH_nIm_akgxYrjSXuar_IjjaxS7jSWpNjNTxlpo/s591/cmiyc2023_laki_marketing.png" style="margin-left: 1em; margin-right: 1em;"><img alt="A user in the marketing department picked the password laki1234" border="0" data-original-height="399" data-original-width="591" height="216" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjVuRYywDjLSXr6CMc-0iEmGvyeiZdTdFGUA3OkZQhT0XdyjERQb2mkEWPgW-8I74FWTqasj6nxv_8b_g40zTgA9b-7tGaj4cl0VkJX4tCsm4iG0WYXzOdQff6D6WpiZMqTHjo0mEQ_1_1wz1Q25-TiFH_nIm_akgxYrjSXuar_IjjaxS7jSWpNjNTxlpo/w320-h216/cmiyc2023_laki_marketing.png" title="At least make someone in HR have my username be their password! (FYI my wife is in HR)" width="320" /></a></div><br /><div>I kid as I know it wasn't intentional :) One thing that I should mention though is these lists can be easily updated as I crack more passwords. All I need to do is re-run these cells in the Notebook. Looking at the plains, you can start to see that certain mangling rules and dictionaries start to pop out as areas of future exploration. </div><div><br /></div><div>Backing up though, there probably is some other metadata that I could use to further refine attacks on these lists. In Team Hashcat's excellent writeup (<a href="https://github.com/hashcat/team-hashcat/tree/main/CMIYC2023">available here</a>) they talked about an enhancement to their team collaboration tools called "Metaf****r" that displayed all the metadata next to the plaintexts. Can we replicate this in JupyterLab? Absolutely!!</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPu9Giy-yQjkXqUEp1unCoHo2cQoBnjSfM2Y_cAAj4NA00VW5L-sgTwl0qabgbB9vlUVGIui6JwxHstHirNmmwyD_QrKZkQaagsvRfyLeStSv60YbLrA6aytPpHCvERW6HvrR_HqMZim2FMAG1knXmbUJZLaWic3fJcQuQXDqHCKsbEKAWYIDp4wXkCCU/s2021/cmiyc2023_jupyter_meta.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Showing all of the metadata next to cracked passwords" border="0" data-original-height="1669" data-original-width="2021" height="330" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiPu9Giy-yQjkXqUEp1unCoHo2cQoBnjSfM2Y_cAAj4NA00VW5L-sgTwl0qabgbB9vlUVGIui6JwxHstHirNmmwyD_QrKZkQaagsvRfyLeStSv60YbLrA6aytPpHCvERW6HvrR_HqMZim2FMAG1knXmbUJZLaWic3fJcQuQXDqHCKsbEKAWYIDp4wXkCCU/w400-h330/cmiyc2023_jupyter_meta.png" title="Alt title: How to code this up if you don't have a without having a Xanadrel on your team ;p" width="400" /></a></div><br /><div>I'm going to end this post here as I think this starts to show the value of JupyterLab Notebooks. I know, I didn't really crack that many hashes! For my next post I'll leverage this Notebook to actually create wordlists + mangling rules to target hashes and start to show the real value of data analysis when performing password cracking attacks.</div><p></p></div>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-8069772012802892082022-08-21T22:16:00.002-07:002022-08-21T22:16:43.060-07:00More Password Cracking Tips: A Defcon 2022 Crack Me If You Can Roundup<p><b></b></p><blockquote style="text-align: left;"><b> “We do not learn from experience... we learn from reflecting on experience.”</b> </blockquote><blockquote><p style="text-align: left;"><b>-- John Dewey</b></p></blockquote><p><b><u><span style="font-size: medium;">Introduction:</span></u></b></p><p>KoreLogic's <a href="https://contest.korelogic.com/">Crack Me if You Can (CMIYC)</a> is one of the oldest as most established password cracking competitions. Held every year at Defcon, it serves as a great way to pull together password enthusiasts from all over the world and provides a shared use-case that drives password cracking tool development throughout the rest of the year.</p><p>This year I competed as a street team and managed to finish in 12th place:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgSO7T9dTSrARn8enOjEprz8o3nqC6Keg4MgcZFPDbcEc6AODvFQch_VcmQfhcjd5kAslyV-wDlUaFybbiJgzLRKJu1RcJ8qyt3NvbxPCSdpdE6ohOi520WCsIgNIRXssCJvlZKPogjpblsCn2rPNTeoOzIRWxrIG_0UN_I1YwmgxB6Z5hxUYI5OPy/s914/cmiyc_2022_score.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="914" data-original-width="628" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhgSO7T9dTSrARn8enOjEprz8o3nqC6Keg4MgcZFPDbcEc6AODvFQch_VcmQfhcjd5kAslyV-wDlUaFybbiJgzLRKJu1RcJ8qyt3NvbxPCSdpdE6ohOi520WCsIgNIRXssCJvlZKPogjpblsCn2rPNTeoOzIRWxrIG_0UN_I1YwmgxB6Z5hxUYI5OPy/w275-h400/cmiyc_2022_score.png" width="275" /></a></div><br /><div class="separator" style="clear: both; text-align: center;"><br /></div>Now that I've had a week to look back on things, there certainly are strategies where I could have done better. The first is with my cracking setup. I had two systems I used. My primary cracking system was still my laptop running an Ubuntu VM utilizing WSL on a Windows 11 install. My secondary system was the computer I described setting up in <a href="https://reusablesec.blogspot.com/2018/09/configuring-password-cracking-computer.html">this blog post</a>.<br /><div class="separator" style="clear: both; text-align: center;"><br /></div><b>Primary Laptop:</b><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div>CPU: i7-8640U CPU</div><div>RAM: 16 GB</div><div>Storage: 500GB SSD</div></blockquote><div> <div><b>Desktop Computer:</b></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><div><div style="text-align: left;">CPU: Intel i5-7600k, 1 processor; 4 cores</div></div><div><div style="text-align: left;">RAM: 16GB</div></div><div><div style="text-align: left;">Storage: 500GB SSD</div></div><div><div style="text-align: left;">GPU: GeForce GTX 1070</div></div></blockquote><div><p>I really didn't do a good job of splitting my work between both these systems and making sure that my limited GPU was always working. For example, I had a bad habit of running JtR sessions on my desktop computer. Long story short, one week later I have a lot of ideas for future projects to improve my cracking skills, and I'm super excited to start working on them, which is the real benefit of competing in contests like this. Rather than go through a blow for blow recount of the contest, I'll instead try to highlight a couple of tips and lessons I learned along the way.</p><p style="text-align: left;"><b><u><span style="font-size: medium;">Core Contest Techniques:</span></u></b></p><p>Before diving into this write-up, I HIGHLY recommend reading my previous write-up for the CrackTheCon contest which is <a href="https://reusablesec.blogspot.com/2022/05/password-cracking-tips-crackthecon.html">available here</a>.</p><p>I'm going to skip most of the techniques covered there, but I will say they all applied to the KoreLogic contest as well. It really surprised me how much I referred back to that article when I was competing in this contest.</p><p><b><u><span style="font-size: medium;">Contest Overview:</span></u></b></p><p>At a high level the contest consisted of cracking a variety of encrypted files, each of which would have individual hashes to crack. For the street teams, the password to crack the encrypted files were fairly simple, so the real challenge there was getting your tooling setup properly to handle those files. </p><p>Once the encrypted files were cracked, the unencrypted files could be opened up to reveal a set of very quick to compute hashes. As someone who doesn't have a lot of compute resources to throw at the problem, I really appreciated the fact that the hashes were so fast! Cracking these hashes was all about trying to figure out the base words used to construct them, as well as the mangling rules that were applied. One thing I will say is that the selection of mangling rules Korelogic picked made "loopback" style attacks significantly less effective than the CrackTheCon contest. Don't get me wrong, loopback attacks were still very powerful! But as a player I really needed to analyze the passwords and figure out the underlying mangling rules vs. using loopback as a crutch.</p><p>Long story short, I thought that KoreLogic outdid themselves when it came to creating a fun challenge. I thought the contest had a good difficulty scaling to make it approachable to a wide variety of players while still providing areas of growth and frustration to more experienced players.</p><p><span style="font-size: medium;"><b><u>Tip #1: Make use of John the Ripper *2John utilities to crack encrypted files</u></b></span></p><p>Password cracking programs don't need to use the entire encrypted file. Just think about it; Would you really want to try to have you cracking program parse a 100 GiB file every time it makes a guess? What cracking programs really need is a "hash" to make a guess against. To extract that "hash", and to save it in a format that password cracking programs can utilize, John the Ripper comes with a large selection of helper programs in the /john/run/ directory which are identifiable by the '2john' suffix. You can see this below:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrx3pgPexw6OTPNoJE3PZB956nhte5d_auUDnCMYPZNVPi6qFhVKXpyTmozAzqSIAW3-EfD6WO13lFNp-Or9PU454gL2sTCwt0cp0vQRHFUkDxF5knZbpFRI-CxqHgI3P-f6BDJfHwpY6FYFrk167oIMSYM9X8tsgCw2YCWBjHBweVkYBsp1pXvchw/s993/2john_examples.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="843" data-original-width="993" height="340" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrx3pgPexw6OTPNoJE3PZB956nhte5d_auUDnCMYPZNVPi6qFhVKXpyTmozAzqSIAW3-EfD6WO13lFNp-Or9PU454gL2sTCwt0cp0vQRHFUkDxF5knZbpFRI-CxqHgI3P-f6BDJfHwpY6FYFrk167oIMSYM9X8tsgCw2YCWBjHBweVkYBsp1pXvchw/w400-h340/2john_examples.png" width="400" /></a></div><br /><p>The main challenge is to figure out which helper program you want to use. For example, here is me running pdf2john to extract the password hash from the list23-ThisYearsWorst.pdf challenge:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrxuYeLy68AVeCHh-6dT6CLK9U58O9HDi-coP8ywL9jVoyE1vL65L0GzhWEWeSys0sqNds8aJ3j77t2olimmE86rS3m6mx_q5HQ2dXuyrHwDCZSyOR8OKI0Xb8KRrCkyg-qCL0G2-QL_6VATP6SqD9TTkaRbS6N7Un4Udrf37MMlGCNaYgXbyxJNU-/s1054/pdf2john_example.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="123" data-original-width="1054" height="74" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjrxuYeLy68AVeCHh-6dT6CLK9U58O9HDi-coP8ywL9jVoyE1vL65L0GzhWEWeSys0sqNds8aJ3j77t2olimmE86rS3m6mx_q5HQ2dXuyrHwDCZSyOR8OKI0Xb8KRrCkyg-qCL0G2-QL_6VATP6SqD9TTkaRbS6N7Un4Udrf37MMlGCNaYgXbyxJNU-/w640-h74/pdf2john_example.png" width="640" /></a></div><br /><p>Rather than having it print out to your screen, I'd recommend piping the output of this into a file which you would then load in as the target hash file for your cracking program. One important thing: If you are cracking multiple encrypted files at once, you can store all of these hashes in the same file, just like with any other John the Ripper hash format. Many of these hashes are also supported by Hashcat too, so once you extract them using the 2john helper utilities, don't feel like you have to stick to using John the Ripper to crack the hashes.</p><p><b style="font-size: large;"><u>Tip #2: Make sure you compile John the Ripper with all the optional libraries to enable cracking encrypted files.</u></b></p><p>One downside about the flexibility that John the Ripper provides by being able to compile and run it on just about anything, is that it will gladly compile without certain features and cracking modes being enabled if you don't have the correct libraries present when building it. This can be very hard to diagnose after the fact beyond a "For some reason JtR doesn't seem to recognize a particular hash type" style errors.</p><p>This happened to me in the previous CrackTheCon contest where I couldn't get John the Ripper to crack an encrypted Zip file. Luckily for this contest I realized what was going on and was able to fix it, but I really need to update my JtR install instructions <a href="https://reusablesec.blogspot.com/2018/09/configuring-password-cracking-computer.html">here</a> with the new information. </p><div>That being said, here are all the additional libraries I needed to have before running './configure' to build John the Ripper (with Ubuntu 18) to enable support for cracking encrypted files used during the CMIYC contest:</div><div><br /></div></div></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div><div><div><div style="text-align: left;"><b>sudo apt-get install libz-dev, bzip2, yasm libgmp-dev libpcap-dev libnss3-dev libkrb5-dev pkg-config libbz2-dev zlib1g-dev libcompress-raw-lzma-perl</b></div></div></div></div></blockquote><div><div><br /></div></div><div>Most of these libraries were specifically for cracking the 7zip and zip files. The perl library was for being able to successfully run 7zip2john.pl.</div><div><br /></div><div>What this process really highlighted for me is that I really should create an Ansible Playbook to configure a system to run John the Ripper. Going back through all my write-ups in the past to figure out the different dependencies is no fun, and causes a lot of problems when I accidently miss one of them. Unless I get distracted, watch this space as I'll probably end up posting about that Ansible playbook here, and posting it to github.</div><div><br /></div><div><b style="font-size: large;"><u>Tip #3: Save your John the Ripper rules in an external file</u></b></div><div><br /></div><div>Let's face it, John the Ripper's default config file has grown way too large and unwieldy to effectively edit during a password cracking competition. Instead, I highly recommend including your custom rules in an external file to make it easier to quickly find the rules you want to edit or modify. Another advantage of this approach is if you upgrade your copy of John the Ripper, and the config file changes, your old rules will still be saved.</div><div><br /></div><div>The first step to do this is to include a link in your john.conf file to your custom .conf file by inserting the line:</div><div><br /></div><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><div style="text-align: left;"><b>.include <FILENAME_OF_YOUR_CONFIG_FILE></b></div></blockquote><p>Here is a snapshot of my john.conf file I used for this contest:</p><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN0GJIdd_ZGKp0RH_G3mF_839UMMj2yn-MIgTK0KXDKDSXJJpqMRyGHsepa9raZVgSEvxBFwmrTRh_rIZe25fc7-pnl0RK-8ZHpzvRlu3AbIgtM5sArZcON53kQnmrm59WMEtoWeaziiCKOWbWwMGghCBPvxPUrEZpTuAHFLKZ4QSwGOc_hHO5ya5z/s1108/cmiyc_john_conf.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="587" data-original-width="1108" height="170" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjN0GJIdd_ZGKp0RH_G3mF_839UMMj2yn-MIgTK0KXDKDSXJJpqMRyGHsepa9raZVgSEvxBFwmrTRh_rIZe25fc7-pnl0RK-8ZHpzvRlu3AbIgtM5sArZcON53kQnmrm59WMEtoWeaziiCKOWbWwMGghCBPvxPUrEZpTuAHFLKZ4QSwGOc_hHO5ya5z/s320/cmiyc_john_conf.png" width="320" /></a></div><br /> And here is a subset of the rules in my custom "cmiyc.conf" file for targeting challenge 20 hashes:<p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4xDaTMdXtm8RZrPag8NQsrQ0vjxB3IUjb5P_AqMJFIUd_xYVPZhB5UKJ1H6813n86UV2Qpz6sRcWNxwfSy__-ks9K-mf3BpNYcn0ailioTPOJ1jQyYbJBjeVHdGYRrqzcmWYxnBUPzwrzHH2IliUIgBJyCZmki2a4C4wzjnhQ7iTNJiiO51D8KXcZ/s1004/cmiyc_challenge20_conf.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1004" data-original-width="640" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi4xDaTMdXtm8RZrPag8NQsrQ0vjxB3IUjb5P_AqMJFIUd_xYVPZhB5UKJ1H6813n86UV2Qpz6sRcWNxwfSy__-ks9K-mf3BpNYcn0ailioTPOJ1jQyYbJBjeVHdGYRrqzcmWYxnBUPzwrzHH2IliUIgBJyCZmki2a4C4wzjnhQ7iTNJiiO51D8KXcZ/w255-h400/cmiyc_challenge20_conf.png" width="255" /></a></div><br /><p>You'll notice I still have individual rule sets in my custom configuration. This way I can perform quick cracking runs to figure out new rules, (or pipe the output to other John sessions. See Tip #4), and then have longer runs to perform on new dictionary words that I later identify.</p><p><u style="font-size: large; font-weight: 700;">Tip #4: Use the --stdout and --pipe options to combine multiple cracking rules</u></p><p>In the screenshot above of my rules for targeting challenge 20, you'll see similar blocks of rules where the only different is the first mangling rule, (either nothing, 'c', or 'u'). 'c' stands for Capitalizing and 'u' stands for UPPERCASE. The proper way to handle this would be to leverage John the Ripper's rule preprocessor to try combinations of different rules. The rule preprocessor is one of those killer features that JtR has but Hashcat doesn't. For example you can try multiple rule types, (such as capitalization and uppercasing), by including them between brackets []. For example:</p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p style="text-align: left;"><b>[cu]</b></p></blockquote><p> Here is a screenshot of that in action:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5AKjjeClZD8oqV06EJZFN0Ti4oA_4N62yrUhvcbYabeR9oKeqm5IsB7J4SEdimrx0EO5Bs9LUHg0MoY99r6F8gAMKuiyEDPYFeUThlQvdJ0RpHuaA-c6NZibu4kaPYtHhBu84pQfEgNYk0MOREhAcKgsnzkr2NQyT4YR6Wzk9RVK0s2MlavsiJqOQ/s2033/john_rule_preprocessor.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="379" data-original-width="2033" height="75" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi5AKjjeClZD8oqV06EJZFN0Ti4oA_4N62yrUhvcbYabeR9oKeqm5IsB7J4SEdimrx0EO5Bs9LUHg0MoY99r6F8gAMKuiyEDPYFeUThlQvdJ0RpHuaA-c6NZibu4kaPYtHhBu84pQfEgNYk0MOREhAcKgsnzkr2NQyT4YR6Wzk9RVK0s2MlavsiJqOQ/w400-h75/john_rule_preprocessor.png" width="400" /></a></div><br /><p>Still, there are times when you have a larger set of rules you quickly want to apply one or more additional mangling rules to. One of the easier ways to to this is to pipe one instance of JtR or Hashcat into another instance of your cracking program of choice.</p><p>The format for doing this with both JtR and Hashcat is slightly different. With JtR, the base generating instance will have the '--stdout' flag in place of a hashfile. You can then pipe '|' the results into another JtR instance that has the '--pipe' flag instead of a wordlist. Note: You will want to use the '--pipe' command and not the '--stdin' command so that the rules of the second instance are applied to every word sent to it. For example:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6RKnjZ0-UehSLvzmXOjsCjpL_j3rLWGhDOv5QPlZZhJcpxX7omju-65k9wRsRWwxbI_yiuUIP23ZsM_njpIXg7AA3RNtI0FkYMBsjtAgW7vBkUu56vTOETAkBco6PMbMZ5uv0U_gkMaUYMGr5CRm9sB4xdyVKqnNIkOSAVy2uC2Y3jjSRDJTjN1TD/s3007/combining_rules.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="834" data-original-width="3007" height="178" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj6RKnjZ0-UehSLvzmXOjsCjpL_j3rLWGhDOv5QPlZZhJcpxX7omju-65k9wRsRWwxbI_yiuUIP23ZsM_njpIXg7AA3RNtI0FkYMBsjtAgW7vBkUu56vTOETAkBco6PMbMZ5uv0U_gkMaUYMGr5CRm9sB4xdyVKqnNIkOSAVy2uC2Y3jjSRDJTjN1TD/w640-h178/combining_rules.png" width="640" /></a></div><br /><p>You can also pipe guesses into Hashcat instead of John the Ripper. This is a very powerful technique because you can take advantage of John the Ripper's rule preprocessor, (or features such as its better Incremental Markov mode, or built-in Prince mode), but still have Hashcat take advantage of your GPUs when cracking hashes. All you need to do in Hashcat is not enter in a wordlist file and it will automatically accept guesses from stdin. This tends to work better if you also have a large number of mangling rules in Hashcat to help keep those GPUs of yours busy since you want to limit the amount of time transferring information from your CPU to the GPU. Aka if you can transfer a limited number of base "words" from the CPU and expand them via additional mangling rules in the GPU, you'll achieve a higher guess per second rate. Below is a screenshot of using this approach. Ignore the '--force' option as I took the screenshot on my laptop vs. my desktop which I normally run my Hashcat sessions from.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5KVWkhPWj9HHuBoxRD-frA2RKyADyaIwaGfjzWrF6w0sjAWg8YGXqXFpiESkxHUtoq6OFhgebCoYDg2Pp7MW3TgRvBF1GHtV4r0NeZjNU-rcd4_gLQPyiFFbQ3ZFY4HnsJWqtVr4XuXyWX3F4HDSeF6hav_JuU_5fNffYCOlIVLUQPsBg8GZVwglu/s3000/combining_hashcat_rules.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="242" data-original-width="3000" height="52" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh5KVWkhPWj9HHuBoxRD-frA2RKyADyaIwaGfjzWrF6w0sjAWg8YGXqXFpiESkxHUtoq6OFhgebCoYDg2Pp7MW3TgRvBF1GHtV4r0NeZjNU-rcd4_gLQPyiFFbQ3ZFY4HnsJWqtVr4XuXyWX3F4HDSeF6hav_JuU_5fNffYCOlIVLUQPsBg8GZVwglu/w640-h52/combining_hashcat_rules.png" width="640" /></a></div><br /><p><u style="font-size: large; font-weight: 700;">Tip #5: For password cracking competitions, perform web searches on "interesting" words</u></p><p>This was the piece of advice I wish I could build a time machine and send back to my past self. I really didn't do a good job of this during the contest. This is despite the fact that Saturday night I finally googled some of the words for challenge #20 and found that creating wordlists <a href="https://gizmodo.com/stop-the-steal-hacker-homecoming-queen-charged-as-ad-1846822348">from articles discussing a high schooler hacking the Homecoming queen prom vote</a> were extremally effective. In fact, I had the biggest jumps in my score thanks to finding those articles.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHM4baCq_t3SrFKwAEykVAStft1JlUjmwEhc4FEpRpftQ3DjWDotQd6CoFHzHizpwEEetH-ZaObwmEcnQPWzVi1A4Tc1EESVyBFFttz61S-LbYCru_QVscHFCK0S6-GiOi3qduD5yvvywHfUg_2THEIy2H_SzsIJD6z9CWqF-C33Yhj1EBz7rv9nfi/s2092/score_jump.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1476" data-original-width="2092" height="283" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhHM4baCq_t3SrFKwAEykVAStft1JlUjmwEhc4FEpRpftQ3DjWDotQd6CoFHzHizpwEEetH-ZaObwmEcnQPWzVi1A4Tc1EESVyBFFttz61S-LbYCru_QVscHFCK0S6-GiOi3qduD5yvvywHfUg_2THEIy2H_SzsIJD6z9CWqF-C33Yhj1EBz7rv9nfi/w400-h283/score_jump.png" width="400" /></a></div><br /><p>This is an area ripe for tool development. Admittedly it likely won't have much real world applications. But for contests, having a tool or process to automate the identification of sources of wordlists would be super helpful. In my head, the tool would take the following approach:</p><p></p><ol style="text-align: left;"><li>Use the PCFG trainer to create an input wordlist of the base words in cracked passwords</li><li>Identify words that weren't in the "top 500 English words" or in John the Ripper's "password.lst" wordlist</li><li>Perform a google search and identify results that contained [all/most] of the words to identify possible sources of the wordlist</li><li>Scrape the sites and build a custom dictionary.</li></ol><div>Who knows, maybe I'll get motivated and have this done before next year's CMIYC?</div><p style="text-align: left;"><span style="font-size: medium;"><b><u>Tip #6: Use Linux's 'alias' command to make your commands shorter</u></b></span></p><div>I'll admit I don't always do this, (for example see all of the screenshots above), but rather than type the full path for John the Ripper or Hashcat, you can use Linux's 'alias' command to link to them. For example:</div><p></p><blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px; text-align: left;"><p></p><div><b>alias john=/mnt/c/github/JohnTheRipper/run/john</b></div><p></p><p></p><div><b>alias hashcat=/mnt/c/tools/hashcat/hashcat.bin</b></div><p></p></blockquote><p>With the above, now you can simply type 'john' or 'hashcat' to invoke them. Note: This works better than trying to add the John the Ripper or Hashcat directories to your command path as John the Ripper specifically gets weird when you do that. This probably won't help you crack more passwords, but it is a nice quality of life improvement, especially if you have different directories you are maintaining for contest hash lists and dictionaries.</p><p><span style="font-size: medium;"><b><u>Tip #7: Modify the PCFG's multiword detector to identify shorter words</u></b></span></p><p>Of course I need to make a new tip utilizing the PCFG toolset! The PCFG trainer is a really powerful tool to create input dictionaries from cracked passwords. During this contest, one thing I noticed from the passwords I was cracking was that KoreLogic added a large number of two/three letter prefixes/suffixes to the base word. For example, here is some of the mangling rules I started using.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbbnLNn7QmHTbFwMnGxDjUd1aj7fnm--KIHgLMsC6K717PfGwCVjIOnomlyV8G9PfBcxvI7fnZxp0wyCQCDzAuXS0U21F7McNnCU3zLADLQIMcX20LSFuFsYTxT3BQiv-I9p9A6saQhTiRLTB1Mm68ZzgSj0mnR8gT1cpLtUxxs6PFxnv_LYhUEieZ/s1139/cmiyc_multiword.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1139" data-original-width="734" height="400" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbbnLNn7QmHTbFwMnGxDjUd1aj7fnm--KIHgLMsC6K717PfGwCVjIOnomlyV8G9PfBcxvI7fnZxp0wyCQCDzAuXS0U21F7McNnCU3zLADLQIMcX20LSFuFsYTxT3BQiv-I9p9A6saQhTiRLTB1Mm68ZzgSj0mnR8gT1cpLtUxxs6PFxnv_LYhUEieZ/w258-h400/cmiyc_multiword.png" width="258" /></a></div><div class="separator" style="clear: both; text-align: center;"><br /></div><p>One problem I had utilizing the PCFG trainer on these passwords was that its multiword detector enforced a minimum length five characters long for detecting base words. This was to reduce false positives. Or to put it another way, if you are parsing 60 million passwords, if you reduced the minimum base-word length to three characters, everything would look like a multiword!<br /></p><p>The difference during a competition is that your training list is not 60 million passwords long (unless you are doing really, really well!). Therefore it was helpful for me to modify my code to detect multiwords that were only three characters in length. I eventually plan on releasing a patch to the PCFG toolset to make this a command line option, but until then you can make the changes yourself <a href="https://github.com/lakiw/pcfg_cracker/blob/0582eeff508db7d880bbd2e2214f51c0a868a8cc/lib_trainer/run_trainer.py#L80">here in the code:</a></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnjYYjcHJGwQWz9L8VASHldNo28PmncHvcnQWeaGu04ACK3OokxFTqDHh0t7TsUpEi-bLLHQOL3VwU70mZmrcV-tteSajxh4rkYdb1dPalnh64Y0mwK_RWjvWvJEATjMaQ4XXLdBhk7u3OLU-4b2I1JT8MeZc5inY_hjb8imwjmNmWNR-cQrIUvFR0/s963/pcfg_multiword.png" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="331" data-original-width="963" height="138" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjnjYYjcHJGwQWz9L8VASHldNo28PmncHvcnQWeaGu04ACK3OokxFTqDHh0t7TsUpEi-bLLHQOL3VwU70mZmrcV-tteSajxh4rkYdb1dPalnh64Y0mwK_RWjvWvJEATjMaQ4XXLdBhk7u3OLU-4b2I1JT8MeZc5inY_hjb8imwjmNmWNR-cQrIUvFR0/w400-h138/pcfg_multiword.png" width="400" /></a></div><br /><p><span style="font-size: medium;"><b><u>Conclusion:</u></b></span></p><p>As I continue to reflect on this contest, I'll probably keep adding to the list of tips above. Even as I write this conclusion other ideas are popping into my head (such as using the online version of Microsoft OneNote to pass documentation and commands between different computers). But I want to conclude by saying I hope these blog posts are helpful, and that I really wanted to thank the KoreLogic team once again for running an amazing contest.</p>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-76182552441878592902022-05-03T19:51:00.011-07:002022-05-09T10:39:00.974-07:00Password Cracking Tips: A CrackTheCon Roundup<blockquote style="border: none; margin: 0px 0px 0px 40px; padding: 0px;"><p style="text-align: left;"><b><i><span style="background-color: white; color: #181818; font-family: Merriweather, Georgia, serif; font-size: 14px;">“It is common sense to take a method and try it. If it fails, admit it frankly and try another. But above all, try something.”</span><span style="color: #181818; font-family: Merriweather, Georgia, serif;"><span style="font-size: 14px;">― Franklin D. Roosevelt</span></span></i></b></p></blockquote><p>CrackTheCon, a password cracking contest run by CynoSurePrime, just finished. I competed as a Street team and I was really impressed. This was a well run contest, and I felt was very friendly to new and experienced password crackers alike. At least from a player's perspective, the infrastructure was rock solid, there was a great variety of challenges, and the difficulty level had a good gradient. Thanks to everyone who helped put this contest together!</p><p>My computer setup for this challenge was limited. I performed all my cracking on one laptop with no GPU support. You read that right, I was rolling old school with a pure CPU cracking session. Because of that, my primary password cracking program was John the Ripper, which has a ton of features that I prefer when I can't just let HashCat burn through some GPUs. While my operating system was Windows, I used Windows Subsystem for Linux to run John the Ripper and perform analysis on the cracked passwords. You can read about how to configure JtR and WSL <a href="https://reusablesec.blogspot.com/2019/08/installing-john-ripper-on-microsofts.html">here</a>.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtNvt5QSih39zWHhgc4D_GWsslKplMjNpmR0gIBwz1_1cJ3smS3pghd2mimyYYXhOSXjMv66HFewOyf_zX06t6dypuYKppqtalf9r_7kqUS_Dv8YCOlUi1McjgsOkR7dBvIS-h-1_hLK2Etbn6MZBz2kJHCWum-ogQxAaadbEv7spz6u0KfkEESk_l/s3335/laptop_cat.jpeg" style="margin-left: 1em; margin-right: 1em;"><img alt="Picture of my laptop and my cat" border="0" data-original-height="1858" data-original-width="3335" height="223" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtNvt5QSih39zWHhgc4D_GWsslKplMjNpmR0gIBwz1_1cJ3smS3pghd2mimyYYXhOSXjMv66HFewOyf_zX06t6dypuYKppqtalf9r_7kqUS_Dv8YCOlUi1McjgsOkR7dBvIS-h-1_hLK2Etbn6MZBz2kJHCWum-ogQxAaadbEv7spz6u0KfkEESk_l/w400-h223/laptop_cat.jpeg" title="My hash cat not being helpful" width="400" /></a></div><p>This lead to a modest performance of 9th place:</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizDcV9wzdSjVRqoxYZpPDSpVB1Tf3GSuBxEXw0oPwy0a5EVRarQaaez47Ax-qJkS23Ess4SPqx0atrG45qKqVwmvrw_RG8wwuwZnVnbkyJTMzYCzLxAT1dq1VEKlHRikSFpBfv9HAoj3aydnOcrk58Z7f-dtgATIGfxZgeNijzAHaxL_cNIcgb8izv/s930/ctc_rankings.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Score Ranking of the Crack the Con Street Teams (9th Place)" border="0" data-original-height="572" data-original-width="930" height="246" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEizDcV9wzdSjVRqoxYZpPDSpVB1Tf3GSuBxEXw0oPwy0a5EVRarQaaez47Ax-qJkS23Ess4SPqx0atrG45qKqVwmvrw_RG8wwuwZnVnbkyJTMzYCzLxAT1dq1VEKlHRikSFpBfv9HAoj3aydnOcrk58Z7f-dtgATIGfxZgeNijzAHaxL_cNIcgb8izv/w400-h246/ctc_rankings.png" title="CrackTheCon Final Street Team Rankings" width="400" /></a></div><br /><p>GPUs are nice, and this certainly shows it! If you have some GPUs available I highly recommend using them along with HashCat. As some backstory, I still have my main password cracker set up to run medical security capture the flag events, and I was too lazy to get it reconfigured for this contest.</p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxsXxBQoKup9wqigsSgz1Ml4zwe55JancC7rs5FH-U7FlY9EOetuAZQnsNMG_UjSoR_zp5wW_70JosVpC78tsbyj4-D8NvvfzVb17Ry9JZzdhey36NiEqJbbcwf1LDx-5J0WNhUkpFhz8i2h54dhFrdoVuH8Pit6jr3UaD7A_LW5IiuGm_--cFmVth/s3580/netmux_server.jpeg" style="margin-left: 1em; margin-right: 1em;"><img alt="Computer surrounded by infusion pumps" border="0" data-original-height="2094" data-original-width="3580" height="234" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxsXxBQoKup9wqigsSgz1Ml4zwe55JancC7rs5FH-U7FlY9EOetuAZQnsNMG_UjSoR_zp5wW_70JosVpC78tsbyj4-D8NvvfzVb17Ry9JZzdhey36NiEqJbbcwf1LDx-5J0WNhUkpFhz8i2h54dhFrdoVuH8Pit6jr3UaD7A_LW5IiuGm_--cFmVth/w400-h234/netmux_server.jpeg" title="The attached medical equipment really pumps things up!" width="400" /></a></div><p>Therefore you should probably take everything I say with a healthy degree of skepticism. Based on the chat on Discord afterwards though I realized there's a few password cracking tips that might be helpful to share. One important point I want to stress is that anyone can make use of these tips. You don't need a fancy GPU hash-cracking monster to crack passwords. In fact, most of all my attacks were "semi-automated" with very little manual analysis of the cracked passwords. So you can apply all of these techniques yourself regardless of your past level of experience.</p><h3 style="text-align: left;"><b><br /></b></h3><h3 style="text-align: left;"><b>Tip #1: Make sure your John the Ripper build is based off Bleeding-Jumbo, and update it regularly!</b></h3><div><br /></div><div>Even if you normally use Hashcat, JtR is a very powerful password cracking tool that has a lot of nice "research friendly" features. This makes it an extremely useful tool to have in your toolbox. As a general rule of thumb, if I'm cracking passwords with a GPU I use Hashcat. If I'm leveraging my CPU I use JtR. The key to JtR is you need to use the Bleeding-Jumbo version of it. The "main" branch prioritizes compatibility with different architectures, but the Bleeding-Jumbo branch goes all in on features. As an example, over the last couple of months they added "duplication detection" to the early portions of a password cracking session (to help with slow or salted hashes), and performed a complete rework of the included rulesets. What I do is use Git to clone JtR from its github repo at: <a href="https://github.com/openwall/john">https://github.com/openwall/john</a>, check out the "bleeding-jumbo" branch, and then periodically pull down updates and rebuild it, (roughly once a month). This makes a huge difference!</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9Tx4WYxKOqOY2e7pF92bjCh7nOo3iVP5ouIroNnGNGfur3JUHnMY3lS-k0wEavpEZC9z_OLWU2Ovoy7hDO0Uvlm3K_hp3Wwrs9BliLrS2oYRCygr-GhEkLl3MUMIOS1A_8oyQ5rhxLd9WBDvYZKpmExOE2DQA5veBYJcvpyAODk8hHUtXTb_1Zpti/s622/jtr_bleeding_jumbo.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Picture of John the Ripper's Github site" border="0" data-original-height="327" data-original-width="622" height="210" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg9Tx4WYxKOqOY2e7pF92bjCh7nOo3iVP5ouIroNnGNGfur3JUHnMY3lS-k0wEavpEZC9z_OLWU2Ovoy7hDO0Uvlm3K_hp3Wwrs9BliLrS2oYRCygr-GhEkLl3MUMIOS1A_8oyQ5rhxLd9WBDvYZKpmExOE2DQA5veBYJcvpyAODk8hHUtXTb_1Zpti/w400-h210/jtr_bleeding_jumbo.png" title="Bleeding Jumbo Branch of John the Ripper" width="400" /></a></div><br /><div>As to the deeper question of "Why would you ever crack passwords on a CPU and not GPU", that gets more complicated... At a high level, I do a lot of password cracking research from a researcher and hobbyist viewpoint, so a CPU based approach makes it easier to tailor attacks. The real reason though is I don't own a massive cracking setup. From a training perspective, this means even if you only have a Raspberry Pi, you can pretty much recreate all of the techniques described here. That being said, sometimes the features of John the Ripper still outweigh the speed that a GPU provides, and at the very least it's a good tool to run on parallel on a VM or research computer while running longer GPU sessions with Hashcat on your main cracking box.</div><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #2: Use the '--loopback' option to leverage previously cracked passwords in your rules</h3><div><br /></div><div>The clickbait title was going to be: "This one simple trick is like a cheat code for password cracking competitions!" That's not much of an exaggeration. Full disclaimer, this technique probably resulted in around 50% of my successful password cracks in the CrackTheCon competition. I'd periodically wander over to my laptop, and feel l33t by hitting the enter key to kick off a new loopback session. So if you only follow one of these tips, this is the one to pay attention to.</div><div><br /></div><div>As to the actual technique itself, John the Ripper's '--loopback' option tells JtR to use previously cracked passwords as a wordlist in a cracking session. Hashcat also supports loopback attacks as well. There's a million different names for this approach, which by itself should tell you how powerful it can be. You can further optimize this attack by specifying a different .pot file from your main one such as '--loopback=Challenge1.pot'. This can be helpful if you are keeping your .pot files separate for different challenges, (I don't actually do this, but some people might). Once you are using --loopback to generate your base words, you can then apply mangling rules to them like a normal wordlist. Aka by also adding: '--rules=hashcat'. </div><div><br /></div><div>What this means was that my typical cracking session would start by running fairly basic attacks to generate an initial set of cracked passwords. For example, I'd run '--incremental' to brute force shorter passwords. I'd run a quick cracking session using the wordlist 'dic-0294' and hashcat + single rules to get slightly more complicated passwords. And I'd run a quick PCFG guessing session as well. After that initial set of passwords were cracked, loopback became one of my main attacks. And as you can see from the results, it was very effective.</div><div><br /></div><div>Now in the real world, loopback attacks while still powerful, aren't nearly as game braking as it is in a password cracking competition. Real users don't exclusively pick their passwords from a list of fungi names. But even then, loopback can still be useful to help augment your other cracking sessions. </div><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #3: John the Ripper supports dynamic hash formants on the command line. No need to modify a kernel or look though lots of documentation!</h3><div><br /></div><div>This being a CynoSurePrime cracking competition, there was bound to be weird hashtypes to crack. This problem also pops up time in real life cracking situations where some vendor decides to roll their own password hashing function. This can be a challenge since writing your own Hashcat kernel is not a lot of fun. That's one area where JtR really shines is with their extensive "Dynamic" hash type support. You can see the main formats that JtR supports by specifying '--list=formats' on the command line. That only shows the "mainstream" formats though. If you really want to see all the various formats supported by "Dynamic" mode you can specify '--list=subformats' on the command line.</div><div><br /></div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRW_sjV70HZUNBYwJta-mHAORR32q586izFa8KNcUM6-xzmEqjYkfxCSJBpTYLzr16cpnLxZcqQNi6CgwaIWyqRtDBT4hCw9_zpgCuH2x_HEoTrfpyXb59X36-_mfKN9J5F9SpsJnTJZRZT2VXmIEKg2evKnNXvMTepLySu-ypRBhuI12NGz3ZBiqK/s1142/subformats.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Lots of dynamic format details" border="0" data-original-height="1142" data-original-width="1086" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjRW_sjV70HZUNBYwJta-mHAORR32q586izFa8KNcUM6-xzmEqjYkfxCSJBpTYLzr16cpnLxZcqQNi6CgwaIWyqRtDBT4hCw9_zpgCuH2x_HEoTrfpyXb59X36-_mfKN9J5F9SpsJnTJZRZT2VXmIEKg2evKnNXvMTepLySu-ypRBhuI12NGz3ZBiqK/w304-h320/subformats.png" width="304" /></a></div><br /><div><br /></div><div>There's a lot of them included, and sometimes even that is a pain to look through and remember. One feature of JtR most people don't know about though is you can specify the hash details directly on the command line. For example, Challenge2 of the CtC contest was five rounds of MD5. To crack this with John the Ripper I simply needed to specify the following command:</div><div><b><blockquote>./john '--format=dynamic=md5(md5(md5(md5(md5($p)))))'</blockquote></b></div><p>The single quote around format is important so that your shell command doesn't misinterpret the parenthesis (). Basically though, you can specify the hash type, and how the password ($p) is applied, along with any salt ($s) as well. Dynamic mode supports multiple types of hash primitives, so for example, with Challenge4 which was a sha256 of a md5 hash I was able to use the following command:</p><blockquote><p><b>./john </b><b>'--format=dynamic=sha256(md5($p))'</b></p></blockquote><p>Long story short, if you ever find yourself needing to crack a weird hash type, don't forget about John the Ripper's Dynamic formats. </p><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #4: Leverage MDXFind to identify unknown hash types</h3><div><br /></div><div>I'll be up-front. I did not follow this tip and I'm really kicking myself over it. To guess the hash types, I relied on trying the suggestions provided by John the Ripper, and when that failed, I manually tried different hashing functions using the command line dynamic mode (Tip #3). Don't be like me. If you are dealing with an unknown hash, the tool you want to use is MDXFind. You can obtain it here: <a href="https://www.techsolvency.com/pub/bin/mdxfind/">https://www.techsolvency.com/pub/bin/mdxfind/</a>. If I had followed this advice, I probably would have ranked higher as I never figured out that Challenge #5 was:</div><div><blockquote> <b>--format=dynamic=sha256(sha1($p))</b></blockquote><p> To get MDXFind running on an Ubuntu image running on Windows Subsystem for Linux (WSL2):</p><p></p><ol style="text-align: left;"><li>Download mdxfind.1.116.bin</li><li>sudo apt-get install libjudy-dev</li><li>sudo apt-get install libmhash-dev</li><li>sudo apt-get install librhash-dev</li></ol><div>Here is an example leveraging MDXFind to identify the hash type for Challenge #5. The passwords look like SHA256, so the command I'd start with would be:</div><blockquote><div><b>./mdxfind.1.116.bin -h 'SHA256' -f Challenge5.txt wordlist.txt</b></div></blockquote><p></p><ul style="text-align: left;"><li><b>-h 'SHA256':</b> is the base hash type to use</li><li><b>-f Challenge5.txt:</b> is the hashlist</li><li><b>wordlist.txt: </b>is the wordlist </li></ul><div>And the results...</div><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4D-DxJ_fJM-Fo9sqG-NC9LBIQPkhaWBDmL2-AQE8Pu3r4ZEktCev9ao1w-YnOR53a2rQzbZ8NVY1zMn16vj3XJwOEcnhOaaVnbkouz-snLAJ3zeyjaP_ucG9pPls0PYlJMXrvUNK1VzrXlxzGeKifwlkW5myga7aZMp-Z4uFgJjtJiG6CowCqfdx5/s898/mdxfind.png" style="margin-left: 1em; margin-right: 1em;"><img alt="MDXFind Results" border="0" data-original-height="537" data-original-width="898" height="239" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4D-DxJ_fJM-Fo9sqG-NC9LBIQPkhaWBDmL2-AQE8Pu3r4ZEktCev9ao1w-YnOR53a2rQzbZ8NVY1zMn16vj3XJwOEcnhOaaVnbkouz-snLAJ3zeyjaP_ucG9pPls0PYlJMXrvUNK1VzrXlxzGeKifwlkW5myga7aZMp-Z4uFgJjtJiG6CowCqfdx5/w400-h239/mdxfind.png" width="400" /></a></div><br /><div>It quickly identified SHA256(SHA1($p)) in 4 seconds... Yeah that would have been nice to use.</div><p></p><p></p><b><i></i></b></div><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #5: John the Ripper supports mangling rules on the command line</h3><div><br /></div><div>In password cracking competitions one of the keys is to try and identify mangling techniques and create rulesets to target them. Now, I'll admit that for this competition, I mostly relied on the included rulesets (Tip #6), and using the PCFG Toolset to autodetect and create rulesets (Tip #7). A hands on approach is more effective though, but it quickly becomes annoying to have to constantly open up your ruleset file to modify it. This may sound like a minor nitpick, but your analysis time is valuable. One hidden feature John the Ripper supports is creating rulesets right on the command line. This is a huge timesaver, and in my opinion one of the killer features of John the Ripper. For example, let's say you want to duplicate a word and then add two digits to the end of it. Your JtR command might look like:</div><blockquote><div><b>./john --wordlist=somelist.txt '--rules=:d$[0-9]$[0-]' hashlist</b></div></blockquote><p>Key points:</p><p></p><ul style="text-align: left;"><li>You need to include --rules in single quotes. Aka '--rules...'</li><li>Your rule needs to start with ':' which is JtR's "no-op"</li><li>You can include multiple rules separated by a ';'. For example: <b>'--rules=:d$[0-9]$[0-];:$[a-z]' </b></li></ul><div>This is also very useful to test the output of your rules. To do this you can feed in an single word via stdin, and then you can apply rules to it using JtR's --pipe command. So for example:</div><blockquote><div><b>echo test | ./john --stdout --pipe '--rules=:d$[0-9]$[0-9]'</b></div></blockquote><p></p><div class="separator" style="clear: both; text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQXcnLSzOaNZTT47dG_Pkc0fPt_YzBjEMVlk3ebC8rsqg8NFZvB2G8FbOI9T8TwZdH1jbMPT5Vvr91PRnR7wL5XBoIDEqHfpl811pCu6R2KMOMQD1L7gxQXSfVyPma7bJpA4BFbehsDVqqgvtvyeD786Wsec-z2neVybQBzHW6bJEqIKqyse39cKyM/s1149/jtr_rules_cmd_line.png" style="margin-left: 1em; margin-right: 1em;"><img alt="Testing JTR Rules Using Cmd Line Switches" border="0" data-original-height="287" data-original-width="1149" height="100" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQXcnLSzOaNZTT47dG_Pkc0fPt_YzBjEMVlk3ebC8rsqg8NFZvB2G8FbOI9T8TwZdH1jbMPT5Vvr91PRnR7wL5XBoIDEqHfpl811pCu6R2KMOMQD1L7gxQXSfVyPma7bJpA4BFbehsDVqqgvtvyeD786Wsec-z2neVybQBzHW6bJEqIKqyse39cKyM/w400-h100/jtr_rules_cmd_line.png" width="400" /></a></div><br /><div>It may seem weird, but this is one of those tricks that makes me smile every time I use it.</div><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #6: Making use John the Ripper's mangling rulesets</h3><div>John the Ripper includes a ton of wordlist mangling rules. Given this contest was run by CynoSurePrime, I figured there would be heavy hashcat users on the hash creation side, so I primarily used the 'hashcat' ruleset. Aka:</div><div><b><blockquote>./john --wordlist=somelist.txt --rules=hashcat</blockquote></b></div><div><div>This runs through some of the main rules included in Hashcat such as:</div><div></div><blockquote><div>[List.Rules:hashcat]</div><div>.include [List.Rules:best64]</div><div>.include [List.Rules:d3ad0ne]</div><div>.include [List.Rules:dive]</div><div>.include [List.Rules:InsidePro]</div><div>.include [List.Rules:T0XlC]</div><div>.include [List.Rules:rockyou-30000]</div><div>.include [List.Rules:specific]</div></blockquote><p>Other useful rulesets (though not as useful for this particular competition)</p><p></p><ul style="text-align: left;"><li>--rules=phrase: Great for attacking passphrases</li><li>--rules=l33t: Good for attacking l33tsp33k passwords</li><li>--rules=ShiftToggle: Good for attacking weird capitalization</li><li>--rules=by-score: A good set of rules to use for fast hashes</li><li>--rules=by-rate: A good set of rules to use for slower hashes</li></ul><p></p><div></div></div><h3 style="text-align: left;"><br /></h3><h3 style="text-align: left;">Tip #7: Using the Pretty Cool Fuzzy Guesser (PCFG) Toolset</h3><div><br /></div><div>Of course I was going to mention the PCFG toolset! I just recently released version 4.3 and it has a ton of expanded documentation, plus better support for cracking Russian passwords. You can get it here:</div><div><a href="https://github.com/lakiw/pcfg_cracker">https://github.com/lakiw/pcfg_cracker</a></div><div><br /></div><div>The default PCFG ruleset is usually decent, but not great when it comes to password cracking competitions. This is because context passwords don't resemble RockYou passwords which the default ruleset was trained on. The real value is using the PCFG trainer to learn new rules and create new wordlists based on cracked passwords. The trainer does a lot of cool stuff in the backend such as multiword detection, keyboard walk identification, and other mangling rule generation. It can also be fairly effective even if you only have a couple of hundred cracked passwords.</div><div><br /></div><div>To train a PCFG ruleset:</div><div><ol style="text-align: left;"><li>You first need to create the training list. Adding support for JtR pot files has been on my todo list forever, but currently you need to strip the hash information off of your cracked passwords. For example, if I wanted to create a training list for Challenge #2 which was five rounds of md5 I ran: <b>"cat john.pot | grep 'md5(md5(md5(md5(' | awk -F':' '{print$2$3$4$5$6}' > plains_2.txt" </b>Yes this is a horribly inefficient way to do this, but it printed out all of the hashes, then only printed ones with the correct hashtype, then stripped off the hash, and then saved the results to plains_2.txt.</li><li>Run the PCFG trainer on the set. For example: <b>"python3 trainer.py -r Challenge2 -t plains_2.txt</b></li></ol><div>Once you have the training set you can do a couple of things:</div></div><div><ol style="text-align: left;"><li>You can run a PCFG attack against the challenge using the new ruleset. For this I recommend disabling Markov generation using the --skip_brute option. For example: <b>python3 pcfg_guesser.py --skip_brute -r Challenge2 | ../JohnTheRipper/run/john --stdin '--format=dynamic=md5(md5(md5(md5(md5($p)))))' Challenge2.txt</b></li><li>Another good option is to use princeling to generate a wordlist optimized for PRINCE attacks. You can also use this as a normal wordlist as well. For example, this will create a 50k word dictionary: <b>python3 prince_ling.py -r Challenge2 --size 50000 -o new_wordlist.txt</b></li><li>You can manually go through the generate rules file to identify mangling rules. A good option to open up is: <b>Rules/<RULENAME>/Grammar/grammar.txt</b></li></ol><div>Summing all of this up, 99% of my cracking sessions for this contest were:</div></div><div><ul style="text-align: left;"><li>Identify the correct hashtype</li><li>Run a default attack against it using the dict0294 wordlist and the hashcat rules</li><li>At the same time run a JTR bruteforce Incremental attack <b>"./john --incremental=All"</b>. That's the nice thing about CPU cracking. I have enough cores I can run around three sessions at the same time on my laptop before things get really slow.</li><li>Run a couple of loopback attacks using the hashcat rules</li><li>Train a PCFG ruleset and run a PCFG cracking session until it gets to around 95% coverage. You can see the coverage by hitting enter while running it.</li><li>Run a couple more loopback attacks</li><li>Re-Train the PCFG ruleset</li><li>Create a wordlist using the PCFG prince_ling</li><li>Run a PRINCE cracking session using the wordlist and JtR</li><li>Run a normal cracking session using the prince wordlist and the hashcat ruleset</li><li>Re-Train the PCFG ruleset and run a PCFG cracking session</li><li>Repeat. Maybe run a longer incremental session, or try another input dictionary.</li></ul><div><br /></div><div>Following these steps, you too can get 9th place in a password cracking competition!</div></div>Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-51649212118358059952019-08-01T19:40:00.000-07:002019-08-02T07:43:11.930-07:00Installing John the Ripper on Microsoft's Windows Subsystem for Linux (WSL)<blockquote class="tr_bq">
<span style="background-color: white; color: #333333; font-family: "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;"><b><i>"I see my path, but I don't know where it leads. Not knowing where I'm going is what inspires me to travel it." --</i></b></span><b>Rosalía de Castro</b></blockquote>
<h3>
Introduction:</h3>
With great regret I finally decided to retire my 10-year-old MacBook Pro as my personal travel laptop. Part of that is I'll be attending Defcon this year to help out #<a href="https://www.iamthecavalry.org/" target="_blank">IAmTheCalvary</a> and the <a href="https://wehearthackers.org/" target="_blank">#WeHeartHackers</a> initiative by volunteering in the <a href="https://www.villageb.io/" target="_blank">Defcon Biohacking village</a>. Side note, if you are in Vegas, feel free to drop by and we can talk about cyber security in a clinical setting. Doctors and nurses hate passwords too!<br />
<br />
Getting back on track, I wanted something a bit more modern to participate in this year's <a href="https://contest-2019.korelogic.com/" target="_blank">Crack Me If You Can Completion</a>, as well as to play around in the various hacking villages so I bought myself a Microsoft Surface Book. The challenge was while Hashcat has a native Windows build, my experiences getting John the Ripper (JtR) running on Windows in the past have been ... troubled. That's part of why I loved my old MacBook. It just worked (sorry Linux), and JtR ran great on it. Now I could re-image my laptop with Linux or dual boot it but having Excel and Notepad++ makes my life so much better. Plus, I'm really digging the tablet. So before I went ahead and installed VirtualBox and ran JtR in a VM I figured I'd try and install JtR using the new Windows Subsystem on Linux (WSL). Long story short, it worked great and was straightforward to do, so I figured I'd share my experiences.<br />
<br />
<h3>
Other Options for Running John the Ripper on Windows</h3>
If you want to skip this guide and instead install a pre-built executable of JtR, you can obtain a relatively up-to-date version here: <a href="https://github.com/claudioandre-br/packages/releases/tag/jumbo-dev">https://github.com/claudioandre-br/packages/releases/tag/jumbo-dev</a><br />
<br />
Note: I've never run these, so I'm not very familiar with how they perform.<br />
<br />
Other options include installing JtR using Cygwin. A guide for doing so is available here: <a href="https://openwall.info/wiki/john/tutorials/win64-howto-build">https://openwall.info/wiki/john/tutorials/win64-howto-build</a><br />
<br />
Finally, a very common option that I referenced to above is to simply install VirtualBox, and then run JtR in a VM.<br />
<br />
<h3>
Windows Subsystem for Linux:</h3>
<div>
If you are wondering what WSL is, you are not alone! At a high level, it lets you run Linux programs on Windows without having to recompile them or run them in CygWin. To steal <a href="https://docs.microsoft.com/en-us/windows/wsl/about" target="_blank">Microsoft's own words</a>:</div>
<blockquote class="tr_bq">
<blockquote class="tr_bq">
The Windows Subsystem for Linux lets developers run a GNU/Linux environment -- including most command-line tools, utilities, and applications -- directly on Windows, unmodified, without the overhead of a virtual machine.</blockquote>
<blockquote class="tr_bq">
You can:</blockquote>
<blockquote class="tr_bq">
<ol>
<li>Choose your favorite GNU/Linux distributions from the Microsoft Store.</li>
<li>Run common command-line free software such as grep, sed, awk, or other ELF-64 binaries.</li>
<li>Run Bash shell scripts and GNU/Linux command-line applications including:</li>
</ol>
<ul><ul>
<li>Tools: vim, emacs, tmux</li>
<li>Languages: Javascript/node.js, Ruby, Python, C/C++, C# & F#, Rust, Go, etc.</li>
<li>Services: sshd, MySQL, Apache, lighttpd</li>
</ul>
</ul>
</blockquote>
<blockquote class="tr_bq">
<ol>
<li>Install additional software using own GNU/Linux distribution package manager.</li>
<li>Invoke Windows applications using a Unix-like command-line shell.</li>
<li>Invoke GNU/Linux applications on Windows.</li>
</ol>
</blockquote>
</blockquote>
<div>
The mechanics of it are complicated with significant differences between WSLv1 and WSLv2. This guide was written with WSLv1, though if I get adventurous before Defcon I may try to upgrade to WSLv2.<br />
<br /></div>
<h3>
Enabling WSLv1 and Install a Linux Distro:</h3>
<div>
The first thing you need to do is enable WSLv1 as it is disabled by default. As a fair warning, this will require a reboot.</div>
<div>
<ul>
<li>There are several ways to enable WSLv1. I opted to use PowerShell. The first step then is to open an Administrative instance of PowerShell. </li>
<li>Run the following command (<a href="https://docs.microsoft.com/en-us/windows/wsl/install-win10" target="_blank">ref</a>):</li>
</ul>
<ol>
<ul>
<li>Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsytem-Linux</li>
</ul>
</ol>
<ol><ul>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrMae97k0xZrtA6DQ2viR5xMUavDQx2rJvHlCYt3dCiJygj6nZgHR7xAxlK7cOSo41C9PnbNM3JIaUskQQhKZtYlfaqAESZsM9duJhY5D3UbGUEC7DqH5ydi5GlFOUcIKRT14tvkv4uxc/s1600/enable_linux.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="256" data-original-width="1595" height="62" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgrMae97k0xZrtA6DQ2viR5xMUavDQx2rJvHlCYt3dCiJygj6nZgHR7xAxlK7cOSo41C9PnbNM3JIaUskQQhKZtYlfaqAESZsM9duJhY5D3UbGUEC7DqH5ydi5GlFOUcIKRT14tvkv4uxc/s400/enable_linux.png" width="400" /></a></div>
</ol>
<ul>
<li>Reboot your system when prompted to.</li>
<li>Once your computer starts back up, the next step is to pick a Linux distro. Open the Microsoft store and type Linux in the search menu</li>
</ul>
</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9SnU5egO8TI0ytBDfLUHNfTkbTh1NnK6a7owQHDzZT_w5Rj9VpVtBf20ARKB_1tS6jBGpuKv2WRAnrUL1gJ5MP3GIvuTJ76eIuAbl7rCu47VVCTYl_QBIaTzQ5Gm_Dcu8ThkK9-F4S6I/s1600/linux_on_windows.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1050" data-original-width="1600" height="262" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj9SnU5egO8TI0ytBDfLUHNfTkbTh1NnK6a7owQHDzZT_w5Rj9VpVtBf20ARKB_1tS6jBGpuKv2WRAnrUL1gJ5MP3GIvuTJ76eIuAbl7rCu47VVCTYl_QBIaTzQ5Gm_Dcu8ThkK9-F4S6I/s400/linux_on_windows.png" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<ul>
<li style="text-align: left;">Side note: You'll be happy to know that Kali Linux is rated "E for Everyone"!</li>
</ul>
<div style="margin-left: 1em; margin-right: 1em; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgb4C1ZWOup5VV9atLvr7-XC6RA4tj8pv92o9Z6dnWwEPNk45gKcHcaNa-lHZuAsWNUMvk62l4xiK9U3zCTS26OuY6YOC90p60GGrJ-6htWNrcmfqra764Pg-nkbLz7FQPDmM2bIJZbv4M/s1600/kali.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="595" data-original-width="1600" height="147" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgb4C1ZWOup5VV9atLvr7-XC6RA4tj8pv92o9Z6dnWwEPNk45gKcHcaNa-lHZuAsWNUMvk62l4xiK9U3zCTS26OuY6YOC90p60GGrJ-6htWNrcmfqra764Pg-nkbLz7FQPDmM2bIJZbv4M/s400/kali.png" width="400" /></a></div>
<br />
<div class="" style="clear: both; text-align: left;">
</div>
<ul>
<li>Important Note: All the Linux distros I looked at in the Windows Store, (including Kali), are barebones and do not include graphical desktops, or many tools or installed libraries. It's not like installing a Kali live boot image.</li>
<li>Because Kali doesn't come with any tools preconfigured, I opted to go with a base Ubuntu build. That's also partially because Kali and Hashcat in the past haven't been an ideal match, so I tend to stay away from it on my desktop builds</li>
</ul>
<br />
<div class="separator" style="clear: both; text-align: left;">
</div>
<div style="text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMd_ABFRfukFJEhYmgdHflbe3gCX0b-pxG-uIVaGhZ900S98yIL4qn18wGsfdbY4QsdBzr7WwwRe3iXKt4uEtiKp2-JouYXHUxjRxNPlh_PmVwxyr1KhuZS2PmONw_nBFpBsDPBO7X3kA/s1600/ubuntu.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="665" data-original-width="1600" height="166" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgMd_ABFRfukFJEhYmgdHflbe3gCX0b-pxG-uIVaGhZ900S98yIL4qn18wGsfdbY4QsdBzr7WwwRe3iXKt4uEtiKp2-JouYXHUxjRxNPlh_PmVwxyr1KhuZS2PmONw_nBFpBsDPBO7X3kA/s400/ubuntu.png" width="400" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="text-align: left;">
<ul>
<li>Once you install Ubuntu, you'll still need to initialize it. To do this open PowerShell again, though this time you can run it as a standard user. For Ubuntu, simply type 'ubuntu'</li>
</ul>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzwYnnO40niJJWltZtuHWzHkUZ9zEB2IPa4wXYS2qlS1fSQAW7qRviNRJuafqzKaAKwFgpetLXiybRIo6bilfzLizgAW8Fgnl1OYCnTBMW08SlzYK8EGuXPkcZ962cWNAgNHYCpq7Mtc8/s1600/initial_install.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: left;"><img border="0" data-original-height="375" data-original-width="1600" height="92" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhzwYnnO40niJJWltZtuHWzHkUZ9zEB2IPa4wXYS2qlS1fSQAW7qRviNRJuafqzKaAKwFgpetLXiybRIo6bilfzLizgAW8Fgnl1OYCnTBMW08SlzYK8EGuXPkcZ962cWNAgNHYCpq7Mtc8/s400/initial_install.png" width="400" /></a></div>
<div style="text-align: left;">
<br /></div>
</div>
<div>
<ul>
<li>You'll be prompted to create a user account. Go ahead and do so.</li>
<li>Congratulations, you are now running Linux on Windows!</li>
</ul>
</div>
<h3>
Installing John the Ripper</h3>
<div>
<ul>
<li>This guide was written using the bleeding-jumbo version of John the Ripper, which is available here: <a href="https://github.com/magnumripper/JohnTheRipper">https://github.com/magnumripper/JohnTheRipper</a></li>
<li>It's beyond the scope of this guide on how to install and use Git on Windows, (I personally like <a href="https://www.gitkraken.com/" target="_blank">GitKracken</a>). While you can download the source-code as a zip file, I highly recommend downloading it using git to make keeping it up to date much easier. With WSLv1, it's recommended that you install the code somewhere besides your new Linux filesystem. I put it in c:\github\JohnTheRipper\. With WSLv2 that changes, but I'll cross that bridge when I try that out. You could also probably install git into Ubuntu and download it that way, but I didn't try that.</li>
<li>The next step is to install all the required libraries in WSLv1 Ubuntu. Run all the following commands in the PowerShell window above after starting Ubuntu. If you ever close your window, you can restart PowerShell and type "ubuntu" to restart Ubuntu.</li>
<li>Update your package libraries. If you don't do this, the following installs will not work, (as seen in all the errors above the command in the below screenshot)</li>
<ul>
<li>sudo apt update</li>
</ul>
</ul>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkciR2XhVTavdR0jfdMnVThVZxw5pBK0pcSN7grim1hWOL3zK0XBDR8DXAiCRIH8pi_kqrSeEn0wDhUpfKiwW7T9W-rwZyNdDp3886uGj-6g1ePIgL4q6YNZKMQLOVUkvtnjqHZUQIvY4/s1600/apt_update.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="732" data-original-width="1559" height="150" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkciR2XhVTavdR0jfdMnVThVZxw5pBK0pcSN7grim1hWOL3zK0XBDR8DXAiCRIH8pi_kqrSeEn0wDhUpfKiwW7T9W-rwZyNdDp3886uGj-6g1ePIgL4q6YNZKMQLOVUkvtnjqHZUQIvY4/s320/apt_update.png" width="320" /></a></div>
<div>
<br /></div>
<div>
<ul>
<li>Install GCC. Select 'Y'es when prompted. The install will take a while.</li>
<ul>
<li>sudo apt install gcc</li>
</ul>
</ul>
</div>
<div>
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3n8S84GoZ3lw7eaXAxEEswjLuWEZNt9Z3OSFoI7tk9hfuKAg0YfgP8PH03AJJ9rZ6vjIb6znFlZlK3kuXBhlTxRt5b3e8sFbkLRv5INX3MAK-MWOD5CviS5S0HG3i_luIYybtfc2NZy0/s1600/gcc_install.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1148" data-original-width="1555" height="236" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3n8S84GoZ3lw7eaXAxEEswjLuWEZNt9Z3OSFoI7tk9hfuKAg0YfgP8PH03AJJ9rZ6vjIb6znFlZlK3kuXBhlTxRt5b3e8sFbkLRv5INX3MAK-MWOD5CviS5S0HG3i_luIYybtfc2NZy0/s320/gcc_install.png" width="320" /></a></div>
<div>
<br /></div>
<ul>
<li>Install Make</li>
<ul>
<li>sudo apt install make</li>
</ul>
<li>Install various libraries required/recommended for JtR Bleeding-Jumbo</li>
<ul>
<li>sudo apt install libssl-dev</li>
<li>sudo apt install libgmp-dev</li>
<li>sudo apt install libkrb5-dev</li>
</ul>
<li>Navigate to your Windows drive where you installed the John the Ripper source-code. You can access you C:\ Drive under the /mnt/c directory. Run the following command to build JtR</li>
<ul>
<li>./configure && make</li>
</ul>
</ul>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQnnU5P9UfB_u1N2110TpPn1xD9NYdPH7mTyxh0dLBvS1DQmOZ9vzc_hoB0nLMt_EZwk7UdWcFaQyX9h6Kzv2zfza9Zs9uWFSWpoapJoHRhb71qL1pZY9VNLmfS9bvZlAVN2jmE3W88d4/s1600/building_jtr.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="205" data-original-width="1173" height="68" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjQnnU5P9UfB_u1N2110TpPn1xD9NYdPH7mTyxh0dLBvS1DQmOZ9vzc_hoB0nLMt_EZwk7UdWcFaQyX9h6Kzv2zfza9Zs9uWFSWpoapJoHRhb71qL1pZY9VNLmfS9bvZlAVN2jmE3W88d4/s400/building_jtr.png" width="400" /></a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<ul>
<li>The build process will likely take around 10-15 minutes. After it is done you should see the following. If there are any errors, something went wrong so you will likely need to perform additional troubleshooting.</li>
</ul>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHIb1_6Vb94c8XIJLbPcRY_ByOnNtTU4WCZOU0U5eZR6xmhwQVUIELLfEr-arQtVOA6_V9Cwa4chIlTml4pCmnZlEoWLmtn3RQhxFzPzr0ggsvhuGv3Pr9yUgAK_zaB3WHBsFEqbvNQ-Q/s1600/build_sucessful.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="774" data-original-width="1600" height="154" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiHIb1_6Vb94c8XIJLbPcRY_ByOnNtTU4WCZOU0U5eZR6xmhwQVUIELLfEr-arQtVOA6_V9Cwa4chIlTml4pCmnZlEoWLmtn3RQhxFzPzr0ggsvhuGv3Pr9yUgAK_zaB3WHBsFEqbvNQ-Q/s320/build_sucessful.png" width="320" /></a></div>
<div style="text-align: left;">
<br /></div>
<div style="text-align: left;">
<ul>
<li>Finally navigate to the run directory '../run/' and try to start John the Ripper:</li>
<ul>
<li>.\john</li>
</ul>
</ul>
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZVR8wyeCtOkdFk15F10Pll_fF0G6Flgulztzte2T88RsuIB37l0jpMt_QqZdVztAhp570Hg87dZhJ2EmQTJDV-iVSgqOq8v4uXgII79Z5mcG3RReGgjto0utxqq3VwA3NNlRDgE8YTCs/s1600/start_cracking.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" data-original-height="1503" data-original-width="1600" height="600" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZVR8wyeCtOkdFk15F10Pll_fF0G6Flgulztzte2T88RsuIB37l0jpMt_QqZdVztAhp570Hg87dZhJ2EmQTJDV-iVSgqOq8v4uXgII79Z5mcG3RReGgjto0utxqq3VwA3NNlRDgE8YTCs/s640/start_cracking.png" width="640" /></a></div>
<div style="text-align: left;">
<br /></div>
<ul>
<li>Congratulations! You are now running John the Ripper on Windows!</li>
</ul>
<h3>
Performance:</h3>
<div>
If you are curious, here is a short snipped of me benchmarking JtR on my PC. Note, this is only running on a single core. I should have also included the --fork=8, which I'll admit I didn't realize worked with the --test option before writing this guide.</div>
<div>
<br /></div>
<div>
<b>Laptop Specs: </b></div>
<div>
<ul>
<li>Microsoft Surface Book 13 Inch,</li>
<li>Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz</li>
<li>16.0 GB Ram</li>
</ul>
<div>
<b>Test command: </b>./john --test</div>
</div>
<div>
<br /></div>
<div>
<div>
Benchmarking: bcrypt ("$2a$05", 32 iterations) [Blowfish 32/64 X3]... (8xOMP) DONE</div>
<div>
Speed for cost 1 (iteration count) of 32</div>
<div>
Raw: 6344 c/s real, 790 c/s virtual</div>
</div>
<div>
<br /></div>
<div>
<div>
Benchmarking: Raw-MD5 [MD5 256/256 AVX2 8x3]... DONE</div>
<div>
Raw: 61074K c/s real, 61074K c/s virtual</div>
</div>
<div>
<br /></div>
<div>
<div>
Benchmarking: scrypt (16384, 8, 1) [Salsa20/8 128/128 AVX]... (8xOMP) DONE</div>
<div>
Speed for cost 1 (N) of 16384, cost 2 (r) of 8, cost 3 (p) of 1</div>
<div>
Raw: 280 c/s real, 35.0 c/s virtual</div>
<div>
<br /></div>
<div>
Benchmarking: LM [DES 256/256 AVX2]... (8xOMP) DONE</div>
<div>
Raw: 121470K c/s real, 15241K c/s virtual</div>
</div>
<div>
<br /></div>
<div>
<br /></div>
<div>
<br /></div>
</div>
<div style="text-align: left;">
<br /></div>
</div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com1tag:blogger.com,1999:blog-496451536493805371.post-76534354643585862752018-09-11T20:02:00.001-07:002018-09-16T20:01:50.578-07:00Configuring a Password Cracking Computer<blockquote class="tr_bq">
<ul>
<li><b><i>“Be willing to be a beginner every single morning.” —Meister Eckhart</i></b></li>
</ul>
</blockquote>
<b>Disclaimer: </b>While the reason I'm writing this is because I was lucky enough to win a new cracking rig from <a href="https://twitter.com/netmux" target="_blank">Netmux's</a> <a href="https://www.netmux.com/blog/hash-crack-challenge" target="_blank">Hash Crack Challenge</a>, I want to state for the record that he never asked me to blog about it, and all of the good things I say are 100% of my own choosing and not contingent on me receiving any prize.<br />
<br />
<b>2nd Disclaimer: </b>I plan on this being a "living" blog entry as I continue to update and use my new computer. Since install procedures change over time, for the record I started to perform my install on September 7th 2018. I'll try to date my entries as I write them to help anyone trying to follow this so they can estimate how useful these instructions are.<br />
<br />
<b>ChangeLog:</b><br />
<ul>
<li>September 12, 2018, (rearranged sections, added MDXFind, updated installing OpenCL instructions)</li>
</ul>
<b><u><br /></u></b>
<b><u>September 7, 2018 (Computer Arrives):</u></b><br />
<br />
Wow, I suddenly and unexpectedly found myself in possession of a dedicated password cracking machine! For more background how that happened, please refer to my post on Netmux's Hash Cracking Challenge <a href="https://reusablesec.blogspot.com/2018/09/netmuxs-hash-crack-challenge-writeup.html" target="_blank">here</a>. For the record, Netmux was amazing when it came to promptly shipping my portable cracking rig and keeping me in the loop. I'll admit I was a bit hesitant to hand out my home address to professional pen-tester and password cracker I met on the internet, but I've made <a href="https://hikinghiker.com/2014/07/13/day-145-sketchy-trail-magic-zoos-and-pool-parties-oh-and-some-hiking-was-also-done/" target="_blank">a lot worse threat modeling decisions in the past</a>, (There is a story behind the first picture that gets everyone who knows and cares about me legit angry for the stupid trust I've put in absolute strangers before). Long story short, Netmux was professional in shipping the server, kept me in the loop, and when it showed up I was super excited! As some background, while I study password cracking, <a href="https://github.com/lakiw/pcfg_cracker" target="_blank">develop</a> and <a href="https://www.openwall.com/lists/john-users/2015/01/06/2" target="_blank">analyze</a> password cracking tools, and <a href="https://www.openwall.com/lists/john-users/2015/01/06/2" target="_blank">participate in password cracking challenges</a>, I've never been willing to personally invest in a dedicated password cracking rig. Mostly I've made do with a 2010 MacBook Pro, and a Windows machine with a GTX970 that I'll freely admit spends more time running Excel and playing World of Warcraft than cracking hashes. Which is another way of saying please take all my advice with a grain of salt, and the understanding that I'm planning on using this new server for research. I'm not optimizing it as a pure password cracking rig. But also this is a way of saying that I no longer have any excuses in how much I contribute in password cracking challenges in the future! This gift has inspired me to start a few new research projects so I want to give yet another huge thanks to Netmux!!! If you see me post additional blog content in the next few months or update my PCFG cracker, please give credit to him!<br />
<br />
<b>A Quick Aside on my New Password Cracking Rig:</b><br />
<br />
Let me first say that it arrived in perfect shape so of course the first thing I did was crack it open and look at the inside...<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL95h75jBpxIEPQlLWLGfvtJn5S3Pkjbxs4KIQjE5iToIWdJWiIlmZwYdaQUA7FTt91W0pKjVkkLmJTI0gzwDMXfA0Wwv4Pfisy2OcLOUs7TJjfRm8m9fyPml_6kSONBIm3v7b06uWJTY/s1600/CFE74004-69A8-476F-8075-E954D45087B7.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1200" data-original-width="1600" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjL95h75jBpxIEPQlLWLGfvtJn5S3Pkjbxs4KIQjE5iToIWdJWiIlmZwYdaQUA7FTt91W0pKjVkkLmJTI0gzwDMXfA0Wwv4Pfisy2OcLOUs7TJjfRm8m9fyPml_6kSONBIm3v7b06uWJTY/s320/CFE74004-69A8-476F-8075-E954D45087B7.jpeg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">My new rig from Netmux</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3BclB4N8biKzctN7ZE0D8Idg4bL2tP90bPY1pYRv3QB5l2vaDbmJR0ED7IN8jpFegbA5fqtjCFzy7ecXfZzzleXI_ZgWCSmXV18jEL6VwEavDp4uUiYp_faHQ3cv5I-sx2OFUu9mwFg4/s1600/A4D4CADA-D027-4CD4-AF8E-5C71BD970AF3.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="874" data-original-width="1164" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3BclB4N8biKzctN7ZE0D8Idg4bL2tP90bPY1pYRv3QB5l2vaDbmJR0ED7IN8jpFegbA5fqtjCFzy7ecXfZzzleXI_ZgWCSmXV18jEL6VwEavDp4uUiYp_faHQ3cv5I-sx2OFUu9mwFg4/s320/A4D4CADA-D027-4CD4-AF8E-5C71BD970AF3.jpeg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Super excited!!!</td></tr>
</tbody></table>
The wiring was very well done, the whole rig is water cooled, the case certainly adds hacker creds, and little things were taken care of such as having good filters over the air vents which is pretty much a make or break requirement for this cat owner. I'm *very* happy with it, and would recommend it to someone else.<br />
<br />
As far as the specs go:<br />
CPU: Intel i5-7600k, 1 processor; 4 cores<br />
RAM: 16GB<br />
Storage: 500GB SSD<br />
GPU: GeForce GTX 1070<br />
<h4>
<b><u>Installing the OS:</u></b></h4>
Netmux's cracking rig came pre-installed with Ubuntu, but I figured I might as well re-install everything from scratch. After consulting with several password cracking experts I'm lucky to know, my end decision was to re-install Ubuntu. The version I used was 18.04.1 LTS. I plan on using this server for research as well so I went with a full graphical desktop. If you are hardcore and want 100% of your machine devoted to cracking then by all means go with a server deployment, but this guide probably won't help you to much since I *love* GUIs. Spoiler alert, I recommend installing a GUI git client like GitKracken, so that's where this guide is taking you.<br />
<b><br /></b>
<b>Building the Boot USB (</b><b>September 7, 2018)</b><b>:</b><br />
Like anyone has a DVD anymore... The very first step I took was to create a bootable USB.<br />
<br />
Steps:<br />
<ol>
<li>You can download an Ubunto ISO from <a href="https://www.ubuntu.com/#download-content" target="_blank">here</a></li>
<li>Since I already was running Ubuntu, I could use <em class="style-scope codelabs-page" style="background-color: white; font-family: Ubuntu, sans-serif; font-size: 14px;">Startup Disk Creator </em><span class="style-scope codelabs-page" style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"> to create a bootable USB drive. You can perform a search, (use the Windows key), for that application if you are running Ubuntu already.</span></li>
<li><span class="style-scope codelabs-page" style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;">Follow the options to create a bootable USB using the ISO that you previously downloaded</span></li>
</ol>
<span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>Installing Ubuntu fro USB (</b></span><b>September 7, 2018)</b><b style="font-family: ubuntu, sans-serif; font-size: 14px;">:</b><br />
<ol>
<li>Use multiple swear words and reboot several times until you find the BIOS option to change your boot preference to start with your USB drive. In my case it was hitting F2.</li>
<li><span class="style-scope codelabs-page" style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;">Once you boot from the USB, follow the steps in the Ubuntu installer and configure it how you want.</span></li>
<li><span class="style-scope codelabs-page" style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;">If you are going to configure full hard drive encryption, (this will be a real portable rig that will potentially be unattended in your car when you make a restroom stop, or you are worried about legal issues), this is the time to configure full hard drive encryption. Just saying.</span></li>
</ol>
<div>
<h3>
<span style="font-family: "ubuntu" , sans-serif;"><span style="font-size: 14px;"><b><u>Core OS Drivers and Important Tools for Other Capabilities:</u></b></span></span></h3>
<span style="font-family: "ubuntu" , sans-serif;"><span style="font-size: 14px;"><b>Installing OpenCL drivers </b></span></span><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(Originally installed </b></span><b>September 7, 2018, updated September 12)</b><b style="font-family: ubuntu, sans-serif; font-size: 14px;">:</b></div>
<div>
<span style="font-family: "ubuntu" , sans-serif;"><span style="font-size: 14px;">Special thanks to <a href="https://twitter.com/winxp5421" target="_blank">WinXP5421</a>. The following section was written by him, though I tested it on my system and made minor edits based on my experiences and formatting it for this blog</span></span><br />
<ol>
<li>Download the appropriate Opencl Drivers for your system. We are specifically looking for “Intel® Xeon™ Processors OR Intel® Core™ Processors OpenCL runtime” drivers. </li>
<ul>
<li>Drivers at the current time of writing this are located here: https://software.intel.com/en-us/articles/opencl-drivers</li>
<li>The current version at the time of writing this can be found: http://registrationcenter-download.intel.com/akdlm/irc_nas/12556/opencl_runtime_16.1.2_x64_rh_6.4.0.37.tgz </li>
<li>Run: wget <a href="http://registrationcenter-download.intel.com/akdlm/irc_nas/12556/opencl_runtime_16.1.2_x64_rh_6.4.0.37.tgz">http://registrationcenter-download.intel.com/akdlm/irc_nas/12556/opencl_runtime_16.1.2_x64_rh_6.4.0.37.tgz</a></li>
</ul>
<li>Extract the archive:</li>
<ul>
<li>tar -xvzf opencl_runtime*.tgz</li>
</ul>
<li>The opencl runtime requires `lsb-core` to be installed on the ubuntu machine:</li>
<ul>
<li>sudo apt install lsb-core</li>
</ul>
<li>Now install the drivers:</li>
<ul>
<li>Go to the intel directory that you extracted in step #2</li>
<li>sudo ./install.sh</li>
<li>Work your way through the installer answering questions as needed. The install script will complain that your Ubuntu operating system is not supported this is fine continue with the installation anyway.</li>
</ul>
<li>Let’s verify we have a working Opencl environment by installing and running `clinfo`</li>
<ul>
<li>Note: clinfo was already installed on my machine, but one of the other tools I installed later may have installed it -- Matt</li>
<li>sudo apt install clinfo</li>
<li>clinfo</li>
<li>The output of clinfo should display detailed information about each CPU core you have on your system. Simply put “Lots of output = all good” If OpenCL did not install properly you will see short and specific errors after running clinfo. </li>
</ul>
</ol>
</div>
<div>
<ol><ul>
</ul>
</ol>
<span style="color: #222222; font-family: "arial" , sans-serif; font-size: 14px;"><b style="font-family: ubuntu, sans-serif;">Installing NVidia Drivers</b><b> </b></span><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(</b></span><b>September 7, 2018)</b><b style="color: #222222; font-family: arial, sans-serif; font-size: 14px;">:</b><br />
<ol>
<li><span style="color: #222222; font-family: "arial" , sans-serif;"><span style="font-size: 14px;">Run: ubuntu-drivers devices</span></span></li>
<li>Select the driver from the list you want to install. In my case it was: </li>
<ol>
<li>sudo apt-get install nvidia-driver-396</li>
</ol>
</ol>
<div>
<b>Install basic GIT </b><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(</b></span><b>September 7, 2018)</b><b>:</b></div>
<div>
I usually only use a command line git when something goes horribly wrong, but having it ready helps a lot when that happens.</div>
<div>
<ol>
<li>Sudo apt-get install git</li>
</ol>
<div>
<b>Install a GUI GIT Client </b><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(</b></span><b>September 7, 2018)</b><b>:</b></div>
</div>
<div>
I've used a lot of git GUIs in the past. The following is purely personal preference, but I would highly recommend using a graphical git GUI if you are doing any development. Having the ability to easily view changes, manage merge requests, fork, etc, I've found to be invaluable in all my work.</div>
<div>
<br /></div>
<div>
My favorite git GUI of all time has been the official github client from several years ago. Unfortunately since then they re-based everything in a web layout, it completely broke my workflow. I've tried to use Atlassian's SourceTree, but after a few horribly failed merges was told to never use it again by several co-workers. I currently use GitKracken, and am very happy with it. GitKracken is not free for commercial use. I've been told to use SmartGit by several people but don't have experience with it. If you are using this tutorial for commercial use and don't have funding to pay for GitKracken please check it out. Otherwise, I've found GitKracken to be great for non-profit and personal use.</div>
<div>
<ol>
<li>Install GitKracken from https://www.gitkraken.com/</li>
<li>Run the following command or gitkracken will never actually start: sudo apt install libgnome-keyring0</li>
<li>Once GitKracken is installed, log in to your github account using it</li>
<li>Now add your computer's SSH key to your github account using: File->Preferences->Authentication->Github.com->Add_SSH_Public_Key</li>
</ol>
<div>
<b><u>Installing Password Cracking Programs:</u></b><br />
<b><br /></b>
<b>Install Hashcat </b><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(</b></span><b>September 7, 2018)</b><b>:</b></div>
</div>
<div>
Yes there are pre-built binaries for Hashcat, but I highly recommend using the github based source code to stay up to date with all the latest changes, fixes, and features.</div>
<div>
<ol>
<li>Install Hashcat using your git tool of choice. If you are using GitKracken, import the following repo: git@github.com:hashcat/hashcat.git</li>
<li>Full instuctions for installing Hashcat can be found at: https://github.com/hashcat/hashcat/blob/master/BUILD.md</li>
<li>You'll need to update the OpenCL Header submodule. This can be done in GitKracken by importing Hashcat using the above link and then in gitkracken "viewing Left Hand Side" at SubModules, right clicking on the deps/OpenCl-Headers, and selecting "Create" or "Update", If you are not using GitKracken, follow the instructions listed in step #2</li>
<li>In a terminal, select "make", and then "make install"</li>
<li>By building from source, you can periodically pull from the Hashcat repository and re-build it to add new features before an "official" release is published</li>
</ol>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB9B3WR7sYwSlGu1cY7Abu2rJVBXZO4PYyveBlaaliqWiSRwBtO4-ykkDRgretgURr8WsVXmTz-LMI94R5f2p2ejv0BCeaiji9YvIrEijRqJ280ELdOc23TduE9S9vuqZapDHsmbBWcA4/s1600/9CFD7BE3-5233-43F2-8EEE-83B08FEBD131.jpeg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1332" data-original-width="1600" height="266" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhB9B3WR7sYwSlGu1cY7Abu2rJVBXZO4PYyveBlaaliqWiSRwBtO4-ykkDRgretgURr8WsVXmTz-LMI94R5f2p2ejv0BCeaiji9YvIrEijRqJ280ELdOc23TduE9S9vuqZapDHsmbBWcA4/s320/9CFD7BE3-5233-43F2-8EEE-83B08FEBD131.jpeg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Benchmarking Hashcat With New Install, (and gratuitous plug for NetMux's Hashcracking Manual which is awesome)</td></tr>
</tbody></table>
<b>Install John the Ripper </b><span style="background-color: white; font-family: "ubuntu" , sans-serif; font-size: 14px;"><b>(</b></span><b>September 7, 2018)</b><b>:</b><br />
<div>
John the Ripper is my favorite password cracking program. If you are doing any sort of academic research or tool development, I can't suggest it enough. I'll admit though that if I'm only concerned with cracking standard hashes I generally use Hashcat instead. Regardless, I'd recommend installing John the Ripper on any password cracking rig you configure. Furthermore, you really need to install the magnum-ripper bleeding edge version of John the Ripper since the base version hasn't been updated in years. New patches, fixes, and features are normally pushed weekly, so building it from source, and constantly re-building it is highly recommended.</div>
<div>
<ol>
<li>Install the following branch of John the Ripper: https://github.com/magnumripper/JohnTheRipper./</li>
<li>Install SSL libraries: sudo apt-get install libssl-dev</li>
<li>cd ./JohnTheRipper/src/</li>
<li>./configure</li>
<li>Note: The following does not have OpenCL support. I'll try to circle back to this later to figure out how to add it.</li>
<li>make -s clean && make -sj4</li>
<li>cd ../run/</li>
<li>./john --test</li>
</ol>
</div>
<div>
<div>
<b>Install MDXFind (September 12th 2018):</b></div>
<div>
I've been told I really need to start using MDXFind so since I'm starting a new cracking platform this is certainly the right time to install it. </div>
<div>
<br /></div>
<div>
A quick aside, most people might question why I need three different password cracking programs on the same computer. I'm sure it's a lot like how chefs view their kitchen knife collection. Yes they all cut, but the right one depends on what you are trying to do.</div>
<div>
<br /></div>
<div>
While certainly not set in stone, as a general rule of thumb I use John the Ripper for research, CPU cracking sessions, cracking file encryption "hashes", and a few other hash types that don't translate well to GPU like SCrypt/BCrypt. It also has the best support for <a href="https://hashcat.net/forum/thread-7763.html" target="_blank">non-English data-sets</a>.</div>
<div>
<br /></div>
<div>
I use Hashcat for most GPU cracking that I do. Yes, John the Ripper GPU support has been getting more robust, but I've had better luck with Hashcat. For example, I'm cracking large lists of unsalted MD5, Hashcat is my go-to cracking program.</div>
<div>
<br /></div>
<div>
MDXFind seems tailored to cracking large "messy" data-sets. Think of a lot of the major password dumps that become public. It's fast and can handle data-sets going into the millions of password hashes. It also has support for cracking nested hashes which have a way of ending up in some of these dumps. Oh, and it seems to be the password cracking tool of choice for <a href="https://blog.cynosureprime.com/2018/08/crack-me-if-you-can-2018-write-up.html" target="_blank">CynoSurePrime</a> and they know a few things...</div>
<div>
<ol>
<li>Obtain the latest copy of the source-code from <span id="docs-internal-guid-25b039ec-7fff-151e-ae21-a7aa82bc2b62"><span style="color: #1155cc; font-family: "arial"; font-size: 11pt; vertical-align: baseline; white-space: pre-wrap;"><a href="https://hashes.org/mdxfind.php" style="text-decoration-line: none;">https://hashes.org/mdxfind.php</a></span></span></li>
<ul>
<li>MDXFind is only provided as a pre-compiled binary so you don't need to build it. Grab the 64bit Linux variant.</li>
<li>Download and copy the file to the directory you want to install MDXFind into</li>
</ul>
<li>Make MDXFind executable</li>
<ul>
<li>chmod +x mdxfind</li>
</ul>
<li>Install required dependencies</li>
<ul>
<li>sudo apt install libjudydebian1 libmhash2 librhash0</li>
</ul>
<li>Test MDXFind</li>
<ul>
<li>./mdxfind </li>
</ul>
</ol>
</div>
<b><u>Other Quality of Life Installations:</u></b></div>
<div>
<b><br /></b>
<b>Install Text Editor:</b></div>
<div>
<ol>
<li>I like Kate. To install it: sudo apt-get install kate</li>
<li>You might also want to install Atom which has more features. I'm hesitant to recommend it with Microsoft buying GitHub, but it is free and has a ton of features: https://atom.io/</li>
</ol>
<div>
<b>Change Login Background </b><b>(September 7th 2018)</b><b>:</b></div>
</div>
<div>
Not really important, but I always do this because it helps my gumption level:</div>
<div>
<ol>
<li>Find a picture you want to see when typing your login picture.</li>
<li>sudo cp Pictures/FILENAME_OF_PCITURE_YOU_WANT_TO_USE /usr/share/backgrounds/login.jpg</li>
<li>vim /etc/alternatives/gdm3.css</li>
<li>Find: #lockDialogGroup background: #2c001e url(resource:///org/gnome/shell/theme/noise-texture.png) background-repeat: repeat; }</li>
<li>Replace it with <pre style="background-color: whitesmoke; border-radius: 4px; border: 1px solid rgba(0, 0, 0, 0.15); color: #333333; font-family: Monaco, Menlo, Consolas, "Courier New", monospace; font-size: 13px; line-height: 20px; margin-bottom: 20px; overflow-wrap: break-word; padding: 9.5px; white-space: pre-wrap; word-break: break-all;">#lockDialogGroup { background: #2c001e url(<span style="color: red; overflow-wrap: break-word;">file:///usr/share/backgrounds/login.jpg</span>);
background-repeat: <span style="color: red; overflow-wrap: break-word;">no-repeat</span>;
<span style="color: red; overflow-wrap: break-word;">background-size: cover;</span>
<span style="color: red; overflow-wrap: break-word;">background-position: center;</span> }</pre>
</li>
</ol>
</div>
</div>
</div>
<div>
<br /></div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-57883873372876611142018-09-03T21:03:00.004-07:002018-09-03T21:22:38.567-07:00Netmux's Hash Crack Challenge Writeup<blockquote class="tr_bq">
<span style="background-color: white; color: #333333; font-family: "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;"><b><i>"Good luck is when opportunity meets preparation, while bad luck is when lack of preparation meets reality"</i> -</b></span><span style="background-color: white; color: #333333; font-family: "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;"><b>Eliyahu Goldratt</b></span></blockquote>
This last week I participated in <a href="https://www.netmux.com/blog/hash-crack-challenge" target="_blank">Netmux's Hash Crack Challenge</a>, and this happened:<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
HASH CRACK CHALLENGE Hash #2 has been cracked by <a href="https://twitter.com/lakiw?ref_src=twsrc%5Etfw">@lakiw</a>. Congrats to our winner and thanks to everyone that participated! It was a bumpy ride but a lot of fun to create and host. Thanks again to everyone and look for the final write-up in the coming days!</div>
— Netmux (@netmux) <a href="https://twitter.com/netmux/status/1035690674123927552?ref_src=twsrc%5Etfw">September 1, 2018</a></blockquote>
<br />
So I figured the least I could do was make a blog posting about it along with my analysis of Netmux's <a href="https://www.netmux.com/blog/one-time-grid" target="_blank">One Time Grids</a>, which the challenge was based on.<br />
<br />
<b>TLDR/Bottom Line(s) Up Front (BLUF): </b><br />
I was lucky enough to be checking Twitter right when Netmux posted his final hint, and that was the only reason I won. As to the security of One Time Grids, they share a lot of similarities to other password books, which can be both good or bad depending on your threat model. Compared to other physically written down password books, the One Time Grid approach pushes users to stronger passwords at the expense of usability. It is *very* secure against your typical online hacker, but shares the weakness of other password books in that it may be weak against people in physical proximity you, (such as ex-boyfriends, nosy parents, nosy children, etc). I didn't find any weaknesses that could be exploited by an online attacker. Long story short, I wouldn't recommend it due to the usability issues, but if you have fun with it, feel free to use it.<br />
<br />
<b>What is a One Time Grid and how does that apply to the contest?</b><br />
Netmux does a better job explaining it in <a href="https://www.netmux.com/blog/one-time-grid" target="_blank">his blog here</a>, but it basically is a password creation book that you can buy from Amazon, <a href="ttps://www.amazon.com/dp/1984926861" target="_blank">available here</a>, that provides a bunch of One Time Grids for creating and storing passwords. The contest was an attempt to crack two different raw-SHA1 password hashes generated using a One-Time-Grid. They were:<br />
<blockquote class="tr_bq">
<b><span style="font-family: "courier new" , "courier" , monospace;">Hash1: fe0c9f335b35c45e92d5e7d07c5933b6c4c0a522</span></b></blockquote>
<blockquote class="tr_bq">
<b><span style="font-family: "courier new" , "courier" , monospace;">Hash2: 120c249bc0f301ef3cba7a0fcbff463aaaded486</span></b></blockquote>
As to the One Time Grids themselves, they are either a 7x7 grid filled randomly with one of the following 84 characters:<br />
<blockquote class="tr_bq">
<blockquote class="tr_bq">
<b><span style="font-family: "courier new" , "courier" , monospace;">ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890-!@#$%^&*=?[](),.;{}:+</span></b></blockquote>
</blockquote>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><img alt="hash crack challenge one time grid" height="268" src="https://res.cloudinary.com/hrscywv4p/image/upload/c_limit,fl_lossy,h_9000,w_1200,f_auto,q_auto/v1/148712/hashcrackchallenge_grid_s84vpm.png" style="margin-left: auto; margin-right: auto;" width="320" /></td></tr>
<tr><td class="tr-caption" style="text-align: center;">One Time Grid used in the contest</td></tr>
</tbody></table>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Or a 3x26 grid filled with random words:</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><img alt="One Time Grid Word Grid" height="320" src="https://res.cloudinary.com/hrscywv4p/image/upload/c_limit,fl_lossy,h_9000,w_1200,f_auto,q_auto/v1/148712/Screen_Shot_2018-02-17_at_7.37.27_PM_lg0x06.png" style="margin-left: auto; margin-right: auto;" width="214" /></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Example word based One Time Grid. Not used in the contest</td></tr>
</tbody></table>
<span style="font-family: inherit;">The One Time Grid used in the challenges was composed of random letters, so this blog post will focus on that. When it comes to the security of a One Time Grid though, most of the statements I'll make will apply to both unless otherwise specified.</span><br />
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;">Netmux also suggests three different ways to turn a One Time Grid into a passwords, a "basic" random grid, a "pattern" random grid, and a "scatter" </span>random<span style="font-family: inherit;"> grid. Only pattern and scatter were used in the contest, so I'll focus on them, but a "basic" grid is simply a "pattern" with no bends. Aka all walks go in a straight line. Below are examples he gave for pattern and scatter on his site. Note, these examples do not use the contest One Time Grid.</span><br />
<span style="font-family: inherit;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji3zqd9RnTvLmzYNL5rBnTrlB4vGa1sLbiOMTgZJgIYOiDziusT2PGpEIyChNvUYqnKD7M7EdYSl_BqKNramBVzquoOHcENWlbclHhZo5BFQoiO5eP-Kd5uwRHPRF_QbIxsdCxVPQTjDg/s1600/one_time_grid_pattern.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="735" data-original-width="871" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEji3zqd9RnTvLmzYNL5rBnTrlB4vGa1sLbiOMTgZJgIYOiDziusT2PGpEIyChNvUYqnKD7M7EdYSl_BqKNramBVzquoOHcENWlbclHhZo5BFQoiO5eP-Kd5uwRHPRF_QbIxsdCxVPQTjDg/s1600/one_time_grid_pattern.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Example "Pattern" Password Creation rules</td></tr>
</tbody></table>
<br />
<span style="font-family: inherit;"><br /></span>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit2xk0upIYuQVDk2QEqe-DI2VKCqUj4O_zXcefos4-RW38O7zh3E1M3kVH1Jx89151ufv_RcZhDn3Wd-IEhkP_Qw-8_KEE8gwsCWEwDNaDzV6djg3Wc9LYAdrJ18gAI67hhmszj1Y1HH0/s1600/one_time_grid_scatter.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="679" data-original-width="610" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEit2xk0upIYuQVDk2QEqe-DI2VKCqUj4O_zXcefos4-RW38O7zh3E1M3kVH1Jx89151ufv_RcZhDn3Wd-IEhkP_Qw-8_KEE8gwsCWEwDNaDzV6djg3Wc9LYAdrJ18gAI67hhmszj1Y1HH0/s1600/one_time_grid_scatter.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Scatter One Time Grid password creation, taken from Netmux's site</td></tr>
</tbody></table>
<span style="font-family: inherit;"><br /></span>
<span style="font-family: inherit;"><b>Contest Start:</b></span><br />
The first thing that should be apparent is that without the One Time Grid that a password was based on, no attack can be run that has a hope of being successful against passwords longer than 9 characters. Even 8 characters would require significant horsepower. 84^8 = 2.4 <span style="background-color: white; white-space: pre-wrap;">quadrillion keyspace which is quite big, even for GPUs. This assumes that the One Time Grids are generated using a true random number generator, <a href="http://www.bettermgmt.com/uploads/4/5/9/1/4591515/seinfeld-yada-yada_orig.jpg" target="_blank">yada yada yada</a>, but for the purposes of this contest, no effective attacks could be started. Which is ok, because it gave me time to prep some tools and do some research.</span><br />
<span style="background-color: white; white-space: pre-wrap;"><br /></span>
<span style="background-color: white;"><span style="white-space: pre-wrap;">Side note, I'll give Netmux credit that doing a "search inside" check of his Amazon One Time Grid book didn't accidentally share any of the real grids. Not that I've abused that feature in other contexts before...</span></span><br />
<span style="background-color: white;"><span style="white-space: pre-wrap;"><br /></span></span>
<span style="background-color: white;"><span style="white-space: pre-wrap;"><b>First Clue: </b></span></span><b>"Pattern" & "Scatter"</b><br />
<span style="font-family: inherit;">Sometime around this point Netmux released his first clue: </span>"Pattern" & "Scatter". This pretty clearly indicated that the above two methods were used to generate the password, so I started to develop some scripts to generate walks of One Time Grids in anticipation of when the actual grid would be released. I originally started out investigating if I could use a custom keyboard layout with <a href="https://github.com/hashcat/kwprocessor" target="_blank">Hashcat's kwprocessor</a>, which generates keyboard walks, but quickly realized I would have to significantly modify it to target One Time Grids. That's because kwprocessor was set up to crack 4 row keyboards vs 7x7 grids, along with some other optimizations it made for keyboard quirkiness which is great for normal cracking, but would cause problems with what I wanted it to do. So I wrote my own script, which I posted on github and is <a href="https://github.com/lakiw/random_contest_code" target="_blank">available here</a>. It admittedly went through several rounds of improvement throughout the contest, but here is a general overview of how it works, and the constraints I added to reduce the key-space:<br />
<br />
<ul>
<li>one_time_grid_walker.py only targets the "Pattern" random grids. "Scatter" random grids need a lot more information to effectively target them. I'll dig into that more later</li>
<li>The first constraint I added to it was that all "walks" had to start and end on the edge of a grid. This was based on my reading of netmux's examples and how I expected a typical user to interpret his suggestions. Examples of "valid" and "invalid" walks can be seen below.</li>
</ul>
<ol>
</ol>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjHbD70rH93x4prXhCFWOW99YhvSRVGevwhJMK75N973heJz2dL-rv2N4bYAwFIns8-c0GHQMuZ8v1TAahpZnV7lD2zm9OqoGSHmupkx1zwTyUVmiw0d7PLoOgm4VL1HynPUL2FtPg2xk/s1600/valid_walk.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="596" data-original-width="711" height="268" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgjHbD70rH93x4prXhCFWOW99YhvSRVGevwhJMK75N973heJz2dL-rv2N4bYAwFIns8-c0GHQMuZ8v1TAahpZnV7lD2zm9OqoGSHmupkx1zwTyUVmiw0d7PLoOgm4VL1HynPUL2FtPg2xk/s320/valid_walk.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Valid walk of contest grid</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNzqqMzfciaz9pTcujCrsBMMldftkfqIkO80A8kIi1_A2O8h2c_p_rTfCsh9sp8MJsWAa4sB0ODhrpprj91LO5yzVoQgsF7MkCpDfjlgbVVTxPcKxrj62vdy_-gwp5K0v0waShVNhv9hw/s1600/invalid_walk.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="596" data-original-width="711" height="268" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNzqqMzfciaz9pTcujCrsBMMldftkfqIkO80A8kIi1_A2O8h2c_p_rTfCsh9sp8MJsWAa4sB0ODhrpprj91LO5yzVoQgsF7MkCpDfjlgbVVTxPcKxrj62vdy_-gwp5K0v0waShVNhv9hw/s320/invalid_walk.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Invalid walk of contest grid</td></tr>
</tbody></table>
<ul>
<li>The second constraint I added was a walk could not double back on itself or cross a part of itself. In the above example, a walk could no go, "8oyIyo8". This admittedly was a naive assumption on my part, but I made it once again to reduce the keyspace and based it on my reading of the examples given.</li>
<li>The third constraint that I struggled with but felt when coding up my script that I needed to make was to limit the maximum size of a walk. As the maximum length increased, the keyspace also did, which would cause problems later when running a combinator/Prince attack. Len8= 4081, Len9= 7268, Len10=12011, Len11=19131. This on its own would be trivial, but when you start combining multiple walks together, can be significant. For example, 19131^2 = 365 million. 19131^3 = 7 trillion. This admittedly was where I probably made my biggest mistake, prematurely optimizing this.</li>
<li>Skipping ahead a bit, I later optimized my approach further to limit the number of "bends" that a walk could make. If I only allowed one "bend", (or change in direction), there were only 575 possible walks for a current grid. This allowed combining many different walks practical. I felt for a typical user following the advice given, this represented what I would expect to see them do.</li>
</ul>
As far as weaponizing this goes, I was tempted to use the <a href="https://reusablesec.blogspot.com/2014/12/tool-deep-dive-prince.html" target="_blank">Prince attack</a>, but when talking with <a href="https://twitter.com/Chick3nman512" target="_blank">Chick3nman</a>, he gave the helpful advice that if you didn't need the optimizations that Prince uses, a straight combinator attack with Hashcat was much faster for easy hashes like raw-sha1.<br />
<div>
<br /></div>
<div>
And then I pretty much waited. Well in reality I tried some attacks against the sample One Time Grids to bide my time, but I didn't expect to crack the first hash. I was a bit cocky though, and expected that I'd crack the first hash within minutes of it being released.</div>
<div>
<br /></div>
<div>
<span style="background-color: white;"><b>Second</b><b style="white-space: pre-wrap;"> Clue: </b></span><b>One-Time Grid attached below</b></div>
<div>
Yes! The target one time grid was finally released. I'll admit I said a few choice words that it was released as a picture though, which led to some squinting and me questioning if letters were lower or uppercase. Oh, and also one typo when entering it into my code that I nearly missed, but luckily <a href="https://www.blogger.com/">Hops<span id="goog_1945447152"></span></a> pointed it out to me. In any future contests, it would be really nice if items like this could be released as text that allowed copying/pasting.</div>
<div>
<br /></div>
<div>
Another challenge I ran into was that I wasn't at my cracking computer, so couldn't run any effective attacks myself. Luckily Chick3nman agreed to run my script and try to crack the first hash for me. Unfortunately he wasn't successful. I want to stress that was my fault since he was running my scripts and attacks.</div>
<div>
<br /></div>
<div>
There was a lot of head scratching, and variations of walks plus the suggested PIN and random word, but long story short, even when I got back to my computer and ran attacks myself, I was completely ineffective at cracking that first hash. I'll admit it really annoyed me in a good way like any fun problem does. I want to give a huge shout out to <a href="https://twitter.com/BoursierEtienne" target="_blank">Boursier Etienne</a>, who actually managed to crack it first. I'd love to hear what Boursier did.</div>
<div>
<br /></div>
<div>
<b>Third Clue: Birthday Paradox</b><br />
I may have uttered a few more choice words over this clue. I'm well versed in the <a href="https://en.wikipedia.org/wiki/Birthday_problem" target="_blank">birthday problem</a>, but that doesn't seem to be applicable to One Time Grids. Yes some individual characters appear more often than others, but the heart of the "scatter" problem is a <b>"Choose X with no replacement"</b> problem. Aka, the first character has 49 different options. The second character has 48 different options. The third character has 47 different options. And so on. This is not related with generating collisions between multiple inputs as far as I can see.<br />
<br />
<b>Fourth Clue: Are all cell values equally probable?</b><br />
I see where Netmux was going with this. For a scatter password, if you were modeling it, cells 3/26, 6/25, and 7/23 all contained periods ".". If you selected any of them when generating a password guess, it didn't matter which order you picked them which can reduce the effective keyspace. The problem comes when trying to weaponize this info. I did some back of the napkin calculations and if your guess generator took into account the "choose and no replacement" aspects along with the "several characters show up several times", you could reduce the keyspace by roughly a factor of 10 for the password lengths I thought the password might be. This sounds great, but one problem I've run into <a href="https://github.com/lakiw/pcfg_cracker" target="_blank">many times before</a>, is that more effective guess generators take time to generate guesses. So while a script that I coded might reduce the keyspace by 10x, it would probably take 100x more time to generate a guess against a raw-sha1 hash then just using a custom mask. Therefore trying to optimize my solution would actually make it worse.<br />
<br />
Now admittedly someone could take the time to create a custom solution in Hashcat or John the Ripper that would be fast, but that wasn't going to happen in the time this contest ran. More importantly though, for a 10 character password generated by a "scatter" method, it didn't matter. The keyspace was so large that even a 10x speedup wouldn't be enough to make it practical.<br />
<br />
<b>Fifth Clue: str(PIN)[:-1]</b><br />
This hint was a good clue that the PIN, minus the last character of the PIN, was part of one or both of the passwords. Aka "71997" could be found in the password. This was good info to have when trying to crack the password, but I'll admit I was a little annoyed since guidance to apply mangling rules like this wasn't in the instructions for using One Time Grids. By that I mean, it's totally within the bounds of someone doing this in real life. In fact, I'd recommend it, as it explodes the keyspace of One Time Grids. But based on the instructions I wouldn't expect a typical user of One Time Grids to do mangling rule like "remove the last character of the PIN". Now, most of my password cracking techniques are based on targeting "typical users". If everyone was unique I'd be the worst password cracker out there. But people typically follow standard behavior patterns which makes password cracking possible. I'm biased, but I like to see that reflected in contests. Needless to say though, this wasn't enough information to crack either one of the two password hashes.<br />
<br />
<b>Sixth Clue: scatter_cells + str(PIN)[:-1]</b><br />
This clue said that the PIN-1 would be at the end of the scatter cells password, which was helpful without being useful. They keyspace for likely scatter cells passwords was so large that knowing any additional mangling didn't make a difference.<br />
<br />
<b>Seventh Clue: Use seven of the possible ten "repeats" to mask your way to the other half of the scatter_cells solution.</b><br />
This provided a lot of useful information without being actionable. It said the "scatter" portion of the password was 14 characters long, with 7 of those characters being a repeat item, and the other 7 being unique characters. This meant 7 characters had 10 possible values, and the other 7 had 29 possible values. What's more, the second set was a pure chose with no replacement, so the 7th character would technically only have 22 possible options. The problem once again was making use of this information. For example, I didn't know which positions would take from either set. So for a 14 character password, that increases the keysize by 2^14 = 16,384, which is a problem because the current mask setups for JtR and Hascat don't support that kind of selection. In retrospect, I realized I could have created a script to generate all 16k masks and feed them into Hashcat, but during the contest that didn't occur to me. Long story short, this was the point where if given six months it's possible someone could have cracked the second hash, but it was unrealistic to do it in a day or two.<br />
<br />
<b>Eighth Clue: Hash #2 = print(len(scatter_cells + str(PIN)[:-1])) = 19</b><br />
While this made explicit that there were no other mangling rules or surprises for the second password hash, it didn't make the problem more crackable compared to the previous clue.<br />
<br />
<b>Ninth Clue: No cell values have been reused in the composition of scatter_cells.</b><br />
<b>“q$*????????)wc” + str(PIN)[:-1]</b><br />
This is where I got really lucky. I managed to check Twitter at the exact right time and saw the following tweet by Netmux:<br />
<blockquote class="twitter-tweet" data-lang="en">
<div dir="ltr" lang="en">
T-minus 15 minutes until the release of the final Hash Crack Challenge clue!<a href="https://twitter.com/hashtag/hashcrack?src=hash&ref_src=twsrc%5Etfw">#hashcrack</a> <a href="https://twitter.com/hashtag/passwords?src=hash&ref_src=twsrc%5Etfw">#passwords</a> <a href="https://t.co/EBPfiPXc9R">pic.twitter.com/EBPfiPXc9R</a></div>
— Netmux (@netmux) <a href="https://twitter.com/netmux/status/1035682070499139584?ref_src=twsrc%5Etfw">September 1, 2018</a></blockquote>
<script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script>
<br />
Therefore I was at my computer and ready to go for the final hint. When he posted it, I quickly created the following mask attack using hashcat:<br />
<br />
<span style="background-color: white; color: #24292e; font-family: , "consolas" , "liberation mono" , "menlo" , "courier" , monospace; font-size: 12px; white-space: pre;">hashcat64.exe -m100 -O -a 3 ..\contests\netmux\netmux.hsh -1 IA9GV8oyILM.!03WKH+epP{TxJz3hbu\? q$*?1?1?1?1?1?1?1?1)wc71997</span><br />
<br />
By Netmux giving me 6 of the scatter characters used I only had to bruteforce a 8 character password, and there were only 32 possible characters per posision, making this significantly easier than a Lanman password hash. All told, it took me around 5 minutes to crack the password hash, which admittedly was a heart pounding five minutes since I was sure other people were running the same attack as I was. I was sweating the whole time and my adrenaline was pumping. As proof of the timing to run the attack, here is me re-running the cracking attack on my system. It took 9 minutes to exhaust the whole keyspace, but I got my crack around five minutes in.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiU5r1aNxju6U5_lwf8unHxLIn5hJ7iyZWijnizurV-zgEHD_Kvcbr7e1tciPAobI5ZBC1VTDUoDjhWJyV8JCQo_m2a87TEVode8nwb1Z1QlCv60K3xUX9VgLVa420e54vtUGgotL0CbVg/s1600/cracked.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="515" data-original-width="1077" height="306" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiU5r1aNxju6U5_lwf8unHxLIn5hJ7iyZWijnizurV-zgEHD_Kvcbr7e1tciPAobI5ZBC1VTDUoDjhWJyV8JCQo_m2a87TEVode8nwb1Z1QlCv60K3xUX9VgLVa420e54vtUGgotL0CbVg/s640/cracked.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Cracking the 2nd Hash. Path information and the actual hash plaintext redacted.</td></tr>
</tbody></table>
For comparison, I have a single NVidea GTX 970 in my computer. Not even a Ti. Really what it comes down to was that I was very lucky, to the point where I feel a little bit guilty about it. In the future I'd advise contest creators to publish set times when they will release hints so that way everyone is on an even field when it comes to making use of this information.<br />
<br />
<b>Conclusion:</b><br />
First of all, I'd like to give thanks to Netmux for putting on this competition. I had a lot of fun and I hope this blog post points that out. There's many "contests" out there but putting my time into this was way more enjoyable than dealing with the drama of <a href="https://techcrunch.com/2018/08/30/john-mcafees-unhackable-bitfi-wallet-got-hacked-again/" target="_blank">hacking Bitfi</a>. Also dealing with a new type of bounded problem like One Time Grids was very interesting.<br />
<br />
I'd also like to thank <a href="https://twitter.com/Chick3nman512" target="_blank">Chick3nman</a>, <a href="https://twitter.com/hops_ch" target="_blank">Hops</a>, and <a href="https://twitter.com/TychoTithonus" target="_blank">Royce Williams</a>, for lending cracking hardware, giving advice, and all the heckling ;p<br />
<br />
As to the security of One Time Grids, let me back up a bit.<br />
<br />
When doing any threat analysis or security review my first step is to categorize the adversary. A good rule of thumb brought up by James Mickens is the <a href="https://www.usenix.org/system/files/1401_08-12_mickens.pdf" target="_blank">"Massad vs. not-Massad"</a> categorization. I highly recommend following that link because the write-up is hilarious, but it boils down to if you are worried about the Massad, well there's nothing you can do because you are going to f***ing die. But if your adversary is someone else, there's effective strategies you can take to protect yourself. Now admittedly there's variations of this, but basically if you are worried about nation level attackers, then don't use One Time Grids. If you are worried about typical hackers though, One Time Grids can be extremely effective. I'll freely admit that I'm not the best password cracker out there, but the fact remains that if Netmux hadn't given me the One Time Grid, along with 11 characters of an 19 character password, I'd never have cracked it. Also One Time Grids are such a niche technique that even after this contest I don't see myself incorporating the lessons learned into any of my normal cracking strategies.<br />
<br />
There's two major problems I see with One Time Grids though. The first is they don't produce memorable passwords. If you don't want to write the passwords down, you'll need to take your book with you, which is a pain. And if you do write your passwords down, I'd recommend using a traditional password manager instead. Most of which have built in random password generation tools which are just as effective as One Time Grids for creating strong passwords.<br />
<br />
The second problem is that One Time Grids share the same issue as many other password "books". They have the potential for horrible failure if your adversary is someone you know and/or love who has access to it directly. Ex-boyfriends/girlfriends/husbands/wives are the big ones, but nosy children or parents also pop up. I'm always very sensitive to this threat vector since while dealing with an abusive ex is bad, dealing with an abusive ex who has access to your e-mail and facebook is way worse. Password management programs can help in this regards, but written down books are problematic. Yes, someone could avoid writing down their "patterns" for One Time Grids, but that doesn't scale as having unique passwords for sites is more important than strong passwords in my opinion. You have no idea how sites are storing their passwords, so the best way to minimize your risk of a site storing your password in plaintext is to use different passwords for different sites.<br />
<br />
I guess what I'm trying to say is I'm a big believer in <a href="http://joehikes.tumblr.com/post/107841653149/hike-your-own-hike-what-does-it-mean" target="_blank">hike your own hike</a>. If you enjoy using One Time Grids, I haven't seen anything to caution against it. You are probably way more secure than most people who don't do anything special. While I'm biased to suggest standard password management programs like 1password, I'll readily admit that programs like 1password have <a href="https://twitter.com/lakiw/status/1032104757220003840" target="_blank">usability problems too</a>. If you really want to have a physical password book, free options include <a href="http://world.std.com/~reinhold/diceware.html" target="_blank">diceware</a>, but if you like the idea of One Time Grids, quite simply, I'm not going to crack those passwords without a whole lot of help.<br />
<br />
<b>Bonus Snark</b><br />
<br />
While doing research on One Time Grids, I came across the following on Amazon and my first thought was, "I bet whoever owned that copy previously was *really* important!!!" /jk<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7lJjPDtHDfLSXMiL9OjyuJaGsBgjDYkQHoOxOheKMhK-2H5qn5HAj3Jj7gHZOmcSdooUkBQwh4IP_iaBeApXtDtLt3HYNradrp2Je5SaOeg97sWGq-DoWg2f2Bo73Cyka6gi_a-Ygpk4/s1600/one_time_grid_funny.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="608" data-original-width="883" height="440" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh7lJjPDtHDfLSXMiL9OjyuJaGsBgjDYkQHoOxOheKMhK-2H5qn5HAj3Jj7gHZOmcSdooUkBQwh4IP_iaBeApXtDtLt3HYNradrp2Je5SaOeg97sWGq-DoWg2f2Bo73Cyka6gi_a-Ygpk4/s640/one_time_grid_funny.png" width="640" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Only $4.67 for shipping though...</td></tr>
</tbody></table>
<br />
<br />
<br />
<br />
<script async="" charset="utf-8" src="https://platform.twitter.com/widgets.js"></script>
</div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-71988692680192798482018-03-11T15:13:00.000-07:002018-03-11T15:13:06.709-07:00Creating Long Term SSL Certificates<blockquote class="tr_bq">
<i><b><span style="background-color: white; color: #333333; font-family: "helvetica neue" , "helvetica" , "arial" , sans-serif; font-size: 14px;">"It's constantly fascinating for me that something that feels absolutely right one year, 12 months later feels like the wrong thing to do." --</span><span style="color: #333333; font-family: "helvetica neue" , "helvetica" , "arial" , sans-serif;"><span style="font-size: 14px;">Damian Lewis</span></span></b></i></blockquote>
Often I find myself having to create my own SSL certificates. Be it an internal web-server, or two scripts that need to communicate to each other, SSL is the easiest way to encrypt network traffic. Unfortunately it's also one of the most dangerous encryption methods. If you make a mistake setting it up it usually works ... at least for a little while.<br />
<br />
Ignoring the client SSL checks for now, (hint if your script is using SSL and it works the first time, you probably are not checking SSL correctly), one area of danger is having your SSL certificates expire. As an example of that, recently <a href="https://arstechnica.com/gadgets/2018/03/oculus-rift-runtime-service-error/" target="_blank">every Oculus Rift broke</a> because a code signing certificate expired. Admittedly this was a different type of certificate, but the same thing tends to happen with internal SSL deployments. People do not remember to update them, and when they expire things tend to break, (at least if your clients are checking SSL properly). The problem is when you use the standard OpenSSL libraries to create your certificates, there's three places that you need to specify certificate lifetimes. If you forget to specify any of the three, the certificate will be valid only for the default which is set to be "365 days".<br />
<br />
These lifetime checks are:<br />
<ol>
<li>The Certificate Authority has an expiration date</li>
<li>The actual certificate you are using has an expiration date</li>
<li>The CA signature for the certificate has an expiration date</li>
</ol>
<div>
Since most stack-overflow posts don't cover this, and Linux man pages are not helpful unless you already know what you are doing, I wanted to share my cheat sheet for creating long term, (valid for one thousand years), SSL Certificate Authorities and signing certs. This script was born from many previous failed efforts, and to be honest I'm still not sure I have it perfectly right. If you notice any improvements that could be made, please let me know! </div>
<div>
<br /></div>
<div>
Requirements/Comments:</div>
<div>
<ul>
<li>These instructions were written for CentOS. It should work for most other Linux flavors without any changes. If you are using Windows, good luck!</li>
<li>OpenSSL</li>
<li>Whenever you see 365000 in the command that's the expiration date. I'm using 365*1000 as shorthand for one thousand years. Yes I realize that isn't exactly accurate. Feel free to change this to the time period you want to use.</li>
</ul>
</div>
<div>
<br /></div>
<div>
<u>Creating the Certificate Authority:</u> (If you already have a CA ignore this, but you might want to check the valid lifetime for that CA)</div>
<div>
<ul>
<li>Generate the key for the CA using 4096 RSA. Note the key will be cakey.pem so protect that!</li>
</ul>
<blockquote class="tr_bq">
<b>openssl req -new -newkey rsa:4096 -nodes -out ca.csr -keyout cakey.pem</b></blockquote>
<ul>
<li>Create the CA's public certificate which will be called "cacert.pem". Note the '-days' field:</li>
</ul>
<blockquote class="tr_bq">
<b>openssl x509 -trustout -signkey cakey.pem -days 365000 -req </b><b>-in ca.csr -out cacert.pem</b></blockquote>
<blockquote class="tr_bq">
<b><span style="color: red;">Important: </span></b>When you run the previous command, you'll be set a list of questions. Note, for many SSL deployments you <b>*must*</b> have the <b>Country, City, State, </b>and<b> Organization</b> match between your CA and the certificates you are signing. Does this make sense? Of course not! The domain can be pretty important as well depending on what you are doing.</blockquote>
<ul>
<li>Next you need to copy the CA info and create the required files into where OpenSSL expects them. Yes if you know what you are doing you can override the defaults, but if not here's what to do:</li>
</ul>
<ol><ol>
<li>If the <b>/etc/pki/CA</b> directly does not exist, create it</li>
<li><b>mv cakey.pem /etc/pki/CA/secret/cakey.pem</b></li>
<li><b>touch /etc/pki/CA/index.txt</b></li>
<li>create or edit <b>/etc/pki/CA/serial</b> using the text editor of your choice</li>
<li>In this file put a list of all the serial numbers you want to assign certificates, separated by a newline. For example:</li>
<ul>
<li>01</li>
<li>02</li>
<li>03</li>
<li>04</li>
</ul>
<li>It is *highly* recommended that you set permissions on the /etc/pki/CA directory so only the user you want to sign certificates has access to it.</li>
</ol>
</ol>
<ul>
<li>Note, <b>cacert.pem</b> is not used for signing SSL certificates, but you'll need to push it to clients that are verifying the certificates</li>
</ul>
<br />
<u>Creating and Signing a SSL Certificate:</u><br />
<br />
<ul>
<li>Create the certificate private key using RSA 4096. It is named client.key in this example. Make sure you protect this!</li>
</ul>
<blockquote class="tr_bq">
<b> openssl genrsa -out client.key 4096</b></blockquote>
<ul>
<li>Create the certificate request. Note the "days" field.</li>
</ul>
<blockquote class="tr_bq">
<b>openssl req -new -key client.key -out client.csr -days </b><b>365000 </b> </blockquote>
<blockquote class="tr_bq">
<b><span style="color: red;">Important: </span></b>Remember for the questions it asks you, the <b>Country, City, State, </b>and<b> Organization</b> <b>*</b>must<b>*</b> match between your CA and the certificates you are signing. In addition, the domain can be pretty important depending on if you are checking that with your client or not </blockquote>
<ul>
<li> Create the actual client certificate. Once again, note the '-days' field</li>
</ul>
<blockquote class="tr_bq">
<b>openssl ca -in client.csr -out client.pem -days 365000 -notext</b></blockquote>
<br />
<u>Resulting Files: </u><br />
<div>
<ol>
<li>Public Client Certificate: client.pem</li>
<li>Client private key: client.key (Only deploy on the server that owns this key)</li>
<li>Public CA certificate: cacert.pem</li>
<li>Private CA key: cakey.pem (Protect this one!!)</li>
</ol>
</div>
</div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-3701063888582701422017-08-16T12:31:00.000-07:002017-08-17T05:43:47.252-07:00Solving Problems with Unknown Constraints<blockquote class="tr_bq">
<i>"Software constraints are only confining if you use them for what they're intended to be used for"</i> </blockquote>
<blockquote class="tr_bq">
<i>-- David Byrne (Of the Talking Heads)</i></blockquote>
I recently had an ongoing conversation that spanned several days about the subject of solving mazes. A friend casually mentioned the "Same Wall Rule", (also known as the "Right Hand Rule"), for solving a maze. This is where if you want to find the exit of a maze you should pick a wall and follow it, with the assumption that you will eventually find the exit this way.<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjPlE35RKOx0Djnb__oGhsqfpGWzj7wh6Nv-PfglxI0suglEoa1MS-Or9cm_1SJt7ORkhu2HTGTG-ofL1v4QA0JtHVvOSGxEB6sgIqjQyfEkN1bDUgg5FEfMzCC-h72JJBWPqHx21kji0/s1600/right_hand.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="177" data-original-width="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjPlE35RKOx0Djnb__oGhsqfpGWzj7wh6Nv-PfglxI0suglEoa1MS-Or9cm_1SJt7ORkhu2HTGTG-ofL1v4QA0JtHVvOSGxEB6sgIqjQyfEkN1bDUgg5FEfMzCC-h72JJBWPqHx21kji0/s1600/right_hand.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Same Wall Rule for Solving a Maze</td></tr>
</tbody></table>
<br />
I pointed out that while this rule generally works, you can't count on it as it can fail spectacularly. For example, what if you start out next to a free-standing wall?<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeFVu8t8NNBqO_9W09qyMLtTrRzqrxTxuf5fiQesHOsmgmgHu2LYrF_9AXDhICg1FU_v6P7c-YaqM657UnC0RbT8klWLt4TXcVou9_e4Tu8lVEtX_cj-IHy2__-mWvJLij6oPsUSBarGU/s1600/right_hand_fail.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="177" data-original-width="177" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeFVu8t8NNBqO_9W09qyMLtTrRzqrxTxuf5fiQesHOsmgmgHu2LYrF_9AXDhICg1FU_v6P7c-YaqM657UnC0RbT8klWLt4TXcVou9_e4Tu8lVEtX_cj-IHy2__-mWvJLij6oPsUSBarGU/s1600/right_hand_fail.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Same Wall Rule Failing Horribly</td></tr>
</tbody></table>
After that our conversation turned to other things but the next day my friend came back and said <i>"I found the problem! The Same Wall Rule will work, but you have to start at the beginning of the maze! Then you can be guaranteed that you won't hit a free-standing wall".</i><br />
<br />
Which is true in most cases, but what if what you are looking for an exit in a free-standing section of the maze? For example what if the treasure is in the middle or you are dealing with a 3-dimensional maze?<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVvXzrF3WhClzPD7O7_wrQMciCaOynlvd5hynt6SKx7hdHUiq696zXVgBKMpXq25owOF5RyDnhy0zq5uqS-P9yBESm9U4jKHD7rolXXvVXzBvoX20opAFEsWh2yaappl2PGAGoBwlhumM/s1600/right_hand_exit_in_middle2.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="171" data-original-width="167" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiVvXzrF3WhClzPD7O7_wrQMciCaOynlvd5hynt6SKx7hdHUiq696zXVgBKMpXq25owOF5RyDnhy0zq5uqS-P9yBESm9U4jKHD7rolXXvVXzBvoX20opAFEsWh2yaappl2PGAGoBwlhumM/s1600/right_hand_exit_in_middle2.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Same Wall Rule Failing to Find Treasure</td></tr>
</tbody></table>
This reminded me of a paper that Cormac Herley recently wrote titled: <a href="http://cormac.herley.org/docs/justifyingSecurityMeasures.pdf" target="_blank">Justifying Security Measures</a>. I highly recommend reading it. It points out that in the security community we often say:<br />
<blockquote class="tr_bq">
Security(X) > Security(~X)</blockquote>
When we really mean:<br />
<blockquote class="tr_bq">
Outcome(X|ABCD) > Outcome(~X|ABCD).</blockquote>
Which is a fancy way of showing that when we say doing X is more secure than not doing X, there usually is a large number of assumptions, (ABCD....), that we're leaving out. Where this directly relates to the main topic of this blog, (password security), is that Herley specifically calls out the password field for the practice of ignoring constraints in our security advice. Or, to quote his paper:<br />
<blockquote class="tr_bq">
<i>"Passwords offers a target-rich environment for those seeking tautologies and<br />unfalsifiable claims."</i></blockquote>
Now back to the issue of maze solving, the same problem often arises. When we make a maze solving algorithm, we're making certain assumptions about the rules of the game. For example, the next iteration of a mapping algorithm might involve marking rooms that you have been in before to detect loops. Well there is a certain fairy-tale where that approach failed due to the marks being destroyed by a 3rd party actor:<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU7iUQ131nhKojj1dgh_Hfap1HuBUdKeRlWo0sgtUnBaHxru81mVnkUPfxDV0DoNFLJBUM5E4Db4ydMZFoxOl6hEuIiAcNCWXkM6AKb8nQM0Zb-v77o_eJ629snxeVNksAljsqBwVP9p4/s1600/hansel.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="349" data-original-width="400" height="279" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjU7iUQ131nhKojj1dgh_Hfap1HuBUdKeRlWo0sgtUnBaHxru81mVnkUPfxDV0DoNFLJBUM5E4Db4ydMZFoxOl6hEuIiAcNCWXkM6AKb8nQM0Zb-v77o_eJ629snxeVNksAljsqBwVP9p4/s320/hansel.jpg" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Hansel and Gretel showing that marks aren't always permanent</td></tr>
</tbody></table>
Even assuming you can safeguard your marks in the maze, that approach may still not be effective if the maze moves while you are traversing it.<br />
<div>
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwvbGPagDmod1fPcBQtlZu895wVBXWxsWuYTX70F_Vm4OLaCCqbL4R1IuQZ2841ttH6e8S9Iad0saA6yVd3_o3QCY4Jycl2Wt1_tXgzhbQt5NhyWl0DRKzd0AKR50OTf_PiKEsswoB6HQ/s1600/the-maze-runner-teaser-poster.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="1600" data-original-width="1083" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjwvbGPagDmod1fPcBQtlZu895wVBXWxsWuYTX70F_Vm4OLaCCqbL4R1IuQZ2841ttH6e8S9Iad0saA6yVd3_o3QCY4Jycl2Wt1_tXgzhbQt5NhyWl0DRKzd0AKR50OTf_PiKEsswoB6HQ/s320/the-maze-runner-teaser-poster.jpg" width="216" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">I've never seen such an amazing premise turned into such a boring book</td></tr>
</tbody></table>
Note, these assumptions go both ways. For example if you are designing a super hard maze, a snarky player can often do something completely unexpected.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF3gfPF_Sl-bXS6ckDf3k7xExcqleA7hmDd2t_jnl4UyCLNUiXgl3THb4BpxOxDCFqpSc_UXULQ5bxoVn1Cwdro8xWAvxZn2TdNUxASg0UVsIAhg0t8Nq8c1uArClAB8jEdQUbA8JWfis/s1600/maze_crushing_it.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" data-original-height="186" data-original-width="194" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhF3gfPF_Sl-bXS6ckDf3k7xExcqleA7hmDd2t_jnl4UyCLNUiXgl3THb4BpxOxDCFqpSc_UXULQ5bxoVn1Cwdro8xWAvxZn2TdNUxASg0UVsIAhg0t8Nq8c1uArClAB8jEdQUbA8JWfis/s1600/maze_crushing_it.png" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Seriously, why would you want to go through the maze?</td></tr>
</tbody></table>
I'd argue that coming up with a perfect maze solver that works for all mazes with no constraints is a near impossible problem. If you can design an algorithm, chances are someone else can come up with a situation where it will fail. On the plus side, the same goes for maze designers. If you come up with a maze with constraints, someone probably can solve it even if it's not how you expected the maze to be solved.<br />
<div>
<br /></div>
<div>
This is a point that I'm actually optimistic about. We deal with imperfect knowledge of the rules we're operating under every day. That's part of the human condition! Tying this back in with Herley's paper, I think there's some things to keep in mind.</div>
<div>
<ol>
<li>When giving advice to end users, I think it's fair to leave implied constraints out as long as the person giving the advice keeps them in mind. Aka telling your kids to follow the right hand wall to get through a corn maze is perfectly reasonable. Telling your kids this assumes there are no minotaurs or evil clowns waiting in the maze to eat them probably will not result in the end state you are aiming for.</li>
<li>Unfortunately following the above can lead to those constraints being forgotten over time and that advice being applied to situations where it is no longer helpful.</li>
<li>Therefore you need to be willing to question previously held beliefs and come up with new approaches when reality doesn't match your expected experiences.</li>
</ol>
<div>
The question then is, how do you discover/rediscover unknown constraints when your start experiencing issues?</div>
<div>
<br /></div>
<div>
One way to deal with this is through experimental design along with making hypothesis about what the results of those experiments will be before you run them. That's something I'm trying to get better at doing as seen in my <a href="http://reusablesec.blogspot.com/2016/08/evaluating-value-of-purge-rule.html" target="_blank">previous blog post</a>. </div>
<div>
<br /></div>
<div>
As an example: Hurley raises the question "Are lower-case pass-phrases better or worse than
passwords with a mix of characters". If I construct an experiment I have to specify a set of constraints that experiment will run under. Now do those constraints match up with the real world use-cases. Of course not! But the fact that there are constraints can help myself and other people interpret how to use those results. Likewise before running an experiment it's important to have a theory and make a hypothesis about what the results will be. Once that's done, running the experiment can validate or falsify the hypothesis. I can then update theory as needed and the process continues.</div>
<div>
<br /></div>
<div>
To put it another way, I think there is a lot of areas where the academic side of computer security can help improve the practical impact that computer security choices impose on the end user ;p</div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div>
<div>
<br /></div>
</div>
</div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com1tag:blogger.com,1999:blog-496451536493805371.post-1189497090047241392016-08-14T21:12:00.002-07:002016-08-14T21:12:45.331-07:00Evaluating the Value of the (@)Purge Rule <blockquote class="tr_bq">
<blockquote class="tr_bq">
<i>“Only sometimes when we pick and choose among the rules we discover later that we have set aside something precious in the process.” </i> </blockquote>
</blockquote>
<blockquote class="tr_bq">
<blockquote class="tr_bq">
<i>― Helen Simonson, Major Pettigrew's Last Stand</i></blockquote>
</blockquote>
<h3>
Background and Problem Statement:</h3>
I was recently asked the following question: <i>"Is there any value in supporting the character purge rule in Hashcat?"</i> The purge rule '@x' will remove all characters of a specific type from a password guess. So for example the rule '@s' would turn 'password' into 'paword'. The full thread can be found on the Hashcat forum <a href="https://hashcat.net/forum/thread-5661-post-30504.html" target="_blank">here.</a> The reason behind this inquiry was that while the old version of Hashcat implemented the character purge rule, GPU versions of Hashcat and Hashcat 3.0 dropped support for it. Since then, At0m <a href="https://github.com/hashcat/hashcat/commit/8acf5b38797560de0613d77f7eb6d6daee578bcb" target="_blank">added support</a> for the rule back in the newest build of Hashcat which makes this question much less pressing. That being said, similar questions pop up all the time and I felt it was worth looking into if only to talk about the process of investigating problems like this.<br />
<br />
Side note, as evidence that any change will <a href="https://xkcd.com/1172/" target="_blank">break someone's workflow</a>, when researching this topic I did find one user who stored passphrase dictionaries with spaces left intact. They would then use the purge rule to remove the spaces during a cracking session so that way they wouldn't have to save a second copy of their passphrase wordlist without spaces. For that reason alone I think there is some value in the purge rule<br />
<br />
<h3>
The Purge Rule Explained:</h3>
<b>Hashcat Rule Syntax:</b> @X where (X) is the character you want to purge from the password guess<br />
<b>Example Rule: </b>@s<br />
<b>Example Input: </b>password<br />
<b>Example Output:</b> paword<br />
<h3>
</h3>
<h3>
</h3>
<h3>
<br /></h3>
<h3>
Hypothesis:</h3>
My gut feeling is that the purge rule will have limited impact on a cracking session. I base that on a rule of thumb that mangling rules work best if they mimic the thought process people use when creating passwords. For example, people often start with a base word and then append digits to it, replace letters with L33t replacements, etc. Therefore rules that mimic these behaviors tend to be more successful. I just don't see many people removing character classes from their password.<br />
<br />
Now if you are a Linux fan, you'll realize Linux developers *love* removing characters from commands. Do you want to change your password? Well "passwd" is the command for you! Maybe Linux developers use the same strategy for their passwords? So I certainly could be wrong. That being said, the whole idea of a hypothesis is to go out on a limb and make a prediction on how an existing model will react so here I go:<br />
<br />
My hypothesis is that the purge rule will crack less than 1 thousand passwords of a 1 million password dataset, (0.1%). Of those passwords cracked, a vast majority (95%), will be cracked due to weaknesses of the input dictionary vs. modeling how the user created the password. For example, 'paword' might be a new Pokemon type that didn't show up in the input dictionary vs being created by a user taking the word 'password' and then removing the S's.<br />
<br />
<h3>
Short Summary of Results:</h3>
The purge ruleset cracked 164 passwords (0.016% of the test set). This was slightly better then just using random rules which in a test run cracked 23 password, but not by much. Supporting this rule is unlikely to help in any noticeable degree with your cracking sessions.<br />
<br />
<h3>
Experimental Setup:</h3>
<b>Test Dataset:</b> 1 million passwords from the newest MySpace leak. These were randomly selected from the full set using the '<b><i>gshuf -n 1000000</i></b>' command.<br />
<b><br /></b>
<b>Reason:</b> Truth be told, the main reason I used the MySpace passwords was I'm getting tired of using the RockYou dataset for everything. That being said, it's useful for this experiment that all of the passwords in that dataset have been converted to lowercase since I don't have to worry about combining case mangling rules with the purge rules.<br />
<br />
<b>Tools Used: </b>Hashcat for the cracking, and John the Ripper for the --status option<br />
<br />
<b>Rulesets Used:</b> Hashcat's D3ad0ne manging rules. I broke it up into two different rulesets with one containing the purge rules, (along with a few append/prepend '@' rules that snuck in), and the other one containing all the other mangling rules.<br />
<br />
<b>Reason:</b> D3ad0ne's mangling rules contains about 34 thousand individual mangling rules. Due to its size and the fact that it is included with Hashcat it should make a good example of a ruleset that many Hashcat users are likely to incorporate in their cracking sessions. I initially split the base ruleset into two different subsets, with all rules including the '@' into one ruleset called d3ad0ne_purge, and all the other rules into another one called d3ad0ne_base. I then started manually going through d3ad0ne_purge and placing rules such as "append a @" into the d3ad0ne_base, but with over 1k rules in d3ad0ne_purge I quickly decided to remove the results of the append/prepend '@' after the fact instead of trying to fully isolate only purge rules in their own ruleset.<br />
<h4>
<br /></h4>
<h4>
Dictinary Used: <span style="font-weight: normal;">I used dic-0294 as my wordlist. Yes there are better input dictionaries out there, but this is a common one and strikes a good balance between size and coverage, plus it is public vs other dictionaries I have that are based on cracked passwords</span></h4>
<h4>
</h4>
<div>
<br /></div>
<h3>
Experimental Results:</h3>
<b>Step 1) </b>Run a normal cracking session on the 1 million myspace passwords using dic-0294 and D3ad0ne_base. This is important since the purge rule will likely crack many passwords that would be cracked normally with other rules. Running a normal cracking session first remove those passwords so we can focus on password that would only be cracked by the purge rules. The command I ran was below, (note, I'm editing some of the path information out of the commands for clarity sake).<br />
<blockquote class="tr_bq">
<span class="s1"><i><b>./hashcat -D1 -m 100 -a 0 --remove myspace_rand_1m_hc.txt -r rules/d3ad0ne_base.rule dic-0294.txt</b></i></span></blockquote>
A couple of notes about the above rule. I'm using a version of Hashcat that I updated on August 10th 2016. I ran it on a very old MacBook Pro so the <b><i>-D1</i> </b>is telling it to use CPU only, (since the GPU doesn't have enough memory). The <i><b>-m 100</b></i> is telling it to crack unsalted SHA-1 hashes. The <i><b>-a 0</b> </i> is to do a basic dictionary attack. <b><i>--remove</i> </b>was to remove any cracked hashes so they aren't counted twice in future cracking sessions. <i><b>myspace_rand_1m_hc.txt</b> </i>is my target set, <b>r<i>ules/d3ad0ne_base.rule</i></b> is my ruleset, and <i><b>dic-0294.txt</b></i> is my input dictionary. Below are the results of running this first attack.<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheTZQhnp3uIoo7vXJz8zXLhot62jOtG8AGYRpcWssbLriFdUQZspZ0-x1MMspNOTBwXhxENUwrxrc1P9jpplTpn-yUCiux9nq6F2-hXn72GwbbD1rNa0TsVwnk1v6xWD6qQBopvdg-Vg8/s1600/Screen+Shot+2016-08-13+at+10.49.36+PM.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEheTZQhnp3uIoo7vXJz8zXLhot62jOtG8AGYRpcWssbLriFdUQZspZ0-x1MMspNOTBwXhxENUwrxrc1P9jpplTpn-yUCiux9nq6F2-hXn72GwbbD1rNa0TsVwnk1v6xWD6qQBopvdg-Vg8/s1600/Screen+Shot+2016-08-13+at+10.49.36+PM.png" /></a><br />
<br />
With 36% of the passwords cracked by a very vanilla attack on a slow computer, that isn't bad. Next up is running the purge rules.<br />
<br />
<b>Step 2) </b>Delete the previous hashcat.pot file. Run a cracking session on the remaining passwords using the purge ruleset. The command I ran was very similar to the one above:<br />
<blockquote class="tr_bq">
<i><b>./hashcat -D1 -m 100 -a 0 myspace_rand_1m_hc.txt -r rules/d3ad0ne_purge.rule dic-0294.txt</b></i></blockquote>
Note, I took off the --remove option since I didn't care about removing cracked hashes for this. I also deleted the previous .pot file of cracked passwords since I only wanted to store passwords associated with this test. Here is a screenshot I took partway through the cracking session:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghdOr1lzQje9dYemweK3MTiGD5cA-4i3JyUy7ao8qg24z7Vi7lajpvAT09LpHlGYJJHurtPiZQZwCiHX1DWoFOQilmU_SnjQH-RlNFMfS7aU-VWmS8QurknFshFHBvICPXiuEJrrWAayQ/s1600/Screen+Shot+2016-08-13+at+10.53.53+PM.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEghdOr1lzQje9dYemweK3MTiGD5cA-4i3JyUy7ao8qg24z7Vi7lajpvAT09LpHlGYJJHurtPiZQZwCiHX1DWoFOQilmU_SnjQH-RlNFMfS7aU-VWmS8QurknFshFHBvICPXiuEJrrWAayQ/s1600/Screen+Shot+2016-08-13+at+10.53.53+PM.png" /></a><br />
<br />
As you can see. many of the cracked passwords were due to "insert a @ symbol" vs. using the purge rule. Here are the final results:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9WqdkVOmeFy4ldKVEDqQ9I8uw0K6ZeMmJwL3_97134tvmdXQ9ws7DclaMSDFWunXIzc758DvE6R-Bjga_FG5p1PhdJHY6we2wvGgaWBH9g9_afJi3epk_EIYN0nsP_M5oanMTPIs2WRI/s1600/Screen+Shot+2016-08-13+at+10.53.10+PM.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh9WqdkVOmeFy4ldKVEDqQ9I8uw0K6ZeMmJwL3_97134tvmdXQ9ws7DclaMSDFWunXIzc758DvE6R-Bjga_FG5p1PhdJHY6we2wvGgaWBH9g9_afJi3epk_EIYN0nsP_M5oanMTPIs2WRI/s1600/Screen+Shot+2016-08-13+at+10.53.10+PM.png" /></a><br />
<br />
The session managed to crack 405 unique hashes. I then went into the pot file and deleted any password containing the '@' character so what was left was due to the purge rule. This left me a list containing 128 unique passwords. A screenshot is shown below:<br />
<br />
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4l0Y-K9nh1fOnigvHLPw9d7wADIC6A7Wlfwb-84rFEdfkar7_Yr8f3h-YJQCMaqxh70f8imCFBpgAGuJV2P7wZtJ4GyPWqijk69JEWmcrszxeX4OZLlmyOXF4U6JiBB0udwMYshjaCdg/s1600/Screen+Shot+2016-08-13+at+11.00.54+PM.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4l0Y-K9nh1fOnigvHLPw9d7wADIC6A7Wlfwb-84rFEdfkar7_Yr8f3h-YJQCMaqxh70f8imCFBpgAGuJV2P7wZtJ4GyPWqijk69JEWmcrszxeX4OZLlmyOXF4U6JiBB0udwMYshjaCdg/s1600/Screen+Shot+2016-08-13+at+11.00.54+PM.png" /></a><br />
<br />
Now it's hard to tell what people were thinking when they created these passwords, but glancing through the list, it certainly appeared that most of the cracked passwords were simply due to limitations in my input dictionary vs users purging characters from their passwords. I was actually surprised 'jayden' and 'fatguy' weren't in dic-0294 but after double checking it they were in fact missing from it.<br />
<br />
Now, input dictionaries are always going to be limited to a certain extent so these cracks absolutely count. They only represent uniq cracked hashes though. For example, if 20 people used the password 'imabear' it would only be counted once. To figure out how many total accounts would have been cracked, I re-ran the above dictionary through John the Ripper against the myspace_1m_rand list. This was to get the files into John's cracked file (pot) format. For example here is 'imabear' in john.pot:<br />
<br />
<blockquote class="tr_bq">
<span class="s1"><b>{SHA}QiPoQuc4sqqs3J+OulWLt3H09kY=:imabear</b></span></blockquote>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">The reason I did this was because JtR has a really cool feature '<b><i>-show</i></b>' that will match up cracked passwords with the accounts in the target set. Running the command:</span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">
</span></div>
<blockquote class="tr_bq">
<span class="s1"><b>./john -format=raw-sha1 -show myspace_rand_1m_clean.txt</b></span></blockquote>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
resulted in the following output:</div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbPQQtTrSOImBKxuWfI3v1lTr8oV_akAJrcN97H2GVzjN2Gb0sNQ0XwD8SJ_2ydJ2jKkyykzMNTcuL2jL0KzUkCx8iLK0QlKPthyCmtGLMXxmBWUvBlbCKfjmXjJqhBJ3ILcqyenmWb0Y/s1600/Screen+Shot+2016-08-14+at+8.21.15+PM.png" imageanchor="1"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgbPQQtTrSOImBKxuWfI3v1lTr8oV_akAJrcN97H2GVzjN2Gb0sNQ0XwD8SJ_2ydJ2jKkyykzMNTcuL2jL0KzUkCx8iLK0QlKPthyCmtGLMXxmBWUvBlbCKfjmXjJqhBJ3ILcqyenmWb0Y/s1600/Screen+Shot+2016-08-14+at+8.21.15+PM.png" /></a></span></div>
<div class="p1">
<span class="s1"><br /></span></div>
<div class="p1">
<span class="s1">Therefore the purge rules cracked a total of 164 passwords from the test set, or 0.0164% of the total. That's a really small amount. Admittedly every password cracked is nice, but still I was curious if the purge rules were better then just running random mangling rules instead. Luckily, Hashcat supports a command to test that out:</span></div>
<blockquote class="tr_bq">
<b><span class="s1">./hashcat -D1 -m 100 -a 0 myspace_rand_1m_hc.txt -g 500 dic-0294.txt</span></b></blockquote>
The only difference with the above command and the previous Hashcat commands I ran was that instead of a rules file I specified '<b><i>-g 500</i></b>'. What that does is tell Hashcat to generate 500 random rules to run on the input dictionary. I choose that number since there were over a thousand rules in my D3ad0ne_purge dictionary and I guestimated that about half of them were actual purge rules. When I ran the above I ended up cracking 23 more passwords. That's significantly less then the 164 the purge rules did but in the grand scheme of things it was about the same in effectiveness. Considering some of those rules were likely duplicates of rules in D3ad0ne_base ruleset as well I'd argue that running a purge rule is about equivalent of running a random mangling rule. In fact if you don't already have purge rules in your mangling set, I'd probably recommend not worrying about it and just running a brute force method like Markov mode to stretch your dictionary instead.<br />
<br />
<h3>
Conclusion:</h3>
<div>
For once my gut feeling was right and the value of Hashcat's purge rule '@' was limited in the tests that were run. That's not to say that it's not useful. It may help when targeting certain users or aid in keeping the size of your dictionary files on disk manageable. But at the same time, it's not a major feature that other password crackers should rush to mimic. I hope this blog post was informative in helping show different ways to evaluate the effectiveness of a mangling technique. If you have any questions, comments or suggestions please feel free to leave them in the comments section.</div>
<div class="p1">
<span class="s1">
</span></div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com1tag:blogger.com,1999:blog-496451536493805371.post-74252621230436561512016-07-07T17:52:00.000-07:002016-07-07T17:52:23.852-07:00Cracking the MySpace List - First Impressions<blockquote class="tr_bq">
<em>Alt Title: An Embarrassment of Riches</em></blockquote>
<h3>
Backstory:</h3>
Sometime around 2008, a hacker or disgruntled employee managed to break into MySpace and steal all the usernames, e-mails, and passwords from the social networking site. This included information covering more than 360 million accounts. Who knows what else they stole or did, but for the purposes of this post I'll be focusing only on the account info. For excellent coverage of why the dataset appears to be from 2008 let me refer you to the always superb <a href="https://www.troyhunt.com/dating-the-ginormous-myspace-breach/" target="_blank">Troy Hunt's blog post on the subject</a>. Side note, most of my information about this leak also comes from Troy's coverage.<br />
<br />
This dataset has been floating around the underground crime markets since then, but didn't gain widespread notoriety until May 2016 when an advertisement offering it for sale was posted to the "Real Deal" dark market website. Then on July 1st, 2016, another researcher managed to obtain a copy and then posted a public torrent of then entire leak for anyone to download. That's where things stand at this moment.<br />
<br />
<h3>
Unpacking the Dataset:</h3>
The first thing that stands out about the dataset is how big it is. When uncompressed the full dump is 33 Gigs. Now, I've dealt with database dumps of similar size but they always included e-mails, forum posts, website code, etc. The biggest password dataset I previously had the chance to handle was RockYou set which weighed in at 33 million passwords and took up 275 MB of disk. Admittedly that didn't include user info and passwords were stored as plaintext, (the plaintexts are generally shorter than hex representation of hashes), but still that's a huge leap in data to process. Heck, even the full RockYou list is a bit of a pain to processes.<br />
<br />
Let me put this another way. Here is a simple question, "How many accounts are in the MySpace list?" Normally that's quick and easy. Just run:<br />
<blockquote class="tr_bq">
<strong>wc -l</strong></blockquote>
And then you wait ... and wait ... and wait ... and then Google if there is a faster way to count lines .. and then wait. 16 minutes and 24 seconds later, I fount out there were 360,213,049 lines in the file. Does that equal the number of total accounts or is there junk in that file? Well, I don't want to spend the 30+ minutes to run a more complicated parser so that sounds about right to me ¯\_(ツ)_/¯. Long story short, doing anything with this file takes time. Eventually I plan on moving over to a computer with a SSD and more hardware which should help but it's something to keep in mind.<br />
<br />
That being said, the next question is "What does the data look like?" Well here is a screenshot of the first couple of lines.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_IsqH9INwCPRAj89QL1hJ5jAhctQh5qPFxszsUU978yB91uNIHAgPwUML4hie72ivHK8Ux8k4eaJBHOqjCnA2z7COwqec2nS9HNnyl0JQqTgeg234DnIKe_5Y1Mf3q8pbvAJ3baLY290/s1600/start_of_list.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="145" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg_IsqH9INwCPRAj89QL1hJ5jAhctQh5qPFxszsUU978yB91uNIHAgPwUML4hie72ivHK8Ux8k4eaJBHOqjCnA2z7COwqec2nS9HNnyl0JQqTgeg234DnIKe_5Y1Mf3q8pbvAJ3baLY290/s640/start_of_list.png" width="640" /></a></div>
<br />
As you can see, it takes the form of unique ID that increments, e-mail address, username, and then two hashes. All of the fields except the unique ID can be blank.To answer the next question, "Why two hashes?" well ... ¯\_(ツ)_/¯. That's something I plan on looking at but I haven't gotten around to it yet.<br />
<br />
<span style="color: red;">Update: 7/7/16: </span>Just as I was finalizing this post, I ran across CynoSure Prime's analysis where they managed to crack almost every single hash in this dataset. You can find their blog post <a href="http://cynosureprime.blogspot.com.au/2016/07/myspace-hashes-length-10-and-beyond.html" target="_blank">here</a>. It turns out the second hash is actually the original password, (full length with upper case characters) salted with the user_id. I'm going to leave most of this blog entry unmodified even though how to parse the list can certainly be optimized based on this new info. <span style="color: red;"></Update></span><br />
<br />
Other random tidbits: The final unique ID is 1005290998. That's significantly higher than the number of accounts in this dataset so there are large chunks of accounts that were deleted at some point in time. My guess is when a user deleted their MySpace account it really was deleted in which case, kudos to MySpace for doing that! That's just a guess though. As you would expect the first accounts were administrative accounts and system process accounts. I know I blocked out the user e-mails but I will admit I googled the first name. When I found his LinkedIn profile my first reaction was, "Wow, he needs brag about his accomplishments more than just saying:"<br />
<blockquote class="tr_bq">
<span style="background-color: white; color: #333333; font-family: "arial" , sans-serif; font-size: 13px; line-height: 17px;"><i><b>Developed, and launched the initial Myspace community which currently has over 100 million members and was acquired by Fox Corp. for $580 million.</b></i></span></blockquote>
I mean if it was me I would post that database dump on my resume! Of course further googling led me to to the book "<a href="https://books.google.com/books?id=c-lEzyA4TSQC&printsec=frontcover#v=onepage&q&f=false" target="_blank">Stealing MySpace</a>." Reading about all the drama that went on and suddenly there went my evening. Needless to say, the general layout of the dataset looks legit but one more interesting fact was all those gmail accounts. MySpace was created in 2003, Gmail opened for invitation access in 2004, and the lead engineer of MySpace left in 2003. So employees were able to update their accounts after they had left the company. Once again, kudos to MySpace but that was surprising.<br />
<br />
<h3>
Password Hash Format:</h3>
<div>
I initially learned from Troy Hunt's posts that the hashes were unsalted SHA1 with the plaintext lowercased and then truncated to 10 characters long. Therefore the password:</div>
<blockquote class="tr_bq">
123#ThisIsMyPassword</blockquote>
would be saved as:<br />
<blockquote class="tr_bq">
123#thisis</blockquote>
I've heard some people say that this means hackers can just brute force the entire key-space. If I was feeling nit-picky I could argue *technically* that's beyond the reach of commercial setups as 70^10 is still a really big number (27 characters + 10 digits, + 33 special characters). In reality though by intelligently searching the key-space, (who uses commas in their password?), a vast majority of unsalted password hashes can be cracked under that format. It's a bit of a moot point though since the real issue is using such a fast unsalted hash. Ah 2008, when it was still acceptable to claim ignorance for using a bad hashing set-up.<br />
<br />
Long story short, from my experiments so far I can confirm that it appears all the hashes had their plaintexts lowercased and truncated to 10 characters. Also, yes, serious attackers are very likely to crack almost every password in this list.<br />
<br />
<h3>
Cracking MySpace Passwords With John the Ripper (Take 1):</h3>
After glancing around the dataset, the next thing I wanted to do was start cracking. To do this, I needed to extract and format the hashes. My first attempt to do this yielded the following script:<br />
<blockquote class="tr_bq">
<span style="background-color: white; color: #333333; font-family: "consolas" , "menlo" , "monaco" , "lucida console" , "liberation mono" , "dejavu sans mono" , "bitstream vera sans mono" , monospace , serif; font-size: 12px; line-height: 24px;"><b>cat Myspace.com.txt | awk -F':' '{if (length($2) > 3) {print "myspace_big_hash1:" substr($4,3); if (length($5) > 3) {print "myspace_big_hash2:" substr($5,3)}}}' > myspace_clean_big.hsh</b></span></blockquote>
To point out a couple of features, I was labeling my data-sets so they are correctly identified in my input file, (I maintain different input files for different data sets but still having that name there has saved me trouble in the past), and I was removing blank hashes. Also I was stripping the username and e-mail addresses since I really didn't want to see passwords associated with names. The problem was the resulting file was huge. I didn't save it, but it was bigger than the original list! I couldn't afford the full naming convention. Therefore I switched to to following script:<br />
<blockquote class="tr_bq">
<b style="color: #333333; font-family: Consolas, Menlo, Monaco, "Lucida Console", "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", monospace, serif; font-size: 12px; line-height: 24px;">cat Myspace.com.txt | awk -F':' '{if (length($2) > 3) {print substr($4,3); if (length($5) > 3) {print substr($5,3)}}}' > myspace_temp.hsh</b></blockquote>
And then to remove duplicates I ran:<br />
<blockquote class="tr_bq">
sort -u myspace_temp.hsh > myspace_big.hsh</blockquote>
The resulting file was a little under 8 gigs which was better. Problems occurred though when I tried to load the resulting hash file into JtR. More specifically after letting it run overnight, JtR still hadn't loaded up the password list and started making guesses. That kind of makes sense, That's way more passwords than normal to parse and my laptop only had 8 gigs of ram so even in an ideal case the whole list probably couldn't be stored in memory. That's not an ideal cracking situation. Being curious, I then decided to try and load it up in Hashcat.<br />
<br />
<h3>
Cracking MySpace Passwords With Hashcat:</h3>
Loading up the dump in Hashcat was interesting since it gave me warnings about records in the dataset that weren't parsed correctly.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ1ZpO7gmACiwwVSPWsX0uRGZ506713c5Jyjszxx0i77T-LWERjCNlMQ_xqgPc6f8Rf9CNMJW3lIhpuragwT7-uUsyxs_Zp38ASYcPrXlYV-zznyEf8T4tJvHyxXYGDFLhr6FGAC-qrRw/s1600/line_length.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="142" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjJ1ZpO7gmACiwwVSPWsX0uRGZ506713c5Jyjszxx0i77T-LWERjCNlMQ_xqgPc6f8Rf9CNMJW3lIhpuragwT7-uUsyxs_Zp38ASYcPrXlYV-zznyEf8T4tJvHyxXYGDFLhr6FGAC-qrRw/s400/line_length.png" width="400" /></a></div>
<br />
Regardless, once all was said and done, I ended up with the following error:<br />
<blockquote class="tr_bq">
<b>ERROR: cuMemAlloc() 2</b></blockquote>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUoL9e-DlYx77Fj4UH0BzTuwjKbWrygyC6A0Xkabnvd0Ij7FZ02e2M4D45WZskprC-GDEXlA_UVZTmxUEy3W3EbHRCv3gJceZRPywSQBqeRC0qhk4C6D3LC0l2FP1qerE5ZPN2B7raAmU/s1600/hashcat_full_myspace.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="321" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUoL9e-DlYx77Fj4UH0BzTuwjKbWrygyC6A0Xkabnvd0Ij7FZ02e2M4D45WZskprC-GDEXlA_UVZTmxUEy3W3EbHRCv3gJceZRPywSQBqeRC0qhk4C6D3LC0l2FP1qerE5ZPN2B7raAmU/s640/hashcat_full_myspace.png" width="640" /></a></div>
<br />
<br />
Doing some quick Googling, I found out the cause was that the GPUs ran out of memory trying to load the hashes. Not surprising but it meant I had to take a different approach if I wanted to crack any hashes from this set.<br />
<br />
The easiest way to do this was to split the full list up into smaller chunks and then crack each section by itself. One way to do that is with the split command<br />
<blockquote class="tr_bq">
split -l 5000000000 myspace_big.hs myspace_split_</blockquote>
This will break up the list into 5 million hash chunks that follow the line of myspace_split_aa, myspace_split_ab .... The downside is since you have to crack each file individually, the total cracking time has been increased by close to a factor of 40. I'd recommend playing with the file size to maximize the total number of hashes per file that your GPU supports. On the plus side, after all that I can now finally crack passwords!<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjxU82tNDLu7v6pxBqxDQMm-iZtbTQvffIQidH7HYtZFyjWm2mTNw2vssmJquuDzaIlSSMIetnjjA1pB4y1fm6-K9zlADszLbVxUwg4Th8Gaio71FL4tDLbXkP8eFYuH4tlJSkqHCU1Cw/s1600/cracking.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="285" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhjxU82tNDLu7v6pxBqxDQMm-iZtbTQvffIQidH7HYtZFyjWm2mTNw2vssmJquuDzaIlSSMIetnjjA1pB4y1fm6-K9zlADszLbVxUwg4Th8Gaio71FL4tDLbXkP8eFYuH4tlJSkqHCU1Cw/s320/cracking.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Finally cracking passwords</td></tr>
</tbody></table>
<br />
One issue I had was that there were so many hashes cracking all the time that it was hard to see the status of my session. It's not that my attack was effective, but with a list that large it's hard not to crack something. I belatedly realized I could pause hashcat, print the status and then resume. Or are Jeremi Gosney replied on Twitter, I could have used the following switch with Hashcat:<br />
<blockquote class="tr_bq">
-o /dev/null </blockquote>
<h3>
<b>Closing Thoughts:</b></h3>
<div>
I'll admit I'm writing this conclusion with CynoSure Prime's analysis fresh in my mind. While the MySpace list is great for giving me a real world challenge to knock my head against, I'm not sure how useful it'll be from a research perspective. The 66 million salted hashes that were created from the original plaintexts will be nice for new training and testing sets so researcher's don't have to keep using RockYou for everything. That being said, MySpace is actually an older list than RockYou. Also I fully expect there to be a lot of overlap in the passwords between the two datasets. RockYou's entire business model was allowing apps to work across multiple social networking sites in the era before federated logins. RockYou was storing MySpace + LiveJournal + Facebook passwords in the clear so its app could post cross-post across all of them. Statistically I expect MySpace and RockYou to be very similar. </div>
<div>
<br /></div>
<div>
What worries me though, and what makes the MySpace list special, is it has user information associated with all those 360 million accounts + password hashes. Just about everyone who did any social networking and is between the ages of 24 and 40 is in this dump. I realize this list has been in the hands of criminals for the last eight years and a lot of the damage has already been done. Still, now that this list is public it enables many more targeted attacks to be carried out by malicious actors from all over the internet. How long before we start seeing the top 100 celebrity passwords posted on sites like Gawker? What about ex's using this information against former partners? Previous public password dumps have been much more limited or didn't contain e-mail addresses. I really don't know what will happen with this one. Hopefully I'm being overly paranoid but it's hard not to think about the downsides associated with this dump being widely distributed. On the plus side, hopefully this is the only mega-breach we'll see with weak password storage. Sites like Google and Facebook are now using very strong hashes which will limit a lot of damage if their user information is disclosed in the future.</div>
Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-16702899222118854432016-05-11T18:53:00.000-07:002016-05-11T19:01:52.039-07:00Getting Started With Quantum Computing<blockquote class="tr_bq">
<i>“More often than not, the only reason we need experiments is that we're not smart enough.” </i><i>― Scott Aaronson</i></blockquote>
IBM is currently offering free time on one of their quantum computers for interested researchers. Yup, you can program a real life quantum computer right now! In fact, I highly recommend signing up which you can do <a href="http://www.research.ibm.com/quantum/" target="_blank">here</a>. Go ahead and check it out. It took me about 24 hours to get my account approved so you can come back here afterwards to finish reading this post.<br />
<br />
What got me interested in this opportunity was that while I have tried to keep up on the field of quantum computing, it basically is magic to me. I've been building up some general rules in my head about quantum systems, but any sort of question about them that did more than scratch the surface left me shrugging my shoulders. Also it was hard to separate fact from fiction.<br />
<blockquote class="tr_bq">
<b>Quantum Laws (in Matt's head):</b><br />
<ol>
<li>Quantum is a system like everything else. </li>
<li>A quantum state is a configuration of the system.</li>
<li>A quantum state changes; it naturally wants to evolve, but it can always be undone.</li>
<li>Evolution of a closed system is a unitary transformation on its Hilbert space.</li>
<li>Only the Keeper can block quaffle shots thrown by the opposing team</li>
<li>Do not feed your qubits after midnight</li>
</ol>
</blockquote>
That's why IBM's offer interested me so much. Let's be honest, there's always going to be some magic when it comes to quantum systems, but the opportunity to actually get hands on time programming one would at least turn the whole experience into alchemy if not science for me.<br />
<br />
<h3>
<b>Participating in IBM's Quantum Experience:</b></h3>
After your account is approved you immediately have access to a research portal which IBM calls the "Quantum Experience". It's currently in Beta, but beyond a few bugs in the composer, (which I'll talk about in a bit), it's a very well polished site.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuE-8vln7ROgpwd76O7FYAkiQH13jdONhRT7zUXtk6dtU_ytcdIB1knice8iu4Hj2nJPvi6ZgnKgKYv4lKNRkk_9mVuAboJRFAN1yWsePvZgaaFvQ0SkErvrpEMX0NDpdN7vc_whtHwFk/s1600/welcome.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="123" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhuE-8vln7ROgpwd76O7FYAkiQH13jdONhRT7zUXtk6dtU_ytcdIB1knice8iu4Hj2nJPvi6ZgnKgKYv4lKNRkk_9mVuAboJRFAN1yWsePvZgaaFvQ0SkErvrpEMX0NDpdN7vc_whtHwFk/s320/welcome.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Are you ready to experience some quantums!?</td></tr>
</tbody></table>
The portal is divided into three tabs, "User Guide", "Composer", and "My Scores". The User Guide is fairly self explanatory but actually impressed me more than the quantum computer itself. I'm still making my way through it but the authors deserve a pat on the back since it's some of the best technical writing I've seen in a while. What's more, there are multiple links to the quantum simulator with examples for each section so you can read about a particular operation or theory and then run a simulation of it and check the results. You can then modify the example, re-run it, and in general play around with the concept before going back to where you were in the user's guide.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVoiQlNEEwR9eBLzxQOJQgCIIUjpMJpGmz87k2whPgHxFps8VvZvYfESmRMEVz6CW576Gmm9N6VEAd6rEWlpzi6q1sUTYyU5Gkyrk3hZzyllk9_XqOjQWPeYFU6uXF7pzFMaShjMdgAzM/s1600/user_guide.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="190" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVoiQlNEEwR9eBLzxQOJQgCIIUjpMJpGmz87k2whPgHxFps8VvZvYfESmRMEVz6CW576Gmm9N6VEAd6rEWlpzi6q1sUTYyU5Gkyrk3hZzyllk9_XqOjQWPeYFU6uXF7pzFMaShjMdgAzM/s320/user_guide.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Don't worry, it starts out with simpler concepts.</td></tr>
</tbody></table>
The Composer is the programming interface for the quantum computer. It is attached to a quantum simulator as well. In it, you write "Scores" which are basically circuits to run on the quantum computer. IBM calls them scores since with five qubits to work with it looks like sheet music. That's also how the composer got its name.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVDPoRKIAnxcr2K_0077U-fBjPFV3gwbYy_DWfK0gaEjGv-BcNSNO1dpo6IlgjK_za3JFGcYuLokDVYNV5sf7UsIALPy0sJITIjhZZhSYC1VyPJ2hAiDywURDnAD2pC5T_J-1jsmahPSY/s1600/composer.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="151" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgVDPoRKIAnxcr2K_0077U-fBjPFV3gwbYy_DWfK0gaEjGv-BcNSNO1dpo6IlgjK_za3JFGcYuLokDVYNV5sf7UsIALPy0sJITIjhZZhSYC1VyPJ2hAiDywURDnAD2pC5T_J-1jsmahPSY/s320/composer.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An example score in the composer. Yes, this is the default example for Grover's algorithm, but I renamed it since it's all about how you frame the problem.</td></tr>
</tbody></table>
You can simulate a given quantum score as much as you'd like. When doing so, (or creating a new score), you have the option of choosing an ideal or real layout. The difference is that there are physical limitations of the real quantum computer which directly impact how you design your score.<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifnrCLdkNpytMXpE1cbRBWwRPT9mE0kuZrQV5Ugz3yP0UTjtFn-bErnBKGA7rerCFPgkl5GN6IJgFEGmpcLNhaOA_e3rJJrI7yCJ4k_qFT_Q2uEt82m1jWcnrLc6GIaaQvd7bPtJL3VE8/s1600/ideal_or_real.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="176" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifnrCLdkNpytMXpE1cbRBWwRPT9mE0kuZrQV5Ugz3yP0UTjtFn-bErnBKGA7rerCFPgkl5GN6IJgFEGmpcLNhaOA_e3rJJrI7yCJ4k_qFT_Q2uEt82m1jWcnrLc6GIaaQvd7bPtJL3VE8/s320/ideal_or_real.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Red pill or blue pill?</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc9QAh8ZlOMdkA_Na6RNbqFYdo1IjLLEuX2ZpWibhoO1cdOggLU8FHtn19fuvFNQtLXu__36_TvnKvpEZto8fRrqiAUtvcqaOMdpaRNAoBn9KlfDRDarZH97A5lKzQSZ7lnDTSZdV-1s4/s1600/physical_setup.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="81" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhc9QAh8ZlOMdkA_Na6RNbqFYdo1IjLLEuX2ZpWibhoO1cdOggLU8FHtn19fuvFNQtLXu__36_TvnKvpEZto8fRrqiAUtvcqaOMdpaRNAoBn9KlfDRDarZH97A5lKzQSZ7lnDTSZdV-1s4/s320/physical_setup.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Qubit 2 is the gatekeeper</td></tr>
</tbody></table>
That's one of the neat things about using this service vs a standard quantum simulator. You can see some of the limitations that current implementations have to deal with. For example, Qubit 2 is the only qubit that can talk to other qubits, so if you want to perform operations like conditional NOTs, (CNOT), that has a huge impact.<br />
<br />
<h3>
<b>Running it For Real:</b></h3>
<div>
That's all fun but the real reason you are probably using IBM's service is to actually run programs on their quantum computer. I'll admit the "good old days" of punch card mainframes was before my time but the whole setup is somewhat similar. You are given "Units" which are then used up when you run a program vs simulate it. IBM currently is being very generous with giving them out and you can request more for free. The typical program uses around 3 units to run. The results are probabilistic, so each run can be made up of multiple "shots", and then in the end the average of the results is presented to you. Further display options, such as blotch spheres where the results are plotted as a vector on a 3D sphere take even more shots to generate.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL9k8LiyLy9aEJWdsaQcfewiImbrl5EIGpP1ar3ShA8MbQ_xET6P3dNhKSrycaYbf62BgeoAWI5m71_u4w5V2oYMzsyFho_XvbraoB2rrfJ4makIjN2lmJtrfhxofB-6bA_woFoWceePs/s1600/running_for_real.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiL9k8LiyLy9aEJWdsaQcfewiImbrl5EIGpP1ar3ShA8MbQ_xET6P3dNhKSrycaYbf62BgeoAWI5m71_u4w5V2oYMzsyFho_XvbraoB2rrfJ4makIjN2lmJtrfhxofB-6bA_woFoWceePs/s320/running_for_real.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">I feel a bit guilty about not just running this once, but it's the same price!</td></tr>
</tbody></table>
<div>
To further help you save units, as well as get you the results sooner, if your quantum score has previously been run by someone IBM will give you the option to see the saved results vs re-running it yourself.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3LtXvg5YNWy3LC6IgJfWo0fVUknQ41BDOh3ZyIl52gwAuwE_PA77JbFTVJ60Q3EeTwqiQGj1-BgxlwtYyDB38DmDO1t7cVK3oSHLrs7eRGCvnseRou8RMzbBmLLoCJiiwF2p68RA4qOY/s1600/results_from_cache.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="184" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg3LtXvg5YNWy3LC6IgJfWo0fVUknQ41BDOh3ZyIl52gwAuwE_PA77JbFTVJ60Q3EeTwqiQGj1-BgxlwtYyDB38DmDO1t7cVK3oSHLrs7eRGCvnseRou8RMzbBmLLoCJiiwF2p68RA4qOY/s320/results_from_cache.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Well I guess I wasn't that original...</td></tr>
</tbody></table>
<div>
If you do choose to run your program you are added to the queue. So far, most of my results have been available within a couple of minutes.</div>
<div>
<br /></div>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-a2tgdncDhfFSjPpY9HyVkD_XGTDTeODwwysoMrLAGMy_Bwc6KZ6ViSb4P8E7GWRQOcJtRseO_nMEHy2DgM6PxtrmXvZwS872ZW_9fy90rwO_m39sJnTYRBv76Q9bRManL5NkUyhkk8U/s1600/queue.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="145" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh-a2tgdncDhfFSjPpY9HyVkD_XGTDTeODwwysoMrLAGMy_Bwc6KZ6ViSb4P8E7GWRQOcJtRseO_nMEHy2DgM6PxtrmXvZwS872ZW_9fy90rwO_m39sJnTYRBv76Q9bRManL5NkUyhkk8U/s320/queue.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Someone else is quantuming ahead of me</td></tr>
</tbody></table>
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUlxkwuXBB_dhdBzgJt4ReFqstU_e3aH-sJwg271M14-h75emEBkofg6RI9iQzBbDTN6Ppfd-1RcXhZkSto5fhW04qnz_mIyHgU2-_CP9Sp42dKc8yxxXKD9zdGithdSqvnxq2drYcO6k/s1600/results.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="227" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgUlxkwuXBB_dhdBzgJt4ReFqstU_e3aH-sJwg271M14-h75emEBkofg6RI9iQzBbDTN6Ppfd-1RcXhZkSto5fhW04qnz_mIyHgU2-_CP9Sp42dKc8yxxXKD9zdGithdSqvnxq2drYcO6k/s320/results.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Looks like the plain-text is '00'. As I said, it's all about framing the problem.</td></tr>
</tbody></table>
<div>
Remember earlier when I said the runs were probabilistic? You can really see that in the results above. The correct answer was '00', but around 24% of the time a different answer was chosen. </div>
<div>
<br /></div>
<div>
<h3>
<b>Issues With the Composer:</b></h3>
</div>
<div>
I need to submit bug reports, (the bug icon is prominently displayed in the lower right corner of the screen on the portal site), but I've been hesitant to since all the issues I've run into have been very minor. Sometimes the composer gets a bit wonky, (gates get stuck or aren't saved when you run your simulation), but the problem goes away when I refresh my screen. Also, it would be nice if the transitions between composer and users guide were quicker or I could have them open side by side, (opening multiple browser windows does not work). All in all though, I haven't run into any major issues considering this program is currently a beta release.</div>
<div>
<br /></div>
<div>
<h3>
<b>Summary:</b></h3>
</div>
<div>
You are not going to be able to<a href="https://www.youtube.com/watch?v=8wXBe2jTdx4" target="_blank"> hack any Gibsons</a> with IBM's quantum computer. It's very limited, but that is kind of the point. It shows where the field of quantum computing is right now. IBM is providing an amazing free learning opportunity with this service and if you are at all interested in the future of computing I highly recommend checking it out.</div>
<br />Matt Weirhttp://www.blogger.com/profile/16111343330590419341noreply@blogger.com2tag:blogger.com,1999:blog-496451536493805371.post-3996259065602346192015-08-19T19:37:00.003-07:002015-08-19T19:47:47.899-07:00Challenges with Evaluating Password Cracking Algorithms<blockquote class="tr_bq">
<i>"In theory, theory and practice are the same. In practice they are not"</i> -Quote from somebody on the internet. Also attributed to Albert Einstein but I've never been able to find the original source to back that up.</blockquote>
<h3>
Back-story:</h3>
Currently I'm writing a post looking into Hashcat's Markov mode but I found myself starting off by including several paragraphs worth of disclosures and caveats. Or to put it another way:<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
</div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6_a1OBMXVnnYLiRa-Vouq-g23HBBD0QC6fCASYzViPGsMhckQ_4zdgifFXMIclY1TERhEVuGey4aprl5hwqjV4sdafJeVf668BGp08IWB71m00EbOpHYfu0ZPpNF06dmxmSrwJrSFjY2z/s1600/twitter.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="190" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi6_a1OBMXVnnYLiRa-Vouq-g23HBBD0QC6fCASYzViPGsMhckQ_4zdgifFXMIclY1TERhEVuGey4aprl5hwqjV4sdafJeVf668BGp08IWB71m00EbOpHYfu0ZPpNF06dmxmSrwJrSFjY2z/s1600/twitter.png" width="400" /></a></div>
<br />
<br />
It was a valid point Jeremi brought up and it's something I'm trying to avoid. After thinking about it for a bit I figured this topic was worth its own post.<br />
<br />
<h3>
Precision vs Recall: </h3>
Part of the challenge I'm dealing with is I'm performing experiments vs writing tutorials. That's not to say I won't write tutorials in the future but designing and running tests to evaluate algorithms is fun and what I'm interested in right now. Why this can be a problem though is that I can get so deep into how an algorithm works that it's easy to loose sight of how it performs in a real life cracking session. I try to be aware of this, but an additional challenge is representing these investigations to everyone else in a way that isn't misleading.<br />
<br />
This gets into the larger issue of balancing <a href="http://en.wikipedia.org/wiki/Precision_and_recall">precision and recall</a>. In a password cracking context, precision is modeling how effective each guess is when it comes to cracking a password. The higher your precision, the fewer guesses on average you need to make to crack a password. As a rule of thumb if you see a graph with number of guesses on the X axis and percentage of passwords cracked on the Y axis, it's probably measuring precision.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2gjPvRJG-TRQu-qgIzUqPTwoLI9t3US6y1C3CLL2q4hSWpL9m5QEFKTZ5lHAwK_ciz7xMRp47dmcvGnXMzwBHvey9g7S2_cIf6I2DgpQGMRQ9AMKw2fV12JS5l9MiUL4uQcoBgFlC1FDs/s1600/prince_dictionary.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="215" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj2gjPvRJG-TRQu-qgIzUqPTwoLI9t3US6y1C3CLL2q4hSWpL9m5QEFKTZ5lHAwK_ciz7xMRp47dmcvGnXMzwBHvey9g7S2_cIf6I2DgpQGMRQ9AMKw2fV12JS5l9MiUL4uQcoBgFlC1FDs/s1600/prince_dictionary.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">An example of measuring the precision of different cracking techniques</td></tr>
</tbody></table>
Recall on the other hand is the percentage of passwords cracked during a cracking session in total, regardless of how many guesses are made. Usually this isn't represented in a graph format, and if it is, the X axis will be represented by "Time", and not number of guesses.<br />
<br />
<table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgX4jSlFcxIZlpWtoa7riu3b8STIc0qtcwkDEdMJArawpxEV3jkXklliBa8WKa8R7gLDpTfcb8e2xXRqWjGFw-Dl9_7XCQhKL1T0rGt6tCUZdFCqIpL03kJSyDoh9zXaKz5uU6BFW-5fQqR/s1600/Untitled.png" imageanchor="1" style="margin-left: auto; margin-right: auto;"><img border="0" height="192" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgX4jSlFcxIZlpWtoa7riu3b8STIc0qtcwkDEdMJArawpxEV3jkXklliBa8WKa8R7gLDpTfcb8e2xXRqWjGFw-Dl9_7XCQhKL1T0rGt6tCUZdFCqIpL03kJSyDoh9zXaKz5uU6BFW-5fQqR/s1600/Untitled.png" width="320" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Courtesy of Korelogic's Crack Me If You Can contest. This represents a Recall based graph</td></tr>
</tbody></table>
It's tempting to say that "Precision" is a theoretical measurement and "Recall" is the practical results. It's not quite so clear cut though since the "time" factor in password cracking generally boils down to "number of guesses". In an online guessing scenario an attacker may only be able to make 10-20 guesses. With a fast hash, offline attack, and a moderate GPU setup, billions of guesses a second are possible and an attack might run for several weeks. Therefore recall results tend to be highly dependent of the particular situation being modeled.<br />
<br />
Now it would be much easier to switch between "Precision" and "Recall" if there was a direct mapping between number of guesses and time. The problem is, not all guesses take the same amount of time. A good example of that is CPU vs GPU based guessing algorithms. Going back to John the Ripper's Incremental mode, I'm not aware of any GPU implementation of it so guesses have to be generated by the CPU and then sent to the GPU for hashing. Meanwhile Hashcat's Markov mode can run in the GPU itself, and in Atom's words "it has to create 16 billions candidates per 10 milliseconds on a single GPU. Yes, billions". Therefore this can lead to situations such in the case of a very fast hash where certain attacks might have a higher precision, but worse recall.<br />
<br />
<h3>
Amdahl's law and why I find precision interesting</h3>
When trying to increase recall an attacker generally has two different avenues to follow. They can increase the number of guesses they make or they can increase the precision of the guesses they make. These improvements aren't always exclusive; many times you can do both. Often though there is a balancing act as more advanced logic can take time and may be CPU bound. What this means is that you might increase precision only to find your recall has fallen since you are now making fewer guesses. That being said, if the increase in precision is high enough, then even an expensive guessing algorithm might do well enough to overcome the decrease in the total number of guesses it can make.<br />
<br />
Often in these optimization situations <a href="http://en.wikipedia.org/wiki/Amdahl%27s_law">Amdahl's law</a> pops into my head, though <a href="http://en.wikipedia.org/wiki/Gustafson%27s_law">Gustafson's law</a> might be more appropriate for password cracking due to the rate of increase in the number of guesses. Amdahl's law in a nutshell says the maximum speedup you can have is always limited by the part of the program you can't optimize. To put it another way, if you reduce the cost of an action by 99%, but that action only accounts for 1% of the total run-time, then your maximum total speedup no matter how cool your optimization is would be no more than 1%.<br />
<br />
Where this applies to password cracking is the cost of a guess in an offline cracking attack can be roughly modeled as:<br />
<blockquote class="tr_bq">
<b>Cost of making the plain-text guess + cost of hashing + general overhead of the cracking tool</b></blockquote>
Right now the situation in many cases is that the cost of hashing is low thanks to fast unsalted hashing algorithms and GPU based crackers. Therefore it makes sense to focus on reducing the cost of making the plain-text guesses as much as possible since that will have a huge impact on the overall cost of making a guess. Aka, trading precision for speed in your guessing algorithm can have a significant impact on the total number of guesses you can make. If on the other hand a strong hash is used, (or you at least are trying to crack a large number of uniquely salted hashes), the dominant factor in the above equation becomes the hashing itself. Therefore a speedup in the plaintext generation will not have as much impact on the overall cost and therefore precision becomes more important.<br />
<br />
As a researcher, precision is very interesting for me. From a defensive standpoint a good starting place is "use a computationally expensive salted hash". If you aren't at least doing that then the chances are you aren't interested in doing anything more exotic. Also when it comes to contributing to the larger research community, well my coding skills are such that I'm not going to be making many improvements to the actual password cracking tools. Evaluating and improving the precision of different attacks is much more doable.<br />
<br />
<h3>
Carnegie Mellon's Password Guessability Service:</h3>
<div>
One cool resource for password security researchers is the new Password Guessability service being offered by the CUPs team over at Carnegie Mellon. I'm going to paraphrase their talk, but basically their team got tired of everyone comparing their passwords attacks to the same default rulesets of John the Ripper so they created a service for researchers to model more realistic password cracking sessions. If you are interested their USNIX paper describing their lab setup can be found <a href="https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/ur">here</a>. Likewise if you want to see a video of their Passwords15LV talk you can view it <a href="https://www.youtube.com/watch?v=IIWmkn9R9q0&index=13&list=PLJtZQMlIjpsFySMxA37NJQ-hiS_23CvHX">here</a>. More importantly, if you want to go to their actual site you can find it here:<br />
<br />
<a href="https://pgs.ece.cmu.edu/"><b>https://pgs.ece.cmu.edu/</b></a></div>
<br />
The service itself is free to ethical security researchers and is run by students so don't be a jerk. The actual attacks they run are bound to change with time, but as of right now they are offering to model several different default password cracking attacks consisting of around 10 trillion guesses each. These cracking attacks use the public TrustWave's JtR KoreLogic Rulelist, several different HashCat rulesets, an updated Probabilistic Context Free Grammar attack, and another custom attack designed by Korelogic specifically for this service. All in all, if you need to represent an "industry standard" cracking session it's hard to do better. In fact it probably represents a much more advanced attacker than many of the adversaries out there if you assume the target passwords were protected by a hashing algorithm of moderate strength.<br />
<br />
I could keep on talking about this service but you really should just read their paper first. I think it's a wonderful resource for the research community and I have lots of respect for them offering this. So the next question of course is what does that mean for this blog? I plan on using this service as it makes sense without hogging Carnegie Mellon's resources. I need to talk to them more about it but I expect that I'll have them run it against a subset of the RockYou list and then use, and reuse, those results to evaluate other cracking techniques as I investigate them. If I attack some other dataset though I may just run a subset of the attacks myself, unless that dataset and the related tests are interesting enough to make using CM's resources worth it.<br />
<br />
<h3>
Fun with Designing Experiments:</h3>
When designing experiments there's usually a couple of common threads I'm always struggling with:<br />
<ol>
<li><b>Poor datasets.</b> I know there's a ton of password dumps floating around but often due to the nature of their disclosure there's massive problems or shortcomings with most of them. For example most of the dumps on password cracking forums or pastebin have only unique hashes, so '123456' only shows up once, and there is no attribution. Gawker was a site most people didn't care about and the hashing algorithm cut off the plaintext after 8 characters and replaced non-ASCII text as a '?'. A majority of the passwords in the Stratfor dataset were machine generated. Myspace, well that was a result of a phishing attack so it has many instances of 'F*** You You F***ing Hacker'. Even with RockYou the dataset is complicated as it contained many passwords from the same users for different sites but since there were no usernames connected with the public version of it, it can be hard to sort out. Then there is the fact that most of these datasets were for fairly unimportant sites. I'm not aware of any confirmed public Active Directory dump, (though there are a large number of NT hashes floating about and this whole Ashley Madison hack may change things with the Avid Life Media NT hashes there). Likewise, while there are some banking password lists, the amount of drama surrounding them makes me hesitant to use them.</li>
<li><b>Short running time</b>. Personally I like keeping the time it takes to run a test to around an hour or so. While I can certainly run longer tests, realistically anything over a couple of days isn't going to happen since I like using my computers for other things and truth be told, it always seems like I end up finding out I need to run additional tests or I messed something up in my original setup and need to re-run it. Shorter tests are very much preferred. Add into that the fact that most of the time I'm modeling precision and running my tests on a CPU system means most of my tests will not be modeling GPU cracking vs fast hashes.</li>
<li><b>What hypothesis do I want to test, and can I design an experiment to test it? </b>I'll admit, sometimes I'll have no clue what the results of a test will be so I'll pull a YOLO, throw some stuff together and just run it to see what pops out. That's not ideal though as I usually like to try and predict the results. I'm often wrong, but that at least forces me to look deeper into what assumptions I held were wrong, and hey that's why I run tests in the first place.</li>
</ol>
<div>
Furthermore, for at least the next couple of tools I'm investigating I plan on using both Hashcat and John the Ripper as much as possible. While it might not always make sense to use both of them as often there isn't an apples to apples comparison, I do have some ulterior motives. Basically it helps me to use both of these tools in a public setting and I've already gotten a lot of positive feedback from my PRINCE post. It's pretty amazing when I can have a creator of a tool tell me how I can optimize my cracking techniques. My secondary reason for this is to make people more aware of both of these tools. When it comes to the different attack modes I've found there's a lot of misunderstandings of what each tool is capable of.</div>
<div>
<br /></div>
<div>
That being said, I explicitly don't want to get into "Tool A is better than Tool B" type debates. Which tool you use really depends on your situation. Heck, occasionally I'm glad I still have Cain and Abel installed. I'll admit, this is going to get tricky when I'm doing tests such as comparing Hashcat's Markov mode to JtR's Incremental mode, but please keep in mind that I want to make all the tools better.<br />
<br />
<h3>
Enough talk; Give us some code or graphs or GTFO:</h3>
</div>
<div>
Thanks for putting up with all of that text. In the spirit of showing all my research I'm sharing the tool that I wrote to evaluate password cracking sessions which I'll be using in this blog. The code is available here:</div>
<div>
<br /></div>
<div>
<a href="https://github.com/lakiw/Password_Research_Tools">https://github.com/lakiw/Password_Research_Tools</a></div>
<div>
<br /></div>
<div>
The specific tool I'm talking about, (in the hope that I release multiple tools in the future so it isn't obvious ;p), is called checkpass2.py. It's a significantly faster version of the old checkpass program I had used and released in the past. The options on how it works are detailed in the -h switch, but basically you can pipe whatever password guess generation tool you are using into it and it'll compare your guesses against a plaintext target list and tell you how effective your cracking session would have been. For example if you were using John the Ripper you could use the -stdout option to model a cracking session as follows:</div>
<blockquote class="tr_bq">
<i><b>./john -wordlist=passwords.lst -rules=single -stdout | python checkpass2.py -t target.pws -o results.txt</b></i></blockquote>
It also has some options like limiting the maximum number of guesses or starting a count at a specific number if you want to chain multiple cracking sessions together. There's certainly still a lot of improvements that need to be made to it, but if you like graphs I hope it might be useful to you. Please keep in mind that this isn't a password cracker. Aka, It does not do any hashing of password guesses. So if you want to model a password cracking session against a hashed list you'll need to run two attacks, One to crack the list using the tool of your choice, and a second session to use the checkpass2.py tool to model your cracking session against the cracked passwords. Since both John the Ripper and Hashcat have logging options you might want to consider using them instead to save time. Where checkpass2 is nice for me anyway is the fact that I can quickly edit the code depending on what I need so it's easier to do things like format the output for what I'm doing. Long story short, I hope it is helpful but I still strongly recommend looking into the logging options that both John the Ripper and Hashcat offer.Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-6906379158211760722014-12-22T08:03:00.004-08:002015-01-04T16:15:51.719-08:00Tool Deep Dive: PRINCE<b>Tool Name:</b> PRINCE (PRobability INfinite Chained Elements)<br />
<b>Version Reviewed:</b> 0.12<br />
<b>Author:</b> Jens Steube, (Atom from Hashcat)<br />
<b>OS Supported: </b>Linux, Mac, and Windows<br />
<b>Password Crackers Supported:</b> It is a command line tool so it will work with any cracker that accepts input from stdin<br />
<h3>
Blog Change History:</h3>
<span style="color: red;">1/4/2015: Fixed some terminology after talking to Atom</span><br />
<span style="color: red;">1/4/2015: Removed a part in the Algorithm Design section that talked about a bug that has since been fixed in version 0.13</span><br />
<span style="color: red;">1/4/2015: Added an additional test with PRINCE and JtR Incremental after a dictionary attack</span><br />
<span style="color: red;">1/4/2015: Added a section for using PRINCE with oclHashcat</span><br />
<h3>
<b>Brief Description:</b> </h3>
PRINCE is a password guess generator and can be thought of as an advanced <a href="https://hashcat.net/wiki/doku.php?id=combinator_attack">Combinator attack</a>. Rather than taking as input two different dictionaries and then outputting all the possible two word combinations though, PRINCE only has one input dictionary and builds "chains" of combined words. These chains can have 1 to N words from the input dictionary concatenated together. So for example if it is outputting guesses of length four, it could generate them using combinations from the input dictionary such as:<br />
<blockquote class="tr_bq">
4 letter word<br />
2 letter word + 2 letter word<br />
1 letter word + 3 letter word<br />
1 letter word + 1 letter word + 2 letter word<br />
1 letter word + 2 letter word + 1 letter word<br />
1 letter word + 1 letter word + 1 letter word + 1 letter word<br />
..... (You get the idea)</blockquote>
<h3>
<b>Algorithm Design:</b></h3>
As of this time the source-code of PRINCE has not been released. Therefore this description is based solely on At0m's <a href="http://passwords14.item.ntnu.no/">Passwords14</a> presentation, talking to At0m himself on IRC as well as running experiments with various small dictionaries using the tool itself and manually looking at the output.<br />
<br />
As stated in the description, PRINCE combines words from the input dictionary to produce password guesses. The first step is processing the input dictionary. Feeding it an input dictionary of:<br />
<blockquote class="tr_bq">
a<br />
a</blockquote>
resulted it in generating the following guesses:<br />
<blockquote class="tr_bq">
a<br />
a<br />
aa<br />
aa<br />
aaa<br />
aaa<br />
aaaa<br />
aaaa<br />
...(output cut to save space) </blockquote>
Therefore, it's pretty obvious that the tool does not perform duplicate detection when loading a file<br />
<br />
<i><b>Finding #1</b>: Make sure you remove duplicate words from your input dictionary *before* you run PRINCE</i><br />
<i><br /></i>
After PRINCE reads in the input dictionary it stores each word, (element), in a table consisting of all the words of the same length. PRINCE then constructs chains consisting of 1 to N different elements. Right now it appears that N is equal to eight, (confirmed when using the --elem-cnt-min option). It does this by setting up structures of the different tables and then filling them out. For example with the input dictionary:<br />
<blockquote class="tr_bq">
a</blockquote>
It will generate the guesses:<br />
<blockquote class="tr_bq">
a<br />
aa<br />
aaa<br />
aaaa<br />
aaaaa<br />
aaaaaa<br />
aaaaaaa<br />
aaaaaaaa</blockquote>
<div>
This isn't to say that it won't generate longer guesses since elements can be longer then length 1. For example with the following input dictionary:<br />
<blockquote class="tr_bq">
a<br />
bb<br />
BigInput</blockquote>
</div>
<div>
It generates the following guesses</div>
<blockquote class="tr_bq">
a<br />
aa<br />
bba<br />
aabb<br />
bbabb<br />
bbbbbb<br />
abbbbbb<br />
BigInput<br />
BigInputbb<br />
bbBigInputbb<br />
bb<br />
aaa<br />
...(output cut to save space)</blockquote>
<div>
Next up, according to the 35 slide of the Passwords14 talk it appears that Prince should be sorting these chains according to keyspace. This way it can output guesses from the chains with the smallest keyspace first. This can be useful so it will do things like append values on the end of dictionary words before it tries a full exhaustive brute force of all eight character passwords. While this appears to happen to a certain extent, something else is going on as well. For example with the input dictionary:</div>
<blockquote class="tr_bq">
a<br />
b<br />
cc</blockquote>
<div>
It would output the following results:</div>
<div>
<blockquote class="tr_bq">
a<br />
b<br />
cc<br />
cca<br />
cccc<br />
ccacc<br />
cccccc<br />
acccccc<br />
cccccccc<br />
cccccccccc<br />
cccccccccccc<br />
aa<br />
ccb<br />
acca<br />
ccbcc<br />
aacccc<br />
bcccccc<br />
aacccccc<br />
.....(Lots of results omitted).....<br />
aaaabbbb<br />
baaabbbb<br />
abaabbbb<br />
bbaabbbb<br />
aababbbb<br />
bababbbb<br />
abbabbbb<br />
bbbabbbb<br />
aaabbbbb<br />
baabbbbb</blockquote>
</div>
<div>
<div>
This is a bit of a mixed bag. While it certainly saved the highest keyspace chains for the end, it didn't output everything in true increasing keyspace order since elements of length 1, (E1), had two items, while elements of length 2, (E2), only had one item, but it outputted E1 first. I have some suspicions that the order it outputs its chains is independent on how many items actually are in each element for that particular run, (aka as long as there is at least one item in each element, it is independent of your input dictionary). I don't have anything hard to back up that suspicion though beyond a couple of sample runs like the one above. Is this a problem? Quite honestly, I'm not really sure, but it is something to keep in mind. When I talked to Atom about this he said that password length compared to the average length of items in the training set also influenced the order at which chains were selected so that may have something to do with it.</div>
</div>
<div>
<br /></div>
<div>
<i><b>Finding #2</b>: PRINCE is not guaranteed to output all chains in increasing keyspace order, though it appears to at least make an attempt to do so</i></div>
<h3>
Additional Options:</h3>
<div>
<b>--elem-cnt-max=NUM: </b> This limits the number of elements that can be combined to NUM. Aka if you set NUM to 4, then it can combine up to 4 different elements. So if you had the input word 'a' it could generate 'aaaa' but not 'aaaaa'. This may be useful to limit some of the brute forcing it does.<br />
<br />
The rest of the options are pretty self explanatory. One request I would have is for PRINCE to save its position automatically, or at least print out the current guess number when it is halted, to make it easier to restart a session by using the "--skip=NUM" option.</div>
<h3>
<b>Performance:</b></h3>
PRINCE was written by Atom so of course it is fast. If you are using a CPU cracker it shouldn't have a significant impact on your cracking session even if you are attacking a fast hash. For comparison sake, I ran it along with JtR's incremental mode on my MacBook Pro.<br />
<br />
Prince:<br />
run laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/passwords_top10k.txt | ./john --format=raw-sha1-linkedin -stdin one_hash.txt<br />
Loaded 1 password hash (Raw SHA-1 LinkedIn [128/128 SSE2 intrinsics 8x])<br />
guesses: 0 time: 0:00:02:00 c/s: 1895K trying: asdperkins6666 - bobperkins<br />
<br />
<div class="p1">
JtR Incremental Mode:</div>
<div class="p1">
run laki$ ./john -incremental=All -stdout | ./john --format=raw-sha1-linkedin -stdin one_hash.txt </div>
<div class="p1">
Loaded 1 password hash (Raw SHA-1 LinkedIn [128/128 SSE2 intrinsics 8x])</div>
<div class="p1">
guesses: 0 time: 0:00:00:14 c/s: 2647K trying: rbigmmi - rbigm65</div>
<h3>
Using PRINCE with OCLHashcat:</h3>
Below is a sample screen shot of me using PRINCE as input for OCLHashcat on my cracking box, (it has a single HD7970 GPU). Ignore the --force option as I had just installed an updated video card driver and was too lazy to revert back to my old one that OCLHashcat supports. I was also too lazy to boot into Linux since I was using Excel for this post and my cracking box also is my main computer...<br />
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkj_x-6-pIWRWmvVXTL__8PCZlXpD-v5n998hhyEJHepUqjA9ycgtAViUzx-6z3wZuSfNQEw5Fc8g5O_dwnyI3DsVmMDMoDMXsVtJPgFL8NzdB0o8DVJkXAOs4Ni_WoWNWY9AaCA7v3lX7/s1600/prince_gpu.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgkj_x-6-pIWRWmvVXTL__8PCZlXpD-v5n998hhyEJHepUqjA9ycgtAViUzx-6z3wZuSfNQEw5Fc8g5O_dwnyI3DsVmMDMoDMXsVtJPgFL8NzdB0o8DVJkXAOs4Ni_WoWNWY9AaCA7v3lX7/s1600/prince_gpu.png" height="640" width="459" /></a></div>
<br />
What I wanted to point out was that for a fast hash, (such as unsalted SHA1 in this case), since PRINCE is not integrated into OCLHashcat it can't push guesses fast enough to the GPU to take full advantage of the GPU's cracking potential. In this case, the GPU is only at around 50% utilization. That is a longer way of saying that while you still totally make use of OCLHashcat when using PRINCE, it may be adventurous to also run dictionary based rules on the guesses PRINCE generates. Since those dictionary rules are applied on the GPU itself you can make a lot more guesses per second to take full advantage of your cracking hardware. This is also something Atom recommends and he helpfully included two different rulesets with the PRINCE tool itself.<br />
<br />
Side note: PRINCE plows though the LinkedIn list pretty effectively. To get the screenshot above I had to run the cracking session twice since otherwise the screen would have been filled with cracked passwords.<br />
<h3>
<b>Big Picture Analysis:</b></h3>
The main question of course is how does this tool fit into a cracking session? Atom talked about how he saw PRINCE as a way to automate password cracking. The closest analogy would be John the Ripper's default behavior where it will start with Single Crack Mode, (lots of rules applied to a very targeted wordlist), move on to Wordlist mode, (basic dictionary attack), and then try Incremental mode, (smart bruteforce). Likewise with PRINCE depending on how you structure your input dictionary it can act as a standard dictionary attack, (appending/prepending digits to input words for example), combinator attack, (duh), and pure brute force attack, (trying all eight character combos). It can even do a limited passpharse attack though it gets into "Correct Horse Battery Staple" keyspace issues then. For example, with the input dictionary of:<br />
<blockquote class="tr_bq">
Correct<br />
Horse<br />
Battery<br />
Staple</blockquote>
It will generate all four word combinations such as:<br />
<blockquote class="tr_bq">
<lots before="" guesses="" of="" this=""><br />HorseHorseCorrectBattery<br />HorseHorseBatteryBattery<br />HorseCorrectCorrectHorse<br />HorseBatteryCorrectHorse<br />HorseCorrectBatteryHorse<br />HorseBatteryBatteryHorse<br />CorrectCorrectHorseHorse<br />BatteryCorrectHorseHorse<br />CorrectBatteryHorseHorse<br />BatteryBatteryHorseHorse</lots></blockquote>
<div>
<br /></div>
<div>
When talking about passpharse attacks then, keep in mind it doesn't have any advanced logic so you are really doing a full keyspace attack of all the possible combinations of words.<br />
<br />
The big question then is how does it compare against other attack modes when cracking passwords? You know what this means? Experiments and graphs!<br />
<br />
I decided I would base my first couple of comparisons using the demos Atom had listed in his slides as a starting point. I figure no-one would know how to use PRINCE better than he would. Note: these are super short runs. While I could explain that away by saying this simulates targeting a slow hash like bcrypt, the reality is Atom made some noticeable changes in PRINCE while I was writing this post, (yay slow update schedule). I figured it would be good to make some quick runs with the newer version to get a general idea of how PRINCE performs and then post a more realistic length run at a later time. Also, this way I can get feedback on my experiment design so I don't waste time running a longer cracking session on a flawed approach.<br />
<br />
<b>Experiment 1) PRINCE, Hashcat Markov mode, and JtR Incremental mode targeting the MySpace lis</b>t<br />
<b><br /></b>
<b>Experiment Setup</b>:<br />
The input dictionary for PRINCE was the top 100k most popular passwords from the RockYou list, as this is what Atom used. For Hashcat I generated a stats file on the full RockYou list and used a limit of 16. For JtR I ran the default Incremental mode using the "All" character set. The target list was the old MySpace list. The reason why I picked that vs the Stratfor dataset which Atom used was simply because there are a ton of computer generated passwords, (aka default passwords assigned to users), in the Startfor dataset so it can be a bit misleading when used to test against.<br />
<br />
<b>Cracking Length: </b>1 billion guesses<br />
<br />
<b>Commands used:</b><br />
laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<span style="background-color: white; font-family: 'Courier New', Courier, monospace; font-size: 14px; white-space: pre-wrap;"><br /></span><br />
<br />
laki$ ../../../John/john-1.7.9-jumbo-7/run/john -incremental=All -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<span style="background-color: white; font-family: 'Courier New', Courier, monospace; font-size: 14px; white-space: pre-wrap;"><br /></span><br />
<br />
laki$ ../../../hashcat/statsprocessor-0.10/sp64.app --threshold=16 ../../../hashcat/statsprocessor-0.10/hashcat.hcstat | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<span style="background-color: white; font-family: 'Courier New', Courier, monospace; font-size: 14px; white-space: pre-wrap;"><br /></span><b><br /></b>
<b>Experiment Results:</b><br />
<br /></div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6-g-R1ZLm1vDd1-gowSWaNrqOBl8H98x3FYRnASkxOIw2yFmMGUG9vagwknHgKzVc5IfVZhaXnqgEbmCeeUwbSFYb-_1LV8lh8Psd0PM1UWbTBmjzEPYbgDmb7EhKMUr5qijwnct1Wqtn/s1600/prince_and_bruteforce.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh6-g-R1ZLm1vDd1-gowSWaNrqOBl8H98x3FYRnASkxOIw2yFmMGUG9vagwknHgKzVc5IfVZhaXnqgEbmCeeUwbSFYb-_1LV8lh8Psd0PM1UWbTBmjzEPYbgDmb7EhKMUr5qijwnct1Wqtn/s1600/prince_and_bruteforce.png" height="212" width="320" /></a></div>
<br />
<br />
Click on the graph for a zoomed in picture. As you can see, Prince did really well starting out but then quickly became less effective. This is because it used most, (if not all), of the most common words in the RockYou list first so it acted like a normal dictionary attack. At the same time, Incremental Mode was starting to catch up by the end of the run. While I could continue to run this test over a longer cracking session, this actually brings up the next two experiments....</div>
<br />
<b>Experiment 2) PRINCE and Dictionary Attacks targeting the MySpace lis</b>t<br />
<b><br /></b>
<b>Experiment Setup</b>:<br />
This is the same as the previous test targeting the MySpace dataset, but this time using dictionary attacks. For JtR, I stuck with the default ruleset and the more advanced "Single" ruleset. I also ran a test using Hashcat and the ruleset Atom included along with PRINCE, (prince_generated.rule). For all the dictionary attacks, I used the RockYou top 100k dictionary to keep them comparable to the PRINCE attack.<br />
<br />
<b>Cracking Length: </b>I gave each session up to 1 billion guesses, but the two JtR attacks were so short that I only displayed the first 100 million guesses on the graph so they wouldn't blend in with the Y-axis. The hashcat attack used a little over 700 million guesses which I annotated its final results on the graph. Side note, (and this merits another blog post), but Hashcat performs its cracking sessions using word order, vs JtR's rule order. I suspect this is to make hashcat faster when cracking passwords using GPUs. You can read about the difference in those two modes in one of my <a href="http://reusablesec.blogspot.com/2008/10/password-cracking-geekiness.html">very first blog posts</a> back in the day. What this means is that Hashcat's cracking sessions tend to be much less front loaded unless you take the time to run multiple cracking sessions using smaller mangling rulesets.<br />
<br />
<b>Commands used:</b><br />
laki$ ../../../Tools/princeprocessor-0.12/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<br />
<br />
laki$ ../../../John/john-1.7.9-jumbo-7/run/john -wordlist=../../../dictionaries/Rockyou_top_100k.txt -rules=wordlist -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<br />
<br />
laki$ ../../../John/john-1.7.9-jumbo-7/run/john -wordlist=../../../dictionaries/Rockyou_top_100k.txt -rules=single -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000<br />
<div>
<br /></div>
<div>
<div>
laki$ ../../../hashcat/hashcat-0.48/hashcat-cli64.app --stdout -a 0 -r ../../../Tools/princeprocessor-0.12/prince_generated.rule ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/myspace.txt -m 1000000000</div>
</div>
<div>
<br /></div>
<div>
<b>Experiment Results:</b></div>
<div>
<b><br /></b>
<b><br /></b></div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_80EP7EUDMe-UOOUfsHa2FstV4OqsNkwa9SEmXaHHQySTLb8b214YByMbYCZc2COAMydm8-psEGipGgX38yp-E_mFvgx7Ek3JqoL_9xW3y7P4nQkDBEkKFr3w3fWzF-p3NG58bB2SMwxO/s1600/prince_dictionary.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi_80EP7EUDMe-UOOUfsHa2FstV4OqsNkwa9SEmXaHHQySTLb8b214YByMbYCZc2COAMydm8-psEGipGgX38yp-E_mFvgx7Ek3JqoL_9xW3y7P4nQkDBEkKFr3w3fWzF-p3NG58bB2SMwxO/s1600/prince_dictionary.png" height="215" width="320" /></a></div>
<b><br /></b></div>
<div>
As you can see, all of the dictionary attacks performed drastically better than the PRINCE over the length of their cracking sessions. That's to be expected since their rulesets were crafted by hand while PRINCE generates its rules automatically on the fly. I'd also like to point out that once the normal dictionary attacks are done, PRINCE keeps on running. That's another way of saying that PRINCE still has a role to play in a password cracking session even if standard dictionary attacks initially outperform it. All this test points out is if you are going to be running a shorter cracking session you would be much better off running a normal dictionary based attack instead of PRINCE. This does lead to my next question and test though. After you run a normal dictionary attack, how does PRINCE do in comparison to a Markov brute force based attack?</div>
<div>
<br />
<br /></div>
<div>
<b>Experiment 3) PRINCE and JtR Wordlist + Incremental mode targeting the MySpace lis</b>t<br />
<b><br /></b>
<b>Experiment Setup</b>:<br />
Based on feedback from Atom I decided to restructure this next test. First of all, Atom recommended using the full Rockyou list as an input dictionary for PRINCE. Since that is a larger input dictionary than just the first 100k most frequent passwords, I re-ran JtR's single mode ruleset against the MySpace list using the full Rockyou dictionary as well. I also used the most recent version of JtR, 1.8-jumbo1 based on the recommendation of SolarDesigner. This cracked a total of 23,865 passwords from the MySpace list, (slightly more than 64%). I then ran PRINCE, (the newer version 0.13) with the full RockYou dictionary, (ordered), and JtR Incremental=UTF8, (equivalent to "ALL" in the older version of JtR), against the remaining uncracked passwords. I also increased the cracking time to 10 billion guesses.<br />
<br />
Side note: I ran a third test PRINCE using the RockYou top 100k input dictionary as well since the newer results were very surprising. I'll talk about that in a bit...<br />
<br /></div>
<b>Cracking Length: </b>10 billion guesses<br />
<br />
<b>Commands used:</b><br />
laki$ ../../../John/john-1.8.0-jumbo-1/run/john -wordlist= ../../../dictionaries/Rockyou_full_ordered.txt -rules=single -stdout | python checkpass2.py -t ../../../Passwords/myspace.txt -u uncracked_myspace.txt<br />
<b><br /></b>
laki$ ../../../Tools/princeprocessor-0.13/pp64.app < ../../../dictionaries/Rockyou_full_ordered.txt | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865<br />
<br />
laki$ ../../../Tools/princeprocessor-0.13/pp64.app < ../../../dictionaries/Rockyou_top_100k.txt | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865<br />
<br />
laki$ ../../../John/john-1.8.0-jumbo-1/run/john -incremental=UTF8 -stdout | python checkpass2.py -t ../../../Passwords/uncracked_myspace.txt -m 10000000000 -c 23865<span style="background-color: white; font-family: 'Courier New', Courier, monospace; font-size: 14px; white-space: pre-wrap;"><br /></span><br />
<br />
<b>Experiment Results:</b><br />
<b><br /></b>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4dlo_fPIblusgd9pOGP0nWrLqVv6xZYom6VIpvy5_bvUdXSfTaNWtDmQ9fcWsmrZKg4KDLYEavtrDTOfV0lX-K5IP3eQudkOOvM0-d4aseqwaey9hdWq2WSBnv6bHn66Qhx_BgtIhtEn7/s1600/prince_and_brute_round2.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4dlo_fPIblusgd9pOGP0nWrLqVv6xZYom6VIpvy5_bvUdXSfTaNWtDmQ9fcWsmrZKg4KDLYEavtrDTOfV0lX-K5IP3eQudkOOvM0-d4aseqwaey9hdWq2WSBnv6bHn66Qhx_BgtIhtEn7/s1600/prince_and_brute_round2.png" height="215" width="320" /></a></div>
<br />I'll guiltily admit before running this test I hadn't been that impressed with PRICE. That's because I had been running it with the top 100k RockYou dictionary. As you can see, with the smaller dictionary it performed horribly. When I ran the new test with the full RockYou dictionary though, PRINCE did significantly better than an Incremental brute force attack. Yes, cracking 1.5% more of the total set might not seem like much, but it will take Incremental mode a *long* time to catch up to that. Long story short though, PRINCE's effectiveness is extremly dependend on the input dictionary you use for it.<br />
<br />
Like most surprising test results, this opens up more questions then it solves. For example, what exactly is going on with PRINCE to make it so much more effective with the new dictionary. My current hypothesis is that it is emulating a longer dictionary attack, but I need to run some more tests to figure out if that's the case or not. Regardless, these results show that PRINCE appears to be a very useful tool to have in your toolbox if you use the right input dictionary for it.<br />
<br />
<h3>
Current Open Questions:</h3>
<div>
<ol>
<li>What is the optimal input dictionary to use for PRINCE? Yes the full RockYou input dictionary does well but my gut feeling is we can do better. That leads me to the next open question...</li>
<li>Can we make PRINCE smarter? Right now it transitions between dictionary attacks and brute force automatically, but beyond sorting the chains by keyspace it doesn't have much advanced logic in it. Perhaps if we can better understand what makes it effective we can make a better algorithm that is even more effective than PRINCE.</li>
</ol>
</div>
<br />
<b>Other References:</b><br />
<ul>
<li><a href="https://hashcat.net/forum/thread-3914.html">PRINCE Tutorial by Atom</a></li>
<li><a href="https://hashcat.net/tools/princeprocessor/">Grab the latest version of PRINCE</a></li>
</ul>
Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com5tag:blogger.com,1999:blog-496451536493805371.post-29669593954933533492014-12-09T08:29:00.000-08:002014-12-09T08:29:05.248-08:00Don't call it a comeback ... Ok, maybe it isHas it really been four years since my last post!? Well I guess it has! So a better question is: "What has changed for me to bring this blog back?"<br />
<br />
Well last February I got rid of my apartment, took leave of my job, and hiked all 2185.3 miles of the Appalachian Trail from Georgia to Maine. During that time I realized how much blogging meant to me. Long story short, I've been working with my employer to figure out how I could restart this blog now that I'm back.<br />
<br />
I don't want to spend too much time on this news post but I might as well end it on one more question: "What should you expect?" That's up in the air right now and I plan on remaining flexible, but I feel one thing I have to contribute to the research community is the fact that I really do enjoy constructing and running experiments. I may not be the best coder, l33t3st password cracker, or have a million dollar cracking setup, but I do have mad Excel skills and I love digging into algorithms. Right now I'm investigating the new PRINCE tool from At0m, creator of Hashcat, (<a href="https://hashcat.net/tools/princeprocessor/">You can get it here</a>) so hopefully I should have a post up about it in a couple of days.<br />
<br />
<br />
<table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"><tbody>
<tr><td style="text-align: center;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAazJ1bx3IWwATURyV3zz_h3wdOgqs4DwjP7CpYwKXeuPplgjeD2dn8_h1-R-gy8B8__GVsX49cYS2IXjB_T6SDeUgVqR3zyqZJe-ubxHgbaDXXb2wokTa5lGHoy77ZQ-iGqwArO1E-PHT/s1600/katahdin.JPG" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgAazJ1bx3IWwATURyV3zz_h3wdOgqs4DwjP7CpYwKXeuPplgjeD2dn8_h1-R-gy8B8__GVsX49cYS2IXjB_T6SDeUgVqR3zyqZJe-ubxHgbaDXXb2wokTa5lGHoy77ZQ-iGqwArO1E-PHT/s1600/katahdin.JPG" height="320" width="240" /></a></td></tr>
<tr><td class="tr-caption" style="text-align: center;">Balance, I have it</td></tr>
</tbody></table>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbBywGcOGR-t7_7GOSEt_XRCa3GWZMOhhEBmRFrO6smkxp89-10bxVKWW_ijcQhiONfKrNd-mh8pI_9D_dX5Kt1eL2ljh7aH6VusgBwttx-N9tmlvEt7VgLtZ_uyrLqxyk9Dar6TuU3Ccr/s1600/cliff.jpg" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhbBywGcOGR-t7_7GOSEt_XRCa3GWZMOhhEBmRFrO6smkxp89-10bxVKWW_ijcQhiONfKrNd-mh8pI_9D_dX5Kt1eL2ljh7aH6VusgBwttx-N9tmlvEt7VgLtZ_uyrLqxyk9Dar6TuU3Ccr/s1600/cliff.jpg" height="320" width="240" /></a></div>
Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com3tag:blogger.com,1999:blog-496451536493805371.post-14686277589709538002010-10-30T14:27:00.000-07:002010-11-02T02:13:27.106-07:00CCS Paper Part #2: Password Entropy<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR8c2S-2K0WK4569cZvJN3OOHl5OU-qBYeYxhHgw274Ve0aFHG4gtNYokOdyG_5zdeuRqobAbUndOz7Y-Ze8gME75NVhM1WzdnQAdF5hcT5xXUJit4Z3vOWMQCYh_fsk1Z1Jm5EcjLzXrI/s1600/entropy_nonpassword4.png"></a><div style="text-align: center;"><a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcVA6HumI_TZ08NU8Zoc45HdLCXsjEFcnWmZ9XRwG6aL5tV6rT1F9mdVBG-KjxYvo4Ww1JmrUSZEvnbPgAEneUtOUGDkOTDeRQNYRgbsAoqMYKJ-0_hTpRR7zO1RWOWTpeoZ-XIvuH3s91/s1600/square+peg+round+hole.jpg"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 355px; height: 236px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcVA6HumI_TZ08NU8Zoc45HdLCXsjEFcnWmZ9XRwG6aL5tV6rT1F9mdVBG-KjxYvo4Ww1JmrUSZEvnbPgAEneUtOUGDkOTDeRQNYRgbsAoqMYKJ-0_hTpRR7zO1RWOWTpeoZ-XIvuH3s91/s400/square+peg+round+hole.jpg" border="1" alt="Round Peg, Square Hole" id="BLOGGER_PHOTO_ID_5533961142447737154" /></a></div><div style="text-align: justify;">This is part #2 in a (mumble, cough, mumble) part serious of posts discussing the results published in the <a href="http://goo.gl/YxRk">paper</a> I co-authored on the effectiveness of passwords security metrics. Part #1 can be <a href="http://reusablesec.blogspot.com/2010/10/new-paper-on-password-security-metrics.html">found here</a>.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I received a lot of insightful comments on the paper since my last post, (one of the benefits of having a slow update schedule), and one thing that stands out is people really like the idea of password entropy. Here’s a good example:</div><div style="text-align: justify;"></div><blockquote><div style="text-align: justify;">“As to entropy, I think it would actually be a good measure of password complexity, but unfortunately there's no way to compute it directly. We would need a password database comparable in size (or preferably much larger than) the entire password space in order to be able to do that. Since we can't possibly have that (there are not that many passwords in the world), we can't compute the entropy - we can only try to estimate it in various ways (likely poor)”</div></blockquote><div style="text-align: justify;">First of all I want to thank everyone for their input and support as I really appreciate it. This is one of the few cases though where I’m going to have to disagree with most of you. In fact, as conceited as it sounds, my main takeaway has been that I've done a poor job of making my argument, (or I’m wrong which is always a possibility). So the end result is another post on the exciting topic of password entropy ;)</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">When I first started writing this post, I began with a long description on the history of Shannon Entropy, how it’s used, and what it measures. I then proceeded to delete what I had written since it was really long, boring, and quite honestly not that helpful. All you need to know is:</div><div style="text-align: justify;"><ol><li>Claude Shannon was a <a href="http://en.wikipedia.org/wiki/Claude_Shannon">smart dude</a>.</li><li>No seriously, he was amazing; He literally wrote the <a href="http://netlab.cs.ucla.edu/wiki/files/shannon1949.pdf">first book</a> on modern code-breaking techniques.</li><li><a href="http://en.wikipedia.org/wiki/Shannon_entropy">Shannon entropy</a> is a very powerful tool used to measure information entropy/ information leakage.</li><li>Another way of describing Shannon entropy is that it attempts to quantify how much information is unknown about a random variable.</li><li>It’s been effectively used for many different tasks; from proving one time pads secure, to estimating the limits of data compression.</li><li>Despite the similar sounding names, information entropy and guessing entropy are not the same thing.</li><li>Yes, I’m actually saying that knowing how random a variable is doesn’t tell you how likely it is for someone to guess it in N number of guesses, (with the exception of the boundary cases where the variable is always known – aka the coin is always heads- or when the variable has an even distribution – aka a perfectly fair coin flip).</li></ol></div><div style="text-align: justify;">Ok, I’ll add one more completely unnecessary side note about Shannon Entropy. Ask a crypto guy, (or gal), if the Shannon entropy of a message encrypted with a truly random and properly applied one time pad is equal to the size of the key. If they say “yes”, point and laugh at them. The entropy is equal to that of original message silly!</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Hey, do you know how hard it is to make an entropy related joke? I’m trying here… </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Anyways, to calculate the entropy of a variable you need to have a fairly accurate estimate of the underlying probabilities of each possible outcome. For example a trick coin may land heads 70% of the time, and tails the other 30%. The resulting Shannon entropy is just a summation of the probability of each event multiplied by the log2 of its probability, (and then multiplied by -1 to make it a positive value). Aka:</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><img src="http://upload.wikimedia.org/math/a/2/f/a2f05485301595188046d986c8cdd705.png" border="0" alt="" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 390px; height: 48px; " /></span></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">So the Shannon entropy of the above trick coin would be -(.7 x log2(.7) + .3 x log2(.3)) which is equal to 0.8812 bits. A completely fair coin flip’s entropy would be equal to 1.0. In addition, the total entropy of different <i> independent </i>variables is additive. This means the entropy of flipping the trick coin and then the fair coin would be .8812 + 1.0 = 1.8812 bits worth of entropy.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I probably should have put a disclaimer above to say that you can live a perfectly happy life without understanding how entropy is calculated.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">The problem is that while the Shannon entropy of a system is determined using the probability of the different outcomes, the final entropy measurement does not tell you about the underlying probability distribution. People try to pretend it does though, which is where they get into trouble. Here is a picture, (and a gratuitous <a href="http://en.wikipedia.org/wiki/Gnomes_(South_Park)">South Park</a> reference), that I used in my CCS presentation to describe NIST’s approach to using Shannon entropy in the SP800-63 document:</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjEgj7LSXmvPuZ3hKfdfcApDfo-mE-1oAlptvmKUIIkcFhUSOYaRH7fnNgZRFukALmSL2p2EIK8HC1h_MmOe4Q4l42ihjlPRaGHV9nefS8k7OdYwExCorg-1_I9od3TJsb7JMugz42mSpL7/s320/underpants.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5534013793703012914" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 320px; height: 220px; " /></span></div><div><span class="Apple-style-span"><br /></span></div><div style="text-align: justify;"> Basically they take a Shannon entropy value, assume the underlying probability distribution is even, and go from there. Why this is an issue is that when it comes to human generated passwords, the underlying probability distribution is most assuredly not evenly distributed. People really like picking “password1”, but there is always that one joker out there that picks a password like “WN%)vA0pnwe**”. That’s what I’m trying to say when I show this graph:</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeZq2A6NJvdPqIeghUlUae7jseGpAtku5oIx1_aZs7i2briNXSWeg5mAWrs04jvtjMyVVAcW4PEivn4-NldlHEKIVF9tilFoUvZbt_vlc9O4kX8pPGmwWqBp40NQfcGFs9pORBil857fpP/s400/NIST_model_issues.png"><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeZq2A6NJvdPqIeghUlUae7jseGpAtku5oIx1_aZs7i2briNXSWeg5mAWrs04jvtjMyVVAcW4PEivn4-NldlHEKIVF9tilFoUvZbt_vlc9O4kX8pPGmwWqBp40NQfcGFs9pORBil857fpP/s400/NIST_model_issues.png" border="0" alt="" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 227px; " /></a></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">The problem is not that the Shannon value is wrong. It’s that an even probability distribution is assumed. To put it another way, unless you can figure out a method to model the success of a realistic password cracking session using just a straight line, you’re in trouble.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Let me make this point in another way. A lot of people get hung up on the fact that calculating the underlying probability distribution of a password set is a hard problem. So I want to take a step back and show you this holds true even if that is not the case.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">For an experiment, I went ahead and designed a variable that has 100 possible values that occur at various probabilities, (thanks Excel). This means I know exactly what the underlying probability distribution is. This also means I’m able to calculate the exact Shannon entropy as well. The below graph shows the expected guessing success rate against one such variable compared to the expected guessing success generated by assuming the underlying Shannon entropy had an even distribution.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgi_XwhMJj94ZYlEJvM-CDAd5zmOcwW-p1vuoi427qLfRPVFwHfAX443QghYCv_Zi0I_SSApV1WGPB0cKfVKZZ8KWit5IBj4vblE9uwjB6FmqMX72QzigO3wpORE-c4TdULGVUqqmglZEFC/s400/entropy_nonpassword1.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5534342710967658642" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 241px; " /></span></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Now tell me again what useful information the Shannon entropy value tells the defender about the resistance of this variable to a guessing attack? What’s worse is the graph below that shows 3 different probability distributions that have approximately the same entropy, (I didn’t feel like playing around with Excel for a couple of extra hours to generate the EXACT same entropy; This is a blog and not a research paper after-all).</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhwblpsiwGowBqpFe6VF8_nPJIcTPm6g1SiO8H-7wCIwTRLcb9Pod7FCXH7R02F4Dy6cYdytysZp1XqIowS3-35v_LSRKf9M46PJTYnN4ScIjOqolOq94RPkiJK8R3_jZAHU3vVDraSDpUY/s400/entropy_nonpassword3.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5534344437274116130" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 241px; " /></span></div><div><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><br /></span></div><div style="text-align: justify;">These three variables have very different resistance to cracking attacks, even though their entropy values are essentially the same. If I want to get really fancy, I can even design the variables in such a way that the variable with a higher Shannon entropy value is actually MORE vulnerable to a shorter cracking session.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR8c2S-2K0WK4569cZvJN3OOHl5OU-qBYeYxhHgw274Ve0aFHG4gtNYokOdyG_5zdeuRqobAbUndOz7Y-Ze8gME75NVhM1WzdnQAdF5hcT5xXUJit4Z3vOWMQCYh_fsk1Z1Jm5EcjLzXrI/s1600/entropy_nonpassword4.png"><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjR8c2S-2K0WK4569cZvJN3OOHl5OU-qBYeYxhHgw274Ve0aFHG4gtNYokOdyG_5zdeuRqobAbUndOz7Y-Ze8gME75NVhM1WzdnQAdF5hcT5xXUJit4Z3vOWMQCYh_fsk1Z1Jm5EcjLzXrI/s400/entropy_nonpassword4.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5534351408839065682" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 241px; " /></a></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">This all comes back to my original point that the Shannon entropy doesn’t provide “actionable” information to a defender when it comes to selecting a password policy. Even if you were able to perfectly calculate the Shannon entropy of a password set, the resulting value still wouldn’t tell you how secure you were against a password cracking session. What you really want to know as a defender is the underlying probably distribution of those passwords instead. That's something I've been working on, but I’ll leave my group’s attempts to calculate that for another post, (hint, most password cracking rule-sets attempt to model the underlying probability distribution because they want to crack passwords as quickly as possible).</div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com7tag:blogger.com,1999:blog-496451536493805371.post-33674517479169158442010-10-07T07:23:00.000-07:002010-10-07T21:09:24.488-07:00New Paper on Password Security Metrics<div style="text-align: justify;">I'm in Chicago at the <a href="http://www.sigsac.org/ccs/CCS2010/">ACM CCS conference</a>, and the paper I presented there: "Testing Metrics for Password Creation Policies by Attacking Large Sets of Revealed Passwords", is now available online.</div><div><ul><li style="text-align: justify;"><a href="http://goo.gl/wqcX">Direct Download of PDF</a></li><li style="text-align: justify;"><a href="http://goo.gl/YxRk">View Online</a></li></ul></div><div style="text-align: justify;">Since I had the paper and presentation approved through my company's public release office I was given permission to blog about this subject while the larger issue of my blog is still going through the proper channels. Because of that I'm going to limit my next couple of posts to this subject rather than talking about the CCS conference as a whole, but let me quickly point you to the amazing paper "<a href="http://www.cs.unc.edu/~yinqian/password.html">The Security of Modern Password Expiration: An Algorithmic Framework and Empirical Analysis</a>", written by Yinqian Zhang, Fabian Monrose and Michael Reiter. In short, they managed to obtain a great dataset, their techniques were innovative and sound, and there's some really good analysis on how effective password expiration policies really are, (spoiler: forcing users to change their password every six months isn't very useful).</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I'd like to first start by acknowledging the other authors who contributed to the "Testing Password Creation Metrics..." paper.</div><div style="text-align: justify;"><ul><li><b>Dr. Sudhir Aggarwal </b>- Florida State University: My major professor, who spent I don't know how many hours helping walk me through the subtle intricacies of information entropy.</li><li><b>Michael Collins</b> - Redjack LLC: Another data driven researcher, and much cooler than me since he uses GNUPlot instead of Excel ;)</li><li><b>Henry Stern</b> - Cisco IronPort: He was the driving force behind getting this paper written. It was over lunch at the Microsoft Digital Crime Consortium, (it's a conference to combat cybercrime, and not a group of people from Microsoft looking to commit digital crime like the name implies...), that the framework for this paper was laid out.</li></ul>As for the contents of the paper, I'm planning on breaking the discussion about it down into several different posts, with this post here being more of an overview. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">When writing this paper, we really had two main goals:</div><div style="text-align: justify;"><ol><li>How does the NIST model of password entropy as defined in SP800-63 hold up when exposed to real password datasets and realistic attacks?</li><li>How much security is actually provided by typical password creation policies, (aka minimum length, character requirements, blacklists)?</li></ol><div>Based on our results, we then looked at the direction we would like password creation policies move to in the future. This ended up with us suggesting how to turn our probabilistic password cracker around, and instead use it as part of a password creation strategy that allows people to create passwords however they like, as long as the probability of the resulting password remains low.</div><div><br /></div><div>Of all that, I feel our analysis of the NIST password entropy model is actually the most important part of the paper. I know it sounds like an esoteric <a href="http://en.wikipedia.org/wiki/Inside_Baseball">inside baseball</a> subject, but the use of NIST's password entropy model has a widespread impact on all of us. This is because it provides the theoretical underpinning for most password creation policies out there. Don't take my word for how widespread the use of it is. Check out the <a href="http://en.wikipedia.org/wiki/Password_strength">Wikipedia article</a> on password strength, (or better yet, read the <a href="http://en.wikipedia.org/wiki/Talk:Password_strength">discussion page</a>) for yourself.</div><div><br /></div><div>Our findings were that the NIST model of password entropy does not match up with real world password usage or password cracking attacks. If that wasn't controversial enough, we then made the even more substantial claim that the current use of <a href="http://en.wikipedia.org/wiki/Entropy_(information_theory)">Shannon Entropy</a> to model the security provided by human generated passwords at best provides no actionable information to the defender. At worse, it leads to a defender having an overly optimistic view of the security provided by their password creation policies while at the same time resulting in overly burdensome requirements for the end users.</div><div><br /></div><div>Getting in front of a room full of crypto experts and telling them that Shannon Entropy wasn't useful to evaluate the security of password creation policies and "We REALLY need to STOP using it", was a bit of a gut clenching moment. That's because the idea of information entropy is fairly central to the evaluation of most cryptographic algorithms. I would have never done it except for the fact that we have a lot of data backing this assertion up. The reason we are making the broader point is because it's tempting to dismiss the flaws in the NIST model by saying that NIST just estimated the entropy of human generated passwords wrong. For example, if you juggle the constants around or perhaps look at word entropy vs character entropy, things will work out. Our point though is not that you can't come up with a fairly accurate Shannon entropy model of human generated passwords. You most assuredly can. It's just that it's not apparent how such a model can provide "actionable information". In addition, the way we currently use Shannon Entropy in evaluating password security policies is fundamentally flawed.</div><div><br /></div><div>This subject really does require another blog post, but before I head back to Boston I wanted to leave you with one of the graphs from our paper that demonstrates what I'm talking about:</div><div><br /></div><div><span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgeZq2A6NJvdPqIeghUlUae7jseGpAtku5oIx1_aZs7i2briNXSWeg5mAWrs04jvtjMyVVAcW4PEivn4-NldlHEKIVF9tilFoUvZbt_vlc9O4kX8pPGmwWqBp40NQfcGFs9pORBil857fpP/s400/NIST_model_issues.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5525517243002256002" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 227px; " /></span></div><div><br /></div><div>The above graph shows cracking sessions run against passwords that met different minimum length password creation requirements, (aka must be at least seven characters long). The NIST estimated cracking speed is based on the calculated NIST entropy of passwords created under a seven character minimum password creation policy. You may notice that it overestimates the security of the creation policy over shorter cracking sessions, but at the same time doesn't model longer cracking sessions either. This is what I keep on saying that it doesn't provide "actionable intelligence", (third time and counting). When we say "password entropy" what we really want to know is the Guessing Entropy of a policy. Unfortunately, as a community, we keep using Shannon entropy instead. Guessing entropy and Shannon entropy are two very different concepts, but unfortunately there doesn't exist a very good way of calculating the guessing entropy, while calculating the Shannon entropy of a set of text is well documented. This is part of the reason why people keep trying to use Shannon entropy instead.</div><div><br /></div><div>So I guess I should end this post by saying, if any of this sounds interesting please read the paper ;)</div></div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com3tag:blogger.com,1999:blog-496451536493805371.post-33874842368351620902010-09-10T17:56:00.000-07:002010-09-10T18:13:24.637-07:00Quick Status Update<div style="text-align: justify;">This is just a quick post to let you know that I for once have a valid excuse for not updating this blog in a timely manner. I actually found a job! Thanks to everyone who offered help, recommendations and encouragements. The only catch is that right now it's being decided if I have to run my posts through our public release office or not. Don't worry, this blog is not going away regardless of the decision.. It might just gain a few unwilling readers ;)</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">As to my new company, I'm going to keep that a bit of an open secret. This blog reflects my personal views. I certainly don't speak for them, and I plan on avoiding any topics that have to do with my day job, (Don't worry, I'm not doing any password cracking there).</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Once again thanks, and I'll resume posting once I get the OK and can update this blog while complying with company policies. I just want to make sure I handle this situation the right way.</div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com1tag:blogger.com,1999:blog-496451536493805371.post-59832808699621935732010-07-28T10:39:00.000-07:002010-07-28T10:56:18.780-07:00Defcon Crack Me if You Can Competition<div style="text-align: justify;">I'd be remiss if I didn't spend a little time talking about the "Crack Me if you Can" competition at Defcon. It's really been amazing the amount of interest that this contest is drumming up. People are excited; it seems like everyone is refining their mangling rules, putting together new wordlists, and finishing up various password cracking tools. The impact that this is having on the password cracking community as a whole is hard to overstate. Needless to say, I'm a fan of that, and I have a ton of respect for Minga and the folks at KoreLogic for putting this together.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I'll be participating, though I certainly don't plan on winning. What I'm really looking forward to though is the chance to meet with everyone else and learn what other people are doing. I'm hoping this turns into an event like the lockpicking village with the contest being almost besides the point. Of course I might be saying that because I'm going to get creamed as well...</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Since I've had a few people ask me about the competition itself, here's my two cents. My biggest concern is that the passwords we will be cracking aren't real. This isn't a criticism. There's no way you could run this competition with real corporate passwords, (well, legally that is...). It's just something to keep in mind. What will be interesting though is applying the techniques learned from the winner, (part of the rules are that you have to disclose your cracking techniques), to other datasets as they become available. That's why I have this blog. I might not be the best password cracker out there, but I can certainly run other people's attacks and plot the results on Excel ;)</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">If I had to hazard a guess, here's some predictions of mine about the contest:</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">1) Most passwords will be based on relatively common dictionary words. Way more so than you would find normally.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">2) Most of the cracking will center around applying the correct mangling rules. Yes there will be the 'Dictionary123' words, but I expect most 'high score' passwords will have less common rules such as 'xD1ct1onaryx'.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">3)There will probably be some LANMAN passwords, so bring your rainbow tables.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">4) I expect there to be so many NTLM passwords that rainbow tables for them won't be cost effective.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">5) I'll be interested to see if they have any 'exotic' password hashes. WinRAR, TrueCrypt, etc.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">6) It'll be a ton of fun ;)</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I'll see you guys there.</div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com2tag:blogger.com,1999:blog-496451536493805371.post-66516463920279935732010-07-03T18:08:00.000-07:002010-07-03T22:20:16.894-07:00Protecting Physical Documents<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLRSW09mXebVJ3lHF-rFjfpREZbxsx06HJmH5eXgg0n-BDAgv4U4qxUjLyPoE5BKAR3ERBmu1b32r4a65_OAMkfDZ1RgvTG55VN0w9PiJhgW2myQoGA9TRV9xQo_sKGsAv8_mffStRSQ5N/s1600/Broken_car.JPG"><img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgLRSW09mXebVJ3lHF-rFjfpREZbxsx06HJmH5eXgg0n-BDAgv4U4qxUjLyPoE5BKAR3ERBmu1b32r4a65_OAMkfDZ1RgvTG55VN0w9PiJhgW2myQoGA9TRV9xQo_sKGsAv8_mffStRSQ5N/s400/Broken_car.JPG" border="0" alt="" id="BLOGGER_PHOTO_ID_5489854572855244386" /></a><div style="text-align: justify;"><br /></div><div style="text-align: justify;">The above picture is of my Subaru Baja. Sometime last night, someone broke my back window, and stole almost everything from my car, (they apparently did not like my music; btw Punk is not dead). Normally this would be a costly annoyance, but in this case I'm in the process of moving and finding a new job. While most of my stuff is sitting safely in a storage locker, I still had several boxes of various items stored in my back seat, including unfortunately my "to-go" bag.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">My to-go bag contained all of my important possessions that I planned on grabbing if my house was in the process of burning down. For example, it contained two thumb drives with backups of all of my work plus assorted other documents. I'm not worried about them though since I used TrueCrypt. Good luck cracking those, which BTW, is the reason I love TrueCrypt. What I am concerned about though is that my social security card, my passport, my birth certificate, my extra banking checks, and a whole lot of other important paper documents were also taken. Visualize everything you don't want to be stolen and you have a pretty good idea of what was in that bag.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Of course the next question is "Why the Hell did you leave such important documents in your car?" My response is, laziness and a poor threat model. I needed several of those documents when getting a job. A drivers license by itself doesn't cut it, as you need at least two other forms of ID. Rather than grab those two documents out of my bag and leave the rest in the storage locker I grabbed the whole thing in case there was anything else I needed. Normally I also wouldn't leave it in my truck if I was planning on going somewhere sketchy, but I was parking in a well lit hotel parking lot, and quite simply I had my hands full hauling my bag of clothes, my suit and my computer bag into my room. Darned if I wanted to make a second trip back down to my car. What's worse though is I rationalized it away by saying that at least my car was locked, unlike my hotel room where any of the housekeeping staff could access it during my stay; Plus my car's never been broken into before... The reason that's worse is because I didn't simply forget my bag; I realized it might be a problem and then actively convinced myself that, "No, everything really is ok".</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I have to confess that this is a difficult post for me to write. I was counseled by several friends never to mention this incident to anyone, (besides the cops and other officials of course). Their comments were along the lines of "You're looking for a security job, and you just had your life stolen because you left it in the back of a car. Even I wouldn't hire you now!" Of course that was said in jest, but the concern is real. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">I'm willing to take that risk though for several reasons. First of all I'm really angry, both at the person who did this and myself. Actually mostly at myself which is seriously messed up. By talking about this incident hopefully I can gain a bit more control over the situation. Second, I'm a big fan of disclosure. If the above description doesn't sound like a typical computer attack, let me rewrite it for you:</div><div><blockquote style="text-align: justify;"><i>The attacker performed an SQL injection attack against subaru_bajas_rule.com. After gaining access to the database they downloaded the user's social security number, banking information, and other personally identifiable information. Afterwards the attacker performed a 'drop table users' destroying the local copy of the data. When the site administrator was asked about this, he responded that knew SQL injection attacks were common, but he never expected to be targeted by one. As for the reason why the user data was accessible, the administrator admitted the site was in the process of transitioning to a new forum software, and that if the attack happened a week later when the new forum software was in place, this wouldn't have been a problem.</i></blockquote></div><div style="text-align: justify;">I've always felt that security incidents can happen to anyone, and what's important is to focus on the remediation, and use them as a learning tool to make sure the same attack doesn't happen again. That's one nice thing about having a blog, I'm on record on saying much the same thing <a href="http://reusablesec.blogspot.com/2009/08/defcon-17-roundup.html">when talking about the ZF0 attack</a>, so at least this isn't a new found belief I came to after finding myself completely 0wned ;)</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">So in the spirit of full disclosure I wanted to talk about this attack in a public forum where hopefully it will benefit other people, and if someone doesn't want to hire me because I'm not perfect, well at least they found out now. So on to a more detailed analysis of the attack:</div><div style="text-align: justify;"><ol><li>Only items in the back seat were stolen. Since there was so much stuff it looks like the attacker grabbed what they could, and then left without doing a full search of the car. I just was really unlucky that everything I cared about was in the back seat.</li><li>When this happened, my car was parked at the far end of the parking lot, since I'd rather walk than squeeze into a small spot. This was a serious mistake considering all of the stuff I had in my car.</li><li>While there was a night watchman, he did not notice the attack. Likewise there were no sensors collecting forensically useful data, (aka cameras, and the thief did not leave any useable fingerprints).</li><li>I'm much too focused on digital issues. The fact that I bothered to encrypt my electronic documents and the store them with paper documents that are way more valuable to an attacker shows a serious lack of priorities and/or threat modeling. I'm not saying don't encrypt your files, but simply that I should have taken the same care with my other documents and locked them up in my hotel room safe.</li><li>The mindset, "Just because something bad hasn't happened to me in the past means it won't happen to me in the future" is very hard to avoid.</li><li>I really hope all that is happening right now is some 16 year old kid is trying to use my passport to buy booze. That being said, I need to plan for much more serious scenarios, hence closing my old bank account rather than just canceling my checks, signing up to a credit check service, etc.</li><li>Dealing with issues like this on a holiday weekend is extremely difficult. Canceling my old bank account and moving it to a new one was particularly stressful since I had a couple of hours to do it before the bank closed for the long weekend. Likewise, I will have to wait till Tuesday to have my car window replaced. Of course, cyberattacks never happen <a href="http://www.huffingtonpost.com/2009/07/07/massive-cyber-attack-knoc_n_227483.html">during a holiday</a>... </li><li>Security policies are important, but what's more important is enforcement of those policies. You really do need some force from on high telling people, "Yes, it is a pain to take a second trip down to your car, but you are going to do it anyway".</li><li>Storing all of your valuables in one place has advantages and disadvantages. I still don't know if the idea of a to-go bag was a fundamentally bad idea, but I certainly should have "checked out" my two forms of id from it rather than taking the whole thing.</li><li>What makes a fiasco is a cascade of multiple smaller mistakes/failures occurring together. Whether we are talking about the BP oil spill, or <a href="http://www.thisamericanlife.org/radio-archives/episode/61/Fiasco%21">a horribly hilarious Peter Pan play</a>, serious problems often are not the result of just one thing going wrong, but several poor decisions.</li><li>Combined with my comments in the previous paragraphs, it's really easy to analyze all of the stuff that I did wrong after the fact. The problem is it all of my decisions seemed so reasonable <a href="http://www.epicfail.com/wp-content/uploads/2009/06/bike-security-fail1.jpg">at the time</a>. On the other hand, you can't live <a href="http://www.offthemarkcartoons.com/cartoons/2000-03-21.gif">a perfectly secure life</a>. What I'm wrestling with right now is how to re-evaluate my personal threat models and learn from this incident without letting it ruin my life.</li></ol><div>Well, that about does it for now. Hopefully I can get back to the focus of this blog, the academic study of password cracking techniques, soon. This whole real life security thing can be pretty annoying...</div></div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com6tag:blogger.com,1999:blog-496451536493805371.post-51205561326668541632010-05-31T01:27:00.000-07:002010-05-31T11:51:48.359-07:00Carders.cc - General Observations and Updates - Part 3<div style="text-align: justify;"><div style="text-align: justify; ">Digging into this data is like watching an <a href="http://www.youtube.com/watch?v=j1PAB6Sgdp8">episode of Lost</a>. Whenever it seems like one question gets answered, about ten other questions pop up.</div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">Before I get into details, I want to start with a comment <a href="http://securitynirvana.blogspot.com/">Per Thorsheim</a> sent me as to what other password cracking programs support salted sha1 hashes:</div><div><p class="MsoPlainText"></p><blockquote><p class="MsoPlainText" style="text-align: justify; "><i>The sha1(lowercase_username.password_guess) is at least supported by these:</i></p><p class="MsoPlainText" style="text-align: justify; "><i>Extreme GPU Bruteforcer (</i><a href="http://www.insidepro.com/"><i>www.insidepro.com</i></a><i>) hashcat and oclhashcat (cpu/gpu respectively) </i><a href="http://www.hashcat.net/"><i>www.hashcat.net</i></a></p></blockquote><p class="MsoPlainText"><a href="http://www.hashcat.net/"></a></p></div><div style="text-align: justify; ">I'm kicking myself for not thinking about hashcat, since it's a extremely powerful password cracker; plus it's free. Unfortunately the GPU version doesn't support the salted sha1 hash type, but even the non-gpu version is quite nice.</div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">As for InsidePro, it also is very good, though it does cost some money. I've had a license-free version of questionable origin offered to me before, but I turned that down. Legality aside, installing pirated software given to you by shady people at a hacker conference would just be stupid...</div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">Also, I've been talking to <a href="http://twitter.com/iagox86">Ron Bowes</a> over at the excellent <a href="http://www.skullsecurity.org/blog/">skullsecurity blog</a>, and he and some other people are hard at work cracking the passwords. It sounds like they have some serious hardware behind the effort, so expect to see something posted on his site about that in the near future.</div><div style="text-align: justify; "><br /></div><div style="text-align: justify; "><b>OK, now onto the analysis:</b></div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">First of all, I'm downgrading my opinion of the skill showed by the hackers in their password cracking attack. What I didn't realize before was the extent that the users of carders.cc had been compromised previously. Just about all of the non-trivial passwords that were cracked appear in publicly available input dictionaries which are based on passwords cracked from user submitted hashes - aka hashkiller, insidepro, etc. Please note, I'm not saying they were script kiddies. The attackers were able to target the salted sha1 hash, and they knew where to get some good input dictionaries. It's just that they are not some uber-l33t password crackers, and anyone else using those input dictionaries could crack the same number of passwords in a couple of hours.</div><div><br /></div><div style="text-align: justify; ">What this also means is we might be able to figure out which input dictionaries the attackers were using by looking at the hashes they cracked, vs. what the input dictionary would crack. To demonstrate this, below is a Venn diagram of what an input dictionary the attackers used would look like:</div><div><b><br /></b></div><div><b><img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhigUeHAuPvOZ2UaEScHdq50wmyLIPyascD9NOGIndz4yWXQaOFnHz1jb_lrxiRZuhuhA4QYF_pwImCvKVQzWNrd1fwC3cnYya-8gjooSD4n64ZyMUTk6uCHFf-wGOVKwhiafku5Cs53y1m/s400/venn-dictionary.png" border="1" alt="" id="BLOGGER_PHOTO_ID_5477368519471341234" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 316px; height: 294px; " /></b></div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">For example, there is a good chance the attackers used the <a href="http://depositfiles.com/files/pig7qzee1">InsidePro Big dictionary</a>, since it cracks the same 598 passwords from the carders.cc list that the attackers cracked. Add in a couple of the other publicly available input dictionaries, and you get real close to the 920 total passwords they managed to crack. To put this in perspective, I've so far managed to crack 62%, (compared to the 53% that the hackers cracked), of the salted Sha1 passwords on my laptop using basic dictionaries with almost no mangling rules. I fully expect other people to blow past that mark.</div><div style="text-align: justify; "><br /></div><div style="text-align: justify; "><br /></div><div style="text-align: justify; "><b>Next up, initial thoughts about the carders.cc database:</b></div><div style="text-align: justify; "><br /></div><div style="text-align: justify; ">What I really need to do is load all of the tables into my own database so I can do quick SQL queries, vs. manually going through the data by hand. That being said there are a few things that stick out:</div><div style="text-align: justify; "><ol><li>There's a lot of userids/password hashes in the <i>carders_smf_members</i> table that did not appear in the write-up.</li><li>The MD5 hash is salted, but at least the salt is also available in the <i>carders_smf_members </i>table. The hash itself is a vBulletin3 hash type, MD5(MD5(Password).Salt). Both John the Ripper and Hashcat support this hash type.</li><li>I'm not sure if the IP addresses stored in the table are accurate or not, since it looks like the site admins tried to obscure it in the webserver logs, but if they are, the database stores the last two ip addresses used.</li><li>Other interesting fields include date joined, number of posts, karma level, last login date, etc.</li><li>The above doesn't even begin to get into all of the data contained in the actual posts themselves...</li></ol><div>That about it for this update. Let me leave you with one last fact that Ron found out:</div><div><p class="MsoPlainText"></p><blockquote><p class="MsoPlainText"><i>Another interesting factoid:</i></p> <p class="MsoPlainText"><i>Last MD5 password: 2010-01-10 21:54:01</i></p> <p class="MsoPlainText"><i>First SHA1 password: 2010-01-10 22:40:16</i></p> <p class="MsoPlainText"><i>So it was on January 10, 2010, later in the evening (in CDT) that they upgraded from vBulletin 3 to SMF.</i></p><p class="MsoPlainText"><i>*shrug* the more you know! </i></p></blockquote><p class="MsoPlainText"></p></div></div></div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com0tag:blogger.com,1999:blog-496451536493805371.post-9587187267256650612010-05-26T18:34:00.000-07:002010-05-31T01:27:37.583-07:00Carders.cc - Analysis of Password Cracking Techniques - Part 2<div style="text-align: justify;"><blockquote></blockquote>So I figure I probably should get around to looking at the passwords in this list, since password cracking techniques are the focus of this blog...</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">First though, a real quick definition. I needed to decide what to call the various parties involved in this whole shenanigans. For example, when I'm talking about the 'hackers', am I referring to the people collecting stolen credit card data who belonged to the board, or the people who hacked carders.cc? Likewise, if I use the term criminals, that could refer to both groups as well. Therefore, in my blog posts I'm going to use the following terms to refer to the two groups:</div><div><ul><li style="text-align: justify;"><b>Carders/Users</b>: The people who belonged to the board. Normally I would also use the term 'victims', but I don't want to honor them with that title.</li><li style="text-align: justify;"><b>Hackers/Attackers</b>: The people who broke into the forum and posted the data online.</li></ul><div style="text-align: justify;">Ok, now that we have that out of the way, the rest of this post is going to be broken up into four parts:</div><div><ol><li style="text-align: justify;">Executive Summary</li><li style="text-align: justify;">Brief Description of the Password List</li><li style="text-align: justify;">General Observations</li><li style="text-align: justify;">Setting Up a Password Cracker to Target Salted Sha1 Hashes</li></ol><div style="text-align: justify;">More in-depth analysis will have to wait till later since I need more time to go through the data. Also, I'm moving up to the D.C. area in a week to look for a job/start interviewing, so large scale password cracking sessions are not really feasible for me to perform right now. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><i>-BTW, if you have a job opening please feel free to contact me.</i> </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Finally, I apologize but this post is going to be a bit more haphazard then the above outline implies, simply because it's hard to talk about one aspect of this dataset without discussing other aspects as well.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>Executive Summary:</b></div><div style="text-align: justify;"><b><br /></b></div><div style="text-align: justify;">Carders.cc, on online forum for the buying and selling of stolen bank and credit card data, was broken into sometime before May 5th by vigilante hackers. On May 18th, the hackers then posted data about the users to various file sharing sites, (pastebin, rapidshare), which included IP addresses, usernames, e-mail addresses, and all of the saved forum posts. More significant to this report, the attackers also posted the users' password hashes along with the plaintext passwords that the attackers had managed to crack. These passwords and password hashes provide significant data for researchers. From this data, we now know that cyber-criminals often select weak passwords. Furthermore, by analyzing these passwords we can develop better password cracking techniques that target the specific type of users, (criminals), whom law enforcement is the most interested in. This is an improvement from some of the other disclosed password datasets which were gathered from the general public. In addition, by looking at the passwords the attackers managed to crack, along with the passwords they failed to crack, we can gain a better understanding of the capabilities and techniques employed by hacker groups in real life attacks. This hacker group in particular appears to be quite proficient in password cracking techniques, possessing both the tools to attack salted SHA1 password hashes, and the ability to crack close to 53% of the passwords in the dataset.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>Brief Description of the Password List(s):</b></div><div style="text-align: justify;"><b><br /></b></div><div style="text-align: justify;"><b><span class="Apple-tab-span" style="white-space:pre"> </span><span class="Apple-style-span" style="font-weight: normal;">The first thing that stands out is there are actually two lists of passwords present in the attacker's write-up, though all of the password hashes are mixed together.</span></b></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>List 1)<span class="Apple-style-span" style="font-weight: normal;"> </span></b></div><div style="text-align: justify;"><b><span class="Apple-style-span" style="font-weight: normal;"></span></b></div><blockquote><div style="text-align: justify;"><b><span class="Apple-style-span" style="font-weight: normal;">Contains 1737 salted Sha1 password hashes using the following format: </span></b></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b><span class="Apple-style-span" style="font-weight: normal;"><b>Sha1(username.password)</b></span></b></div><div style="text-align: justify;"><b><br /></b></div><div style="text-align: justify;">This is the list that the attackers talked about and the list they attempted to crack passwords from. Of all the salted Sha1 password hashes, the attackers cracked 920 of them, (around 53%), during the course of an approximately two week period.</div></blockquote><div style="text-align: justify;"></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>List 2)</b></div><div style="text-align: justify;"><blockquote>Contains 2036 "one hundred and twenty eight bit" password hashes of unknown origin. Due to their length, the base password hash is most likely MD4 or MD5. It is unknown if the attackers knew of the existence of these separate hashes since they did not mention the second hashing function in their writeup or crack any of the passwords from this list. I've run some quick tests and these hashes do not appear to be a single round of MD4 or MD5, but probably instead are salted in some manner, and/or use multiple rounds of MD4/5. If a salt that was not the username was used, then it probably will be infeasible to crack very many of the passwords due to the salt not being included in the writeup. A few of the passwords might still be vulnerable though since the salt may be short enough to be brute-forced along with the guesses.</blockquote></div><div style="text-align: justify;">So the next question is, why were there two separate types of password hashes? The most likely explanation is that at some point in the past, the site administrators switched over to a new forum software, or upgraded their security settings. The old users were left with the original password hash, and users who joined up afterwards were assigned the newer password hash. The exact same thing happened when <a href="http://reusablesec.blogspot.com/2009/04/ok-some-actual-results.html">phpbb.com was hacked</a>, leaving users who hadn't logged into the site recently, exposed with only a simple MD5 hash protecting their password vs. the tougher phpbb3 hash.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">The interesting thing is that in this case, a user's hash probably was not changed to the new format when they logged into the site. This can be seen by comparing the user's hash type to the IP addresses I <a href="http://reusablesec.blogspot.com/2010/05/carderscc-hacked-initial-analysis-of-ip.html">talked about earlier</a>. Of the 960 username/IP address combinations, 47 of the users had their password stored as the unknown hash, and 726 had their passwords stored as a salted Sha1 hash. You might notice that these two numbers did not add up to 960. There were 187 usernames that appeared in the IP logs but did not appear in either list of password hashes. I need to look into this more, but many of them might represent incorrect login attempts. A second option is the attackers may not have released all of the user password hashes, instead choosing to keep several for themselves to aid in future hacking attacks. I should probably analyze the forum posts to confirm if the hash type correlates to when the users became active, and if any of the users with missing password hashes actually posted previously on the forum.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Why yes, this is my "brief" description... </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>General Observations:</b></div><div style="text-align: justify;"><b><br /></b></div><div style="text-align: justify;"><b>Observation 1)</b> The attackers are very proficient in password cracking techniques</div><div style="text-align: justify;"><blockquote></blockquote><blockquote></blockquote></div><blockquote><div style="text-align: justify;"><b>Level of Confidence:</b> High</div><div style="text-align: justify;"><b>Discussion:</b> The fact that the attacker could even target the salted Sha1 password hashes speaks to their competence. Most major password crackers, (Cain&Able, John the Ripper, L0phtcrack, etc), do not natively support that salt/hash combination. While John the Ripper has extensive support for a generic MD5 hashing function, where it is very easy to modify it to support all sorts of weird hash/salt combinations, that functionality has not yet been ported over to support Sha1 as well. So either they developed their own password cracker/workaround, (which isn't as hard as you might think, as I'll discuss later), or they had access to a less well known password cracker that supported that hash.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>Question:</b> Does anyone know of any password cracking program that does support Sha(username.password)?</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">In addition, since the passwords were salted with a unique salt per hash, this negates many common attacks like rainbow tables. Also, I don't know of any online password cracking site that supports this hash format, since they can't use hash lookup tables either. This means the attackers had to crack all of the password hashes themselves instead of relying on community resources.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Next, the fact they cracked 53% of the password hashes in a two week period speaks highly of the attacker's skills. To put this in perspective, in the phpbb.com attack, the attacker had submitted unsalted MD5 password hashes to an online cracker and only managed to crack 24% of the passwords. In that case, while the hacker might have been very skilled in webpage attacks, they were pretty much at the script kiddie level when it came to cracking the passwords. While 53% might not sound that much more impressive, there is the fact that in the carders.cc case, the hashes were salted with a unique salt. The 'unique' part is important since it means an attacker has to hash each guess independently against each target hash they are trying to crack. </div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">For example, imagine a typical cracking session taking one hour to complete. If an attacker was attempting to crack three thousand unsalted password hashes, then it would take them one hour to run all of their guesses against the entire list. On the other hand, if those password hashes used a unique salt, the same attack would then take three thousand hours, or roughly four months. That's a big difference. Assuming the attackers spent two weeks attempting to crack all 1737 salted Sha1 password hashes, that would mean the guesses they could make in their attack would be equivalent to a 11 minute cracking session against unsalted Sha1 passwords. Even assuming they cracked 50% of the password almost immediately, their cracking session would still be only equivalent to around 20 minutes of attacking unsalted hashes. In conclusion: cracking 53% of close to two thousand salted password hashes in a two week period is fairly impressive.</div></blockquote><div style="text-align: justify;"></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;"><b>Observation 2)<span class="Apple-style-span" style="font-weight: normal; "> The Attackers did not user John the Ripper's Incremental mode to brute force guesses.</span></b></div><div style="text-align: justify;"><div style="text-align: justify; "><b></b></div><blockquote><div style="text-align: justify; "><b>Level of Confidence: </b>High</div><div style="text-align: justify; "><b>Discussion:</b> When I ran JtR's incremental mode, using the default charset 'All', against the list, I almost immediately cracked a new password. Over a longer running session, I managed to crack even more.</div></blockquote><div style="text-align: justify; "></div><div style="text-align: justify; "><div style="text-align: justify; "><b>Observation 3)<span class="Apple-style-span" style="font-weight: normal; "> The Attackers brute forced digits</span></b></div><div style="text-align: justify; "><div style="text-align: justify; "><b></b></div><blockquote><div style="text-align: justify; "><b>Level of Confidence: </b><s>Low/Medium</s> Very Low</div><div style="text-align: justify; "><b>Edit March 28th 2001:</b> I've since managed to crack the password '37721010'. It appears that most of the longer digit based passwords that the attackers had cracked were publicly available in lists of previously cracked password. I'm leaving my original discussion post up for posterity.</div><div style="text-align: justify; "><b>Discussion:</b> I have not yet been able to crack any all-digit passwords that the attackers did not crack. Likewise, the attackers cracked several passwords such as '19930720', where unless the attacker has outside information, or had run across that password earlier, it is unlikely to be cracked by any form of attack besides brute force.</div></blockquote></div></div><div style="text-align: justify; "><div style="text-align: justify; "><b>Observation 4)<span class="Apple-style-span" style="font-weight: normal; "> The Attackers did not brute force alpha passwords longer than five characters long</span></b></div><div style="text-align: justify; "><div style="text-align: justify; "><b></b></div><blockquote><div style="text-align: justify; "><b>Level of Confidence: </b>High</div><div style="text-align: justify; "><b>Discussion:</b> I have since managed to crack several passwords that were four lowercase characters long followed by two digits. Likewise I managed to crack several passwords that were six characters long all lowercase.</div></blockquote><div style="text-align: justify; "><b>Observation 5) </b>The attackers may have used knowledge from previous attacks to aid their password cracking sessions</div><div style="text-align: justify; "></div><blockquote><div style="text-align: justify; "><b>Level of Confidence: </b>Medium</div><div style="text-align: justify; "><b>Discussion:</b> Let's be honest, this isn't their first time at <a href="http://www.youtube.com/watch?v=S3-qx08ZJ6c&feature=fvw">the rodeo</a>. It's doubtful that carders.cc is the first blackhat site these attackers have compromised. As we've seen from the <a href="http://news.softpedia.com/news/PerlMonks-ZF0-Hack-Has-Wider-Implications-118225.shtml">ZF0 attacks</a>, just because a website supports a tech-savy audience, doesn't mean it stores their passwords securely. And as we all know, people re-use passwords ... a lot. It's quite possible that the attackers used passwords cracked from previous sites to in turn crack some of the passwords in the carders.cc set. For example, one of the cracked passwords was 'Nadia2312'. An unsalted MD5 hash of that same password was cracked via the <a href="http://hashkiller.com/">Hashkiller site's project Opencrack</a>, on February 1st, 2010. While it could be a coincidence, I'd also like to point out that Hashkiller is a German based website. Other plaintext passwords in the list such as '123456r12', and '01724776692' also show up as being cracked by HashKiller and <a href="http://www.blogger.com/forum.insidepro.com">InsidePro</a> in 2009. Note, this may just imply that the attackers used custom input dictionaries from these sites, and did not actually submit the original password hashes to them.</div></blockquote><div style="text-align: justify; "></div><div style="text-align: justify; "><br /></div></div></div></div><div style="text-align: justify;"><b>Setting Up a Password Cracker to Target Salted Sha1 Hashes</b></div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">As I mentioned previously: I'm sure there's already a password cracking program out there that can target salted Sha1 hashes, but I don't know of it. Not to be deterred, I quickly modified an <a href="http://sites.google.com/site/reusablesec/Home/password-cracking-tools">old password cracking program</a> I had written a while ago to support the hash, (note I could have modified John the Ripper instead, but I'll get to that in a bit). The hardest part was I didn't realize that when you added the username as the salt, you had to lowercase the entire thing. That's a couple of hours of my life I'd like back...</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">So testing this on my Mac laptop I expected to be able to make around 1000 guesses a second against the entire set of 1737 password hashes. I based this guess on benchmarking unsalted Sha1 hashes using John the Ripper, which gave me around 7,000,000 guesses a second. This would give me a maximum of 4029 guesses a second against the salted hashes. Throw in the overhead required to apply the salt plus my general programing sloppiness, and 1000 guesses a second sounded like a good number. So you can imagine my surprise when i was only ended up making around 80 guesses a second. Something's not right here...</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Well it turned out, the problem was I had used the OpenSSL implementation of the Sha1 hashing algorithm which is really slow compared to the version in John the Ripper. Rather than trying to jury rig it into my old password cracker, I then decided to work with John the Ripper instead, which is what I should have done in the first place.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Since I'm a bit <s>lazy</s> goal oriented, I didn't bother to modify John the Ripper's source code though. I was able to get away with this since the hashing algorithm only uses one round of Sha1. Let's look at the hashing algorithm again:</div><div style="text-align: justify;"><blockquote><b>Sha1(lowercase_username.password_guess)</b></blockquote></div><div style="text-align: justify;">The simplest way to do this would be to create a rule in JtR that would insert the username first. Such a rule would look like:</div><div style="text-align: justify;"><blockquote><b>A0"alice"</b></blockquote></div><div style="text-align: justify;">Which would insert Alice in front of all of your password guesses. Since we have close to two thousand usernames we are talking about though, that could get annoying having to create a rule for each one. This calls for the use of one of my favorite programs, <a href="http://www.vectorsite.net/tsawk_1.html#m1">Awk</a>.</div><div style="text-align: justify;"><br /></div><div style="text-align: justify;">Awk, (and its sister Sed), is one of the best programs out there for quickly parsing and modifying files. It's also very useful for creating word mangling rules on the fly for use in a password cracking session in conjunction with JtR's stdin option. In that way, I sometimes use it much like <a href="http://reusablesec.blogspot.com/2009/10/cracking-passwords-with-middle-child.html">MiddleChild</a>, and in retrospect, I could easily have written MiddleChild in awk instead. For example I can create guesses using John the Ripper, pipe them into an awk script, and then pipe them back into John the Ripper where the guess will actually be hashed and compared against the target hashes. The above JtR rule could then be replaced with:</div><div style="text-align: justify;"><blockquote><b>./john -wordlist=passwords.lst -session=cracking1 -rules -stdout | awk '{print "alice" $0}' | ./john -stdin -format=raw-sha1 <hashfile></hashfile></b></blockquote></div><div style="text-align: justify;">The advantage of this approach is that 'alice' will be automatically inserted in front of all the guesses/rules that the first instance of John the Ripper is generating. But wait, don't we still have close to two thousand usernames we have to do this for? That would be one nasty command line. Luckily an Awk script can be run from a file with the -f option. And how do we create that file with all 1737 usernames? Well with Awk of course! The original format that the hashes were saved to in the carders.cc writeup was:</div><div style="text-align: justify;"><blockquote><b>username:hash:plaintext:e-mail</b></blockquote><blockquote><b><blockquote></blockquote><blockquote></blockquote></b></blockquote></div><div style="text-align: justify;">Simply cat that file into Awk using the following command will create your Awk script for you.</div><div style="text-align: justify;"><blockquote><b>cat carders_hashes.txt | awk -F : '{print"{print \"" tolower($1) "\"$0}"}' > awk_add_usernames.txt</b></blockquote></div><div style="text-align: justify;">The -F tells it to use the ':' as a seperator and the tolower() option lowercases the usernames. To run this whole thing just type:</div><div style="text-align: justify;"><div style="text-align: justify; "><blockquote><b>./john -wordlist=passwords.lst -session=cracking1 -rules -stdout | awk -f awk_add_usernames.txt | ./john -stdin -format=raw-sha1 <hashfile></hashfile></b></blockquote><div>Using this setup, now I'm getting around 2,300 guesses a second which is much more respectable. There certainly is room for improvement though. First of all, John compares each hashed guess against all of the target hashes, instead of only the hash which the salt belongs too. Also, you have to clean up the cracked hashes file afterwards to remove the username from the plaintext password. But did I mention it works? As Miles said in the Lost finale, "I don't believe in a lot of things, but I do believe in duct tape!"</div><div><br /></div><div>Well, that about does it for now. Next up, analysis of the cracked passwords.</div><div><br /></div></div></div><div style="text-align: justify;"><br /></div></div></div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com1tag:blogger.com,1999:blog-496451536493805371.post-89868632258876355412010-05-25T15:00:00.000-07:002010-05-25T15:12:42.234-07:00Carders.cc - Analysis of E-mail AddressesI just wanted to point everyone over to Cedric Pernet's bog where he did an amazing job analyzing the e-mail addresses that the carders had used. You can view his work at the following link:<div><br /></div><div><a href="http://bl0g.cedricpernet.net/post/2010/05/20/Fraudsters-e-mail-addresses">http://bl0g.cedricpernet.net/post/2010/05/20/Fraudsters-e-mail-addresses</a></div><div><br /></div><div>It shouldn't come as a surprise, but just because someone is a cybercriminal doesn't mean they are smart.</div><div><br /></div><div>Also, if you or anyone you know is doing research into this, feel free to forward me the links. I only found Cedric's blog on a reference in another post on page 8 of a Google search I did, (aka I stumbled on it by pure luck). Thanks!</div><div><br /></div>Matt Weirhttp://www.blogger.com/profile/16008062842047893999noreply@blogger.com0