How to write a brute-force password cracker
If you have access to a password's hash and salt, if applicable, a brute force attempt to crack it is a guaranteed method if you have the CPU cycles available to compute it in a reasonable amount of time. Unfortunately, guessing every single permutation of characters takes exponentially longer with the length of the password used so often times you are facing months if not years of processing time on a single PC. While there are methods to distribute the work across multiple PCs, I am only going to go over a simple brute force algorithm and will hold off on the distributed processing model for a later article.
Before you can do anything, you must know what algorithm was used to hash the password. For many products, this is know and you can find it either by searching the web or by looking through the code the product uses if it's open-source. In my example, the algorithm is md5(concat(md5(password), salt)). What this means is that a hexadecimal md5 digest was created for the password, then a random salt was concatenated onto it to form a new string, which was then md5 digested again and the resulting hex digest and salt are stored to be checked against later. This is not necessarily the most secure method in the world, but for 10+ character passwords it takes months or years of CPU time on a decent PC to guess.
Test your hashing implementation before running
Within the product, put in a test password then extract the hash and salt. Run your code with your password in and make sure it generates a matching hash and salt. This will be your evaluation method and it is crucial that you have no bugs in it, as you will skip right over the correct password and potentially might end up at a false positive.
Write your password generation method and tie it all together
Obviously a dictionary-based approach would be faster to execute, but we're not talking about that today. You will want to generate every permutation of the test character set, generate a hash of each one and test it against the target hash, then print out a positive result. I used a recursive method that generates all permutations for a given width of password. I then call that from a loop that starts at a width of 1 and works up to a maximum of 13. Again, my PC would never be able to finish all 13 character perms in time but that's a different issue.
The code
Here is my code written in python, which should be easily portable to any other language.
import md5
import sys
easyRange = [32,33,36,42,43] + range(48, 58) + range(65, 91) + range(97, 123)
allRange = range(32,127)
hash = "<16 hex pairs here>"
salt = "<salt>"
def checkPassword(password):
m = md5.new(password)
m = md5.new(m.hexdigest() + salt)
if (m.hexdigest() == hash):
print "match [" + password + "]"
sys.exit()
def recurse(width, position, baseString):
#current position
for char in easyRange:
if (position < width - 1):
recurse(width, position + 1, baseString + "%c" % char)
checkPassword(baseString + "%c" % char)
print "Target Hash [" + hash + "] Salt [" + salt + "]"
maxChars = 13
for baseWidth in range(1, maxChars + 1):
print "checking passwords width [" + `baseWidth` + "]"
recurse(baseWidth, 0, "")
import sys
easyRange = [32,33,36,42,43] + range(48, 58) + range(65, 91) + range(97, 123)
allRange = range(32,127)
hash = "<16 hex pairs here>"
salt = "<salt>"
def checkPassword(password):
m = md5.new(password)
m = md5.new(m.hexdigest() + salt)
if (m.hexdigest() == hash):
print "match [" + password + "]"
sys.exit()
def recurse(width, position, baseString):
#current position
for char in easyRange:
if (position < width - 1):
recurse(width, position + 1, baseString + "%c" % char)
checkPassword(baseString + "%c" % char)
print "Target Hash [" + hash + "] Salt [" + salt + "]"
maxChars = 13
for baseWidth in range(1, maxChars + 1):
print "checking passwords width [" + `baseWidth` + "]"
recurse(baseWidth, 0, "")
I have 2 ranges of characters defined. easyRange, which is most common alphanumerics with a few special characters, then allRange, which is every acceptable character. If you want a more thorough set of characters, use allRange in the recurse method.
Changes to the algorithm need to be done in checkPassword. Just call that with the target hash and salt and the known password to confirm that your algorithm works correctly.
Multi-threading
This example is single-threaded so it can't take advantage of hyper threading, multiple CPUs or multiple CPU cores. My recommended threading strategy would be to divide the work into roughly 5 minute chunks (maybe 100m possible numbers) which would be achievable using low memory by starting the threading only at a width of 5 by assigning the lower 4 combinations to a worker thread and then using an even number of threads-to-cores to maximize CPU usage. Each time a thread joins, get the next available block of work and begin processing again. This simple approach would only require a manager which watches the placeholder of work, which is sequential. For example, if the current password guessing iteration were 7-wide, it would hand ABCxxxx off to a thread1, then ABDxxxx to thread2, then ABExxxx to thread3, etc.
Distributed Computing
While the threading approach is simple, a distributed computing approach would require much more architecture. I recommend writing a simple distribution server which hands units of work off to clients, keeps track of what's a work-in-progress and what's complete. A client could interface via a simple SOAP or other XML-based interface (the overhead here is negligible of the units of work are large and # of requests are low). The client would simply request a unit of work, process it then return a found or not-found result, which, depending on the algorithm, may or may not end the whole search if the algorithm can have collisions (md5 does.) A simple implementation of the Leader/Follower architectural pattern would make it a snap to scale up handling the remote requests from a number of clients.
Academically, you could have a lot of fun with this just testing out distributed computing approaches and more efficient algorithms. Practically speaking, though, if the algorithm used is even SHA-1 or anything more complex than MD5 and you don't have thousands of CPUs to process the possibilities, you simply won't be able to crack the password in a reasonable amount of time.
Disclaimer
This is purely educational. This concept is common-knowledge and provided here for a reference only. I am not liable for any misuse, even if it does take years of processing for a single password.
Related Articles
How to start writing software: web applications
How I became a professional software developer



Thanks for this. I spent an entire day trying to figure out this algorithm. I have limited experience with recursion, and soon found out the limitation of nested-loops.
For anyone interested I created a PHP version.