When you sign up for an account on a web service you will be asked for a password. What usually happens is that this password is ran through something called a “cryptographic hashing algorithm”. This is a one way process that converts your password into a fixed length “key” called a message digest or, more usually, a hash.
Here’s an example of the hash generated by the SHA-1 hashing algorithm for the password “snake”:
It is this, the hash, that is stored in the database. This means that your actual password is not stored anywhere, which is good. It is for this reason that most online services cannot tell you what your password is: the hashing process means that it is safe from prying eyes.
The problem is that people are generally outstandingly stupid when it comes to password selection. A disappointing percentage of people pick really, really shit passwords like “monkey”, “12345″, “password” or “password1234”. I did a couple of searches for the SHA-1 hashes of some common passwords and Uncle Google was typically turning out tens of thousands of results for each.
So if you were a hacker and you stole a database of SHA-1 password hashes you could hack a good 1 in 5 of them by simply searching against a list of pre-calculated hashes of the most common passwords. This is called a rainbow attack and there are rainbow databases consisting of hundreds of millions of hashes out there. Of course it also verifies quickly what sort of hash you’re dealing with.
The easiest way to defeat rainbow attacks is to salt the password before hashing it. This means that you prefix a magic string to the front of the user’s password after they’ve typed it in and before you hash it. Let’s say we had a salt of “CUTE_BOTTOMS”. The hash of our snake password now becomes:
It is completely different. It also means that if someone’s dictionary contains the word “snake” as a potential password, it won’t show up. Thus, salting a password is a fundamental part of basic security 101, i.e., it is standard practice.
(Update: It is worth noting, because I didn’t make it clear in the original text, that salting everyone’s password with the same salt is only a little bit helpful because once the salt is cracked through a brute force attack a rainbow table can be specifically generated which will be very fast at cracking a large majority of users with woefully poor passwords like these ones which were the top 25 used by Gawker users. Used correctly, every user’s salt will be different. With just one database-wide salt it’s like climbing a wall with a pile of ladders: the right tool for the job, but used laughably wrong)
An hour’s work on the train
Then there is the issue of computing power, or, rather, the wonderful fact that it has increased enormously. When algorithms like SHA-1 and another antique hashing system called MD5 were created computers were slow and the algorithms had not been mathematically analysed down to the fine details for possible weaknesses. In the time since their creation both MD5 then SHA-1 were revealed to have a variety of weaknesses that made them unsuitable for security purposes. MD5’s faults have been understood for nearly a decade and SHA-1 has been comprehensively broken for a few years. Furthermore, they’re damn quick to calculate. This opens the door to another way of cracking passwords: simply try every possible combination. Standard practice these days is to use salted SHA-2 hash varieties such as SHA-256, SHA-384 or SHA-512 (the numbers represent the digest size in binary digits; dividing by 4 will tell you how many characters the digest is represented with).
So whilst rattling through the countryside yesterday on a smooth, virtually silent, luxurious state-of-the-art British train1 I knocked out a quick program to exhaustively try every possible combination of password, hash it, and then compare it to a known hash and keep on trucking until it got a match. In a nutshell, my ancient three and a half year old MacBook Pro can crack pretty much any un-salted SHA-1 hashed six character password in the time it takes to get from Kings Cross to Hitchen, assuming none of the usual things go wrong with the journey. This is less than half an hour. It is worth noting that this software took less than an hour to write: it is utterly unoptimised. If I optimised it, I could easily double its performance. If I parallel processed it onto all available cores, then my dual core would double it again: instead of trying 1,400,000 combinations per second, that would mean I would be able to try over six million combinations per second. On a high-performance, quad-core modern computer with decent L1 caches we’re looking at an order of magnitude at least on top of that. The bottom line is that if your password 8 or fewer characters long, I can probably crack it in a few hours given the hash. So is it? Is there something you need to be doing right now?
$./CrackMyHash --hash-type sha1 --character-set lcalpha --crack-hash 148627088915c721ccebb4c611b859031037e6ad CrackMyHash 1.0 by Toby Simpson (C) Copyright 2012 Toby Simpson, All Rights Reserved For 'support', contact cobrascobras.com Searching for [148627088915c721ccebb4c611b859031037e6ad], hashed using sha1 Using lower-case alpha character set (27 characters). ~1,318,965 attempts per second. 76,500,000 attempts so far, length is 5 characters Cracked! Hash: [148627088915c721ccebb4c611b859031037e6ad] Hash type: sha1 Security grade: Pretty damn poor Attempts: 76,508,018 Password: snake Time taken: 58 seconds
A bad news day
SHA-1 and MD5 are not recommended for use on any system and have not been for some time. It came as some surprise, therefore, to discover Linked In had used SHA-1. The real kicker, though, was that the silly noodles had not salted them. This is easy to verify: you simply search the stolen database for known SHA-1 hashes for words like “password”, “linked in”, “password1234”, “linkedin1234”, “qwertyuiop” and see if you get matches. If so, you’re in business. Rainbow the database to catch the common ones and then either employ the use of 10,000 computers you already own through malware or your mates to crack the others exhaustively. Remember: I wrote my little program in an hour, it’s crap, but it can still try over 1.3 million combinations per second. The software here by Benjamin Vernoux can try 200 million MD5 hashes per second by exploiting your graphics processor alongside the CPU with some nifty optimised code.
Let’s summarise what Linked In did wrong:
- Their security was lax enough to allow them to be broken into in the first place. Naughty, naughty. However, in their defence, the quantity and complexity of the software needed to run a service of Linked In’s magnitude is such that guaranteed security is simply impossible. Regardless of risk mitigation, occasionally, this will happen. However…
- They used an out-dated algorithm for hashing their passwords. SHA-1 is outdated, is vulnerable to various attacks and can be calculated fast enough to make brute force a valid attack technique
- They did not salt the passwords. Given that salting defeats a lot of user stupidity in password choice as well as knocking out rainbow attacks, this is a weird omission on their part
Others are not immune. Several other organisations managed to slide their bad news out at the same time to sneak under the clusterfuck that was Linked In’s piss-poor security practices. eHarmony, for example, and Last.FM (who, in a staggering celebration of the past, used MD5 hashes which are only marginally more secure than an open door) both announced that their user databases had been “borrowed” by hackers. The chances are reasonably high that whatever vulnerability was used to blow the doors down to Linked In also applies to other web sites using the same technology.
Then there are the O2s of this world. They store your password in plain text. And, if you forget it, they’ll simply email it to you. Email is so staggeringly insecure that it beggars belief. If you knew how many open machines your emails passed through on the way from A to B you’d probably be a lot more careful about what you said. That kind of fuckwittery is unforgivable, so expect to see their users trashed at some point in the near future.
Generally, people are lazy with passwords. They pick one or two short passwords and use them everywhere. By short, I mean anything shorter than 10 characters. Once the hackers have your password, they probably have the keys to your email, your Facebook account, your Twitter account and goodness knows what else. It is trivial to scam your friends or rob your PayPal or bank account after that. Because Linked In (and many others) use your e-mail address as your username, hackers don’t even need to get within telescope distance of Columbo to pillage their way across many people’s data and any assets that are exposed online.
What can you do to avoid this being a disaster for you in future? Should you cancel all your accounts with everyone, shut off the Internet and live in a tent in the middle of a field somewhere? Maybe, if that’s your thing, but it’s better to simply have realistic expectations of how good companies are at securing your personal data and follow some simple tips:
- XKCD cartoon
- Use different passwords for all your major services
- Your email password should never be used for anything else. Ever
- Passwords should be either: at least ten characters long and include numbers and preferably symbols or a long passphrase that is meaningless to anyone but you. My password program can work a lot faster if I just say “hack alpha-numerical passwords” rather than “try every single character”
- Make any security question answers utter bollocks. It is trivial to find your mother’s maiden name, your birthday, your first pet, etc.
- Help yourself: don’t pick a password like
r4%_gKU745@&, it’s just silly. Pick a phrase that means something to you and decorate it with numbers. “8 colonies of 400 e-colis on my desk” is a good example (as the caption to this XKCD cartoon says: “through 20 years of effort, we’ve successfully trained everyone to use passwords that are hard for humans to remember but easy for computers to guess”)
- Use the HTTPS (encrypted) gateway to your favourite services. Many sites support HTTPS but don’t enable it as default. Never log into Facebook anywhere other than https://www.facebook.com. Same for Twitter, https://www.twitter.com. HTTPS connections are encrypted
The bottom line is simply: “Don’t be a low hanging fruit”. If your password on Linked In was “Fisher folk are men who wear chunky jumpers” then a brute force hack would take more time than there is left in the universe for current hardware: hacking such a passphrase would require exploitation of faults in the SHA-1 algorithm. It’s also worth noting that my examples don’t contain numbers, punctuation or anything other than simple alphabetic characters. It’s still probably more secure than your password, eh? Remember: 26 letters in the alphabet x 2 = 52. +1 for the space = 53. 53 to the power of 45 (for the length of phrase) is a number much bigger than your shitty Casio scientific calculator can display.
Performance analysisOut of curiousity, I ran my program on three computers: my ancient MacBook Pro, my friend’s nearly new MacBook Pro and his Raspberry Pi. Here are the rates at which we were able to attack a hash on all three platforms by hash type:
|We couldn’t figure out why the Pi version was so slow. It was probably a poorly configured compiler with no optimisation. It should really be faster than that, even with the lower specced hardware.
It’s worth a brief note about hacking rates. I touched on this above, but each and every character and character class you introduce into your password makes life a lot, lot harder for hackers. The below table shows how many attempts are required to crack 6, 8 or 10 character passwords depending on the variety of content making them up:
|Mixture||6 character||8 character||10 character|
|Lower-case alphabetical (26)||308,915,776||208,827,064,576||141,167,095,653,376|
|Mixed-case alphabetical (52)||19,770,609,664||53,459,728,531,456||144,555,105,949,057,000|
|Mixed-case with numbers (62)||56,800,235,584||218,340,105,584,896||839,299,365,868,340,000|
|Alphanumeric with punctuation (> 96)||782,757,789,696||7,213,895,789,838,340||18,446,744,073,709,600,000|
Make sure you take a long, long, hard look at the table above. The difference between brute forcing an 6 character and 8 character lower-case only password is the difference between 308 million and 208 billion attempts. Either way, they are small numbers in computing terms. However, that ten character alphanumeric with punctuation? That’s eighteen quintillion combinations. Make that twenty characters long and… well, you get the idea.
I made my program in three versions: Windows command line, OSX command line and Raspberry Pi Debian. If you fancy a copy due to morbid curiosity or whatever, drop me a line or leave a comment on this post along with your e-mail address and a good reason why you’re prepared to trust a perfect stranger’s code running on your machine and I’ll drop you an executable. The Pi one is a version behind (I’ve grooved up the Windows and OSX ones but my Pi hasn’t turned up yet2).
Obviously, if I send you my software, you use it at your own risk. To massage a quote from Frasier that is eminently suitable here, “at Cornell University they have an incredible piece of scientific equipment known as the Tunnelling Electron Microscope. Now, this microscope is so powerful that by firing electrons you can actually see images of the atom, the infinitesimally minute building blocks of our universe. If I were using that microscope right now, I still wouldn’t be able to locate the warranty that comes with this software.” If it works for you, that’s great! If it doesn’t, I’ll give you a full refund. I will, though, probably fix bugs that involve the software rather than the e-coli, salmonella and helicobacter pylori that seem to be infecting the Raspberry Pi.
Yey for germs!
1 And they say the art of sarcasm is lost
2 And when Mr Pi does turn up, I shall begin the adventure game assisted C++ tutorial. The first, code-free, design and discussion post is coming up soon!
PS: if you do get this code and try this and get a much better crack rate than me on SHA-1s, I’d be curious to know the results — please let me know!