Salting

Before you can understand what salting is, you need to understand what hashing is. You can read my explanation of what hashing is here.

Salting is a very simple way to prevent what are called Dictionary Attacks.

The “dictionaries” in dictionary attacks are just lists of character strings and their corresponding hashes. For example, here is a dictionary of (terrible) passwords along with their (real) SHA-256 hashes:

Password0     fa669f95dc83ccd9400fc939a68666720033d5859860f76edcd892e95afb9cc7
Password1     19513fdc9da4fb72a4a05eb66917548d3c90ff94d5419e1f2363eea89dfee1dd
Password2     1be0222750aaf3889ab95b5d593ba12e4ff1046474702d6b4779f4b527305b23
Password3     2538f153f36161c45c3c90afaa3f9ccc5b0fa5554c7c582efe67193abb2d5202
Password4     db514f5b3285acaa1ad28290f5fefc38f2761a1f297b1d24f8129dd64638825d
Password5     8180d5783fea9f86479e748f6d2d1196c4a8e143643119398c16367d2c3d50f2
...           ...

With this dictionary, if somebody ever hacks into, for example, Gmail’s password database, and all of the passwords in the database are hashed, the hackers can quickly and easily find all of the Gmail accounts that use Password0 as their password, Password1 as their password, Password2, and so on. They wouldn’t need to do any guessing and hashing, which would take a really long time to do; all they’d have to do is look up the dictionary and see if an account’s password hash matches a password hash in the dictionary. If there’s a match, then they use the corresponding password in the dictionary to log in to the account.

This kind of attack makes database breaches deadly. It takes months, if not years, for a supercomputer to guess a strong password when all you have is its hash. But with a Dictionary Attack, the hackers have already done the months and years of guessing; they just have to look up the dictionary.

Salting, however, makes Dictionary Attacks useless. Salting is just this: adding a long and random string of characters to the password before hashing it, and storing the string of characters (the salt) along with the hash in the database. That’s it.

So, for example, imagine a password database with three users: user1, user2, and user3. They all picked really simple passwords too: user1 uses Password1, user2 uses Password2, and user3 uses Password3. A password database that stores only hashes would look like this:

User         Password Hash
user1        19513fdc9da4fb72a4a05eb66917548d3c90ff94d5419e1f2363eea89dfee1dd
user2        1be0222750aaf3889ab95b5d593ba12e4ff1046474702d6b4779f4b527305b23
user3        2538f153f36161c45c3c90afaa3f9ccc5b0fa5554c7c582efe67193abb2d5202

If this database ever gets breached, all three of these users’ password hashes can easily be found on every dictionary there is, because their passwords are so simple and common.

If, however, the password database had salted the passwords first, it would look like this:

User   Salt               Password Hash
user1  fd6/45t6+ERg4eSH   b64c31e34a8ba0017dd05c6c6f5c0b275c0ec5f4f3b3bae5320beba711e5bb78
user2  R’htr54}ktfg94_d   daf121cedd603166018b97c5246e13835121a072312ea889f1acc3eaea23191d
user3  45gfdjd.hHF4h#fg   100eae238385f23902afab56b1d5824b162ad2e08c178f9d812312bf0ad06e61

Same passwords, completely different hashes! Dictionary Attacks are useless here because even though they certainly have the hash for Password1, they certainly don’t have the hash for Password1fd6/45t6+ERg4eSH.

ʕ·ᴥ·ʔ: Wow, that easy, huh? But, uh, how does user1 log in then? His password ain’t Password1fd6/45t6+ERg4eSH, right? It’s still Password1, isn’t it?

Oh, yes, it’s still Password1. He logs in the exact same way that he normally does: typing user1 into the username field and Password1 into the password field. The only difference now is that the server that logs him in has to look at the salt stored in the password database, add it to his input from the password field, and hash it to see if the calculated hash matches the stored hash in the database. Super simple, but super effective!