Before you can understand what salting is, you need to understand what hashing is. You can read my explanation of what hashing is here.
Salting is a very simple way to prevent what are called Dictionary Attacks.
The “dictionaries” in dictionary attacks are just lists of character strings and their corresponding hashes. For example, here is a dictionary of (terrible) passwords along with their (real) SHA-256 hashes:
Password0 fa669f95dc83ccd9400fc939a68666720033d5859860f76edcd892e95afb9cc7 Password1 19513fdc9da4fb72a4a05eb66917548d3c90ff94d5419e1f2363eea89dfee1dd Password2 1be0222750aaf3889ab95b5d593ba12e4ff1046474702d6b4779f4b527305b23 Password3 2538f153f36161c45c3c90afaa3f9ccc5b0fa5554c7c582efe67193abb2d5202 Password4 db514f5b3285acaa1ad28290f5fefc38f2761a1f297b1d24f8129dd64638825d Password5 8180d5783fea9f86479e748f6d2d1196c4a8e143643119398c16367d2c3d50f2 ... ...
With this dictionary, if somebody ever hacks into, for example, Gmail’s password database, and all of the passwords in the database are hashed, the hackers can quickly and easily find all of the Gmail accounts that use Password0
as their password, Password1
as their password, Password2
, and so on. They wouldn’t need to do any guessing and hashing, which would take a really long time to do; all they’d have to do is look up the dictionary and see if an account’s password hash matches a password hash in the dictionary. If there’s a match, then they use the corresponding password in the dictionary to log in to the account.
This kind of attack makes database breaches deadly. It takes months, if not years, for a supercomputer to guess a strong password when all you have is its hash. But with a Dictionary Attack, the hackers have already done the months and years of guessing; they just have to look up the dictionary.
Salting, however, makes Dictionary Attacks useless. Salting is just this: adding a long and random string of characters to the password before hashing it, and storing the string of characters (the salt) along with the hash in the database. That’s it.
So, for example, imagine a password database with three users: user1
, user2
, and user3
. They all picked really simple passwords too: user1
uses Password1
, user2
uses Password2
, and user3
uses Password3
. A password database that stores only hashes would look like this:
User Password Hash user1 19513fdc9da4fb72a4a05eb66917548d3c90ff94d5419e1f2363eea89dfee1dd user2 1be0222750aaf3889ab95b5d593ba12e4ff1046474702d6b4779f4b527305b23 user3 2538f153f36161c45c3c90afaa3f9ccc5b0fa5554c7c582efe67193abb2d5202
If this database ever gets breached, all three of these users’ password hashes can easily be found on every dictionary there is, because their passwords are so simple and common.
If, however, the password database had salted the passwords first, it would look like this:
User Salt Password Hash user1 fd6/45t6+ERg4eSH b64c31e34a8ba0017dd05c6c6f5c0b275c0ec5f4f3b3bae5320beba711e5bb78 user2 R’htr54}ktfg94_d daf121cedd603166018b97c5246e13835121a072312ea889f1acc3eaea23191d user3 45gfdjd.hHF4h#fg 100eae238385f23902afab56b1d5824b162ad2e08c178f9d812312bf0ad06e61
Same passwords, completely different hashes! Dictionary Attacks are useless here because even though they certainly have the hash for Password1
, they certainly don’t have the hash for Password1fd6/45t6+ERg4eSH
.
ʕ·ᴥ·ʔ: Wow, that easy, huh? But, uh, how does
user1
log in then? His password ain’tPassword1fd6/45t6+ERg4eSH
, right? It’s stillPassword1
, isn’t it?
Oh, yes, it’s still Password1
. He logs in the exact same way that he normally does: typing user1
into the username field and Password1
into the password field. The only difference now is that the server that logs him in has to look at the salt stored in the password database, add it to his input from the password field, and hash it to see if the calculated hash matches the stored hash in the database. Super simple, but super effective!