Secure Credential Storage
Pop security quiz: What’s the most secure way to store a secret?
a) Encrypt it with a strong symmetric cryptographic algorithm such as AES, using a 256-bit key.
b) Encrypt it with a strong asymmetric cryptographic algorithm such as RSA, using a 4096-bit key.
c) Encrypt it using a cryptographic system built into your platform, like the Data Protection API (DPAPI) for Windows.
Have you made your choice? The correct answer is actually:
d) Don’t store the secret at all!
Ok, it was a trick question. But the answer is valid: thieves can’t steal what you don’t store. Let’s apply this principle to the action of authentication – that is, logging into a web site. If a site never stores its users’ passwords, then even if the site is breached, those passwords can’t be stolen. But how can a site authenticate users without storing their passwords? The answer is for the site to store (and subsequently compare) cryptographic hashes of the passwords instead of the plaintext passwords themselves. (If you’re unfamiliar with the concept of hashes, we recommend reading http://msdn.microsoft.com/en-us/library/92f9ye3s.aspx#hash_values before continuing.) By comparing hashes rather than plaintext, the site can still validate that the user does indeed know his or her password – otherwise, the hashes wouldn’t match – but it has no need to ever actually store that password. It’s an elegant solution, but there are a few design considerations you’ll need to implement to ensure you don’t inadvertently weaken the strength of the system.
The first design issue is that simply hashing the passwords alone isn’t enough protection: you also need to add a random salt to each password before you compute its hash value. Remember that for a given hash function, an input value will always hash to the same output value. With enough time, an attacker could compute a table of plaintext strings and their corresponding hash values. In fact, many of these tables (known as “rainbow tables”) already exist and are freely downloadable on the Internet. Armed with a rainbow table, if an attacker could manage to gain access to the list of password hashes on the web site by any means, he could use that table to easily determine the original plaintext passwords. When you salt hashes, you take this weapon out of the attackers’ hands. It’s also important to generate (and store) a unique salt for every user – don’t just use the same salt for everyone. If you did always use the same salt, an attacker could build a new rainbow table using that single salt value, and eventually extract out the passwords.
The next important design issue to take is to be sure to use a strong cryptographic hash algorithm. MD5 may be a popular choice, but cryptographers have demonstrated weaknesses in it and it’s been considered an unsafe, “broken” algorithm for years. SHA-1 is stronger, but is beginning to show cracks and now cryptographers recommend avoiding SHA-1 as well. The SHA-2 family of hash algorithms is currently considered the strongest, and is the only family of hash algorithms approved for use in Microsoft products per the Microsoft Security Development Lifecycle (SDL) cryptographic standards policy.
Instead of hardcoding your application to use SHA-2, an even better approach would be to implement a “cryptographic agility” that would allow you to change the hash algorithm even after the application has been deployed into production. After all, cryptographic algorithms go stale over time; cryptographers find weaknesses and computing power increases to the point where brute force approaches become feasible. Someday SHA-2 may be considered just as weak as MD5, so planning for this eventuality early may save you a lot of trouble down the road. An in-depth look at hashing agility is beyond the scope of this post, but you can read more about a proposed solution in the MSDN Magazine article Cryptographic Agility. And just as the SDL mandates the use of strong cryptographic algorithms in Microsoft products, it also encourages product teams to use crypto agility where feasible so that teams can more nimbly migrate to new algorithms in the event that a current strong algorithm is broken.
So far, we’ve talked about what to hash (the password and a random unique salt value) and how to hash (using a cryptographically strong hash algorithm in the SHA-2 family, and preferably configurable to allow for future change), but we haven’t talked about where to hash. You might think that performing the hashing on the client tier would be a significant improvement in security, since you’d only need to send the hash over the wire to the server and never the plaintext password itself. However, this doesn’t buy you as much benefit as you’d think. If an attacker has a means of sniffing network traffic, he could still intercept the call and pass the hash to the server himself, thus spoofing the user and taking over his session. At this point, the hash essentially becomes the plaintext password. The only real benefit to this approach is that if the victim is using the same password on multiple web sites, the attacker won’t be able to compromise the victim’s account on those other sites as well, since knowing the hash of a password tells you nothing about the password itself. A better way of defending against this attack is just to perform the hashing on the server side, but to ensure that the password and all credential tokens such as session cookies are always transmitted over SSL/TLS. We’ll explore the topic of secure credential transmission (and other aspects of password management such as password complexity and expiration) in future blog posts.
By following a few simple guidelines, you can help to ensure that your application’s users’ credentials remain secure, even if your database is compromised:
- Always store and compare hashes of passwords, never the plaintext passwords themselves.
- Apply a random, unique salt value to each password before hashing.
- Use a cryptographically strong hash algorithm such as one from the SHA-2 family.
- Allow for potential future algorithm changes by implementing a cryptographically agile design.
- Hash on the server tier and be sure to transmit all passwords and credential tokens over HTTPS.