What I thought would be a 10 minute job to migrate a few lines of code of PHP to Go turned into a 8 hour marathon of thought and discovery. I needed to move our password hash checking code.

This is really a story about how PHP's crypt() works (or doesn't) and why it's never a good idea to try and generate your own cryptographically secure salts.

You may find this useful if you also need to deal with hashing passwords (granted, there are a lot of other better ways to do this) or you just feel like a few "What the!? Ah ha!" moments.

If the code comment is doing it's job I shouldn't need to explain any further...

import "golang.org/x/crypto/bcrypt"

// CheckPassword is used for Basic authentication to check the secret. The logic
// has been copied from the existing implementation in:
// protected/models/traits/HasPassword.php
// The PHP source code will be copied verbatim here, just in case something
// changes it can be compared:
// private static function digestPassword($password, $salt = null)
// {
// static $cryptPrefix = '$2y$10$';
// if ($salt === null) {
// $salt = str_replace(
// '+',
// '.',
// substr(base64_encode(md5(uniqid('', true), true)), 0, 21)
// );
// }
// return substr(crypt($password, $cryptPrefix . $salt . '$'), 7);
// }
// Turns out this is not a trivial problem. In fact it took an entire day to
// find out how and why salt was being handled incorrectly to produce the wrong
// hashes.
// Well, the hashes aren't wrong, per se, but if you fed the same password and
// salt into the equivalent Go built-in (bcrypt) package it would not work. Let
// me explain.
// The crypt() function in PHP, like many functions in PHP is a wrapper for the
// C libraries underneath. That can make it subject to underlying
// implementations. This is important to highlight because while PHP "wrapper"
// does have known and documented bugs with its crypt() function (specifically
// around Blowfish, which we are using). None of those are actually occurring
// (as far as I could tell, I will explain more later) with the difference in
// implementation (since the bcrypt package is pure Go).
// The crypt() function works by taking a password, encryption options
// (including the type of encryption) and an optional salt. The salt is optional
// because one will be generated for you if not provided. However, as you can
// see from the Kounta PHP implementation above we generate our own salt. This
// is what has caused the issue. Or at least, triggered it.
// It is a little confusing at first, but the reason why the salt is generated
// automatically for you but not returned or stored independently is because the
// salt is actually baked into (excuse the pun) the output hash. The result of
// using the crypt() function with Blowfish looks like this:
// crypt('foobar', '$2a$10$1234567890123456789012')
// => $2a$10$123456789012345678901udiU8wKV.famJggjCUFdUGtiTNmkIXYW
// ^-----^--------------------^-------------------------------^
// 1 2 3
// 1. Information about the encryption options used. The "2a" (mainly the "2")
// describes that we are using Blowfish. The "10" is the cost, which can range
// from 04 to 31. This in a nutshell is about how much CPU time you want to
// spend to generate the hash. Presumably higher is better, but 10 seems to be
// default in existing implementations I've seen.
// 2. This looks familiar right? Well, it is. It is most of the salt. "Most of"
// is the extremely confusing part that is the cause of this very long
// explanation. I'll save the in-depth for below.
// 3. This is the actual calculated Blowfish hash. It is not used by itself, but
// in conjunction with the salt part to verify the password again.
// So how is this hash used to actually test the password in the future,
// exactly? The oversimplified answer is we extract the encryption options and
// salt from the existing known hash. Running the password we wish to test with
// these back through the crypt() function should generate the same hash. When
// the entire hashes (all three parts) are equal the password must be correct.
// This is absolutely true, but it's also true that when the hashes are not
// equal under just the right cases the password is still valid.
// How is this possible if hashes are supposed to be deterministic? The same
// input(s) should always reveal the same outputs, right? It comes down the the
// salt. Remember how I said that the "most" of the salt makes it into the hash?
// The salt (for Blowfish) must be exactly 22 characters and encoded with base
// 64. However, it turns out that it doesn't use all of the salt. In fact it
// only uses 128 bits of it. Here is an example:
// // Notice only the last digit of the sale is different.
// crypt('foobar', '$2a$10$1234567890123456789012')
// crypt('foobar', '$2a$10$1234567890123456789013')
// Both of these calls produce the exact same hash, despite the different salt:
// $2a$10$123456789012345678901udiU8wKV.famJggjCUFdUGtiTNmkIXYW
// Great, so what's the last digit for if it's not needed? Well here's another
// example:
// // Once again only the last digit is different, but this time by a slightly
// // larger amount.
// crypt('foobar', '$2a$10$1234567890123456789012')
// crypt('foobar', '$2a$10$123456789012345678901A')
// This however generated two very different hashes:
// $2a$10$123456789012345678901udiU8wKV.famJggjCUFdUGtiTNmkIXYW
// $2a$10$123456789012345678901./ft8TTkhYi0llXsqn8E23vNOOhf0eM.
// What's going on and how does this relate to the PHP version? Look at the PHP
// version again and you will see that it is generating a key that is 21
// characters (not the required 22) but then appending a '$' below. The '$'
// could be misconstrued as a terminator since it's the same character we wrap
// the encryption options in. It's not.
// It actually being used as part of the salt. The not so illustrious 22nd
// required character. There one obvious problem with that. A '$' is not an
// acceptable character in base 64. So we cannot pass the same character through
// to the equivalent Go library without incurring a very reasonable error.
// Those familiar with base 64 may know that encoded data must always be a
// multiple of 4 bytes. If the data length is not a modulo of 4, then it's
// padded with '=' to the correct length. The '=' itself is not part of the base
// 64 alphabet so cannot appear anywhere else in the stream. However, on Unix
// '$' is considered to be part of the alphabet AND a padding character. Either
// by coincidence or not the '$' in this case is being treated as a padding
// character, like a '='.
// Let's go back to an example. Here is a real password and hash:
// $password = 'WB5i4u8TNQvJIbYfW2nQC3o6W1dLR0R08iuo3e6b';
// $hash = '$2a$10$37OBUteY4OKy.T9lB0cvJ.7s0Desj76NjmmcShDPF77vGgd2zWUA.'
// When testing this same combination in Go the CompareHashAndPassword() returns
// false. That is because internally it generates it's own hash that is expects
// to be equal:
// $2a$10$37OBUteY4OKy.T9lB0cvJ.UD/NmqeYcboloNeVVcvExyrCjFMzC5u
// The hash is completely different, but the password is still correct. The
// amount of significant bits from the salt is different.
// My only option was to brute force generate all combinations with the last two
// characters of the salt so see if I could generate the same Blowfish hash with
// a very similar salt because I knew similar salts produce the same hash.
// Fortunately I found a match, actually several. The salt could be modified to
// adjust for the right amount of significant bits by resetting both of the last
// characters to 0. In base 64 this is "..".
// When applying this "hack" to more examples (10 randomly generated passwords
// and hashes) it worked all cases. So it looks like this solves for the general
// case. These have been put into unit tests.
// The moral of this story is you should not generate your own salt.
// [1] https://stackoverflow.com/a/2237009/1470961
func CheckPassword(password, hash string) bool {
realExpectedHash := []byte("$2a$10$" + hash[:20] + ".." + hash[22:])

return bcrypt.CompareHashAndPassword(realExpectedHash, []byte(password)) == nil