Blowfishis a symmetric-key block cipher, designed in 1993 by Bruce Schneier and included in many cipher suites and encryption products. Blowfish provides a good encryption rate in software, and no effective cryptanalysis of it has been found to date for smaller files. It is recommended Blowfish should not be used to encrypt files larger than 4GB in size, Twofish should be used instead.[3]
Schneier designed Blowfish as a general-purpose algorithm, intended as an alternative to the aging DES and free of the problems and constraints associated with other algorithms. At the time Blowfish was released, many other designs were proprietary, encumbered by patents, or were commercial or government secrets. Schneier has stated that "Blowfish is unpatented, and will remain so in all countries. The algorithm is hereby placed in the public domain, and can be freely used by anyone."[5]
Blowfish has a 64-bit block size and a variable key length from 32 bits up to 448 bits.[5] It is a 16-round Feistel cipher and uses large key-dependent S-boxes. In structure it resembles CAST-128, which uses fixed S-boxes.
The F-function splits the 32-bit input into four 8-bit quarters and uses the quarters as input to the S-boxes. The S-boxes accept 8-bit input and produce 32-bit output. The outputs are added modulo 232 and XORed to produce the final 32-bit output (see image in the upper right corner).[6]
Decryption is exactly the same as encryption, except that P1, P2, ..., P18 are used in the reverse order. This is not so obvious because xor is commutative and associative. A common misconception is to use inverse order of encryption as decryption algorithm (i.e. first XORing P17 and P18 to the ciphertext block, then using the P-entries in reverse order).
Because the P-array is 576 bits long, and the key bytes are XORed through all these 576 bits during the initialization, many implementations support key sizes up to 576 bits. The reason for that is a discrepancy between the original Blowfish description, which uses 448-bit keys, and its reference implementation, which uses 576-bit keys. The test vectors for verifying third-party implementations were also produced with 576-bit keys. When asked which Blowfish version is the correct one, Bruce Schneier answered: "The test vectors should be used to determine the one true Blowfish".
Another opinion is that the 448 bits limit is present to ensure that every bit of every subkey depends on every bit of the key,[5] as the last four values of the P-array don't affect every bit of the ciphertext. This point should be taken in consideration for implementations with a different number of rounds, as even though it increases security against an exhaustive attack, it weakens the security guaranteed by the algorithm. And given the slow initialization of the cipher with each change of key, it is granted a natural protection against brute-force attacks, which doesn't really justify key sizes longer than 448 bits.
Blowfish is a fast block cipher, except when changing keys. Each new key requires the pre-processing equivalent of encrypting about 4 kilobytes of text, which is very slow compared to other block ciphers. This prevents its use in certain applications, but is not a problem in others.
In one application Blowfish's slow key changing is actually a benefit: the password-hashing method (crypt $2, i.e. bcrypt) used in OpenBSD uses an algorithm derived from Blowfish that makes use of the slow key schedule; the idea is that the extra computational effort required gives protection against dictionary attacks. See key stretching.
Blowfish has a memory footprint of just over 4 kilobytes of RAM. This constraint is not a problem even for older desktop and laptop computers, though it does prevent use in the smallest embedded systems such as early smartcards.
Blowfish was one of the first secure block ciphers not subject to any patents and therefore freely available for anyone to use. This benefit has contributed to its popularity in cryptographic software.
bcrypt is a password hashing function which, combined with a variable number of iterations (work "cost"), exploits the expensive key setup phase of Blowfish to increase the workload and duration of hash calculations, further reducing threats from brute force attacks.
Blowfish's use of a 64-bit block size (as opposed to e.g. AES's 128-bit block size) makes it vulnerable to birthday attacks, particularly in contexts like HTTPS. In 2016, the SWEET32 attack demonstrated how to leverage birthday attacks to perform plaintext recovery (i.e. decrypting ciphertext) against ciphers with a 64-bit block size.[11] The GnuPG project recommends that Blowfish not be used to encrypt files larger than 4 GB[3] due to its small block size.[4]
A reduced-round variant of Blowfish is known to be susceptible to known-plaintext attacks on reflectively weak keys. Blowfish implementations use 16 rounds of encryption, and are not susceptible to this attack.[12][13]
I want to clarify that I understand the difference between hashing and encryption. What prompted me to ask the question this way is this article, where the author refers to bcrypt as "adaptive hashing"
Since bcrypt is based on Blowfish, I was led to think that Blowfish is a hashing algorithm. If it's encryption as answers have pointed out, then seems to me like it shouldn't have a place in this article. What's worse is that he's concluding that bcrypt is the best. What's also confusing me now is that the phpass class (used for password hashing I believe) uses bcrypt (i.e. blowfish, i.e. encryption). Based on this new info you guys are telling me (blowfish is encryption), this class sounds wrong. Am I missing something?
It should suffice to say whether bcrypt or SHA-512 (in the context of an appropriate algorithm like PBKDF2) is good enough. And the answer is yes, either algorithm is secure enough that a breach will occur through an implementation flaw, not cryptanalysis.
If you insist on knowing which is "better", SHA-512 has had in-depth reviews by NIST and others. It's good, but flaws have been recognized that, while not exploitable now, have led to the the SHA-3 competition for new hash algorithms. Also, keep in mind that the study of hash algorithms is "newer" than that of ciphers, and cryptographers are still learning about them.
Note: bcrypt is an algorithm that uses Blowfish internally. It is not an encryption algorithm itself. It is used to irreversibly obscure passwords, just as hash functions are used to do a "one-way hash".
Cryptographic hash algorithms are designed to be impossible to reverse. In other words, given only the output of a hash function, it should take "forever" to find a message that will produce the same hash output. In fact, it should be computationally infeasible to find any two messages that produce the same hash value. Unlike a cipher, hash functions aren't parameterized with a key; the same input will always produce the same output.
If someone provides a password that hashes to the value stored in the password table, they are authenticated. In particular, because of the irreversibility of the hash function, it's assumed that the user isn't an attacker that got hold of the hash and reversed it to find a working password.
Now consider bcrypt. It uses Blowfish to encrypt a magic string, using a key "derived" from the password. Later, when a user enters a password, the key is derived again, and if the ciphertext produced by encrypting with that key matches the stored ciphertext, the user is authenticated. The ciphertext is stored in the "password" table, but the derived key is never stored.
In order to break the cryptography here, an attacker would have to recover the key from the ciphertext. This is called a "known-plaintext" attack, since the attack knows the magic string that has been encrypted, but not the key used. Blowfish has been studied extensively, and no attacks are yet known that would allow an attacker to find the key with a single known plaintext.
So, just like irreversible algorithms based cryptographic digests, bcrypt produces an irreversible output, from a password, salt, and cost factor. Its strength lies in Blowfish's resistance to known plaintext attacks, which is analogous to a "first pre-image attack" on a digest algorithm. Since it can be used in place of a hash algorithm to protect passwords, bcrypt is confusingly referred to as a "hash" algorithm itself.
Assuming that rainbow tables have been thwarted by proper use of salt, any truly irreversible function reduces the attacker to trial-and-error. And the rate that the attacker can make trials is determined by the speed of that irreversible "hash" algorithm. If a single iteration of a hash function is used, an attacker can make millions of trials per second using equipment that costs on the order of $1000, testing all passwords up to 8 characters long in a few months.
If however, the digest output is "fed back" thousands of times, it will take hundreds of years to test the same set of passwords on that hardware. Bcrypt achieves the same "key strengthening" effect by iterating inside its key derivation routine, and a proper hash-based method like PBKDF2 does the same thing; in this respect, the two methods are similar.
So, my recommendation of bcrypt stems from the assumptions 1) that a Blowfish has had a similar level of scrutiny as the SHA-2 family of hash functions, and 2) that cryptanalytic methods for ciphers are better developed than those for hash functions.
You can of course build a secure password hashing algorithm around SHA-512 by iterating it thousands of times, just like the way PHK's MD5 algorithm works. Ulrich Drepper did exactly this, for glibc's crypt(). There's no particular reason to do this, though, if you already have a tested bcrypt implementation available.
Since Blowfish is old, well-audited, and has no published attacks, why are we using AES instead? I know that Bruce Schneier said that Blowfish is insecure and told people to transition to Twofish, but why? AES has many vulnerabilities, such as padding oracle attacks and power-consumption analysis, but Blowfish doesn't have any well-known issues. And Blowfish can take a key up to 448 bits, which is larger than AES's 256 bits. Is there something I am not understanding? It's been under public scrutiny for the longest with the least issues.
3a8082e126