sha-256 as a boolean function

Alex Mizrahi

unread,

Dec 21, 2011, 4:03:55 PM12/21/11

to

hi

I'd like to have SHA-256 in form of a set of boolean functions, i,e.
each output bit defined as a boolean function of input bits. But I'm too
lazy to make one myself :)

I realize that any normal form will have a lot of terms (~2^256 ?), but
other representations are fine.

Alex Mizrahi

unread,

Dec 21, 2011, 4:06:02 PM12/21/11

to

unruh

unread,

Dec 21, 2011, 5:47:28 PM12/21/11

to

On 2011-12-21, Alex Mizrahi <alex.m...@gmail.com> wrote:
> hi
>
> I'd like to have SHA-256 in form of a set of boolean functions, i,e.
> each output bit defined as a boolean function of input bits. But I'm too
> lazy to make one myself :)

Write yourself a compiler to compile the source code into a primative
set of Turing complete operators and you will have such a
representation. If you want to make it two step, compile it to the
machine language of some (simple) microprocessor, and then compile that
machine language into a Turing complete set of primative operators.

>
> I realize that any normal form will have a lot of terms (~2^256 ?), but
> other representations are fine.

The source code (available on the web) IS such an other representation

Anonymous coward #673

unread,

Dec 22, 2011, 1:55:25 AM12/22/11

to

In article <jcthjn$6jc$1...@speranza.aioe.org>,

Alex Mizrahi <alex.m...@gmail.com> wrote:

> I'd like to have SHA-256 in form of a set of boolean functions,

And I'd like a pony.

Alex Mizrahi

unread,

Dec 22, 2011, 11:48:35 AM12/22/11

to

>> I'd like to have SHA-256 in form of a set of boolean functions, i,e.
>> each output bit defined as a boolean function of input bits. But I'm too
>> lazy to make one myself :)

> Write yourself a compiler to compile the source code into a primative
> set of Turing complete operators and you will have such a
> representation. If you want to make it two step, compile it to the
> machine language of some (simple) microprocessor, and then compile that
> machine language into a Turing complete set of primative operators.

I know how to compute it, but I'd like to get some idea how it looks
like before doing all the work. (BTW the simplest way, I think, is to
get Haskell implementation and tweak it a bit to return formula instead
of doing computations.)

E.g. number of operations. If it's large, then thing I'm trying to do
isn't feasible and it makes no sense wasting time on it.

Here's what I want it for:
Bitcoin uses proof-of-work similar to hashcash: the task is to find
nonce such as SHA-256 message digest has at least a certain number of
leading zeroes.

Existing miners do that by iterating through all possible nonces in
sequence, computing sha256 digest for each and counting zeroes in result.

I'm not sure that brute force is optimal here, as it's not equivalent to
preimage attack on full sha256.

Also from computational perspective, if we count individual bit
operations, we only need to do only ~1% of operations if we implement
early rejection. I.e. if first bit is one, we don't need to compute
other 255 bits.

OTOH implementations on SHA-256 on CPU can use native instructions which
operate on 32 bits at once, thus being equivalent of 32 bit operations,
or even more in case of addition.

So it isn't clear whether early rejection can outweigh native
instruction efficiency. (I guess there is a way to speed up computation
on 'bit' level too, although not addition operation.)

Furthermore, bit-level computation can be sped up by reusing parts of
computation for common prefixes. E.g. if we compute it for nonces
00..000
00..001
00..010
00..011
we can simplify formula assuming first (N-2) variables being zero, and
then four values can be computed efficiently because formula simply
cannot be large if it has only two variables.

So, this looks promising, but I have very little experience with
cryptography and boolean functions, so I don't know if there are any
pitfals. They say there is no such thing as a free lunch, you know.

That's why I'm asking it here, maybe somebody with more experience can
give an advice.

Globemaker

unread,

Dec 22, 2011, 2:11:19 PM12/22/11

to

Alex proposed:

"Also from computational perspective, if we count individual bit
operations, we only need to do only ~1% of operations if we implement
early rejection. I.e. if first bit is one, we don't need to compute
other 255 bits. "

The plan you proposed does not have any significant advantage over
trying all codes. There are 64 rounds in SHA-256 and you need to
compute the rounds using all 256 round output bits set to correct
values. Your plan would only help in the last round, but the first 63
rounds require all bits to be computed. A 2% speedup is not worth the
effort. A gate array can provide fast hashes if all rounds are
provided in hardware logic paths instead of doing rounds sequentially
in time steps.

In summary, abandon this plan. Even if you paid experts to do the
work, you would not profit from seeing the logic equations.

Alex Mizrahi

unread,

Dec 22, 2011, 5:22:38 PM12/22/11

to

On 12/22/2011 09:11 PM, Globemaker wrote:
> Alex proposed:
> "Also from computational perspective, if we count individual bit
> operations, we only need to do only ~1% of operations if we implement
> early rejection. I.e. if first bit is one, we don't need to compute
> other 255 bits. "
>
> The plan you proposed does not have any significant advantage over
> trying all codes.

It still needs to try all codes, the idea is to do that faster.

There are 64 rounds in SHA-256 and you need to
> compute the rounds using all 256 round output bits set to correct
> values. Your plan would only help in the last round, but the first 63
> rounds require all bits to be computed.

Ok, I see. Well, there was just one part of a plan :)
Another part is to reduce/simplify formula using a number of fixed
variables.

How many boolean operations are there for one round? Let's say 1000.
So 64000 total.

But let's say we want to go through all possible combinations of 16
variables, leaving the rest fixed.
So initially we had 256 variables, 240 of them are assigned constants,
16 are left as free.
When variables are substituted with constants formula can be simplified.

How much? I have no idea. Let's say proportionally: 64000/16 = 4000, 62
per round.
It's possible to evaluate 64 combinations at once using operations on
64-bit CPU.

So, we start with same number of basic operations.
Then traditional CPU implementation gets advantage from 32-bit
instructions, and additional advantage from having fast + operator.

Boolean function implementation takes advantage of simplifying the
formula reducing its size 16 times, and then takes advantage of doing up
to 64 combinations at once.

So it looks like boolean function implementation has 32x advantage
EXCEPT slow + operator. I doubt that having cheap + gives 32+x advantage.

There is also a cost of simplifying the formula, but with somewhat
larger number of free variables it would take a tiny fraction of
computation time.

unruh

unread,

Dec 22, 2011, 5:30:10 PM12/22/11

to

On 2011-12-22, Alex Mizrahi <alex.m...@gmail.com> wrote:
> On 12/22/2011 09:11 PM, Globemaker wrote:
>> Alex proposed:
>> "Also from computational perspective, if we count individual bit
>> operations, we only need to do only ~1% of operations if we implement
>> early rejection. I.e. if first bit is one, we don't need to compute
>> other 255 bits. "
>>
>> The plan you proposed does not have any significant advantage over
>> trying all codes.
>
> It still needs to try all codes, the idea is to do that faster.
>
> There are 64 rounds in SHA-256 and you need to
>> compute the rounds using all 256 round output bits set to correct
>> values. Your plan would only help in the last round, but the first 63
>> rounds require all bits to be computed.
>
> Ok, I see. Well, there was just one part of a plan :)
> Another part is to reduce/simplify formula using a number of fixed
> variables.
>
> How many boolean operations are there for one round? Let's say 1000.
> So 64000 total.

I would say it is more like a few thousand for each bit of the answer.
Ie, closer to 100000 or 1,000,000 for each round.

>
> But let's say we want to go through all possible combinations of 16
> variables, leaving the rest fixed.
> So initially we had 256 variables, 240 of them are assigned constants,
> 16 are left as free.
> When variables are substituted with constants formula can be simplified.
>
> How much? I have no idea. Let's say proportionally: 64000/16 = 4000, 62
> per round.
> It's possible to evaluate 64 combinations at once using operations on
> 64-bit CPU.

I think you had better refine your estimates before running off.

>
> So, we start with same number of basic operations.
> Then traditional CPU implementation gets advantage from 32-bit
> instructions, and additional advantage from having fast + operator.
>
> Boolean function implementation takes advantage of simplifying the
> formula reducing its size 16 times, and then takes advantage of doing up
> to 64 combinations at once.
>
> So it looks like boolean function implementation has 32x advantage
> EXCEPT slow + operator. I doubt that having cheap + gives 32+x advantage.
>
> There is also a cost of simplifying the formula, but with somewhat
> larger number of free variables it would take a tiny fraction of
> computation time.

You know this how? I would be very very suspicious of your estimates
above.

Alex Mizrahi

unread,

Dec 22, 2011, 6:08:24 PM12/22/11

to

>> How many boolean operations are there for one round? Let's say 1000.
>> So 64000 total.

> I would say it is more like a few thousand for each bit of the answer.
> Ie, closer to 100000 or 1,000,000 for each round.

How do you arrive to such estimates?
I'm not talking about normal forms, but rather about DAGs with
variables, constants, not, and, or, xor. (Common parts are included once
-- they are linked from multiple places.)

Each boolean operator in SHA-256 specification adds 32 boolean operators
to graph.

Sum is a bit more complex. c = a + b can be implemented as
c[i] = a[i] xor b[i] xor carry[i]
carry[i] = (a[i-1] and b[i-1]) or ((a[i-1] or b[i-1]) and carry[i-1])

So approximately 192 operations. (Maybe there is a better formula and we
don't need first and last carry.)

Here's SHA-256 main loop from wikipedia:

----
Main loop:
for i from 0 to 63
s0 := (a rightrotate 2) xor (a rightrotate 13) xor (a rightrotate 22)
maj := (a and b) xor (a and c) xor (b and c)
t2 := s0 + maj
s1 := (e rightrotate 6) xor (e rightrotate 11) xor (e rightrotate 25)
ch := (e and f) xor ((not e) and g)
t1 := h + s1 + ch + k[i] + w[i]

h := g
g := f
f := e
e := d + t1
d := c
c := b
b := a
a := t1 + t2
----

I see 13 basic operations and 7 sums.
So 13*32 + 7*192 = 1760 operations per round.

Of course DAG is more problematic to deal with than simple formula, but
simple formula will be exponentially large.

> I think you had better refine your estimates before running off.

I need to start with something, ok?

>> There is also a cost of simplifying the formula, but with somewhat
>> larger number of free variables it would take a tiny fraction of
>> computation time.

> You know this how?

Variable substitution and optimization is linear on formula (DAG) size.
I need to do this, say, once per million combinations (for 20 free
variables).

Alex Mizrahi

unread,

Dec 23, 2011, 4:48:33 AM12/23/11

to

> When variables are substituted with constants formula can be simplified.
>
> How much? I have no idea. Let's say proportionally: 64000/16 = 4000, 62
> per round.

Hm, it looks like addition corresponds to tall operation trees which
cannot be simplified very well. (Due to carry.)

E.g. r = a + b

r3 = a3 xor b3 xor ((a2 and b2) or ((a2 or b2) and ((a1 and b1) or ((a1
or b1) and (a0 and b0)))))

Substitution a3 = 1, b3 = 0, a2 = 1, b2 = 0 gives us

r3 = not ((a1 and b1) or ((a1 or b1) and (a0 and b0)))

Alex Mizrahi

unread,

Dec 25, 2011, 12:31:06 PM12/25/11

to

Ok, I've done this, just for shits and giggles.

So, SHA-256 on a single 512-bit block can be represented as a DAG
containing ~280000 AND, OR, NOT nodes. (266k after simple optimization.)
About 100k if you also include XOR. (But it's hard to reason about it in
this case.)
380k with ANDs and ORs only (NOT used only for variables).

Representation as a tree would require around 2^1000 nodes.

Substitution of some variables with constant doesn't make the thing any
smaller, at least not without advanced optimizer.

So now I just need some good theorem prover and a SAT solver =)

RG

unread,

Dec 27, 2011, 11:33:30 AM12/27/11

to

In article <jd7mkn$4v0$1...@speranza.aioe.org>,

Alex Mizrahi <alex.m...@gmail.com> wrote:

> Ok, I've done this, just for shits and giggles.

Wow. I presume you didn't do this by hand? Would you be willing to
post your code?

rg

Alex Mizrahi

unread,

Dec 27, 2011, 2:10:40 PM12/27/11

to

>> Ok, I've done this, just for shits and giggles.

> Wow. I presume you didn't do this by hand?

:)
It was actually pretty easy -- I took Ironclad SHA-256 implementation
and tweaked it a bit to produce formulas instead of numbers.

> Would you be willing to
> post your code?

Here it is:

http://paste.lisp.org/display/126710

Usage:

$ sbcl --load sha256log.lisp

* (calculate-op-count (update-sha256-block))

763.35297
132043

Note that I didn't verify correctness since I'm only interested in
general structure of function for now.

This version produces formula with XORs, but I've included alternative
definitions which produce only ANDs, ORs and NOTs.
Another points of customization: *constant-vars*, Ch definition.

Let me know if you want other goodies like optimizations, stats etc.

Alex Mizrahi

unread,

Dec 28, 2011, 3:25:24 AM12/28/11

to

> So it looks like boolean function implementation has 32x advantage
> EXCEPT slow + operator. I doubt that having cheap + gives 32+x advantage.

Well, it turns out that "boolean function" approach is viable. Although
not as good as I thought.

Full SHA-256 (64 rounds + block expansion) requires 121438 operations
(DAG nodes AND, XOR, OR, NOT).

But when 480 bits are constant (32 left), only 86000 operations are left
after optimization,

This is a lot, but on 64-bit CPU I can try 64 combinations (of lower 6
bits) at once, as it can do AND, XOR, OR, NOT on 64 bits in parallel.
SSE can do 128 bits at once, or even 256 bits with AVX (?).

Now it appears that it is equivalent of 1343 operations per combination.

On the other hand, traditional SHA-256 requires:
19 and/or/xor/+ operations per round
+ 6 rightrotates per round which translate to 3 operations each
(AFAIK), thus 18 more operations
= 25*64 = 1600
+ 21 operations per block in block expansion (assuming ror32 being 3
instructions), thus (64-16)*21 = 1008 operations

So, together 2608 operations. (Or 166912 operations per 64 combinations.)

Of course, modern superscalar CPU can do more than one operation per
cycle, or it is possible to use SIMD (SSE, AVX) to do more than one
32-bit operation at once.

But same optimizations can be used in case of boolean function
implementation, except that it might require more loads and stores.

Thus:
* when variables are not fixed, we have 121k operations against 167k,
but 167k require almost no loads/stores, thus it might be faster
* fixing a number of variables reduces number of operations to 86k in
boolean function case, so it might be viable (twice less
operations), but then it requires dynamic code generation
(JIT-compilation). It isn't a rocket science, but requires some work.
* boolean function based implementation can be further optimized
through
* * pre-computation of some nodes at 'fringe', close to input block
material, ones which depend on limited number of variables. Some nodes
can be replaced with lookup tables, BDDs, disjunctive normal forms or
whatever, and thus their dependencies can be shaved off.
* * conditionally removing portions of graph which do not affect
result. E.g. if we have node AND(x1, x2) and x1 is FALSE, we do not need
to compute x2. Some nodes at wring are likely to be zero or one, so it
makes sense to compute their dependencies conditionally
* * tuning number of fixed variables
* * straight boolean function optimization, use of combined ops if
available for CPU, like NOTAND etc.
* * early rejection -- there is no need to compute everything before
we can rule out combination(s), and I think it is possible to find most
informative nodes through combination of statistics and boolean function
transformations
* I tried only pre-computation approach so far, and it appears that at
least 5k operations out of 86k can be optimized that way. Likely more
with deeper precomputation.

So, this approach is potentially viable optimization if you want to run
SHA-256 pre-image attack on CPU, although gain probably isn't huge. I
don't know whether it can be used on GPU (likely there will be some
obstacle like code size limit or something) or in hardware (FPGA).

It isn't interesting for bitcoin mining because it does

SHA256(SHA256(block_header))

and second SHA256 round cannot be optimized through pre-computations.
So even if first SHA256 is twice faster, second one will be same or
slower, so in best case it is only 25% faster. Hardly worth the effort
when people do mining on GPU.

Globemaker

unread,

Dec 28, 2011, 7:13:28 AM12/28/11

to

Thanks for showing the details of your effort. It was not explained
why you expect setting constant input bits would have any advantage
over trying random inputs and then watching for the desired outputs.
Obviously in a 64 round algorithm with diffusion and avalanche, the
logic reduction attempts have only 1.6% speedup, not 3200% or even
25%.

Alex Mizrahi

unread,

Dec 28, 2011, 8:14:01 AM12/28/11

to

> Thanks for showing the details of your effort. It was not explained
> why you expect setting constant input bits would have any advantage
> over trying random inputs and then watching for the desired outputs.

It reduces number of binary operations required to test one combination
of inputs. A lot of things can be pre-computed once per billion
combinations.

> Obviously in a 64 round algorithm with diffusion and avalanche, the
> logic reduction attempts have only 1.6% speedup, not 3200% or even
> 25%.

Um, look, I have this "avalanche" in form of DAG here. As I wrote,
fixing 480 variables out of 512 removes 35k nodes out of 121k
completely. Quite a difference. My concrete experiment beats your intuition.

You get full diffusion by the end of 64-round algorithm, but for first
rounds you get very poorly mixed bits.

See here:

http://en.wikipedia.org/wiki/SHA-2#Cryptanalysis_and_validation
---
Currently, the best public attacks break 41 of the 64 rounds of SHA-256
or 46 of the 80 rounds of SHA-512,
---

Those are attacks with crazy complexity, but you can guess that if it is
not 100% random at round 41, it is even less so at round 30.

Tail of 512-bit chunk, w[16], gets injected into state only at round 16.
So for the first 15 rounds we are just mashing constants.

Alex Mizrahi

unread,

Dec 28, 2011, 12:49:38 PM12/28/11

to

> Thanks for showing the details of your effort. It was not explained
> why you expect setting constant input bits would have any advantage
> over trying random inputs and then watching for the desired outputs.
> Obviously in a 64 round algorithm with diffusion and avalanche, the
> logic reduction attempts have only 1.6% speedup, not 3200% or even
> 25%.

Oh, I guess you were thinking about not computing all outputs. That's a
tiny gain indeed. The point is to optimize computations in the first
rounds and around block content injection points.

RG

unread,

Dec 30, 2011, 2:46:42 AM12/30/11

to

In article <jdd57c$507$1...@speranza.aioe.org>,

Alex Mizrahi <alex.m...@gmail.com> wrote:

> >> Ok, I've done this, just for shits and giggles.
>
> > Wow. I presume you didn't do this by hand?
>
> :)
> It was actually pretty easy -- I took Ironclad SHA-256 implementation
> and tweaked it a bit to produce formulas instead of numbers.

Cool! I didn't realize there were Lispers hanging out in sci.crypt :-)

>
> > Would you be willing to
> > post your code?
>
> Here it is:
>
> http://paste.lisp.org/display/126710

Thanks!

> Let me know if you want other goodies like optimizations, stats etc.

That pretty much satisfies my curiosity for now.

If you (or anyone else) would like a copy of my Lisp implementation of
Curve25519 or EdDSA let me know :-)

rg

Eddy B.

unread,

Jan 26, 2012, 6:22:23 AM1/26/12

to

First, TL;DR everything.
Second, you're all talking theory crap. MD4 with 32 unknown input bits
is 34365 nodes.
And information propagates into the 26th round (there are 48 in
total), where it's just one known bit.
Take a look at http://pastebin.com/raw.php?i=z74P4aMr (u means
unknown).
I know SHA256 is a bit more complicated. But it's my final goal. It's
been almost a full year since I started this project.
(I think I said too much. Oh well, who cares)

n32...@gmail.com

unread,

Apr 24, 2013, 9:26:11 AM4/24/13

to

Hi there,

I would like to investigate this further. Do you still have the source code? The link to paste.lisp.org seems to have expired.

Regards,
Joey

n32...@gmail.com

unread,

Apr 26, 2013, 6:51:06 AM4/26/13

to

Den tisdagen den 27:e december 2011 kl. 19:10:40 UTC skrev Alex Mizrahi:

alan....@gmail.com

unread,

Apr 30, 2013, 12:34:36 AM4/30/13

to

Check out this:
http://jheusser.github.io/2013/02/03/satcoin.html

Paul Rubin

unread,

Apr 30, 2013, 1:04:43 AM4/30/13

to

alan....@gmail.com writes:
> http://jheusser.github.io/2013/02/03/satcoin.html

That's interesting. Conventional wisdom has been that SAT solvers are
completely useless against crypto algorithms. Earlier today I saw
something about using belief propagation to compute permanents of
boolean matrices, which is in some sense an even worse problem than
cryptography. So maybe we're in for a bunch of new cryptoanalytic
results soon.

Bill Cheng

unread,

May 1, 2013, 3:24:47 PM5/1/13

to

This is a bit surprising, just when I thought this thing is great for crypto
stuff (according to this article http://research.microsoft.com/pubs/65085/sat-hash.pdf). Here's the quote from the first page of it:
---------------------------------------------------------------------
Boolean Satisﬁability (SAT) solvers have achieved remarkable progress in the
last decade [MSS99, MMZ+01, ES03]. The record-breaking performance of the
state-of-the-art SAT solvers opens new vistas for their applications beyond what
conventionally has been thought feasible. Still, most of the successful real
world applications of SAT solvers belong to the traditional domains of formal
veriﬁcation and AI. In this paper we explore applications of SAT solvers to
cryptanalysis of hash functions.
------------------------------------------------------------------------
You're right about this thing for most of the paragraph except possibly the last
sentence. I am no mathematician, and this thing is certainly out of my league.
But it would be interesting to keep an eye on this. Just my 2 cents.

Richard Outerbridge

unread,

May 1, 2013, 6:09:32 PM5/1/13

to

In article <5bc12ad9-50cf-4a23...@googlegroups.com>,
Bill Cheng <lipin...@yahoo.com> wrote:
[....]

> This is a bit surprising, just when I thought this thing is great for crypto
> stuff (according to this article
> http://research.microsoft.com/pubs/65085/sat-hash.pdf). Here's the quote from
> the first page of it:
> ---------------------------------------------------------------------

Dang. I know academic papers are supposed to express timeless wisdom,
but all-and-all I wish they perforce included a publication date. Sort
of like a best-before-date on perishable foodstuffs, you know?
__outer

Bill Cheng

unread,

May 1, 2013, 6:43:32 PM5/1/13

to

I quite agree. Most times I have to look through the reference list to see
which one is the latest. In this case, it's 2006. So my guess is it was
written in 2006. But could be 2007, ... If only they put a date at the
beginning.

Robert Wessel

unread,

May 2, 2013, 2:31:47 AM5/2/13

to

FWIW, the document properties show a creation date of 7/23/2006.

ara...@gmail.com

unread,

Mar 26, 2014, 12:20:57 PM3/26/14

to

2011년 12월 22일 목요일 오전 6시 3분 55초 UTC+9, Alex Mizrahi 님의 말:
> hi

>
>
>
> I'd like to have SHA-256 in form of a set of boolean functions, i,e.
>
> each output bit defined as a boolean function of input bits. But I'm too
>
> lazy to make one myself :)
>
>
>