algorithm brute-forcing

Dennis Yurichev

unread,

Apr 7, 2007, 8:43:39 AM4/7/07

to

Hi.
What we have: we have a black box with some unknown crypto algorithm
inside.
We can test it as many times as we want, placing input data and key at
input and see output.
We know that black box is just one something like LFSR implementation
(Linear feedback shift register), modified somehow, but we are not
sure. We're only sure that this algorithm is very compact.
And our task is to see this algorithm in details.
Are we able to do a bruteforce in such way, where each possible
variation of algorithm will be tested and checked?
Can we choice or build some Turing-complete language for this purprose?

Richard Heathfield

unread,

Apr 7, 2007, 9:14:45 AM4/7/07

to

Dennis Yurichev said:

> Hi.
> What we have: we have a black box with some unknown crypto algorithm
> inside.

Invest in a screwdriver.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

Scott McPhillips [MVP]

unread,

Apr 7, 2007, 10:19:49 AM4/7/07

to

High performance programmable parallel hardware helps a lot:
http://www.annapmicro.com/news.html

--
Scott McPhillips [VC++ MVP]

Dennis Yurichev

unread,

Apr 7, 2007, 11:40:22 AM4/7/07

to

On 7 апр, 17:19, "Scott McPhillips [MVP]" <org-dot-mvps-at-scottmcp>
wrote:

I know, but I'm looking for any known methods or ideas.

Logan Shaw

unread,

Apr 7, 2007, 2:22:17 PM4/7/07

to

Dennis Yurichev wrote:
> What we have: we have a black box with some unknown crypto algorithm
> inside.
> We can test it as many times as we want, placing input data and key at
> input and see output.
> We know that black box is just one something like LFSR implementation
> (Linear feedback shift register), modified somehow, but we are not
> sure. We're only sure that this algorithm is very compact.
> And our task is to see this algorithm in details.
> Are we able to do a bruteforce in such way, where each possible
> variation of algorithm will be tested and checked?

This sounds somewhat related to Kolmogorov complexity. You may have to
account for the Halting Problem, but there are probably steps you can
take to deal with that.

I guess my first question is how much you know about the algorithm
inside the black box. For example, you know that it is "very compact",
but do you have concrete information about how compact? And do you
know what the algorithm is implemented in terms of? That is, do you
know that it is "very compact" when implemented using x86 instructions?

The reason I ask this is that the compactness of actual code depends
on the instruction set. For example, suppose you have one instruction
set that just has the regular integer operations. Now compare that to
another instruction set that has single instruction for "compute the
SHA1 sum of a string". You can put an arbitrary algorithm into an
instruction, and when you do that, you affect the compactness of the
code written using that instruction set.

So, let's suppose that your black box contains not an algorithm, but
a compact *implementation of an algorithm* in a known instruction set.
Perhaps it's x86, although it doesn't really matter which one it is.
Let's also suppose you have an actual specific upper bound on the
size of the implementation. Say, for example, you know it's 10
instructions long.

If both these things are true, then it will be possible to find the
algorithm. All you have to do is create all possible programs up
to the maximum length, then run those programs and see if they get
the same output as the black box does. As you find that algorithms
get different output, eliminate them.

If you find that you have eliminated all but one algorithm, this
remaining algorithm must be the same as the one in the black box.
This is because there must be at least one algorithm (after all,
the black box contains it, so it does exist), and since the others
had different output, they aren't it. Therefore, the one that has
not been eliminated must be it. This is essentially just the
process of elimination.

The question then becomes whether you can eliminate all but one
algorithm. This is a bit tougher. Algorithms that aren't equivalent
must eventually differ. And they must differ on a finite input.
Since you can go through all possible inputs (starting with the short
ones and moving towards the longer ones), eventually you must reach
a point where algorithms will have different outputs if they are not
in fact equivalent algorithms. Unfortunately, there is no way to
know how long this will take.

Naturally, if you eliminate all algorithms but one, then you are
done, because you know there is at least one algorithm that is
correct, so you would be finished in that case. The problem comes
in where you might have more than one sequence of code which are
equivalent implementations of the same algorithm! Unless you have
a general way to examine a set of sequences of code and determine
whether they are equivalent, I don't see a way around this.

I think this means that, in the general case, this problem cannot
be solved.

- Logan

Dennis Yurichev

unread,

Apr 7, 2007, 3:02:25 PM4/7/07

to

On 7 апр, 21:22, Logan Shaw <lshaw-use...@austin.rr.com> wrote:

> I guess my first question is how much you know about the algorithm
> inside the black box. For example, you know that it is "very compact",
> but do you have concrete information about how compact? And do you
> know what the algorithm is implemented in terms of? That is, do you
> know that it is "very compact" when implemented using x86 instructions?

I think, roughly, less than 50 x86 instructions.
The interesting point is that some parts of algorithms may be
eliminated.
For example, two instructions together like "a=a+1; a=a-1;" are
nonsence and, can be such things eliminated before algorithm is
checked for inputs and outputs?
Just interesting, is there any known papers which already have what
I'm looking for..

Dennis Yurichev

unread,

Apr 7, 2007, 3:12:11 PM4/7/07

to

Also, can we build such program that can answer to question: "here is
I as input data and K as key and here is O as output data, how many
x86 instructions can be used to transform input+key to output?"
Also, at some point we can have some algotirhm than is working OK for
10 known input+key+output triads but doesn't work for 11th. Are we
able to say, what part of algorithm is worth to rebuild and which one
is, most likely, don't?

Dennis Yurichev

unread,

Apr 7, 2007, 3:23:33 PM4/7/07

to

Also, another subquesion: is there any known method that can map any
algorithm to natural number is such way that two equivalent algorithms
cannot have different numbers?

Daniel Rudy

unread,

Apr 10, 2007, 4:19:14 AM4/10/07

to

At about the time of 4/7/2007 5:43 AM, Dennis Yurichev stated the following:

There are two very compact algorithms out there that it *could* be, and
they are quite common: AES and RC4. Considering that you said that it
looks like a LFSR, and that the code size is about 50 x86 instructions,
I would guess RC4 or some variation of it.

--
Daniel Rudy

Email address has been base64 encoded to reduce spam
Decode email address using b64decode or uudecode -m

Why geeks like computers: look chat date touch grep make unzip
strip view finger mount fcsk more fcsk yes spray umount sleep

Dennis Yurichev

unread,

Apr 10, 2007, 5:15:12 AM4/10/07

to

On 10 апр, 11:19, Daniel Rudy <spamt...@spamthis.net> wrote:
> At about the time of 4/7/2007 5:43 AM, Dennis Yurichev stated the following:
>
> > Hi.
> > What we have: we have a black box with some unknown crypto algorithm
> > inside.
> > We can test it as many times as we want, placing input data and key at
> > input and see output.
> > We know that black box is just one something like LFSR implementation
> > (Linear feedback shift register), modified somehow, but we are not
> > sure. We're only sure that this algorithm is very compact.
> > And our task is to see this algorithm in details.
> > Are we able to do a bruteforce in such way, where each possible
> > variation of algorithm will be tested and checked?
> > Can we choice or build some Turing-complete language for this purprose?
>
> There are two very compact algorithms out there that it *could* be, and
> they are quite common: AES and RC4. Considering that you said that it
> looks like a LFSR, and that the code size is about 50 x86 instructions,
> I would guess RC4 or some variation of it.

Oh no, LFSR is just as an example. Algorithm can be not related to
crypto field. By interest is in consideration of all possible
algorithms using reasonable resources and time.

Chris Uppal

unread,

Apr 10, 2007, 7:44:56 AM4/10/07

to

Dennis Yurichev wrote:

> By interest is in consideration of all possible
> algorithms using reasonable resources and time.

If you mean that you want to consider all possible algorithms, but only
take reasonable time/space to do it, then give up now.

There is a /much/ more than astronomical number of possible
computations from, say. 100 bytes of input to 100 bytes of output.
There is a more than astronomical number of possible programs which can
be expressed in, say, 150 bytes of i86 binary code. The mapping from
one to the other is not continuous (small changes to the program can
cause large changes to the output, and small changes to the desired
output may require large changes to the program). The mapping is not
only non-continuous, it is also non-computable.

That said, if you are happy with very crude approximations, and are not
trying to recover a target transformation which has been designed to be
difficult/impossible to recover (e.g. any crypto algorithm when
considered together with its key(s)), then you may be able to use
techniques of machine learning. Read about "genetic programming" for
instance:
http://www.genetic-programming.org/
http://en.wikipedia.org/wiki/Genetic_programming
(Warning: both the above links include what stikes me as excessive
levels of hype.)

-- chris

SucMucPaProlij

unread,

Apr 10, 2007, 1:59:30 PM4/10/07

to

"Dennis Yurichev" <Dennis....@gmail.com> wrote in message
news:1175949819....@n76g2000hsh.googlegroups.com...

you can learn a lot about this algorithm if you test it with right input data of
small length (one byte, word or dbl. word). If this algorithm is linear (in any
way) then you can expect that result of binary input 010001 is (some sort of)
linear combination or results from inputs 010000 and 000001. This is just one
idea you can use and I don't think you will find some universal solution for
your problem.

Daniel Rudy

unread,

Apr 10, 2007, 5:39:59 PM4/10/07

to

At about the time of 4/10/2007 2:15 AM, Dennis Yurichev stated the

But you just said it was crypto. You asked us about how to find out
what it is, so we told you. I gave you a couple of good guesses to try
out on your own.

What is it that you want? You either want to find what the algorithm
is, or not. It's your call.