> > The steps it takes 1 walker (W) to find the edge of the square is about
> > N^2. The steps it takes one walker, starting at the middle, to find a
> > specific square N distance away is 8^N.
>
> That estimate is almost definitely wrong, BTW. See my reply to
> Nautilus upthread. However, you are (almost certainly) correct that
> the expected number of steps for one walker is likely exponential in
> the distance.
I don't think my calculations are off to any significant degree. If
every space has equal odds of being hit by random walk, which is most
certainly true (even the start space in a 2-D lattice), then the chance
that a specific square will be hit are most certainly right around X^N.
It may be slightly off, but I think it is fairly close. Dembski, who is
a mathematician, uses these same odds in his own calculations of random
walks to a specific target:
"Given these provisos, the stopping time for the random walk to
reach the target T will have expected value whose order of magnitude is
1U(T ) (for the relevant mathematics, see Spitzer 2001). The stopping
time here is simply the number of steps required for the random walk to
reach the target. Thus, for instance, the average number of steps for a
random walk to locate a particular protein 100 amino acids in length
will have order of magnitude 1 U(T ) = 1 1/20^100 = 20^100 ˜ 10^130.
Thus, we find that random walks, as with uniform random sampling, are
no better than exhaustive search in decreasing the number of points
that, on average, need to be examined before the search succeeds."
http://www.designinference.com/documents/2005.03.Searching_Large_Spaces.pdf
So you see, random walks and random sampling have the same odds of
success. There just is no advantage to a random walker over a random
sampler in finding a specific target.
> > The average number of steps it takes for
> > 2 walkers to find a specific square N distance
> > away, both starting from the same starting point,
> > is greater than 8^N/2.
> >
> > The formula is not N^2 with N walkers.
> >
> > The right answer is > 8^N/W.
> >
> > To illustrate, lets use our 5 x 5 grid again. The N distance from the
> > middle square to a specific square on the edge would be 2 steps. This
> > specific square would be found be a single random walker in 8^2 (8^N)
> > or 64 steps. I think we both agree so far. You are suggesting that
> > two random walkers, both starting from the center square would find the
> > specific square on the edge of our 5 x 5 grid, on average, in 2^2 steps
> > (N^2) or just 4 steps. Does this sound right to you?
>
> Of course not. There are 8N+4 squares along the perimeter of the grid
> in this model (e.g. 20 for N=2, 28 in a 7x7 grid where N=3, etc.)
Not true. Count the squares along the edge of a 7 x 7 grid and you
will see that there are only 24 squares.
> Therefore, I'll need 8N+4 walkers before I can expect that one of them
> will hit the target square in O(N^2) steps. Notice that the O(N)
> notation hides constants and constant factors. N and 8N+4 are both
> O(N).
>
> http://en.wikipedia.org/wiki/Big_O_notation
>
> In your example of a 5x5 grid, I'd need 20 walkers, walking in
> lockstep, before I can expect that one of them will hit the target
> point after 4 steps have been made. Again, see my reply to Nautilus
> for a more detailed explanation.
Your calculations are wrong and you are confusing poor Nautilus who
obviously doesn't know any better (see explanation for the correct
formula below).
> > The odds of one walker going one step in the right direction are 1 in 8
> > steps.
>
> More, since there is more than one way to reach every point, but OK.
Not in this scenario there's not. There's only 1 way to reach a
neighboring square.
> > The odds of both walkers going one step in the right direction are 1 in
> > 64 steps.
> >
> > The odds of one walker going one step in the wrong direction are 7 in 8
> > steps.
> >
> > The odds of both walkers going one step in the wrong direction are 49
> > in 64 steps.
> >
> > The odds of at least one walker going 1 step in the right direction are
> > 15 in 64 steps (1 in 4.26666.... steps)
>
> Likewise, by your reckoning, if you
> have k walkers, the probability that
> at least one of them
> heads in the right direction is
> 1-Prob(all headed in wrong direction) =
> 1 - (7/8)^k.
The odds that at least one of the walkers will step in the right
direction is not 1 - (7/8)^k.
For 2 walkers this formula would given you: 0.125^2 = 0.015625 x 64 =
1 in 64 steps.
That's not the odds of at least 1 of 2 walkers headed in the right
direction (out of 8 equal options). The correct odds are 15/64 or 1 in
4.266666....
You don't seem very good at making up formulas to represent real
statistical problems.
What you do is take 1 - Prob(all headed in the wrong direction)*(X^N) /
X^N
= 1 - (((1-X/X)^k)*(X^N)) / (X^N)
= 1 - ((7/8)^2)*(8^2) / (64)
= 1 - ((0.765626)^2)*(64) / (64)
= 0.234375 * 64 / 64
= 15/64
= 4.2666 steps on average for two walkers to find a target 1 step away
(the right answer)
That's the right formula.
> In other words, the probability approaches 1
> exponentially quickly as k increases.
> See where you're going wrong yet?
The correct formula is 1 - (((1-X/X)^k)*(X^N)) / (X^N)
Which is pretty darn close to my original formula of X^N/W
X = the number of potential directions that can be taken from any one
location
N = the minimum number of steps to a particular target
W (or k) = the number of walkers
Your formula of N^2 is wrong. Each increase in the number of walkers
does not decrease the search time exponentially as you say. It divides
the search time fairly equally (though a little less than equal) like I
originally said.
> > > > If all the walkers start from the same point and the same time, this
> > > > doesn't work.
> > >
> > > *Sigh*. After I gave you the example of a random with restarts, I was
> > > hoping you'd catch on an arrive at the correct answer for the multiple
> > > walkers scenario. Alas, you dashed my hopes yet again.
> >
> > Sorry . . .
>
> Come on, Sean, think about this for a minute. The N simulataneous
> walkers is analagous to the scenario of a single walker with mutliple
> restarts.
Nope. It is analogous to multiple walkers *without* restarts. If the
*average* for one walker without a restart is X^N, and it is, then it
is quite clear that the average for multiple walkers without restarts
is X^N/W.
Look at Dembski's paper. The odds of random walk are the same as
random sampling. More random sampler's would just divide up the time
(though a little less because the samplers can sample previously
sampled choices). It would not reduce the time to success
exponentially like you say. The same thing happens with increased
numbers of random walkers.
> Each additional walker simply represents an addiitonal
> restart instance. If my calculation for a single walker are correct,
> than my calculations for multiple walkers are trivially correct as
> well.
You're thinking about it all wrong. To prove it to yourself, notice
that your formula doesn't work. Mine does.
> > > > An infinite number of walkers can be used and the
> > > > average time to reach a particular target never gets close to zero.
> > >
> > > That is correct. In our scenario, the expected time to target will
> > > asymptotically approach N from above as the number of walkers
> > > increases. The decline, however is exponential in the number of
> > > walkers.
> >
> > Not true.
> >
> > > > Of course, it is reduced from the time it takes a single walker to find
> > > > the target.
> > >
> > > Exponentially so. Isn't it nice to be able to make that claim, and
> > > have the math to show for it?
> >
> > If your math were only correct it would be quite nice . . .
>
> Well I urge you to read the above explanation, as well as my response
> to Natillus, and reconsider your position. But if you think my
> evolutionist beliefs have blinded me to the ability to compute basic
> probabilities, why don't you run this problem by some of those
> mathematicians who you claim are so impressed by your arguments? Or,
> if you'd like, we can make this interesting: each of us places a
> respectable wager on our position (say, $1000 - a king's ransom for a
> lowly grad student like myself), and we ask a neutral party to code up
> a simulation to see which one of us is right. Interested? ;)
Sure. Let's do it.
> Cheers,
> Leonid.
Sean Pitman
www.DetectingDesign.com
They are just off in relation to the reality.
Just to remind you of the "minor" problem for your theory that still
stands as it always did: what we actually *see* in life are *not*
functionalities widely-dispersed across sequence and structural spaces.
We see clustering, with all sequences converging to very *few* basic
folds.
So, yes - it is concievable that there are great functionalities out
there in the sequence space (we are, actually, developing various tools
to explore all of that). But life works with few functionalities that
are close enough to jump to from very simple starting points.
Every single protein that we do have the structure of clearly fits into
the clustered distrubution mentioned above; including the few flagellar
proteins that have been solved so far. Or did you manage to find
examples of *actual* proteins that do not?
M.
How are my calculations significantly off in relation to this specific
random walk problem . . . ? This particular post is not about "real
life" at the moment. It is specifically about a particular random walk
problem. We can talk about the real life applications elsewhere.
Sean Pitman
www.DetectingDesign.com
poor nautilus :(
I AM NOT A SUBMARINE
> > Well I urge you to read the above explanation, as well as my response
> > to Natillus, and reconsider your position.
I AM NOT A SUBMARINE
<snip snip>
So you guys are really going to do this challenge?
LOL - sorry ; )
Sean Pitman
www.DetectingDesign.com
:)
Leonid, the challenge has been accepted! I shall put fourth 10 dollars
to each camp to show my support! (Hey, I'm on a college budget too).
Furthermore, whoever loses shall put the $1,000 in a suitcase, and buy
enough alcohol for the winner and N other friends. We shall find a
football field, and we shall settle this once and for all!
> :)
>
> Leonid, the challenge has been accepted! I shall put fourth 10 dollars
> to each camp to show my support! (Hey, I'm on a college budget too).
>
> Furthermore, whoever loses shall put the $1,000 in a suitcase, and buy
> enough alcohol for the winner and N other friends. We shall find a
> football field, and we shall settle this once and for all!
I loose ; (
Merry Christmas! ; )
Sean Pitman
www.DetectingDesign.com
_____________________________
> RobinGoodfellow wrote:
> > Seanpit wrote:
> >
> > I don't think my calculations are off to any significant degree. If
> > every space has equal odds of being hit by random walk, which is most
> > certainly true (even the start space in a 2-D lattice), then the chance
> > that a specific square will be hit are most certainly right around X^N.
> >
> > It may be slightly off, but I think it is fairly close. Dembski, who is
> > a mathematician, uses these same odds in his own calculations of random
> > walks to a specific target:
> >
> > "Given these provisos, the stopping time for the random walk to
> > reach the target T will have expected value whose order of magnitude is
> > 1U(T ) (for the relevant mathematics, see Spitzer 2001). The stopping
> > time here is simply the number of steps required for the random walk to
> > reach the target. Thus, for instance, the average number of steps for a
> > random walk to locate a particular protein 100 amino acids in length
> > will have order of magnitude 1 U(T ) = 1 1/20^100 = 20^100 ˜ 10^130.
> > Thus, we find that random walks, as with uniform random sampling, are
> > no better than exhaustive search in decreasing the number of points
> > that, on average, need to be examined before the search succeeds."
> >
> > http://www.designinference.com/documents/2005.03.Searching_Large_Spaces.pdf
>
> Figures you wouldn't understand that the scenario that Dembski is
> describing is completely different from the one you and I are
> discussing. He is talking about an unaided random walk finding a
> specific point in a bounded L-dimensional hypercube of volume 20^L.
> There are exactly 20^L points inside such a hypercube, and, since the
> unaided random walk will eventually approach a stationary uniform
> distribution, his approximation of about 20^L steps is indeed an
> adequate one, since every point has 1/20^N probability of being hit at
> every step (after the walk has gone on for enough steps to "mix" - i.e.
> approach a uniform distribution).
That's right . . .
> You and I, however, are discussing
> the walk on an unbounded 2-Dimensional lattice. Since the number of
> points on such a lattice is infinite, the uniform probability of a
> hitting a specific point via random sampling is exactly 0. (i.e. If
> you throw darts purely at random at an infinite 2-D lattice, you will
> never hit a specific point on that lattice.) On the other hand,
> Polya's theorem establishes that a random walk on such a lattice will
> eventually hit any given point on the lattice with probability 1. That
> is, pick any specific point on the lattice, and you are certain to
> visit that point via random walk after a certain amount of steps, even
> though that number of steps can be arbitrarily large. Therefore, in
> this example, random sampling approximation to a random walk just
> doesn't work.
Perhaps . . .
> > So you see, random walks and random sampling have the same odds of
> > success. There just is no advantage to a random walker over a random
> > sampler in finding a specific target.
>
> Again, I strongly urge you to learn your mathematics from a source
> other than Dembski. (Or are all other authors of statistics textbooks
> blinded by their evolutionist beliefs?) Or at least try to understand
> what Dembski is doing in that case, and how his scenario differs from
> the one you and I are discussing.
Right . . .
> > > Of course not. There are 8N+4 squares along the perimeter of the grid
> > > in this model (e.g. 20 for N=2, 28 in a 7x7 grid where N=3, etc.)
> >
> > Not true. Count the squares along the edge of a 7 x 7 grid and you
> > will see that there are only 24 squares.
>
> Ack. Right you are. 8N instead of 8N+4. There are only 16 squares
> along the edges of a 5x5 grid instead of 20.
; )
< snip >
> > Your formula of N^2 is wrong. Each increase in the number of walkers
> > does not decrease the search time exponentially as you say. It divides
> > the search time fairly equally (though a little less than equal) like I
> > originally said.
>
> Even your own math, wrong as it is, does not bear this out. Your
> latest calculation shows that they expected times are independent of N
> entirely - which, of course, is also wrong.
It was late when I was working on my responded to your last post and I
was very tired from a very long week. And so, for some reason, I didn't
catch that your 1 - (7/8)^k is actually the same as my 1 -
(((X-1/X)^k)*(X^N)) / (X^N) formula for N = 1.
_________
X = the number of potential directions that can be taken from any one
Location
N = the minimum number of steps to a particular target
W (or k) = the number of walkers
_________
> > > Come on, Sean, think about this for a minute. The N simulataneous
> > > walkers is analagous to the scenario of a single walker with mutliple
> > > restarts.
> >
> > Nope. It is analogous to multiple walkers *without* restarts. If the
> > *average* for one walker without a restart is X^N, and it is, then it
> > is quite clear that the average for multiple walkers without restarts
> > is X^N/W.
>
> Think about it just a little longer, Sean. If I restart a walk after
> N^2 steps each time, and do so 2*PI*N times, it is as if I've taken N^2
> steps along 2*PI*N mutually independent random walk paths. This is
> exaclty equivalent to the scenario of 2*PI*N independent walkers
> walking for N^2 steps. There is absolutely no difference.
I agree, except for a minor question about Pi (see below)
What you are talking about is the average number of steps for the
population as a whole - not the total average for the individual steps
taken by each of the walkers to reach the target. In other words, I
was talking about the average of the total number of steps of each
walker while it seems you were talking about the average number of
steps for the population as a whole - which is more relevant to the
problem actually. In that case, your formula seems pretty much correct
and mine is not. The search time for the population is not reduced by
dividing X^N by the number of walkers (W), but by your formula of about
N^2 given 2*Pi*N walkers.
So, given the population as a whole, you're right. The average "time"
to success for at least one individual in a population as a whole
decreases dramatically with increasing numbers of random walkers.
Where would you like your $1000 sent? ; )
Sean Pitman
www.DetectingDesign.com
P.S. The Pi thing:
I've thought a bit more about this and come to the conclusion that even
if you walk along a lattice in 1 of 4 directions, the number of points
that can be hit along the edge of the grid is not 2*Pi*N.
For example, say that the straight-line distance to the edge of the 2-D
grid equals 3 steps - all steps being exactly the same length. How
many points along the edge of the grid would be exactly 3 steps away?
According to you there should be 2 x 3 x 3 = 18. However, if you
actually draw it out, there are only 12 points on the grid that are
just 3 equally distant steps away from the center (i.e., in keeping
with my original formula of X*N). Your formula of 2*Pi*R =
circumference of a circle, but doesn't say how many points along the
edge of the circle a walker can actually touch the edge.
But, this is an aside having nothing to do with the main point.
Merry Christmas, Sean!
I should have you know that, after completing a terrible day of finals,
my 2 best friends and I got shitfaced beyond belief, and I actually
convinced them to come with me to the University football field for an
experiment, dragging along with us 3 very disappointed, very bored and
sober girlfriends. We brought a pair of keys and a baseball bat. We
threw them into the field, and then had our sober girlfriends make us
spin around the baseball bat (you know, where you put your head on the
bat and spin until you fall down) before running aimlessly into the
field to try and find the keys. I threw up as soon as I got up from the
spinning part.
We found the keys 10 minutes later, only to discover that one of my
buddy's wallet was missing. It had fallen out during our "random
search". After 20 minutes of looking, my buddy insisted that we cut the
crap and have our girlfriends help us find the wallet (thus, entering
an intelligent agent into the field, 3, in fact) at which point we were
able to find the wallet. Before I stopped remembering the night, I
distinctly remember being furious that we couldn't find the keys via
random walk.
Thus, keys are findable via random walk. Wallets require intelligent
agents. Thankfully, none of our girlfriends are blonde, so the search
took no time at all.
--
John S. Wilkins, Postdoctoral Research Fellow, Biohumanities Project
University of Queensland - Blog: evolvethought.blogspot.com
Nihil tam absurdum quod non quidam Philosophi dixerit - adapted from Cicero
I just coded up a simulation. Took 10 minutes to write, and a couple
seconds to find the "keys", simulating thousands of walkers on
1000x1000 grid. But I have to admit: your way sounds a lot more fun!
But did you remember to make yourselves incorporeal?
> I just coded up a simulation. Took 10 minutes to write, and a couple
> seconds to find the "keys", simulating thousands of walkers on
> 1000x1000 grid. But I have to admit: your way sounds a lot more fun!
> But did you remember to make yourselves incorporeal?
I've been thinking a bit more about this and I am not so sure you're
right any more. Now, I don't think I was right either, but I'm not
sure you're on the money.
Consider the notion of a patterned walk instead of a random walk -
inside a limited lattice (like a football field). If a walker started
out in the middle and walked in an ever-expanding spiral circle mowing
the field with a lawn mower, how many steps would it take to mow over
the wallet if the wallet was N Euclidian steps away? About 2N^2 -1
steps - right (the total number of squares on the grid - 1)? Now, what
if you had two of these non-random walkers? How many steps would it
take for this "population" of two? 2N^2/2 - Right? I mean, you'd just
be mowing the lawn twice as fast and so you'd mow over the wallet twice
as fast. The same holds for 3 of these non-random walkers = 2N^2/3 . .
. etc.
It seems then that the random walk increases the average number of
steps beyond the total number of squares on the grid because of
retesting previously occupied squares. Therefore, all that additional
*random* walkers do is reduce the amount of resampling quite
dramatically.
If this is true, I'm not quite sure how random -walk- on a bounded 2D
lattice would be any more or less successful than random -sampling- on
a bounded 2D lattice - or even an unbounded 2D lattice. The greater
the number of random walkers, the less the resampling before success
and the more in line with the formula 2N^2/Walkers you get. The same
thing seems true of random sampling. The greater the number of random
samplers, the less resampling is done before success and the more in
line with the formula 2N^2/Samplers you get.
So, perhaps we were both wrong? I mean, your formulas did not work with
small distances and mine didn't work with large distances to the
target. This new formula seems to work for all target distances. Try
it in your simulation and check it out - if you don't mind.
Sean Pitman
www.DetectingDesign.com
True. Those non-random walkers of yours do a sort of counting.
Counting is almast always faster than random methods. The reason
counting is often abandoned, is that it is often impractical or
impossible, especially in search spaces of many dimensions.
>If this is true, I'm not quite sure how random -walk- on a bounded 2D
>lattice would be any more or less successful than random -sampling- on
>a bounded 2D lattice - or even an unbounded 2D lattice. The greater
>the number of random walkers, the less the resampling before success
>and the more in line with the formula 2N^2/Walkers you get. The same
>thing seems true of random sampling. The greater the number of random
>samplers, the less resampling is done before success and the more in
>line with the formula 2N^2/Samplers you get.
Consider a high dimensional search space, like the parameters for a
diesel engine, like piston length, cylinder diameter, compression
rate, etc.
If there are 10 dimensions or more, that space is very different from
that 2 dimensional lattice you have been discussing. F.ex: A random
walker will seldom return to its starting point at all, while in 2
dimensions it will usually return to its starting point, and in 1
dimension it will always return to its starting point infinitely many
times.
Now suppose that different regions in that space is somehow better
than other regions, like diesel engines with piston length shorter
than the cylinder length, in other words engines that are possible to
make, and engines that can actually run.
There is a method which is used to find random samples of this good
region faster than pure random sampling. The problem with random
sampling is that the good region can be very small compared to the
entire space, thus the probability of getting a sample inside the
region is small. So what one does instead is using a random walker
with a slight modification: if the walker walks into a less good
region, it may abort that step.
It has been shown that this method, random samling with markov chains,
can get random good samples faster than simple random sampling.
Here is a java applet that generates a random tiling of coloured
rectangles using a mathematically correct version of this principle:
http://www.math.wisc.edu/~propp/tiling/www/applets/bigsquare.html
Without this method, almost all the random samples would have been
impossible arrangements of the tiles, with tiles overlapping and with
holes.
Kim0
We were within walking distance from our house or we wouldn't have
bothered. Finding the house, an island of very large distance from our
starting point, also required the help of our sober girlfriends. Sadly,
I do not remember the night well enough to recall exactly how much time
it took us to find the island with the help of the intelligent agents.
RobinGoodfellow wrote:
>I just coded up a simulation. Took 10 minutes to write, and a couple
>seconds to find the "keys", simulating thousands of walkers on
>1000x1000 grid. But I have to admit: your way sounds a lot more fun!
>But did you remember to make yourselves incorporeal?
incorporeal? maybe?