Optimal block size

157 views
Skip to first unread message

nichol...@gmail.com

unread,
Jun 4, 2022, 2:45:00 PM6/4/22
to Busy Beaver Discuss
Recently Shawn and I were discussing finding the right block size a given program, and that got me to thinking about block size in general. Here are a few questions. The last of these is a RESEARCH PROJECT that I pose to inaugurate Terry's retirement.

1. Given a program P and a block size B, is it possible to say definitively whether or not B is the "optimal" block size for P? In other words, is it always clear that B is optimal for P if indeed it really is, and always clear that B isn't optimal if indeed it really isn't? (If there are multiple "optimal" sizes, we'll take the least such to be "the" optimal one.) My feeling is that the answer is yes, and the function `optimal : Program -> Nat -> Bool` is computable.

2. Given a program P, is there a computable method for determining its optimal block size? That is, can we always find a number B such that B is optimal for P? My feeling is that this is not possible, because if it were it could be used to solve the halting problem. It feels awfully close to dynamic analysis on arbitrary programs, and that generally can't be done. That's not a proof, just a hunch.

3. Among programs with N states and K colors, what is the greatest optimal block size?

Now, if the answers to questions 1 and 2 really are yes and no, then question 3 should give rise to a Busy Beaver-like sequence. Call it the Blocking Beaver problem, or BLOB. What can be said about BLOB? What are its early values? How do its values for N-state/2-color compare to its values for 2-color/N-state?

One way to understand blocks is that they are a program's way of implementing a more expressive color palette. A program can manage cells in blocks of size B in order to bootstrap its way from K colors to K^B colors, and then implement its logic in that higher order language. From this perspective, BLOB asks: what is the most expressive higher order language that can be implemented in some program class?

Tristan Stérin

unread,
Jun 4, 2022, 5:33:59 PM6/4/22
to Busy Beaver Discuss
Would be another to phrase the question the following?

Given a n-state k-symbol Turing machine M, what is the m-state p-symbol machine P with lowest m such that:
1. m < n (meaning that P is good abstraction of M)
2. p > k (bigger block size)
3. p < k*n (we don't want to cheat with an alphabet of size k*n and m = 1)
4. P "simulates" M with a tight definition of simulation (for example, Definition 8 in https://arxiv.org/pdf/2107.12475.pdf)

In our work https://arxiv.org/pdf/2107.12475.pdf, we did something similar in the other direction where we found a conceptually simple 5-state 4-symbol machines that we simulated with a 15-state 2-symbol machines. Now, the issue that I have is that in that constructive approach already, we used encoding tricks where sometimes the 2-symbol machine uses a block size of 2 but sometimes switches to a block size of 1 in order to reduce the number of states it is using (without these tricks the 2-symbol machine would have had ~20 states).

Hence, when looking at computation in the wild I would be 100% willing to bet that a lot of machines use clever tricks where the idea of "optimal" block size is not well defined, switching among various block sizes. You might argue that its enough to take the biggest block size that they use, but I would counter-argue that if you do this you are left with a behavior that is hard to analyze since you don't have the correct "goggles" to see what the machine is actually doing.

The above is a bit speculative but I believe that already some simple 5-state machines illustrate my point. For instance if you look at details iteration of the Bilateral Bouncer https://bbchallenge.org/5708316 with "Explore Mode" (attached screenshot), the "correct" block size to consider when the machine is going to the right is 2 while it is 1 when going to the left. Again that example is quite simple and perhaps remediable with your idea of "optimal" block size, but I could try at finding harder examples.

In a nutshell, I think would need to see a formal definition of your optimality criterion.
Screenshot 2022-06-04 at 22.31.14.png

Tristan Stérin

unread,
Jun 4, 2022, 5:52:12 PM6/4/22
to Busy Beaver Discuss
My 5-state example was not that good as block size 2 seems to give a satisfying analysis in all cases, but here are some other machines where the "optimal" block size is not obvious to me (more than in my first example):

Turn on "Explore Mode":

Also, the side by which you enter a block matters,

(And I stop spamming this thread now!)

Shawn Ligocki

unread,
Jun 4, 2022, 11:57:39 PM6/4/22
to Busy Beaver Discuss
This is an interesting question, Nick. I'm not sure it is possible to precisely define for all machines, but there is an obvious block size for some machines, so maybe we can at least come up with a partial definition!

Just to make sure we're all on the same page: When I talk about "block size" I'm talking about a fixed size block of symbols which is treated a single symbol by the "Macro Machine" simulating it (See Heiner's explanation of Multi-Symbol Macro Machine). So, Tristan, it might not be as general as your Definition 8 ... but I'm not quite sure I understand the implications. In practical terms, our "macro simulator" takes a --block-size argument and we have some heuristic code to guess the best block size. This guess is pretty good, but sometimes it doesn't find the block size that works best with the simulator.

Now there are sort of 2-3 different ways good block sizes help: (1) effective tape compression, (2) efficient chain step simulation and (3) enabling automated proofs to work.

Our automated block-finder system attempts to optimize the first 2 of these using heuristics:
  1. For tape compress: We run the machine some number of steps (maybe 1k), find the step at which the tape was least compressed with block-size 1, and find the block size that would compress the tape the most. As long as we run for enough steps, this seems to work well.
  2. For chain step optimization: We try running using different multiples of the block size identified in phase (1) to see which multiple performs the most chain steps. This also tends to work pretty well, although you sometimes need to run it for a while to notice things.
We don't do anything to optimize for (3). And (3) is actually one of the least objective, it's sort of deeply related to the details of our automated proof system ... which is ... a bit idiosyncratic in some ways. I won't go into the details here (and I don't even really understand the details completely). But suffice to say, that even once you've optimized (1) and (2); sometimes it turns out that you do not get successful proofs until you've run the machine in some specific multiple of the size found above.

So when we talk about defining precisely what it means to find the "optimal" block-size, I think you have some hopes with (1) and (2), but I think (3) will be quite a bit more wishy-washy and perhaps implementation specific ... :/

Another thing to keep in mind is that there are some machines where no single block size is "correct". For example, consider this analysis I made on 22 March: https://groups.google.com/g/busy-beaver-discuss/c/KofE0K7_AbQ/m/I_1GH3tPBAAJ which considers the general configuration:
  • F(a, b, c) = 0^inf 110^a 1^b 1010^c A> 0^inf
What is the optimal block size for this machine? It's sort of 4 and 3 in different parts of the tape. You could use 12 to capture both ... but it's really inefficient at that point. Instead, my code picks 4 and fails to effectively compress the left side. But the left only changes a small enough number of times that it works ... ok.

Pavel: I noticed you mentioned something about dynamic tape compression before. Have you found a way to apply different block sizes to different sections of the tape? I have thought about this, but it is non-trivial, opening up a whole new can of worms of when to compress at different block sizes and when to re-compress, etc.

And then (of course) there are things that don't compress well with any block size (like counters or chaotic machines).

So this is definitely a challenging question. But I look forward to seeing some public discussion on it rather than just having all us implementors making our own heuristics quietly :)

-Shawn

--
You received this message because you are subscribed to the Google Groups "Busy Beaver Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to busy-beaver-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/busy-beaver-discuss/fba7e936-7906-4d71-a1a8-8a86bf07766dn%40googlegroups.com.

Shawn Ligocki

unread,
Jun 5, 2022, 12:15:05 AM6/5/22
to Busy Beaver Discuss
3. Among programs with N states and K colors, what is the greatest optimal block size?

This is definitely a twiddly question! In their paper, Marxen and Buntrock found machines with optimal block size as high as 80. In fact, that TM enters Lin Recurrence (Translated Cycling) with offset 80. So what we're detecting as an "optimal" block size for simulation is that at some point it performs a "chain step" over the (0...0)^inf block thus proving that the TM is infinite. But is this what we really mean when we talk about optimal "block size"? If so, the largest optimal block size might be the same as the largest LR offset.

Tristan Stérin

unread,
Jun 5, 2022, 10:27:17 AM6/5/22
to Busy Beaver Discuss
Another awkward thing is that, even when an optimal block size is defined, let say 2, machines can employ parity tricks and shift whether you should consider your blocks to start on even or on odd positions at different points in time, making your blocks overlap. An example is https://bbchallenge.org/9892415 (toggle "Explore Mode") and an annotation of my point is attached to this email:
  • Blue ovals are the "optimal" size-2 blocks to consider when the machine is going right and they start on even tape positions
  • Red ovals are the "optimal" size-2 blocks to consider when the machine is going left but they are just shifted by 1 and not by 2 compared to blue ovals, making them overlap and changing the relevant block partition of the tape (temporarily).
T
Screenshot 2022-06-05 at 15.24.09.png

Tristan Stérin

unread,
Jun 5, 2022, 10:42:34 AM6/5/22
to Busy Beaver Discuss
I think there is room to define a new TM model that manipulates fixed-size blocks but that also has sub-block granularity access to handle situations like the annotated machine in my last email (https://bbchallenge.org/9892415). Concretely this would amount to defining more fancy Left/Right/Stay in place moving direction.

I believe that it is only within that model that you can, with satisfaction, reduce the behavior of the machine of my last email to its simplest essence: "Go right while you read x, extend the tape, then go left while you read y, extend the tape, and restart".

uni...@fu-solution.com

unread,
Jun 6, 2022, 12:56:45 AM6/6/22
to busy-beav...@googlegroups.com
On 2022-06-05 05:57, Shawn Ligocki wrote:
> Pavel: I noticed you mentioned something about dynamic tape
> compression before. Have you found a way to apply different block
> sizes to different sections of the tape? I have thought about this,
> but it is non-trivial, opening up a whole new can of worms of when to
> compress at different block sizes and when to re-compress, etc.

compression is easy - every time you add symbol to the "stack" you
just... look at examples:
* `10{10}^n...` -> `{10}^(n+1)...`
* `1010...` -> `{10}^2...`
* repeat while you can compress something

the hard part is that macro machines handle trivial tape
transitions/proofs for you and then you can get far with a simple
proving system on top of them.
once you can't use macro machines, you need nested proofs (rules) from
the beginning.

nichol...@gmail.com

unread,
Oct 24, 2022, 9:59:53 AM10/24/22
to Busy Beaver Discuss
A while ago Boyd Johnson found a surprisingly complex 4-state 2-color program:

  1RB_0RC__1LB_1LD__0RA_0LD__1LA_1RC

This one enters into Lin recurrence after 158,491 steps with a period of 17,620 steps. Its behavior is impossible to tell just from looking at it.

Well, I did some searching and apparently its optimal block size is 708. With that number, a simple prover can show that it will loop forever. As far as I can tell, this is the least number that "makes sense" out of Boyd's program.

Can anyone verify this? Or come up with a better (i.e. lower) block size?

uni...@fu-solution.com

unread,
Oct 24, 2022, 10:19:53 AM10/24/22
to busy-beav...@googlegroups.com
hi, join us on bbchallenge discord - this machine is easy to prove with
"closed position set"
https://discord.com/channels/960643023006490684/1028747034066427904
> --
> You received this message because you are subscribed to the Google
> Groups "Busy Beaver Discuss" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to busy-beaver-dis...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/busy-beaver-discuss/b215a8dc-f1c4-4a21-b64a-cf8ff28a1694n%40googlegroups.com
> [1].
>
>
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/busy-beaver-discuss/b215a8dc-f1c4-4a21-b64a-cf8ff28a1694n%40googlegroups.com?utm_medium=email&utm_source=footer

Shawn Ligocki

unread,
Oct 24, 2022, 11:30:25 AM10/24/22
to busy-beav...@googlegroups.com
Hi Nick, FWIW, this TM has LR offset 118 meaning that each cycle it shifts 118 positions to the right. Running my accelerated simulator with block size 118 does allow proving this using macro machines. 708 = 118 * 6, so it makes sense that would also work. In general, if you run a Lin Recurrent TM with the offset as block size it tends to be able to prove it because each cycle tends to become a single step in the macro machine. In fact, this is true of Heiner Marxen's TM #3 from his paper which required block size 80 (it is Lin Recurrent with offset 80).

In this specific case you can see below the output from my accelerated simulator (Quick_Sim.py) showing that this TM is proven infinite via INF_CHAIN_STEP (aka Spinning Out) when considered as a 118-block macro machine.

Pavel, this machine has no halt transitions, so many filters/deciders will prove it infinite :) In this case, I assume Nick is interested in this TM because of its behavior outside of simply proving halting / non-halting. That said, if you are interested Nick, you should join us on Discord, it is a very very active community of Busy Beaver enthusiasts right now!

$ python Code/Lin_Recur_Detect.py <(echo 1RB0RC_1LB1LD_0RA0LD_1LA1RC)

parameters {

  find_min_start_step: true

}

result {

  success: true

  start_step: 158491

  period: 17620

  offset: 118

  elapsed_time_us: 4248145

}


halt_status {

  is_decided: true

  inf_reason: INF_LIN_RECUR

}


$ python Code/Quick_Sim.py <(echo 1RB0RC_1LB1LD_0RA0LD_1LA1RC) -n118

...

         Steps:                     Times Applied:

Total:   10^5.2                                 20

Macro:   10^5.2                                 19

Chain:                                           0

Rule:                                            0

Rules proven: 0

Failed proofs: 0

Prover num past configs: 20

Tape copies: 0

Elapsed time: 0.14205193519592285

0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000^inf 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001011101101011010110101101011^1 0101101011011011011010100101010110101101101101101101101010110110100101101101011011011011011011010010110110101101101010^1 1101010101101010101101010010100101010110101101011011011011011011010010101011010110110101001011011011011011011011011010^1 1101101101101011011010010110101101101010101101010010110110110110110110100101010110101101101101101101101101101011011011^1 0101001010101101101101011010110101101101010110110100101001011011010110110110110101001011011010100101101011011011011010^1 1101010010110110101010110101101101011011011010010101011010101011010110110110100101010110101010110101101101101010110101^1 1011011011011010110110101101101101101101101101101011010100101001010101101011011010110110110101011010100101010110110110^1 (1101011011010010110110110101011010100101101101011010101011010110110101101011011010110110110100101001010101101101101000) A> 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000^inf

Num Nonzeros: 10^2.7  500


Turing Machine proven Infinite

Reason: INF_CHAIN_STEP

Quasihalt:

is_decided: true


--
You received this message because you are subscribed to the Google Groups "Busy Beaver Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to busy-beaver-dis...@googlegroups.com.

nichol...@gmail.com

unread,
Oct 24, 2022, 1:50:42 PM10/24/22
to Busy Beaver Discuss
Yes, 118 works for me. 708 runs in fewer simulator loops, and I had the limit set too low, so 118 didn't show up in my search.

How about for the really long LR that you (Shawn) found?

  1RB_0LC__1RD_1LC__0LA_1LB__1LC_0RD

Enters into a recurrence period of 7,129,704 steps after 309,086,174 steps. Your LR detector shows an offset of 512.

Maybe it comes down to some implementation detail, but I can only get it to run by taking the base program, then 5-cell macro on top of that, then 2-cell macro on top of that. That gives a total block size of (2^5)^2 = 1,024.

Regarding Discord, does anyone have a fresh invite? Pavel's link didn't work for me and neither does the one on the BB challenge site.

Tristan Stérin

unread,
Oct 24, 2022, 1:53:03 PM10/24/22
to busy-beav...@googlegroups.com
Would this invite work? https://discord.gg/GaXqWcmk

--
You received this message because you are subscribed to a topic in the Google Groups "Busy Beaver Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/busy-beaver-discuss/al-Ima3rfX8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to busy-beaver-dis...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/busy-beaver-discuss/b1ed595c-bd48-44e8-8618-21bb492fceb8n%40googlegroups.com.

Tristan Stérin

unread,
Oct 24, 2022, 2:01:05 PM10/24/22
to Busy Beaver Discuss
I updated discord's link on the webpage of bbchallenge to this new invite link that should not expire: https://discord.gg/3uqtPJA9Uv
Sorry about that!

Shawn Ligocki

unread,
Oct 24, 2022, 2:10:19 PM10/24/22
to busy-beav...@googlegroups.com
Maybe it comes down to some implementation detail, but I can only get it to run by taking the base program, then 5-cell macro on top of that, then 2-cell macro on top of that. That gives a total block size of (2^5)^2 = 1,024.

Hm, I'm not sure I understand what you're talking about here. I was talking about using a macro machine with block size 512 (in this case). By block size I mean the length of each block. It looks like you are talking about using a macro machine with block size 5 and then layering another macro machine with block size 2 on top of that (which is basically equivalent to a single macro machine with block size 10, just potentially with some speedup). You say that leads to "block size" of (2^5)^2, but that is not what I am calling block size, that appears to be the total number of macro symbols possible. I don't think that quantity has any relation to the offset, but maybe you can elucidate if it does :)

--
You received this message because you are subscribed to the Google Groups "Busy Beaver Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to busy-beaver-dis...@googlegroups.com.

nichol...@gmail.com

unread,
Oct 24, 2022, 3:28:59 PM10/24/22
to Busy Beaver Discuss
You're right, there's some kind of type error in what I'm trying to say. On the other hand, the numbers work out to be suspiciously close, so it seems pretty close to something correct.

Here are the facts: I can run that long LR program with a 5-block macro stacked with a 2-block macro (that is: base -> 5-block -> 2-block). The [5, 2] sequence works for me, but not similar sequences; not [2, 5], not [10], not [32], not [10, 2], not [5, 4]. [5, 2, 2] does work -- base -> 5-cell -> 2-cell -> 2-cell.

Clearly powers of 2 are swirling around. What's going on here?
Reply all
Reply to author
Forward
0 new messages