Hi Richard,
due to the given (propietary) VM architecture many opcodes appear in 3 styles, suffix B, W or WP.
This is independent of the architecture's bitness (32 or 64).
B, W or WP denote the capacity of the argument (byte, word or -I guess- word pointer - address of an object), that to ensure that the value of the argument can be allocated within this instruction.
This is a basic design pattern to offer performance, well known in the *near* and *far* pointer architectures, to give an example.
In Smalltalk, traditionally that pattern is applied to offsets (to instructions in jumps or to slots, literal ones - static in the method byte codes - or dynamic - relativily to the stack frame) or to direct arguments like characters and numbers and object references.
As a break point opcode (temporarily) replaces the original op code, so that the breakpoint can become active.
To achieve this replacement, three different versions of break op codes, B, W, WP, have to be reserved - though doing always the same thing, a break.
In this situation, the effective argument of the break point itself (the break point number) cannot follow recursivly this style:
you cannot seriously demand that a break point named 10000 cannot replace an op code with a byte argument (e.g. the argument capacity of the break point op code must be able to replace any target).
And remember, in the current design, break point numbers are created automatically.
So a break point op code must be generic in this sense, as the break point number will be created on demand and the break point op code must accept any number, independent of which target op code is to be replaced.
I also knew another architecture
precautiously
reserving NOPs at all potential target break point places. In this environment a single format of a break point op code replaces a NOP to become active. There the break point always has to be of the size of a NOP.
However this approach follows the traditional compilation work flow (providing code with or without debug information).
Not to say that such a code gets larger (additional NOP per statement) and slower (even a NOP costs execution time) just to be prepared to be debugged.
There were good reasons not to follow this approach in an interactive environment.
Summary:
As a (fictitious) break point numbered 266372627 must be able to replace the tiniest possible target opcode (which intentionally was carefully designed to become so small to be performant) [in particular in the discussed scenario, when automatically generating break points at all possible locations to be able to measure the coverage].
So this historically required the choice of a limit - obviously that one could not have been extreme in both directions - e.g. maximum 255 break points as byte code or as in my extreme example, a WP numbered one). Thus resulted in the existing compromise.
Even an indirection (a reference to a place actually holding the indentity/number/name of a break point) would not solve the problem, as then instead one has to care about space of the indirect number/names and as the number of references can grow and become large, it would end up in the same dilemma.
Instead of adopting op codes another idea could be followed: what if a bit is spent for every instruction supported by the vm, not necessarily in the passive generated code but only at execution time, after being loaded and before being run.
Any execution will turn on this bit. A way of its examination has to be designed and provided anew.
if s.b. considers this to be science fiction, think about the existance of readonly bit of all objects, also nonstandard but present and very useful. Might be adopted for this purpose.
The result be a new feature as add on, which would not endanger existing material, replacing the existing development vm, as choice, when needed.
Can potentially be arranged in the existing ENVY/Stats, EsbSampler.
-
M
PS to all insiders: as the VM is propietary and being an outside, I hope I did not infringe internals - and please neglect my inaccuracies.