On Thursday, March 10, 2016 at 8:20:18 AM UTC-5, Rick C. Hodgin wrote:
> My thinking was that pointer selectors would bypass segment selectors,
> containing within them their own information, which extends beyond the
> 4GB limit for machine addressing, while still being limited to the 4GB
> limit for physical addressing per pointer.
>
> lptr esi,myPointerSelector1
> ; Right now, [esi+x] has a range from [esi+0] to [esi+(2^32)-1]
>
> I can simultaneously do this:
>
> lptr edi,myPointerSelector2
> ; Right now, [edi+x] has a range from [edi+0] to [edi+(2^32)-1]
>
> The two pointer selectors could map into overlapping, or completely
> separate and isolated memory, allowing two windows into any area of
> memory. To move to the next 4GB chunk, load myPointerSelector3, and
> so on.
To explain even further, consider that today's selectors have a particular
format. For my LibSF 386-x40 CPU (which I now call Arxoda), I have revised
that format to something a little cleaner:
https://github.com/RickCHodgin/libsf/blob/master/li386/li386-documentation/images/descriptor.png
Here we see a 40-bit base, with a limit being defined by only 28 bits which
ranges in byte block sizes, allowing from 0 to the max range of physically
addressable memory.
To create a pointer selector, it would be similar, but it might employ these
fields:
[ExtraMemAddressBits][Limit][Base][+ the standard info]
Those "extra memory address bits" are tagged on to the physical address at
the highest bits, taking the 32-bit computed pointer offset and concatenating
it with the extra bits, allowing the [esi] pointer selector to access a much
greater range of memory than the 32-bits of esi will do by itself.
In other cases, all 32-bit computations work by itself. And a pointer
selector can be unloaded by loading any other value into ESI or EDI without
using the lptr instruction:
lptr esi,myPointerSelector
; Now, esi is in pointer-selector mode
And when finished, back to normal mode:
xor esi,esi
; Now, esi is back in standard addressing mode
-----
I think this 32-bit solution is a much better solution than going to a full
64-bit machine, though I think the 40-bit movement is a better solution all
the way around because it meets us at the point of the range of feasible
workloads for the foreseeable future as each core is capable of addressing
up to 1 Terabyte of RAM per physical pins, with selectors being created
which allow access into either local or global memory, with additional bits
being added to communicate with a shared pool of memory per multi-cores,
etc., so each uses their own memory controller, their own dedicated address
space for applications, with crossover only necessary where it's necessary,
but still within a full Terabyte of crossover address space, etc.
These kinds of pointer selectors would allow for a 32-bit computer, or 40-
bit computer, to compute workloads which are only possible today on 64-bit
computers, and to do so with more typical coding requirements (as most
things don't need 64-bits, they just need more than 32-bits).
I honestly believe the 40-bit CPU is a great stopover. I don't know why
it wasn't targeted by AMD with their AMD64 design (as the AMD40). I think
also the 40-bit single float, and the 80-bit double float (both with
implict-1 mantissas), would've made nice extensions to IEEE-654, with a
few more mantissa bits keeping the exponent ranges the same as their 32-bit
and 64-bit counterparts, yielding ~2 extra base-10 significant digits in
40-bit form, and ~4 extra base-10 significant digits in 80-bit form, even
exceeding the count yielded by 80-bit IEEE today (due to it having a
larger exponent).
We'll see though. It is my present goal, however. And I think having the
four ISAs will be desirable as well:
(1) 80386 compatibility mode (flags to extend out to 40-bits, w/WEX)
(2) ARM compatibility mode (flags to extend out to 40-bits, w/WEX)
(3) Arxoda's new ISA (a many-predicate 80386/ARM hybrid, w/WEX)
(4) A programmable ISA which maps any of (1), (2), and (3) to
custom opcodes per task, including aggregated macros which are
many opcodes running in succession with a single command.
WEX:
https://github.com/RickCHodgin/libsf/blob/master/li386/li386-documentation/images/wex_register_mapping.png
-----
The planned progression for Arxoda grows in developmental stages to Oppie-8,
which is targeted as the final form involving four Oppie-6 cores:
https://github.com/RickCHodgin/libsf/blob/master/li386/oppie/oppie-6.png
https://github.com/RickCHodgin/libsf/blob/master/li386/oppie/oppie-8.png
With Oppie-8 allowing for 4 "Love Threading" cores (Love Threading is a
concept where a core can request help from its companion core, and the
companion core will leave its workload temporarily and go and help the
other core, and then come back to its own workload afterward). It is
called love threading because this is what love does: It sacrifices of
itself to help another. Clever naming, yes? :-)
I have lots of plans. And lots of things railing against me in moving
forward. Nonetheless, I proceed forward undaunted. And, if God allows
it to be completed, it will be completed. And, of course, it would go
faster if any of you would like to help me. Your help would be much
appreciated, and would be a great service to God, and to mankind, to
use your skills to offer unto Him full fruits of your labors for these
purposes (to give people an alternative to money-based advances in
technology, and instead to look to Him and say, "How should this be
designed? What should it include were money not a factor?" and then
to look to Him (look to truth) for the answer).
Your help would be welcomed.