On Thursday, December 29, 2022 at 8:06:37 PM UTC-6, Tim Rentsch wrote:
> luserdroog <
mij...@yahoo.com> writes:
>
> > On Sunday, December 25, 2022 at 8:58:33 PM UTC-6, Tim Rentsch wrote:
> [...]
> >> I'm still hoping to see a summary description (concise but complete)
> >> of all the different parts of the environment, [...]
> >
> > [.. long and detailed description ..]
>
> A blizzard of information... more than is needed in some areas,
> and (I suspect) less than is needed in others.
>
> Part of the reason for asking for a _concise_ but complete
> summary description is to get you to organize the information so
> it can be so presented. Going through the effort of organizing
> the information in this way should go a long way towards helping
> you implement the state-saving functionality.
>
> For example, any state that is held in integer data types can all
> be lumped together, because saving integers is well-understood
> and pretty easy.
>
> Conversely, function pointers need special care, because they
> cannot just be stored directly.
>
> The long description given mentions "objects" but as far as I can
> tell what an "object" (Xpost_Object?) is is never defined.
>
> Are references to objects done with pointers or by means of an
> object table? If there were an object table that would greatly
> simplify (probably) the state-saving operations.
>
> You mention "operations" but don't say what an operation is.
>
> Also, there is some amount of state for the garbage collector.
> Probably that state does not need to be (directly) saved for
> a state-saving operation. What information is important to save,
> and what information is incidental and can be ignored?
>
> Do these comments help you see what I'm getting at?
Yes. Let me try again having cleaned and straightened up my bifocals.
From the top level, and abstracting away all the fiddlybits, the whole
interpreter is just a collection of execution contexts. Since it's
just a simple round robin scheduling algorithm, it doesn't even really
need to remember the current context to resume execution.
Interpreter
collection of Contexts
Next, an execution context has a global memory and a local memory,
where there is a rule that global memory ought not to contain any
references to things in local memory. So, the global memory can be
considered self-contained with the local memory forming a shell around
it.
Context
Global Memory
Local Memory
a collection of integers (flags, offsets into local memory)
a collection of Objects (current object, window device, window
device event handler)
Skipping ahead to the Objects themselves, these are designed to be 64
bits long. There are Simple Objects which contain their value entirely
within the 64 bit representation. And there are Composite Objects,
such as arrays, strings, and dictionaries, which have their values in
either Global or Local Memory. There are File objects which have a
pointer to a C structure in memory, indexed by an entity number.
There are also Name objects which have associated strings in one of
the memories. Operator objects contain an integer code which indexes
the operator table which is in Global memory (although this is not a
requirement, perhaps it would make more sense to have the operator
table exist outside the memory arena). There is a Glob object which
is not directly accessible to the user but exists during the execution
of the `filenameforall` looping operator. There is also a Magic
object which is intended to exist only in the value part of a
key/value pair in a dictionary, in order to implement the Magic
Dictionaries from Sun's NeWS (where something like `canvas /mapped
true put` would instantly make the window visible).
Object = Integer int-val
| Boolean bool-val
| Composite Global? Entity Size Offset
| File Global? Entity
| Name Global? name-index
| Operator opcode
| Glob pointer-to-POSIX-glob_t
| Magic pointer-to-struct{(*get)();(*set)();}
A composite object always has a bit specifying whether it's in Global
or Local memory, then an Entity number which is an index into the
Memory Table for that memory, Size (in bytes for a String, objects for
an Array, key/value pairs for a Dictionary), and an Offset which will
be added to the address looked up from the Memory Table.
[Aside: I think both kinds of pointer need to be removed from the
Object representation and replaced with indexes into global tables.
With 64bit pointers, these violate the "64bit design" and force the
objects to be larger than intended.]
Each Memory has an associated Memory Table, indexed by the Entity
number, and a flat area of raw memory with size in use and total size
available.
Memory
Memory arena (big block of raw data, size in use, size available)
Memory Table
The Memory Table has a size in use and total size available, and an
array of allocation records containing an address (offset into the
arena data), size in use, size available, GC mark
Memory Table
size in use, size available
Array of (address (==offset), size used, size available, mark, tag)
For better or worse, all the other features of the PostScript Virual
Memory are grafted on top of this basic structure of Objects which
index the Memory Table to get the offset into the raw data arena.
sidebar: How some other features are grafted on:
The first few slots of the Memory Table hold Special Entities:
[0]: Free List (32bit word at address is the index of next free slot)
[1]: Save Stack (address locates head of stack of stacks of Save Records)
[2]: Context List (array of ids of all contexts sharing this memory)
[3]: Name Stack (address locates stack of string objects)
[4]: Name Tree (address locates head of Ternary Search Tree)
[5]: Bogus Name (special internal string returned by a failed name lookup)
( [6]: Operator Table if this is a Global Memory )
[...]: Live Allocations of Entities
The Operator Table is organized as an array of records
Operator Table
Array of (name stack index of operator's name,
number of operator Signatures,
address of array of Signatures)
Signature
pointer to function which implements the operator's action
number of argument objects
address of array of tag patterns
pointer to stack checking function (or NULL)
number of output objects
The window device is a Dictionary whose contents are in Local Memory.
The window device event handler is an Operator object which indexes
into the Operator table like any other operator to locate its function
pointer. One complication is that a window object has a block of internal
data that it stores in a PostScript String object. For an xcb device
this block of data contains an xcb_connection_t * and xcb_screen_t *
which would no longer be valid. Although some crucial information
would still be stored in the dictionary, like the dimensions. So, it
shouldn't be too difficult to create a new window with the old specs.
What needs to be done for the interpreter to resume a stored memory
image.
The original design was to have all the various pieces naturally
live inside the memory arena, so then the Saving/Resuming behavior
would just happen automatically by saving and loading the raw data.
The memory arena is allocated using mmap() so the saving part is
already done. Without any extra effort, exiting the interpreter
leaves it's final gmemXXXXXX and lmemXXXXXX files sitting right
there on the disk.
I put the Operator table in the arena so the memory image would
naturally correspond to the operator definitions it would work
with... but that doesn't really solve anything it seems. The Memory
Tables were originally implemented as a linked list of fixed sized
tables but it was pulled out of the arena for better performance.
So, in broad strokes, the Memory Table needs to go back into the arena
and the Operator Table needs to come out. Perhaps some kind of CRC or
hash could be computed on the operator names to establish the
correspondence between the codes in memory and the functions they
reference. And the Magic pointers need to be replaced with an
index into a table.
Upon resuming, a sweep needs to be done to invalidate all
Glob pointers. And something needs to be done about FILE *s.
Stdio files like stdin, stdout, stderr should be possible
to reconnect -- if they are stored in a recognizable form.
And it seems possible -- with much additional work -- to
remember the position and filename of a repositionable file
and to fopen() and fseek() to the same place. But maybe FILE *s
should just be invalidated except for stdio files which do
seem essential.
And finally, all the top level structures need to be packaged
up into a record and stashed into the Local Memory arena.
Well, wait a second. I think it does actually need one special
top-level file to list all the contexts. Each context is associated
with exactly one Local memory so all the per-context info can be
stored there. But two contexts may be sharing that same Local memory.
Files to store on Disk:
stateXXX/
interpreter.config
gmem001
lmem001
lmem002
...
Where interpreter.config would need to show which memories are used
by a context and where to find the context info in the (local) memory.
Something like:
cid001: gmem001 lmem001 info:<address>
cid002: gmem001 lmem002 info:<address>