Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Meta: a usenet server just for sci.math

577 views

Skip to first unread message

Ross A. Finlayson

unread,

Dec 1, 2016, 11:24:47 PM12/1/16

I have an idea here to build a usenet server
only for sci.math and sci.logic. The idea is
to find archives of sci.math and sci.logic and
to populate a store of the articles in a more
or less enduring form (say, "on the cloud"),
then to offer some usual news server access
then to, say, 1 month 3 month 6 month retention,
and then some cumulative retention (with a goal
of unlimited retention of sci.math and sci.logic
articles). The idea would be to have basically
various names of servers then reflect those
retentions for various uses for a read-only
archival server and a read-only daily server
and a read-and-write posting server. I'm willing
to invest time and effort to write the necessary
software and gather existing archives and integrate
with existing usenet providers to put together these
things.

Then, where basically it's in part an exercise
in vanity, I've been cultivating some various
notions of how to generate some summaries or
reports of various post, articles, threads, and
authors, toward the specialization of the cultivation
of summary for reporting and research purposes.

So, I wonder others' idea about such a thing and
how they might see it as a reasonably fruitful
thing, basically for the enjoyment and for the
most direct purposes of the authors of the posts.

I invite comment, as I have begun to carry this out.

Ross A. Finlayson

unread,

Dec 2, 2016, 2:19:23 PM12/2/16

So far I've read through the NNTP specs and looked
a bit at the INND code. Then, the general idea is
to define a filesystem layout convention, that then
would be used for articles, then for having those
on virtual disks (eg, "EBS volumes") or cloud storage
(eg, "S3") in essentially a Write-Once-Read-Many
configuration, where the goal is to implement data
structures that have a forward state machine so that
they remain consistent with unreliable computing
resources (eg, "runtimes on EC2 hosts"), and that
are readily cacheable (and horizontally scaleable).

Then, the runtimes are of the collection and maintenance
of posts ("infeeds" and "outfeeds", backfills), about
summary generation (overview, metadata, key extraction,
information content, working up auto-correlation), then
reader servers, then some maintenance and admin. As a
usual software design principle there is a goal of the
both "stack-on-a-box" and also "abstraction of resources"
and a usual separation of domain, library, routine, and
runtime logic.

So basically it looks like:
1) gather mbox files of sci.math and sci.logic
2) copy those to archive inputs
3) break those out into a filesystem layout for each article
(there are various filesystems that support this many files
these days)
4) generate partition and overview summaries
5) generate various revisioning schemes (the "article numbers"
of the various servers)
6) figure out the incremental addition and periodic truncation
7) establish a low-cost but high-availability endpoint runtime
8) make elastic/auto-scaling service routine behind that
9) have opportunistic / low cost periodic maintenance
10) emit that as a configuration that anybody can run
as "stack-on-a-box" or with usual "free tier" cloud accounts

Ross A. Finlayson

unread,

unread,

Dec 14, 2016, 11:31:59 PM12/14/16

Tapping away at this idea of a usenet server system,
I've written much of the read routine that is the
non-blocking I/O with the buffer passing and for the
externally coded data and any different coded data
like the unencrypted or uncompressed. I've quite
settled on 4KiB (2^12B) as the usual buffer page,
and it looks that the NFS offering can be so tuned
that its wsize (write size) is 4096 and with an
async NFS write option that that page size will
have that writes are incorruptible (though for
whatever reason they may be lost), and that 4096B
or 256 entries of 64B (2^6B) for a message-id or oversize-
message-id entry will spool off the message-id's of
the group's articles at an offset in the file that
is article-id * (1 << 6). The MTU of Ethernet packets
is often 1500 so having a wsize of 1KiB is not
nonsensible, as many of the writes are of this
granularity, the MTU might be 9001 or jumbo, which
would carry 2 4KiB NFS packets in one Ethernet packet.
Having the NFS rsize (read size) say 32KiB seems not
unreasonable, with that the reads will be pages of the
article-id's, or, the article contents themselves (split
to headers, xrefs, body) from the filesystem that are
mostly some few key and mostly quite altogether > 32 KiB,
which is quite a lot considering that's less than a JPEG
the size of "this". (99+% of Internet traffic was JPEG
and these days is audio/video traffic, often courtesy JPEG.)

Writing the read routine is amusing me with training the
buffers and it amuses me to write code with quite the
few +1 and -1 in the offsets. Usually having +-1 in
the offset computations is a good or a bad thing, rarely
good, with that often it's a sign that the method signature
just isn't being used quite right in terms of the locals,
if not quite as bad as "build a fence a mile then move it
a foot". When +-1 offsets is a good thing, here the operations
on the content of the buffers are rather agnostic the bounds
and amount of the buffers, thus that I/O should be quite
expedient in the routine.

(Written in Java, it should run quite the same on any
runtime with Java 1.4+.)

That said then next I'm looking to implement the Executor pool.

Acceptor -> Reader -> Scanner -> Executor -> Printer -> Writer

The idea of the Executor pool is that there are many connections
or sessions (the protocol is stateful), then that for one session,
its command's results are returned in order, but, that doesn't say
that the commands are executed in order, just that their results
are returned in order. (For some commands, which affect the state
of the session like current group or current article, that being
pretty much it, those also have to be executed sequentially for
consistency's sake.) So, I'm looking to have the commands be
executed in any possible order, for the usual idea of saturating
the bandwidth of the horizontally scalable backend. (Yeah, I
know NFS has limits, but it's unbounded and durable, and there's
overall a consistent, non-blocking toward lock-free view.)
Anyways, basically the Session has a data structure of its
outstanding commands, as they're enqueued to the task executor,
then whether it can go into the out-of-order pool or must stay
in the serial pool. Then, as the commands complete, or for
example timeout after retries on some network burp, those are
queued back up as the FIFO of the Results and as those arrive
the Writer is re-registered with the SocketChannel's Selector
for I/O notifications and proceeds to fill the socket's output
buffer and retire the Command and Result. One aspect of this
is that the Printer/Writer doesn't necessarily get the data on
the heap, the output for example an article is composed from
the FileChannels of the message-id's header, xref, body. Now,
these days, the system doesn't have much of a limit in open
file handles, but as mentioned above there are limits on NFS
file handles. Basically then the data is retrieved as from the
object store (or here an octet store but the entire contents of
the files are written to the output with filesystem transfer
direct to memory or the I/O channel). Then, releasing the
NFS file handles expeditiously basically is to be figured out
with caching the contents, for any retransmission or simply
serving copies of the current articles to any number of
connections. As all these are, read-only, it looks like the
filesystems' built-in I/O caching with, for example, a read-only
client view and no timeout, basically turns the box into a file
cache, because that is what it is.

Then, it looks like there is a case for separate reader and
writer implementations altogether of the NFS or octet store
(that here is an object store for the articles and their
sections, and an octet store for the pages of the tables).
This is with the goal of minimizing network access while
maintaining the correct view. But, an NFS export can't
be mounted twice from the same client (one for reads and
one for writes), and, while ingesting the message can be
done separately the client, intake has to occur from the
client, then what with a usual distributed cloud queue
implementation having size and content limits, it seems
like it'll be OK.

Ross A. Finlayson

unread,

Dec 17, 2016, 5:58:16 PM12/17/16

On Tuesday, December 13, 2016 at 12:05:13 AM UTC-8, Ross A. Finlayson wrote:
>
>
> That is basically the design issue then, I'm tapping away on this.

The next thing I'm looking at is how to describe the "range",
as a data structure or in algorithms.

Here a "range" class in the runtime library is usually a
"bounds" class. I'm talking about a range, basically a
1-D range, about basically a subset of the integers,
then that the range is iterating over the subset in order,
about how to maintain that in the most maintainable and
accessible terms (in computational complexity's space and time
terms).

So, I'm looking to define a reasonable algebra of individuals,
subsets, segments, and rays (and their complements) that
naturally compose to objects with linear maintenance and linear
iteration and constant access of linear partitions of time-
series data, dense or sparse, with patterns and scale.

This then is to define data structures as so compose that
given a series of items and a predicate, establish the
subset of items as a "range", that then so compose as
above (and also that it has translations and otherwise
is a fungible iterator).

I don't have one of those already in the runtime library.

punch-out <- punches have shapes, patterns? eg 1010
knock-out <- knocks have area
pin-out <- just one
drop-out <-
fall-out <- range is out

Then basically there's a coalescence of all these,
that they have iterators or mark bounds, of the
iterator of the natural range or sequence, for then
these being applied in order

push-up <- basically a prioritization
fill-in <- for a "sparse" range, like the complement upside-down
pin-in
punch-in
knock-in

Then all these have the basic expectation that a range
is the combination of each of these that are expressions
then that they are expressions only of the value of the
iterator, of a natural range.

Then, for the natural range being time, then there is about
the granularity or fine-ness of the time, then that there is
a natural range either over or under the time range.

Then, for the natural range having some natural indices,
the current and effective indices are basically one and
zero based, that all the features of the range are shiftable
or expressed in terms of these offsets.

0 - history

a - z

-m,n

Whether there are pin-outs or knock-outs rather varies on
whether removals are one-off or half-off.

Then, pin-outs might build a punch-out,
While knock-outs might build a scaled punch-out

Here the idea of scale then is to apply the notions
of stride (stripe, stribe, striqe) to the range, about
where the range is for example 0, 1, .., 4, 5 .., 8, 9
that it is like 1, 3, 5, 7 scaled out.

Then, "Range" becomes quite a first-class data structure,
in terms of linear ranges, to implement usual iterators
like forward ranges (iterators).

Then, for time-forward searches, or to compose results in
ranges from time-forward searches, without altogether loading
into memory the individuals and then sorting them and then
detecting their ranges, there is to be defined how ranges
compose. So, the Range includes a reference to its space
and the Bounds of the Space (in integers then extended
precision integers).

"Constructed via range, slices, ..." (gslices), ....

Then, basically I want that the time series is a range,
that expressions matching elements are dispatched to
partitions in the range, that the returned or referenced
composable elements are ranges, that the ranges compose
basically pair-wise in constant time, thus linearly over
the time series, then that iteration over the elements
is linear in the elements in the range, not in the time
series. Then, it's still linear in the time series,
but sub-linear in the time series, also in space terms.

Here, sparse or dense ranges should have the same small-
linear space terms, with there being maintenance on the
ranges, about there being hysteresis or "worst-case 50/50"
(then basically some inertia for where a range is "dense"
or "sparse" when it has gt or lt .5 elements, then about
where it's just organized that way because there is a re-
organization).

So, besides composing, then the elements should have very
natural complements, basically complementing the range by
taking the complement of the ranges parts, that each
sub-structure has a natural complement.

Then, pattern and scale are rather related, about figuring
that out some more, and leaving the general purpose, while
identifying the true primitives of these.

Then eventually there attachment or reference to values
under the range, and general-purpose expressions to return
an iteration or build a range, about the collectors that
establish where range conditions are met and then collapse
after the iteration is done, as possible.

So, there is the function of the range, to iterate, then
there is the building of the range, by iterating. The
default of the range and the space is its bounds (or, in
the extended, that there are none). Then, segments are
identified by beginning and end (and perhaps a scale, about
rigid translations and about then that the space is
unsigned, though unbounded both left and right see
some use). These are dense ranges, then for whether the
range is "naturally" or initially dense or sparse. (The
usual notion is "dense/full" but perhaps that's as
"complement of sparse/empty".) Then, as elements are
added or removed in the space, if they are added range-wise
then that goes to a stack of ranges that any forward
iterator checks before it iterators, about whether the
natural space's next is in or out, or, whether there is
a skip or jump, or a flip then to look for the next item
that is in instead of out.

This is where, the usual enough organization of the data
as collected in time series will be bucketed or partitioned
or sharded into some segment of the space of the range,
that buiding range or reading range has the affinity to
the relevant bucket, partition, or shard. (This is all
1-D time series data, no need to make things complicated.)

Then, the interface basically "builds" or "reads" ranges,
building given an expression and reading as a read-out
(or forward iteration), about that then the implementation
is to compose the ranges of these various elements of a
topological sort about the bounds/segments and scale/patterns
and individuals.

https://en.wikipedia.org/wiki/Allen%27s_interval_algebra

This is interesting, for an algebra of intervals, or
segments, but here so far I'd been having that the
segments of contiguous individuals are eventually
just segments themselves, but composing those would
see the description as of this algebra. Clearly the
goal is the algebra of the contents of sets of integers
in the integer spaces.

An algebra of sets and segments of integers in integer spaces

An integer space defines elements of a type that are ordered.

An individual integer is an element of this space.

A set of integers is a set of integers, a segment of integers
is a set containing a least and greatest element and all elements
between. A ray of integers of a set containing a least element
and all greater elements or containing a greatest element and
all lesser elements.

A complement of an individual is all the other individuals,
a complement of a set is the intersection of all other sets,
a complement of a segment is all the elements of the ray less
than and the ray greater than all individuals of the segment.

What are the usual algebras of the compositions of individuals,
sets, segments, and rays?

https://en.wikipedia.org/wiki/Region_connection_calculus

Then basically all kinds of things that are about subsets
of thing in a topological or ordered space should basically
have a first-class representation as (various kinds of)
elements in the range algebra.

So, I'm wondering what there is already for
"range algebra" and "range calculus".

Ross A. Finlayson

unread,

Dec 18, 2016, 8:48:15 PM12/18/16

Some of the features of this subsets of a
range of integers is available as a usual
bit vector, eg with ffs ("find-first-set")
memory scan instructions memory scan instructions,
and as well usual notions of compressed bitmap
indices, with some notion of random access to
the value of a bit by its index and variously
iterating over the elements. Various schemes
to compress the bitmaps down to uncompressed
regions with representing words' worths of bits
may suit parts of the implementation, but I'm
looking for a "pyramidal" or "multi-resolution"
organization of efficient bits, and also flags,
about associating various channels of bits with
the items or messages.

https://en.wikipedia.org/wiki/Bitmap_index

Then, with having narrowed down the design for
what syntax to cover, and, mostly selected data
structures for the innards, then I've been looking
to the data throughput, then some idea of support
of client features.

Throughput is basically about how to keep the
commands moving through. For this, there's a
single thread that reads off the network interface'
I/O buffers, it was also driving the scanner, but
adding encryption and compression layers, then there's
also adding a separate thread to drive the scanner
thus that the network interface is serviced on demand.
Designing a concurrent data structure basically has
a novel selector (as of the non-blocking I/O) to
then pick off a thread from the pool to run the
scanner. Then, on the "printer" side and writing
off to the network interface, it is similar, with
having the session or connection's resources run
the compression and encryption, then for the I/O
thread as servicing the network interface. Basically
this is having put a collator/relay thread between
the I/O threads and the scanner/printer threads
(where the commands are run by the executor pool).

Then, a second notion has been the support of TLS.
It looks I would simply sign a certificate and expect
users to check and install it themselves in their
trust-store for SSL/TLS. That said, it isn't really
a great solution, because, if someone compromises any
of the CA's, certificate authorities, in the trust
store (any of them), then a man-in-the-middle could
sign a cert, and it would be on the server to check
that the content hash reflected the server cert from
the handshake. What might be better would be to have
that each client, signs their own certificate, for the
server to present. This way, the client and server
each sign a cert, and those are exchanged. When the
server gets the client cert, it restarts the negotiation
now with using the client-signed cert as the server
cert. This way, there's only a trust anchor of depth
1 and the trust anchors are never exchanged and can
not be cross-signed nor otherwise would ever share
a trust root. Similarly the server get's the server-
signed cert back from the client then that TLS could
proceed with a session ticket and that otherwise there
would be a stronger protection from compromised CA
certs. Then, this could be pretty automatic with
a simple enough browser interface or link to set up TLS.
Then the server and client would only trust themselves
and each other (and keep their secrets private).

Then, for browsing, a reading of IMAP, the Internet
Message Access Protocol, shows a strong affinity with
the organization of Usenet messages, with newsgroups
as mailboxes. As well, implementing an IMAP server
that is backed by the NNTP server has then that the
search artifacts and etcetera (and this was largely
a reason why I need this improved "range" pattern)
would build for otherwise making deterministic date-
oriented searches over the messages in the NNTP server.
IMAP has a strong affinity with NNTP, and is a very
similar protocol and is implemented much the same
way. Then it would be convenient for users with
an IMAP client to simply point to "usenet.science"
or what and get usenet through their email browser.

Ross A. Finlayson

unread,

Dec 24, 2016, 1:21:16 AM12/24/16

About implementing usenet with reasonably
modern runtimes and an eye toward
unlimited retention, basically looking
into "microtasks" for the routine or
workflow instances, as are driven with
non-blocking I/O throughout, basically
looking to memoize the steps as through
a finite state machine, for restarts as
of a thread, then to go from "service
oriented" to "message oriented".

This involves writing a bit of an
HTTP client for rather usual web
service calls, but with high speed
non-blocking I/O (less threads, more
connections). Also this involves a
sufficient abstraction.

Ross A. Finlayson

unread,

Jan 6, 2017, 4:57:00 PM1/6/17

This writing some software for usenet service
is coming along with the idea of how to implement
the fundamentally asynchronous non-blocking routine.
This is crystallizing in pattern as a: re-routine,
in reference to computing's usual: co-routine.

The idea of the re-routine is that there are only
so many workers, threads, of the runtime. The usual
runtimes (and this one, Java, say) support preemptive
multithreading as a means of implementing cooperative
multithreading, with the maintenance of separate stacks
(of, the stack machine of usual C-like procedural runtimes)
and some thread-per-connection model. This is somewhat
reasonable for the composition of blocking APIs, but
not so much for the composition of non-blocking APIs
and about how to not have many thread-per-connection
resources with essentially zero duty cycle that instead
could maintain for themselves the state machine of their
routine (with simplified forward states and a general
exception and error routine), for cooperative multi-threading.

The idea of this re-routine then is to connect functions,
there's a scope for variables in the scope, there is
execution of the functions (or here the routines, as
the "re-routines") then the instance of the re-routine
is re-entrant in the sense that as partial results are
accumulated the trace of the routine is marked out, with
leaving in the scope the current or partial or intermediate
results. Then, the asynchronous workers that fulfill each
routine (eg, with a lookup, a system call, or a network
call) are separate worker units dedicated to their domain
(of the routine, not the re-routine, and they can be blocking,
polling for their fleet, or callback with the ticket).

Then, this is basically a network machine and protocol,
here about NNTP and IMAP, and its resources are often
then of network machines and protocols (eg networked
file systems, web services). Then, these "machines"
of the "re-routine" being built (basically for the
streaming model instead of the batch model if you
know what I'm talking about) defining the logical
outcomes of the composition of the inputs and the
resulting outputs in terms of scopes as a model of
the cooperative multithreading, these re-routines
then are seeing for the pattern then that the
source template is about implicitly establishing
the scope and the passing and calling convention
(without a bunch of boilerplate or "callback confusion",
"async hell"). This is where the re-routine, when
a routine worker fills in a partial result and resubmits
the re-routine (with the responsibility/ownership of
the re-routine) that it is re-evaluated from the beginning,
because it is constant linear in reading forward for the
item the state of its overall routine, thusly implicit
without having to build a state machine, as it is
declaratively the routine.

So, I am looking at this as my solution as to how to
establish a very efficient (in resource and performance
terms) formally correct protocol implementation (and
with very simple declarative semantics of usual forward,
linear routines).

This "re-routine" pattern then as a model of cooperative
multithreading sees the complexity and work into the
catalog of blocking, polling, and callback support,
then for usual resource injection of those as all
supported with references to usual sequential processes
(composition of routine).

Ross A. Finlayson

unread,

Jan 21, 2017, 5:33:23 PM1/21/17

I've about sorted out how to implement the re-routine.

Basically a re-routine is a suspendable composite
operation, with normal declarative flow-of-control
syntax, that memo-izes its partial results, and
re-executes the same block of statements then to
arrive at its pause, completion, or exit.

Then, the command and executor are passed to the
implementation that has its own (or maybe the
same) execution resources, eg a thread or connection
pool. This resolves the value of the asynchronous
operation, and then re-submits the re-routine to
its originating executor. The re-routine re-runs
(it runs through the branching or flow-of-control
each time, but that's small in the linear and all
the intermediate products are already computed,
and the syntax is usual and in the language).
The re-routine then either re-suspends (as it
launches the next task) or completes or exits (errors).
Whether it suspends, completes or exits, the
re-routine just returns, and the executor then
is specialized and just checks the re-routine
whether it's suspended (and just drops it, the
new responsible launched will re-submit it),
or whether it's completed or errored (to call
back to the originating commander the result of
the command).

In this manner, it seems like a neat way to basically
establish the continuation, for this "non-blocking
asynchronous operation", while at the same time
the branching and flow of control is all in the
language, with the usual un-suprising syntax and
semantics, for cooperative multi-threading. The
cost is in wrapping the functional callers of the
routine and setting up their factories and otherwise
as via injection (and they can block the calling
thread, or have their own threads and block, or
be asynchronous, without changing the definition
of the routine).

So, having sorted this mostly out, then the usual
work as of implementing the routines for the protocol
can so proceed then with a usual notion of a framework
of support for both the simple declaration of routine
and the high performance (and low resource usage) of
the delegation of routine, and support for injection
for test and environment, and all in the language
with minimal clutter, no byte-code modification,
and a ready wrapper for libraries of arbitrary
run-time characteristic.

This solves some problems.

j4n bur53

unread,

unread,

Feb 7, 2017, 3:16:14 AM2/7/17

Not _too_ much progress, has basically seen the adaptation
of this re-routine pattern to the command implementations,
with basically usual linear procedural logic then the
automatic and agnostic composition of the asynchronous
tasks in the usual declarative syntax that then the
pooled (and to be metered) threads are possibly by
design entirely non-blocking and asynchronous, and
possibly by design blocking or otherwise agnostic of
implementation, with then the design of the state
machine of the routine as "eventually consistent"
or forward and making efficient use of the computational
and synchronization resources.

The next part has been about implementing a client "machine"
as complement to the server "machine", where a machine here
is an assembly as it were of threads and executors about the
"reactive" (or functional, event-driven) handling of the
abstract system resources (small pojos, file name, and
linked lists of 4K buffers). The server basically starts
up listening on a port then accepts and starts a session
for any connection and then a reader fills and moves buffers
to each of the sessions of the connections, and signals the
relay then for the scanning of the inputs and then composing
the commands and executing those as these re-routines, that
as they complete, then the results of the commands are then
printed out to buffers (eg, encoded, compressed, encrypted)
then the writer sends that back on the wire. The client
machine then is basically a model of asynchronous and
probably serial computation or a "web service call", these
days often and probably on a pooled HTTP connections. This
then is pretty simple with the callbacks and the addressing/
routing of the response back to the re-routine's executor
to then re-submit the re-routine to completion.

I've been looking at other examples of continuations, the
"reactive" programming or these days' "streaming model"
(where the challenge is much in the aggregations), that
otherwise non-blocking or asynchronous programming is
often rather ... recursively ... rolled out where this
re-routine gains even though the flow-of-control is
re-executed over the memoized contents of the re-routines
as they are so composed declaratively, that this makes
what would be "linear" at worst "n squared", but that is
only on how many commands there are in the procedure,
not combined over their execution because all the
intermediate results are memoized (as needed, because
if the implementation is local or a mock instead, the
re-routine is agnostic of asychronicity and just runs
through linearly, but the relevant point is that the
number of composable units is a small constant thus
that it's square is a small constant, particularly
as otherwise being a free model of cooperative multi-
threading, here toward a lock-free design). All the
live objects remain on the heap, but just the objects
and not for example the stack as a serialized continuation.
(This could work out to singleton literals or "coding"
but basically it will have to auto-throttle off heap-max.)

So, shuffling and juggling the identifiers and organizations
around and sifting and sorting what elements of the standard
concurrency and functional libraries (of, the "Java" language)
to settle on for usual neat and concise (and re-usable and
temporally agnostic) declarative flow-of-control (i.e., with
"Future"'s everywhere and as about reasonable or least-surprising
semantics, if any, with usual and plain code also being "in
the convention"), then it is settling on a style.

Well, thanks for reading, it's a rather stream-of-consciousness
narrative, here about the design of pretty re-usable software.

Julio Di Egidio

unread,

Feb 7, 2017, 4:05:47 AM2/7/17

On Tuesday, February 7, 2017 at 9:16:14 AM UTC+1, Ross A. Finlayson wrote:

> Not _too_ much progress, has basically seen the adaptation
> of this re-routine pattern to the command implementations,

I do not understand what you are trying to achieve here. As long as Usenet the
protocol is fine per se, the technical problem at least is already solved, i.e.
there is plenty of Usenet server software available... OTOH, the "problem with
Usenet" such that one would want to build an entirely new network seems to me
is more of a socio-cybernetic kind, so I'd rather find interesting discussing,
say, the merits but also the limitations of moderation as an approach, and maybe
even what better could be done. But, again, the technical problem is not really
a problem, in fact that is the easy part....

(Also, I do not see why discuss this in sci.math. Maybe comp.ai.philosophy, as
for collective intelligence?)

Julio

Ross A. Finlayson

unread,

Feb 7, 2017, 2:18:07 PM2/7/17

Sure, I'll limit this.

There is plenty of usenet server software, but it is mostly
INND or BNews/CNews, or a few commercial cousins. The design
of those systems is tied to various economies that don't so much
apply these days. (The use-case, of durable distributed message-
passing, is still quite relevant, and there are many ecosystems
and regimes small and large as about it.) In the days of managed
commodity network and compute resources or "cloud computing", here
as above about requirements, then a modernization is relevant, and
for some developers with the skills, not so distant.

Another point is that the eventual goal is archival, my goal isn't
to start an offshoot, instead to build the system as a working
model of an archive, basically from the author's view as a working
store for extracting material, and from the developer's view as
an example in design with low or no required maintenance and
"scalable" operation for a long time.

You mention comp.ai.philosophy, these days there's a lot more
automated reasoning (or, mockingbird generators), as computing
and development affords more and different forms of automated
reasoning, here again the point is for an archival setting to
give them something to read.

Thanks, then, I'll limit this.

Julio Di Egidio

unread,

Feb 9, 2017, 2:00:32 AM2/9/17

On Tuesday, February 7, 2017 at 8:18:07 PM UTC+1, Ross A. Finlayson wrote:

> There is plenty of usenet server software, but it is mostly
> INND or BNews/CNews, or a few commercial cousins.

There is plenty of free and open news server software:
<https://www.dmoz.org/Computers/Software/Internet/Servers/Usenet>
<https://en.wikipedia.org/wiki/News_server>

> Another point is that the eventual goal is archival, my goal isn't
> to start an offshoot, instead to build the system as a working
> model of an archive, basically from the author's view as a working
> store for extracting material,

I'd have qualms as to what the degree-zero is, namely, I'd think more of hyper-
texts hence a Wiki (or, in the larger, the web itself) as the basic structure.
OTOH, Usenet is a conversational model, for discussions, not even forums.

Regardless, even at that most basic level, you already face the fundamental
problem of the "quality" of the content (for some to be properly defined notion
of quality). For one thing, consider that garbage is garbage even under the
best microscope...

> You mention comp.ai.philosophy, these days there's a lot more

I mentioned comp.ai.philosophy partly because I do not have a better reference,
partly because, for how basic you want to keep it (and I am all for building
incrementally), I would think it is only considerations at that level that can
provide the fundamental requirements.

Julio

Ross A. Finlayson

unread,

Mar 21, 2017, 7:10:21 PM3/21/17

I continued tapping away at this.

The re-routines now sit beyond a module or domain definition.
This basically defines the modules' value types like session,
message, article, group, content, wildmat. Then, it also
defines a service layer, as about the relations of the elements
of the domain, so that then the otherwise simple value types
have natural methods as relate them, all implemented behind
a service layer, that implemented with these re-routines is
agnostic of synchronous or asynchronous convention, and
is non-blocking throughout with cooperative multithreading.
This has a factory of factories or industry pattern that provides
the object graph wiring and dynamic proxying to the routine
implementations, that are then defined as traits, that the re-
routine composes the routines as mixins (of the domain's
services).

(This is all "in the language" in Java, with no external dependencies.)

The transport mechanism is basically having abstracted the
attachment for a usual non-blocking I/O framework for the
transport types as of the scattering/gathering or vector I/O
as about then the interface between transport and protocol
(here NNTP, but, generally). Basically in a land of 4K byte buffers,
then those are fed from the Reader/Writer that is the endpoint to
a Feeder/Scanner that is implemented for the protocol and usual
features like encryption and compression, then making Commands
and Results out of those (and modelling transactions or command
sequences as state machines which are otherwise absent), those
systolically carrying out as primitive or transport types to a Printer/
Hopper, that also writes the response (or rather, consumes the buffers
in a highly concurrent highly efficient event and selection hammering).
The selector is another bounded resource, so it's configurable the
SelectorAssignment and there might be a thread for each group of
selectors about FD_SETSIZE, but that's not really at issue as select
went to epoll, but provides an option for that eventuality.

The transport and protocol routines are pretty well decoupled this
way, and then the protocol domain, modules, and routines are as
well so decoupled (and fall together pretty naturally), much using
quite usual software design patterns (if not necessarily so formally,
quite directly).

The protocol then (here NNTP) then is basically in a few files detailing
the semantics of the commands to the scanner as overriding methods
of a Command class, and implementing the action in the domain from
extending the TraitedReRoutine then for a single definition in the NNTP
domain that is implemented in various modules or as collections of services.

Ross A. Finlayson

unread,

Apr 9, 2017, 11:20:50 PM4/9/17

I'm still tapping away at this if rather more slowly (or, more sporadically).

The "re-routine" async completion pattern is more than less
figured out (toward high concurrency as a model of cooperative
multi-threading, behind also a pattern of a domain layer, with mix-in
nyms that is also some factory logic), a simple non-blocking I/O socket
service routine is more than less figured out (the server not the client,
toward again high concurrency and flexible and efficient use of machine
or virtualized resources as they are), the commands and their bodies are
pretty much typed up, then I've been trying to figure out some data
structures basically in I/O (Input/Output), or here mostly throughput
as it is about the streams.

I/O datum FIFOs and holders:

buffer queue
handles queue
buffer+handles queue
buffer/buffer[] or buffer[]/buffer in loops
byte[]/byte[] in steps
Input/Output in Streams

Basically any of the filters or adapters is specialized to these input/output
data holders. Then, there are logically enough queues or FIFOs as there are
really implicitly between any communicating sequential processes that are
rate-limited or otherwise non-systolic ("real-time"), here for some ideas about
data structures, as either implement or adapt unbounded single producer/
single consumer (SPSC) queues.

One idea is the making the linked container with then sentinel nodes
and otherwise making it thread-safe (for a single producer and single
consumer). This is where the queue (or, "monohydra" or "slique") is
rather generally a container, and that here iterations are usually
consuming the queue, but sometimes there are aggregates collected
then to go over the queue. The idea then is that the producer and
consumer have separate views of the queue that the producer does
atomic swap on the tail of the queue and that a consumer's iterator
of elements (as iterable and not just a queue, for using the queue as
a holder and not just a FIFO) returns a marker to the end of the iteration,
for example in computing bounds over the buffers then re-iterating and
flipping the buffers then given the bounds moving the buffers' references
to an output array thus consuming the FIFO.

This then combines with the tasks that the tasks driving the I/O (as events
drive the tasks) are basically constant tasks or runnables (constant to the
session or attachment) that just have incremented a count of times to run
thus that there's always a service of the FIFO after the atomic append.

Another idea is this hybrid or serial mix-and-match (SPSC FIFO), of buffers
and handles. This is where the buffer in the data in-line, the handle is a
reference to the data. This is about passing through the handles where
the channels support their transfer, and converting them to inline data
where they don't. That's then about all the combined cases as the above
I/O datum FIFOs and holders, with adapting them so the filter chain blasts
(eg specialized operation), loops (transferring in and out of buffers), steps
(statefully filling and levelling data), or moves (copying the references, the
data in or out or on or off, then to perform the I/O operations) over them.

It seems rather simpler to just adapt the data types to the boundary I/O data
types which are byte buffers (here size-4K pooled memory buffers) and for
that the domain shouldn't know concrete types so much as interfaces, but
the buffers and handles (file handles) and arrays as they are are pretty much
fungible to the serialization of the elements of the domain, that can then
specialize how they build logical inputs and outputs of the commands.

burs...@gmail.com

unread,

Apr 10, 2017, 8:18:09 AM4/10/17

You could use camel.

Camel is a rule-based routing and mediation engine that provides a
object-based implementation of the Enterprise Integration Patterns
using an application programming interface (or declarative domain-
specific language) to configure routing and mediation rules.

Its name is derived from the camel humps, since the pakets
might take flippy-floppy routes. It also provides automatic
integration of the Gamma Functions, so that Archies post could
be automatically verified whether he

computes the factorial correctly.

Ross A. Finlayson

unread,

Ross A. Finlayson

unread,

Mar 1, 2022, 4:09:32 PM3/1/22

Take a hike, troll.

Duane Hume

unread,

Mar 1, 2022, 4:18:55 PM3/1/22

unread,

Mar 5, 2022, 5:26:57 PM3/5/22

In more primitive cultures it's usually matters of 5's and 20's.

Feeling more rarified these days?

Ross Finlayson

unread,

Mar 8, 2023, 11:51:58 PM3/8/23

After implementing a store, and the protocol for getting messages, then what seems relevant here in the
context of the SEARCH command, is a fungible file-format, that is derived from the body of the message
in a normal form, that is a data structure that represents an index and catalog and dictionary and summary
of the message, a form of a data structure of a "search index".

These types files should naturally compose, and result a data structure that according to some normal
forms of search and summary algorithms, result that a data structure results, that makes for efficient
search of sections of the corpus for information retrieval, here that "information retrieval is the science
of search algorithms".

Now, for what and how people search, or what is the specification of a search, is in terms of queries, say,
here for some brief forms of queries that advise what's definitely included in the search, what's excluded,
then perhaps what's maybe included, or yes/no/maybe, which makes for a predicate that can be built,
that can be applied to results that compose and build for the terms of a filter with yes/no/maybe or
sure/no/yes, with predicates in values.

Here there is basically "free text search" and "matching summaries", where text is the text and summary is
a data structure, with attributes as paths the leaves of the tree of which match.

Then, the message has text, its body, and and headers, key-value pairs or collections thereof, where as well
there are default summaries like "a histogram of words by occurrence" or for example default text like "the
MIME body of this message has a default text representation".

So, the idea developing here is to define what are "normal" forms of data structures that have some "normal"
forms of encoding that result that these "normalizing" after "normative" data structures define well-behaved
algorithms upon them, which provide well-defined bounds in resources that return some quantification of results,
like any/each/every/all, "hits".

This is where usually enough search engines' or collected search algorithms ("find") usually enough have these
de-facto forms, "under the hood", as it were, to make it first-class that for a given message and body that
there is a normal form of a "catalog summary index" which can be compiled to a constant when the message
is ingested, that then basically any filestore of these messages has alongside it the filestore of the "catsums"
or as on-demand, then that any algorithm has at least well-defined behavior under partitions or collections
or selections of these messages, or items, for various standard algorithms that separate "to find" from
"to serve to find".

So, ..., what I'm wondering are what would be sufficient normal forms in brief that result that there are
defined for a given corpus of messages, basically at the granularity of messages, how is defined how
there is a normal form for each message its "catsum", that catums have a natural algebra that a
concatenation of catums is a catsum and that some standard algorithms naturally have well-defined
results on their predicates and quantifiers of matching, in serial and parallel, and that the results
combine in serial and parallel.

The results should be applicable to any kind of data but here it's more or less about usenet groups.

Ross Finlayson

unread,

Mar 9, 2023, 1:23:04 AM3/9/23

So I start browsing the Information Retrieval section in Wikipedia and more or less get to reading
Luhn's 1958 "automatic coding of document summaries" or "The Automatic Creation of Literature
Abstracts". Then, what I figure, is that the histogram, is an associative array of keys to counts,
and what I figure is to compute both the common terms, and, the rare terms, so that there's both
"common-weight" and "rare-weight" computed, off of the count of the terms, and the count of
distinct terms, where it is working up that besides catums, or catsums, it would result a relational
algebra of terms in, ..., terms, of counts and densities and these type things. This is where, first I
would figure the catsum would be deterministic before it's at all probabilistic, because the goal is
match-find not match-guess, while still it's to support the less deterministic but more opportunistic
at the same time.

Then, the "index" is basically like a usual book's index, for each term that's not a common term in
the language but is a common term in the book, what page it's on, here that that is a read-out of
a histogram of the terms to pages. Then, compound terms, basically get into grammar, and in terms
of terms, I don't so much care to parse glossolalia as what result mostly well-defined compound terms
in usual natural languages, for the utility of a dictionary and technical dictionaries. Here "pages" are
both according to common message threads, and also the surround of messages in the same time
period, where a group is a common message thread and a usenet is a common message thread.

(I've had a copy of "the information retrieval book" before, also borrowed one "data logic".)

"Spelling mistakes considered adversarial."

https://en.wikipedia.org/wiki/Subject_indexing#Indexing_theory

Then, there's lots to be said for "summary" and "summary in statistic".

A first usual data structure for efficiency is the binary tree or bounding tree. Then, there's
also what makes for divide-and-conquer or linear speedup.

About the same time as Luhn's monograph or 1956, there was published a little book
called "Logic and Language", Huppe and Kaminsky. It details how according to linguistics
there are certain usual regular patterns of words after phonemes and morphology what
result then for stems and etymology that then for vocabulary that grammar or natural
language results above. Then there are also gentle introductions to logic. It's very readable
and quite brief.

V

unread,

Mar 17, 2023, 10:10:28 AM3/17/23

Let's get to know each other. Me: http://kohtumispaik.000webhostapp.com/Infovahetusteks/dpic/1679061026.gif

Have a nice day......

Ross Finlayson

unread,

Apr 29, 2023, 5:54:26 PM4/29/23

I haven't much been tapping away at this,
but it's pretty simple to stand up a usenet peer,
and pretty simple to slurp a copy,
of the "Big 8" usenet text groups, for example,
or particularly just for a few.

Ross Finlayson

unread,

unread,

Jan 24, 2024, 7:51:10 AM1/24/24

How do I know, well his header contains:

Injection-Info: google-groups.googlegroups.com;
posting-host=97.126.97.251; posting-account=WH2DoQoAAADZe3cdQWvJ9HKImeLRniYW

Means he is even using google groups right now.
From his cabin in the woods:

$ whois 97.126.97.251

OrgName: CenturyLink Communications, LLC
OrgId: CCL-534
Address: 100 CENTURYLINK DR
City: Monroe
StateProv: LA
PostalCode: 71201

Mild Shock schrieb:

Alan Mackenzie

unread,

Jan 24, 2024, 2:41:22 PM1/24/24

Ross Finlayson <ross.a.f...@gmail.com> wrote:

[ .... ]

> Basically thinking about a "backing file format convention".

> The message ID's are universally unique. File-systems support various
> counts and depths of sub-directories. The message ID's aren't
> necessarily opaque structurally as file-names. So, the first thing is
> a function that given a message-ID, results a message-ID-file-name.

> Then, as it's figured that groups, are, separable, is about how, to,
> either have all the messages in one store, or, split it out by groups.
> Either way the idea is to convert the message-ID-file-name, to a given
> depth of directories, also legal in file names, so it results that the
> message's get uniformly distributed in sub-directories of approximately
> equal count and depth.

> A....D...G <- message-ID

> ABCDEFG <- message-ID-file-name

> /A/B/C/D/E/F/ABCDEFG <- message-ID-directory-path

> So, the idea is that the backing file format convention, basically
> results uniform lookup of a file's existence, then about ingestion and
> constructing a message, then, moving that directory as a link in the
> filesystem, so it results atomicity in the file system that supports
> that the existence of a message-ID-directory-path is a function of
> message-ID, and usual filesystem guarantees.

Ross Finlayson

unread,

Feb 18, 2024, 10:00:22 PM2/18/24

It seems like Gert Webelhuth has a good book called
"Principles and Parameters of Syntactic Saturation",
discusses linguistics pretty thoroughly.

global.oup.com/academic/product/principles-and-parameters-of-syntactic-saturation-9780195070415?cc=us&lang=en&
books.google.com/books?id=nXboTBXbhwAC

Reading about this notion of "saturation", on the one
hand it seems to indicate lack of information, on the
other hand it seems to be capricious selective ignorance.

www.tandfonline.com/doi/full/10.1080/23311886.2020.1838706
doi.org/10.1080/23311886.2020.1838706
Saturation controversy in qualitative research: Complexities and
underlying assumptions. A literature review
Favourate Y. Sebele-Mpofu

Here it's called "censoring samples", which is often enough
with respect to "outliers". Here it's also called "retro-finitist".
The author details it's a big subjective mess and from a
statistical design sort of view it's, not saying much.

Here this is starting a bit simpler with for example a sort of
goal to understand annotated and threaded plain text
conversations, in the usual sort of way of establishing
sequence, about the idea for relational algebra, to be
relating posts and conversations in threads, in groups
in time, as with regards to simple fungible BFF's, as
with regards to simple fungible SFF's, what result highly
repurposable presentation, via storage-neutral means.

It results sort of bulky to start making the in-place
summary file formats, with regards to, for example,
the resulting size of larger summaries, yet at the same
time, the extraction and segmentation, after characterization,
and ellision:

extraction: headers and body
characterization: content encoding
extraction: text extraction
segmentation: words are atoms, letters are atoms, segments are atoms
ellision: hyphen-ization, 1/*comment*/2

then has for natural sorts bracketing and grouping,
here for example as with paragraphs and itemizations,
for the plainest sort of text having default characterization.

In this context it's particularly attribution which is a content
convention, the "quoting depth" character, for example,
in a world of spaces and tabs, with regards to enumerating
branches, what result relations what are to summarize
together, and apart. I.e. there's a notion with the document,
that often enough the posts bring their own context,
for being self-contained, in the threaded organization,
how to best guess attribution, given good faith attribution,
in the most usual sorts of contexts, of plain text extraction.

Then, SEARCH here is basically that "search finds hits",
or what matches, according to WILDMAT and IMAP SEARCH
and variously Yes/No/Maybe as a sort of WILDMAT search,
then for _where_ it finds hits, here in the groups', the threads',
the authors', and the dates', for browsing into those variously.

That speaks to a usual form of relation for navigation,

group -> threads
thread -> authors
author -> threads
date -> threads

and these kinds of things, about the many relations that
in summary are all derivable from the above described BFF
files, which are plain messages files with dates linked in from
the side, threading indicated in the message files, and authors
linked out from the messages.

I.e., here the idea then for content, is that, specific mentions
of technical words, basically relate to "tag cloud", about
finding related messages, authors, threads, groups,
among the things.

Ross Finlayson

unread,

Feb 20, 2024, 10:47:14 PM2/20/24

About a "dedicated little OS" to run a "dedicated little service".

"Critix"

1) some boot code
power on self test, EFI/UEFI, certificates and boot, boot

2) a virt model / a machine model
maybe running in a virt
maybe running on metal

3) a process/scheduler model
it's processes, a process model
goal is, "some of POSIX"

Resources

Drivers

RAM
Bus
USB, ... serial/parallel, device connections, ....
DMA
framebuffer
audio dac/adc

Disk

hard
memory
network

Login

identity
resources

Networking

TCP/IP stack
UDP, ...
SCTP, ...
raw, ...

naming

Windowing

"video memory and what follows SVGA"
"Java, a plain windowing VM"

PCI <-> PCIe

USB 1/2 USB 3/4

MMU <-> DMA

Serial ATA

NIC / IEEE 802

"EFI system partition"

virtualization model
emulator

clock-accurate / bit-accurate
clock-inaccurate / voltage

mainboard / motherboard
circuit summary

emulator environment

CPU
main memory
host adapters

PU's
bus

I^2C

clock model / timing model
interconnect model / flow model
insertion model / removal model
instruction model

Ross Finlayson

unread,

Feb 20, 2024, 11:38:51 PM2/20/24

Alright then, about the SFF, "summary" file-format,
"sorted" file-format, "search" file-format, the idea
here is to figure out normal forms of summary,
that go with the posts, with the idea that "a post's
directory is on the order of contained size of the
size of the post", while, "a post's directory is on
a constant order of entries", here is for sort of
summarizing what a post's directory looks like
in "well-formed BFF", then as with regards to
things like Intermediate file-formats as mentioned
above here with the goal of "very-weakly-encrypted
at rest as constant contents", then here for
"SFF files, either in the post's-directory or
on the side, and about how links to them get
collected to directories in a filesystem structure
for the conventions of the concatenation of files".

So, here the idea so far is that BFF has a normative
form for each post, which has a particular opaque
globally-universal unique identifier, the Message-ID,
then that the directory looks like MessageId/ then its
contents were as these files.

id hd bd yd td rd ad dd ud xd
id, header, body, year-to-date, thread, referenced, authored, dead,
undead, expired

or just files named

i h b y t r a d u x

which according to the presence of the files and
their contents, indicate that the presence of the
MessageId/ directory indicates the presence of
a well-formed message, contingent not being expired.

... Where hd bd are the message split into its parts,
with regards to the composition of messages by
concatenating those back together with the computed
message numbers and this kind of thing, with regards to
the site, and the idea that they're stored at-rest pre-compressed,
then knowledge of the compression algorithm makes for
concatenating them in message-composition as compressed.

Then, there are variously already relations of the
posts, according to groups, then here as above that
there's perceived required for date, and author.
I.e. these are files on the order the counts of posts,
or span in time, or count of authors.

(About threading and relating posts, is the idea of
matching subjects not-so-much but employing the
References header, then as with regards to IMAP and
parity as for IMAP's THREADS extension, ...,
www.rfc-editor.org/rfc/rfc5256.html , cf SORT and THREAD.
There's a usual sort of notion that sorted, threaded
enumeration is either in date order or thread-tree
traversal order, usually more sensibly date order,
with regards to breaking out sub-threads, variously.
"It's all one thread." IMAP: "there is an implicit sort
criterion of sequence number".)

Then, similarly is for defining models for the sort, summary,
search, SFF, that it sort of (ha) rather begins with sort,
about the idea that it's sort of expected that there will
be a date order partition either as symlinks or as an index file,
or as with regards to that messages date is also stored in
the yd file, then as with regards to "no file-times can be
assumed or reliable", with regards to "there's exactly one
file named YYYY-MM-DD-HH-MM-SS in MessageId/", these
kinds of things. There's a real goal that it works easy
with shell built-ins and text-utils, or "command line",
to work with the files.

So, sort pretty well goes with filtering.
If you're familiar with the context, of, "data tables",
with a filter-predicate and a sort-predicate,
they're different things but then go together.
It's figured that they get front-ended according
to the quite most usual "column model" of the
"table model" then "yes/no/maybe" row filtering
and "multi-sort" row sorting. (In relational algebra, ...,
or as rather with 'relational algebra with rows and nulls',
this most usual sort of 'composable filtering' and 'multi-sort').

Then in IMAP, the THREAD command is "a variant of
SEARCH with threading semantics for the results".
This is where both posts and emails work off the
References header, but it looks like in the wild there
is something like "a vendor does poor-man's subject
threading for you and stuffs in a X-References",
this kind of thing, here with regards to that
instead of concatenation, is that intermediate
results get sorted and threaded together,
then those, get interleaved and stably sorted
together, that being sort of the idea, with regards
to search results in or among threads.

(Cf www.jwz.org/doc/threading.html as
via www.rfc-editor.org/rfc/rfc5256.html ,
with regards to In-Reply-To and References.
There are some interesting articles there
about "mailbox summarization".)

About the summary of posts, one way to start
as for example an interesting article about mailbox
summarization gets into, is, all the necessary text-encodings
to result UTF-8, of Unicode, after UCS-2 or UCS-4 or ASCII,
or CP-1252, in the base of BE or LE BOMs, or anything to
do with summarizing the character data, of any of the
headers, or the body of the text, figuring of course
that everything's delivered as it arrives, as with regards
to the opacity usually of everything vis-a-vis its inspection.

This could be a normative sort of file that goes in the messageId/
folder.

cd: character-data, a summary of whatever form of character
encoding or requirements of unfolding or unquoting or in
the headers or the body or anywhere involved indicating
a stamp indicating each of the encodings or character sets.

Then, the idea is that it's a pretty deep inspection to
figure out how the various attributes, what are their
encodings, and the body, and the contents, with regards
to a sort of, "a normalized string indicating the necessary
character encodings necessary to extract attributes and
given attributes and the body and given sections", for such
matters of indicating the needful for things like sort,
and collation, in internationalization and localization,
aka i18n and l10n. (Given that the messages are stored
as they arrived and undisturbed.)

The idea is that "the cd file doesn't exist for messages
in plain ASCII7, but for anything anywhere else, breaks
out what results how to get it out". This is where text
is often in a sort of format like this.

Ascii
it's keyboard characters
ISO8859-1/ISO8859-15/CP-1252
it's Latin1 often though with the Windows guys
Sideout
it's Ascii with 0-127 gigglies or upper glyphs
Wideout
it's 0-256 with any 256 wide characters in upper Unicode planes
Unicode
it's Unicode

Then there are all sorts of encodings, this is according to
the rules of Messages with regards to header and body
and content and transfer-encoding and all these sorts
things, it's Unicode.

Then, another thing to get figured out is lengths,
the size of contents or counts or lengths, figuring
that it's a great boon to message-composition to
allocate exactly what it needs for when, as a sum
of invariant lengths.

Then the MessageId/ files still has un-used 'l' and 's',
then though that 'l' looks too close to '1', here it's
sort of unambiguous.

ld: lengthed, the coded and uncoded lengths of attributes and parts

The idea here is to make it easiest for something like
"consult the lengths and allocate it raw, concatenate
the message into it, consult the lengths and allocate
it uncoded, uncode the message into it".

So, getting into the SFF, is that basically
"BFF indicates well-formed messages or their expiry",
"SFF is derived via a common algorithm for all messages",
and "some SFF lives next to BFF and is also write-once-read-many",
vis-a-vis that "generally SFF is discardable because it's derivable".

0 new messages