On 5/8/2023 1:47 PM, Scott Lurndal wrote:
> BGB <
cr8...@gmail.com> writes:
>> On 5/8/2023 11:22 AM, Scott Lurndal wrote:
>>> "
luke.l...@gmail.com" <
luke.l...@gmail.com> writes:
>>>> On Monday, May 8, 2023 at 3:36:24=E2=80=AFAM UTC+1,
robf...@gmail.com wrote=
>>>> :
>>>>
>>>>> loop1:=20
>>>>> LOADG g16,[r1+r2*]=20
>>>>> STOREG g16,[r3+r2++*]=20
>>>>> BLTU r2,1000,.loop1=20
>>>>> =20
>>>>> I must look at adding string instructions back into the instruction set.=
>>>> =20
>>>>
>>>> yeah can i suggest really don't do that. what happens if you want
>>>> to support UCS-2 (strncpyW)? then UCS-4? more than that: the
>>>> concepts needed to efficiently support strings, well you have to
>>>> add them anyway so why not make them first-order concepts
>>>> at the ISA level?
>>>
>>> UTF8 should be good enough for everything; best to deprecate
>>> USC-2 et al.
>>>
>>
>> For many use-cases (transmission and storage), UTF-8 is a sane default,
>> but there are cases where UTF-8 is not ideal, such as inside console
>> displays or text editors.
>
> I disagree with that. Linux-based systems, for example, have no problem using UTF-8
> exclusively for editors, x-terms and any other i18n'd application.
>
As noted, UTF-8 makes sense for "transmission", say, sending text to or
from the console; or in the files loaded or saved from a text editor, etc...
Trying to process, edit, and redraw text *directly* in UTF-8 form
internally would be a massive PITA, and would be computationally
expensive, hence why a "character cell" approach is useful. But, as
noted, 32 or 64 bit cells usually make more sense here as, for things
like "syntax highlighting" etc, it makes sense to mark out the
text-colors in the editor buffers (rather than during the redraw process).
Things like variable-size text rendering add some complexity, but these
are mostly keeping track of the width of each character cell, and the
maximum height for the cells in the row.
Then when one saves out the text, or copies it to the OS clipboard, etc,
it is converted back to UTF-8 (or UTF-16).
As for fonts, there are various strategies:
8x8x1, 8x16x1, or 16x16x1 bitmap
Works, but fairly limited, does not deal with resizable text.
Small pixel bitmap (say, 16x16 or 32x32, 2..8 bpp)
Can deal with things like emojis, but not really resizable.
Signed Distance Fields
Resizable, but less ideal for full-color images (1).
Small vector images for each glyph
Traditional form of True-Type Fonts
Needlessly expensive to draw glyphs this way.
Scaling bitmap fonts with either nearest neighbor or bilinear
insterpolation does not give good looking text (nearest neighbor giving
inconsistent jagged edges, bilinear giving blurry text).
So, "Signed Distance Fields" are a good workaround, but mostly make the
most sense for representing monochrome images.
Effectively, "good" 8-color results require a 6 component image, with 2
components per color bit. For a monochrome image and SDF would need a 2
component image.
A 16-color image would need 8 components to represent effectively with
an SDF.
An SDF can be done using 1 component per channel, but the edge quality
isn't as good (one component forms encoding a combined XY distance from
an edge, and 2 component separately encoding the X and Y distances).
Usual algorithm is to interpolate the texels using bilinear
interopolation or similar, and then threshold the results per color bit
(then one can feed this through a small color palette). Traditionally,
this process being done in a fragment shader or similar.
I guess traditionally, one using a 256x256 texture for every 256 glyphs,
with 16x16 texels per glyph.
Here, the full Unicode BMP would need 256 textures, or roughly 8MB if
each SDF is encoded using DXT1. Though, one trick is to store the glyphs
as a 16x16x1 bitmap font, and then dynamically converting blocks of
glyphs into SDF form (this is how some of my past 3D engines had worked
IIRC).
Though, currently, I haven't really gotten to this stage yet with
TestKern, still just sorta using 8x8x1 pixel bitmap fonts for now.
And, at the moment, I am experimenting with 640x400 and 800x600
256-color modes, and have started working on adding mouse support
(somewhat needed if I add any sort of GUI to this).
In this case, 640x400 8-bpp mode having the advantage that it needs less
memory bandwidth, so the screen is slightly less of a broken jittery
mess (and also, the 800x600 mode currently uses a non-standard 36Hz
refresh).
I guess one possibility could be to give the display hardware an
interface to talk directly with DDR controller (and effectively bypass
the L2 cache). Mostly as the properties the L2 cache adds are "not
particularly optimal" for the access patterns of screen-refresh.
An "L2 bypass path" could potentially be able to sustain high enough
bandwidth to avoid the screen looking like a broken mess when trying to
operate at "slightly higher" resolutions.
It is pros/cons between 256-color and color-cell:
Color cell gives better color fidelity, but more graphical artifacts;
256-color has fewer obvious artifacts, but the color fidelity kinda
sucks (going the RGB555 -> Indexed route; with a "generic" palette);
Drawing the screen image using ordered dither sorta helps, but also
doesn't look particularly good either.
Apparently Half-Life had used this approach (rendering internally using
RGB555 but then reducing the final image back down to 256 color in the
software renderer), but IIRC it looked a lot better than what I am
currently getting.
These images sort of showing the issues I am dealing with:
https://twitter.com/cr88192/status/1654288824669708290
One showing the issue that plagues the 640x400 hi-color mode (and also
800x600 modes), and the other showing the "kinda meh" color rendition
with a fixed 256-color "OS palette" (of the options tested, this being
the palette layout that got the lowest RMSE in my collection of test
images).
Well, along with the 256-color image showing a bug that I have fixed (it
was a bug when doing a partial update of copying the internal
framebuffer to VRAM).
Note that the screen framebuffer is still internally drawn in RGB555,
and then converted to 256-color when being copied into VRAM (well, as
opposed to feeding it through a color-cell encoder).
So, internally this is a 512K screen framebuffer in 640x400 mode, or 1MB
for 800x600. The window also having its own backing buffer (which Doom
draws into, triggering the window stack to be redrawn into the screen
buffer, and then uploaded to VRAM).
>>
>> Still makes sense to keep support UTF-16 around for the cases where it
>> is useful.
>
> It's a painful and non-universal mechanism. Certainly not worth adding
> support in the processor for it.
>
CPU shouldn't really need to know or care.
For all it needs to know about it, it is dealing with 16 or 32 bit WORD
or DWORD values, or packed 16 or 32 bit integer vectors.
The C compiler maybe needs to know/care, and some parts of the C library
which cross paths with this.