is there a need for byte integer data type?

10 views
Skip to first unread message

Pascal Jasmin

unread,
Oct 13, 2025, 10:43:41 AM (2 days ago) Oct 13
to Forum
seems like the only way to store a byte list is

 a. {~ 12 0 _22

convert to numeric by

 a. i. a. {~ 12 0 _22

12 0 234

I understand that cpu math can be faster on bytes than larger sized integers. 

Is there internal use of bytes that make this not a concern?  Just use above approach for storage?

Henry Rich

unread,
Oct 13, 2025, 12:57:12 PM (2 days ago) Oct 13
to forum
There are some special combinations using a. &i. .

We may implement 1-byte integers someday. 

Henry Rich

To unsubscribe from this group and stop receiving emails from it, send an email to forum+un...@jsoftware.com.

Devon McCormick

unread,
Oct 14, 2025, 2:06:05 PM (21 hours ago) Oct 14
to fo...@jsoftware.com
My own initial reaction to this is that it's not such a great idea.  For there to be a significant performance improvement, one would have to be dealing with large arrays of small integers.  This might be the case when dealing with pictures represented as three-dimensional RGB arrays but it seems that, even in this case, most arithmetic would be in danger of promoting the single-byte integers to floating point or full integer values; this would probably negate any performance enhancement.

Another concern is the general one that the size of an interpreter is proportional to the number of datatypes it handles, since any new data type has to be considered in relation to all the existing data types.  If we were going to push for such an increase, it would seem that giving J true Booleans, with one bit per Boolean value, might be more generally useful.

--

Devon McCormick

Flâneur


Marshall Lochbaum

unread,
Oct 14, 2025, 3:48:01 PM (20 hours ago) Oct 14
to fo...@jsoftware.com
I have some notes on the topic based on Dyalog and BQN experience here:

https://mlochbaum.github.io/BQN/implementation/primitive/types.html

I think you're underestimating how much work bit booleans are. With 2D
or higher-rank arrays, even simple operations can be very tricky to
implement well. On the other hand much of J's code for 1-byte characters
could be directly reused for integers, as structural manipulations and
searching don't depend on arithmetic or ordering.

Promotion to a higher type can be a practical problem which is one
reason I emphasize 2-byte integers on that page. There's also the issue
of getting primitives to output a small type when applicable, which is
not that hard in a new interpreter but may be more of an annoyance when
changing a mature one.

Decreasing element type by a factor of 2 halves the per-element cost of
arithmetic, so it will be significant as long as the arrays aren't so
small that the program mostly spends on constant per-array costs. This
holds at about 1000 elements in CBQN, which I wouldn't say is too large.
The following benchmark for <./\ seems reasonably representative:

https://mlochbaum.github.io/bencharray/output/plot/scan-min.svg

(main page is https://mlochbaum.github.io/bencharray/pages/summary.html)

Marshall

Ak

unread,
Oct 14, 2025, 8:27:23 PM (15 hours ago) Oct 14
to fo...@jsoftware.com

Can you share more commonly the Rank/Size of the information that you are storing?

Also, what is the elementwise range of an individual value in your overall case?


Ak

Ak

unread,
Oct 14, 2025, 10:56:08 PM (13 hours ago) Oct 14
to fo...@jsoftware.com
...most arithmetic would be in danger of promoting the single-byte integers to floating point or full integer values...

More generally it would be necessary to have the ability to either belay a promotion action or to specify the numeric type to hold for the operation (There was a tangential forum discussion relating to the bug in  x A. y usage that breaks its sound use when movin1g between A. y <->  x A. y)

A concrete example: once we have the ability in J to target GPUs, APUs, NPUs etc. natively consider these penalties caused by promotion.

A certain Processor Unit (PU) is able process a tensor operation on an Integer Array (IA) of shape 128 x 128 x 128 in an Efficient Time (ET) that is some fixed constant number of cycles.

At any point where IA is treated as float, some form of the following takes place. IA will immediately factored into subsets of 64 x 64 x 64 or worse 32 x 32 x 32 Factor Bricks (FB) depending on the PU.  Once formed, each FB is individually processed, imposing a Penalty Factor (PF). After processing the FB group the solution must recompose IA. 

In the nieve case (Associative Commutative properties) , the imposed PF ->  Factor Size (FS) raised to the Rank all squared  times ET. 

PF= (FS y ^( $$y))^2)*ET

(2^3)^2 * ET for FB x64 and  (4^3)^2 * ET for FB x32. 

It can worsen dramatically if the FBs are not correctly aligned for processing. In many cases incorrect solitions will pervade (not considering common value discrepancies caused by moving between Integer and Floating). 

It is easy to imagine a trivial 7 step series of piped tensor operation applications where a promotion could be triggered at any individual step. Each trigger imposes a fresh PF. So any pipe would be subject to a multiplicative PF (given by the order of promotion triggers)

Yikes!


Ak

Henry Rich

unread,
3:23 AM (8 hours ago) 3:23 AM
to forum
With 2- & 4-byte integers we do not promote to larger size, reasoning that you chose the short type wisely. Presumably we would do the same with bytes. Saturating addition would be needed. 

Henry Rich

Ak

unread,
5:44 AM (6 hours ago) 5:44 AM
to fo...@jsoftware.com
I do recall that is true for the general case. 

Corner case:
      datatype A. 0 1 2 3
extended


The critical point remains. We need the ability to specify that an operator treats its arguments as a particular type. 


Ak

Pascal Jasmin

unread,
8:02 AM (3 hours ago) 8:02 AM
to fo...@jsoftware.com
It's about neural network/llm math where memory is scarce.  Primary research is about smaller bit sizes than 8, but Henry's advice/reminder that a.&i. has special code is useful.  Research is to emulate FP in -1 to 1 range at small bits with "scaled integer multiplication".

Pascal Jasmin

unread,
10:08 AM (1 hour ago) 10:08 AM
to fo...@jsoftware.com
Saturation would work for both RGB cases and "network matrix weights".  If user/programmer needs overflows then they "should" use general number format.
Reply all
Reply to author
Forward
0 new messages