Ring FastPro - Audit and Corrections

Bert Mariani

unread,

Mar 29, 2026, 11:15:20 AMMar 29

to The Ring Programming Language

Hello Mahmoud et ALL

Attached Update:

FastPro.c

Updated after audit and corrections were made to the Cases with Matrix functions

Tested with all the scripts in ... results were Good

C:\ring\samples\UsingFastPro

===================================================

Example
FastPro - 2026-03-29 After Code improvements

Ring AllSum: Sum: 165
FastPro AllSum: AllSum: Sum: 165

Ring AllSum: Sum: 4089140.90 900x900 Time 1309 millisecs
FastPro AllSum: Sum: 4089140.90 900x900 Time 3 millisecs <=== 3x Faster

FINISHED

=====================================================
FastPro - Posted 2025-05-19 in Ring Group

Ring AllSum: Sum: 165
FastPro AllSum: AllSum: Sum: 165

Ring AllSum: Sum: 4084699.60 900x900 Time 1483 millisecs
FastPro AllSum: Sum: 4084699.60 900x900 Time 9 millisecs <===

FINISHED
=====================================================

Full audit and Corrections
- Fixed = vs == dimension-check bug in cases 206/306 (critical)
- Fixed outer-loop bounds (column count used as row count) in 206/306
- Fixed wrong list pointer (pList vs pListB) in 3506/3606 HorStack/VerStack
- Fixed out-of-bounds index (nRow used for pListB) in 4206 Append
- Fixed wrong output size (nEnd vs nEndB) in 1606 DotProduct
- Fixed dead read (discarded getdouble result) in 1706 Fill
- Fixed wrong initial max value (0 vs -DBL_MAX) in 1806 Maximum / 4006 ArgMax
- Fixed srand() called every invocation in 2006 Random
- Removed per-cell ring_list_isdouble() checks from all hot loops
- Hoisted row-pointer fetches out of inner loops throughout
- Cached ring_list_getsize() results before every loop
- Changed pow(x,2) to x*x in 2306 Square
- Moved dimension-mismatch checks before output allocation in 3506/3606
- Removed all unused nRowC/pRowC/nEndC variables
- Merged two-pass zero+diagonal fill into one pass in 1906 Identity
- Fixed pRowB fetched from pListC instead of pListB in 4106 DeRepeat

==============================

Short Summary

The new fastpro.c is 2,513 lines — about 1,280 lines shorter than the original (3,793) because all the dead variables and redundant multi-pass patterns are gone. Every original function and all 90 switch cases are present.

Two classes of problem appear throughout the file:
- **Bugs** — code that is wrong and will produce incorrect results or crashes
- **Performance** — redundant work that costs speed every call

The same four anti-patterns recur in nearly every matrix case, so they are
defined once up front and then referenced by name in the per-case tables.

MAT_ROWS(L) / MAT_COLS(L) macros read dimensions from row 1, not the last row
Row pointers (pSubList, pSubListC, pSubListB) hoisted out of inner loops — one lookup per row instead of one per cell

=========================================

Regards

Bert Mariani

fastpro.c

Bert Mariani

unread,

Mar 29, 2026, 11:34:46 AMMar 29

to The Ring Programming Language

Hello Mahmoud

Oh, oh ... problem with new FastPro.c that I just posted

Forgot to test with

C:/ring/applications/imagepixel/ImagePixel.ring

It is Not working properly --- colors Not changing !

Please hold off on this version of FastPro.c

The FoxPro.c with the Transform3D is Working with ImagePixel.ring

Regards

Bert Mariani

The Future of Programming

unread,

Mar 29, 2026, 11:41:35 AMMar 29

to The Ring Programming Language

Hello Bert

No problem, take your time

And thank you very much for the continuous updates and improvements

Greetings,

Mahmoud

--

---
You received this message because you are subscribed to the Google Groups "The Ring Programming Language" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ring-lang+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ring-lang/440dc1dd-70e4-4b24-9966-911692412d11n%40googlegroups.com.

Bert Mariani

unread,

Mar 30, 2026, 4:56:43 PMMar 30

to The Ring Programming Language

Hello Mahmoud

Attached updated

fastpro.c

Tested with

C:\ring\applications\imagepixel\ImagePixel.ring // good

C:\ring\samples\UsingFastPro\MandelbrotAnimate\MandelbrotAnimate.ring // good

C:\ring\samples\UsingFastPro\ ... Scripts // good

=========================

Fixes

RING_FUNC(ring_list2bytes)

if ( ring_list_isdouble(pPointList,1) && ring_list_isdouble(pPointList,2) &&
ring_list_isdouble(pPointList,2) ) {

if ( ring_list_isdouble(pPointList,1) && ring_list_isdouble(pPointList,2) &&
ring_list_isdouble(pPointList,3) ) {

======================
Syntax "=" equal | "==" compare

if ((nEnd = nEndB) && (nRow = nRowB)){

if ((nEnd == nEndB) && (nRow == nRowB)){

=====================
Fixed Loops -- double loops

for ( nCol = nStart ; nCol <= nEnd ; nCol++ ) {
for ( x = 1 ; x <= ring_list_getsize(pList) ; x++ ) {

int nListSize1004 = ring_list_getsize(pList) ;
for ( nCol = nStart ; nCol <= nEnd ; nCol++ ) {
for ( x = 1 ; x <= nListSize1004 ; x++ ) {

======================

SPEED Test - Old :: New

Random Scripts:

RING ...SigmoidPrime: 900x900 Speed Test: 1544 millisecs
FASTPRO SigmoidPrime: 900x900 Speed Test: 358 millisecs

RING ...SigmoidPrime: 900x900 Speed Test: 1739 millisecs
FASTPRO SigmoidPrime: 900x900 Speed Test: 135 millisecs

---------------------

RING ...Mtanh: 900x900 Speed Test: 1425 millisecs
FASTPRO Tanh: 900x900 Speed Test: 300 millisecs

RING ...Mtanh: 900x900 Speed Test: 1704 millisecs
FASTPRO Tanh: 900x900 Speed Test: 126 millisecs

---------------------

RING LeakyReLu 900x900 Time 2666 millisecs
FastPro LeakyReLu 900x900 Time 258 millisecs

RING LeakyReLu 900x900 Time 3283 millisecs
FastPro LeakyReLu 900x900 Time 79 millisecs

----------------------

RING LeakyReLuPrime 900x900 Time 2615 millisecs
FastPro LeakyReLuPrime 900x900 Time 262 millisecs

RING LeakyReLuPrime 900x900 Time 2344 millisecs
FastPro LeakyReLuPrime 900x900 Time 83 millisecs

----------------------

RING ReLu 900x900 Time 2691 millisecs
FastPro ReLu 900x900 Time 272 millisecs

RING ReLu 900x900 Time 3325 millisecs
FastPro ReLu 900x900 Time 84 millisecs

--------------------------

RING MExp 900x900 Time 1385 millisecs
FastPro Exp900x900 Time 270 millisecs

RING MExp 900x900 Time 1674 millisecs
FastPro Exp900x900 Time 104 millisecs

--------------------------

FastPro SoftMax 500x500 Time 301 millisecs 122x Faster
RING Softmax 500x500 Time 36914 millisecs

FastPro SoftMax 500x500 Time 153 millisecs
RING Softmax 500x500 Time 58327 millisecs

------------------------

fastpro.c

Mahmoud Fayed

unread,

Mar 30, 2026, 7:43:02 PMMar 30

to The Ring Programming Language

Hello Bert

Thank you very much :D

Applied in this commit: Update extensions/ringfastpro/fastpro.c - Better Performance - By Ber… · ring-lang/ring@2ad4213

Keep up the GREAT WORK :D

Greetings,

Mahmoud

Bert Mariani

unread,

Apr 1, 2026, 10:15:25 AMApr 1

to The Ring Programming Language

Hello Mahmoud

Further improvements, optimization

Attached updated

fastpro.c

MatrixAddSubMul.ring // fix ==> C:\ring\samples\UsingFastPro

Even ImagePixel.ring runs faster. Changed timing to msecs to see new speed.

==============================

SPEED IMPROVEMNETS

================================

ImagePixel.ring

Image W-H: 1800-1200 Size: 2160000
Size (bytes): 6480000
Width : 1800
Height: 1200
Channels: 3
GetPixelColors.....: Total Time: 0.050 seconds
Change-ColorValue..: Total Time: 0.050 seconds <=== Old
DrawRBGAImagePixels: Total Time: 0.060 seconds

Image W-H: 1800-1200 Size: 2160000
Size (bytes): 6480000
Width : 1800
Height: 1200
Channels: 3
GetPixelColors.....: Total Time: 28 msecs
Change-ColorValue..: Total Time: 18 msecs <=== New
DrawRBGAImagePixels: Total Time: 43 msecs

================================

RING Append Axis: 1 500x500 Time 350 millisecs
FastPro Append: Axis: 1 500x500 Time 42 millisecs

RING Append Axis: 0 500x500 Time 389 millisecs
FastPro Append: Axis: 0 500x500 Time 10 millisecs

========================
RING AtLeast2D 900x900 Time 1214 millisecs
FastPro AtLeast2D 900x900 Time 348 millisecs

RING AtLeast2D 900x900 Time 1469 millisecs
FastPro AtLeast2D 900x900 Time 171 millisecs

===========================
FastPro Ravel 900x900 Time 228 millisecs
RING Ravel 900x900 Time 551 millisec

FastPro Ravel 900x900 Time 18 millisecs
RING Ravel 900x900 Time 562 millisecs

=========================

FastPro SoftMax 500x500 Time 301 millisecs

RING Softmax 500x500 Time 36914 millisecs

FastPro SoftMax 500x500 Time 124 millisecs
RING Softmax 500x500 Time 42708 millisecs

=========================
MultDot Speed Test:1846 millisecs
FastPro Speed Test:71 millisecs

MultDot Speed Test:2374 millesecs
FastPro Speed Test:34 millesecs

==================================
RING Transpose Time: 131
FastPro Transpose Matrix Time: 97

RING Transpose Time: 148
FastPro Transpose Matrix Time: 33

=========================

Ring AllSum: Sum: 4084699.60 900x900 Time 1483 millisecs
FastPro AllSum: Sum: 4084699.60 900x900 Time 9 millisecs

Ring AllSum: Sum: 4085597.00 900x900 Time 1663 millisecs
FastPro AllSum: Sum: 4085597.00 900x900 Time 3 millisecs

==========================

/* DETAILS
** OPTIMIZATIONS APPLIED (2026):
**
** ROUND 1 — First pass:
** 1. ring_bytes2list : Branch hoisted outside pixel loop; nDivide fast-path
** avoids division; divide path uses one precomputed
** reciprocal (multiplication replaces per-channel division).
** 2. ring_list2bytes : nChannel==3 vs ==4 branch moved outside the pixel loop;
** alpha byte precomputed once.
** 3. case 406 (MatMul): All B-row pointers cached in a heap array before the
** triple loop — eliminates one ring_list_getlist() call
** per (row, col, k) step, the hottest path.
** 4. case 206 (Add Matrix): islist()+isdouble() guards removed from inner loop;
** outer iteration corrected from nEnd to nRow.
** 5. Activation funcs : isdouble() guard removed from inner loops for:
** sqrt, square, sigmoid, sigmoidprime, tanh, leakyrelu,
** leakyreluprime, relu, reluprime, exp (cases 2206-3106).
** 6. case 2106 (Mean) : isdouble() guard removed from inner loop.
** 7. case 4306 (AllSum): isdouble() guard removed from inner loop.
** 8. case 4506 (EMul) : isdouble() guard removed from inner loop.
**
** ROUND 2 — Second pass:
** 9. case 306 (Sub Matrix): same islist()/isdouble() removal + outer bound fix.
** 10. case 1406 (Transpose): pSubList (A-row) re-fetch eliminated from inner loop.
** 11. case 1606 (DotProduct 2D): B-row pointer array cached; A-row and C-row
** pointers hoisted out of inner loops.
** 12. case 3306 (Softmax): Temp double[] buffer replaces ring_list read-back loop;
** one reciprocal division per row replaces nEnd divisions.
** ROUND 3 — Third pass:
** 14. case 3706 (Ravel) : pSubListC (single output row) hoisted outside both
** loops — was re-fetched on every inner column step.
** Intermediate k variable eliminated.
** 15. case 3906 (AtLeast2D): pSubListC hoisted outside loop — same pattern.
** Intermediate valueA variable eliminated.
** 16. case 4206 (Append) : Intermediate valueA eliminated from both axis paths;
** Axis-0 B-copy now correctly iterates nRowB (not nRow).
** Axis-1 B-copy now correctly iterates nEndB (not nEnd).
** 17. ring_mandelbrot : TWO PASSES FUSED INTO ONE.
** A flat int[] scratch buffer replaces all per-pixel
** ring_list_setdouble / ring_list_getdouble calls
** (640000 Ring API calls eliminated for an 800×800 image).
** Color table made 'static const' (ROM, not stack).
** 2.0 literal used instead of integer 2 in zI formula.
*/
=================

Regards

Bert Mariani

fastpro.c

MatrixAddSubMul.ring

Mahmoud Fayed

unread,

Apr 1, 2026, 12:23:06 PMApr 1

to The Ring Programming Language

Hello Bert

Thank you very much for the updates :D

Applied in this commit: Update RingFastPro - extensions/ringfastpro/fastpro.c - Better Perfor… · ring-lang/ring@bd177b2