More of my philosophy about race detection and concurrency and more..

0 views
Skip to first unread message

World-News2100

unread,
Dec 1, 2021, 6:00:40 PM12/1/21
to
Hello,



More of my philosophy about race detection and concurrency and more..

I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..


I have just looked quickly at different race detectors, and i think that
the Intel Thread Checker from Intel "USA" company is also very good
since the Intel Thread Checker needs to instrument either the C++ source
code or the compiled binary to make every memory reference and every
standard Win32 synchronization primitive observable, so this
instrumentation from the source code is very good since it permits me to
port my scalable algorithms inventions by for example emulating some
native Windows synchronization APIs, and this instrumentation from the
source codei s also business friendly, so read about different race
detectors and about Intel Thread Checker here:

https://docs.microsoft.com/en-us/archive/msdn-magazine/2008/june/tools-and-techniques-to-identify-concurrency-issues

So i think that the other race detectors of other programming languages
have to provide this instrumentation from the source code.

More of my philosophy about programming languages about lock-based
systems and more..

I think we have to be optimistic about lock-based systems, since race
conditions detection can be done in polynomial-time, and you can notice
it by reading the following paper:

https://arxiv.org/pdf/1901.08857.pdf

Or by reading the following paper:

https://books.google.ca/books?id=f5BXl6nRgAkC&pg=PA421&lpg=PA421&dq=race+condition+detection+and+polynomial+complexity&source=bl&ots=IvxkORGkQ9&sig=ACfU3U2x0fDnNLHP1Cjk5bD_fdJkmjZQsQ&hl=en&sa=X&ved=2ahUKEwjKoNvg0MP0AhWioXIEHRQsDJc4ChDoAXoECAwQAw#v=onepage&q=race%20condition%20detection%20and%20polynomial%20complexity&f=false

So i think we can continu to program in lock-based systems, and about
composability of lock-based systems, read my below previous thoughts
about it:

More of my philosophy about composability and Haskell functional
language and more..

I have just read quickly the following article about composability,
so i invite you to read it carefully:

https://bartoszmilewski.com/2014/06/09/the-functional-revolution-in-c/

I am not in accordance with the above article, and i think that the
above scientist is programming in Haskell functional language and it is
for him the way to composability, since he says that the way of
functional programming like Haskell functional programming is the
the way that allows composability in presence of concurrency, but for
him lock-based systems don't allow it, but i don't agree with him, and
i will give you the logical proof of it, and here it is, read what is
saying an article from ACM that was written by both Bryan M. Cantrill
and Jeff Bonwick from Sun Microsystems:

You can read about Bryan M. Cantrill here:

https://en.wikipedia.org/wiki/Bryan_Cantrill

And you can read about Jeff Bonwick here:

https://en.wikipedia.org/wiki/Jeff_Bonwick

And here is what says the article about composability in the presence of
concurrency of lock-based systems:

"Design your systems to be composable. Among the more galling claims of
the detractors of lock-based systems is the notion that they are somehow
uncomposable:

“Locks and condition variables do not support modular programming,”
reads one typically brazen claim, “building large programs by gluing
together smaller programs[:] locks make this impossible.”9 The claim, of
course, is incorrect. For evidence one need only point at the
composition of lock-based systems such as databases and operating
systems into larger systems that remain entirely unaware of lower-level
locking.

There are two ways to make lock-based systems completely composable, and
each has its own place. First (and most obviously), one can make locking
entirely internal to the subsystem. For example, in concurrent operating
systems, control never returns to user level with in-kernel locks held;
the locks used to implement the system itself are entirely behind the
system call interface that constitutes the interface to the system. More
generally, this model can work whenever a crisp interface exists between
software components: as long as control flow is never returned to the
caller with locks held, the subsystem will remain composable.

Second (and perhaps counterintuitively), one can achieve concurrency and
composability by having no locks whatsoever. In this case, there must be
no global subsystem state—subsystem state must be captured in
per-instance state, and it must be up to consumers of the subsystem to
assure that they do not access their instance in parallel. By leaving
locking up to the client of the subsystem, the subsystem itself can be
used concurrently by different subsystems and in different contexts. A
concrete example of this is the AVL tree implementation used extensively
in the Solaris kernel. As with any balanced binary tree, the
implementation is sufficiently complex to merit componentization, but by
not having any global state, the implementation may be used concurrently
by disjoint subsystems—the only constraint is that manipulation of a
single AVL tree instance must be serialized."

Read more here:

https://queue.acm.org/detail.cfm?id=1454462

More about Channels, Concurrency and lightweight tasks and more..

I think i am smart and i am like a software architect, and i am
an inventor of many software scalable algorithms and algorithms, so
i will continu to explain more:

From a software architecting point of view, i think that Go programming
language from Google has made some big software architecting errors, for
example you can take a look at the architecting of channels in Go
programming language from the source code and you will notice that the
mutex of the channels is spinning about 1 ms or 2 ms, and it is not
good, since it is not so efficient, since it is by analogy like when you
have to prevent too much cache-coherence traffic by using queue locks, i
mean that the mutex of the channels in Go language is spinning for 1 ms
or 2 ms and it is not good since it makes the thing slow, since thread
switching is slow and it is also not like convoy resistant, and it is
not the only disadvantage, i will talk about the other disadvantages of
Go and more in my next book about parallel programming and concurrency
that i will sell, so i think that Go and Rust languages are not so
efficient, read all my following thoughts so
that to understand more:

I have just looked quickly at the following video of Concurrency and
Channels and lightweight tasks, and i invite you to look
at it:

Andy Wingo - Channels, Concurrency, and Cores: A new Concurrent ML
implementation

https://www.youtube.com/watch?v=7IcI6sl5oBc

I think i am smart and i think that the above solution
in the above video has the same problem as Go language with channels and
light goroutines from Google, since i have looked at Go and is using
mutexes in its implementation of Channels that
are not so efficient, so the problem is that it is by logical analogy
like the problem of a "monopolization" that is not so business friendly,
since we can think that the others can still invent queues or locks or
mutexes that are much more efficient, so how can they sell it
"advantageously" to people around the world if Google or other
compagnies are monopolizing with there Go language and such. So i think
that this types of Concurrency with lightweight tasks and with channels
that is included directly inside a language has its disadvantage that it
becomes a not business friendly.

And i think that Go and Rust have another disadvantage and here it is:

More of my philosophy about the too much purism philosophy of Rust and
more..

I think i am smart, and i think that Rust compiler and language is too
much "purist", it looks like the too much purism of Haskell functional
programming language, since i say that Rust and Go don't provide us with
OOP inheritance, but it is too much restrictive, and it is the
deficiency of Rust, since inheritance has advantages and disadvantages,
so we have to balance well and provide also with inheritance so that to
be efficient, so i think that C++ and C# are better than Rust in this
regard, and here is the advantages and disadvantages of OOP inheritance:

https://www.ianswer4u.com/2017/09/oops-inheritance-advantages.html

As a software developer you have to become more efficient and
productive. So you need to make sure the code you write is easily
reusable and maintainable. And, among other things, this is what
inheritance gives you - the ability to reuse without reinventing the
wheel, as well as the ability to easily maintain your base object
without having to perform maintenance on all similar objects.

More of my philosophy about Superpascal and about CSP(Communicating
sequential processes) and more..

I think i am smart, and i am also programming in Object Pascal
of Delphi and Freepascal, and i think i am also a smart "Wirthian"
programmer of the Wirthian familly of ALGOL-like languages, since i have
programmed in Pascal and i have also programmed in Superpascal(You can
read about it here: https://en.wikipedia.org/wiki/SuperPascal), and
i have programmed in Object Pascal of Delphi and Freepascal, and i know
more about Superpascal, that was an interesting enhancement of the
pascal language, that brought an enhancement in a form of a "Forall"
statement that is like a Parallel For loop, and that brought an
enhancement in a form of "Channels" that look like Go channels and that
permit to code parallel programs, so the Superpascal channels allowed us
to program like in CSP(Communicating sequential processes) that is a
formal language for describing patterns of interaction in concurrent
systems. And CSP(Communicating sequential processes) is a member of the
family of mathematical theories of concurrency known as process
algebras, or process calculi, based on message passing via channels, so
Superpascal Channels allowed us to avoid parallel bugs such as race
conditions, but i think that those channels can also be used in a more
simple way like in the following article, so that they permit to avoid
race conditions and that's also i think a much better enhancement, so
read the following article so that to know about the more simple way of
using Go channels or Superpascal channels so that to avoid race conditions:

https://fodor.org/blog/go-avoiding-race-conditions/

And so that you get an idea about Superpascal, you can look
at its source code in Freepascal here in Gitub:

https://github.com/octonion/superpascal

So as you notice that Superpascal programming language, that was
invented in year 1993, has preceded Go programming language by providing
Channels etc. that permit to do parallel programming by avoiding race
conditions and such parallel programming bugs.

But you have to know that i am smart and i have also enhanced
Object Pascal of Freepascal and Delphi by inventing the following
Threadpool that scales well and that supports parallel for loop,
you can read about it carefully here in my websites:

https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

And i have also enhanced Object Pascal of Freepascal and Delphi by
inventing a Scalable reference counting with efficient support for weak
references, you can take a look carefully about it here in my websites:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

So as you notice that i am also an inventor of many scalable algorithms
and algorithms..

More of my philosophy about stack memory allocations and about
preemptive and non-preemptive timesharing..

I think i am smart, and as you are noticing in my below thoughts that
i am abstracting smartly so that to make you understand preemptive and
non-preemptive timesharing , other than that i will also give you
an interesting Stack memory allocation algorithm in Delphi and
Freepascal so that to use it smartly with my below sophisticated
Stackful coroutines Library, so i will extend my sophisticated Stackful
coroutines Library so that to support it smartly, and here it is:

--

var pool: array [1..limit] of integer;
memory: array [min..max] of integer;
top: integer;


procedure initialize;

var index: integer;

begin
for index := 1 to limit do
pool[index] := empty;
top := min − 1
end;

procedure allocate( index, length: integer; var address: integer);

begin

address := pool[index];
if address <> empty then
pool[index] := memory[address]
else
begin
address := top + 1;
top := top + length;
if not (top <= max)
then raise Exception.Create('Stack overflow..')

end
end;

procedure release( index, address: integer);
begin
memory[address] := pool[index];
pool[index] := address
end;

--


More of my philosophy about about the paper and about preemptive and
non-preemptive timesharing and more..

I have just forgotten to post about who has written the following
paper about cooperative and preemptive tasking:

https://users.ece.cmu.edu/~koopman/pubs/koopman90_HeavyweightTasking.pdf

Here is the Professor Phil Koopman of Carnegie Mellon University from
Department of Electrical and Computer Engineering who has written
this paper:

https://users.ece.cmu.edu/~koopman/personal.html

And note that i am calling, in my thoughts below, cooperative and
preemptive tasking: "preemptive and non-preemptive timesharing"

More of my philosophy about Intel 8051 controller and about preemptive
and non-preemptive timesharing and more..

I have just quickly read the following interesting paper and it says
that judicious use of cooperative tasking techniques can also often meet
an embedded system's multitasking requirements, while giving better
performance and a simpler software environment than a preemptive
multitasker, so read it carefully here:

https://users.ece.cmu.edu/~koopman/pubs/koopman90_HeavyweightTasking.pdf

And notice that it also says in the above paper that so that to meet
the requirements with cooperative multitasking you have to move the
time-critical code to interrupt-service routines. And let us look
for example at the Intel 8051 controller here:

https://www.electronicwings.com/8051/introduction-to-8051-controller

So as you notice that it has many hardware interrupts that you can
use so that to make the cooperative tasking efficient, and i think it
also comes with two clock timers interrupts that you can use to
implement preemptive multitasking if you want, and you have also to know
about interrupt latency when programming embedded systems with hardware
controllers, and you have to know that the hardware interrupts have to
get serviced fast enough and often enough, so you shouldn't disable
interrupts for too long a period of time, and just to give you an idea
, look for example at the nonbuffered communication UART (Universal
Asynchronous Receiver Transmitter) operating at 38,400 bits per second
will interrupt every 208 microseconds. This is 1/38,400*8 because they
will interrupt for every byte (8 bits), and a processor or controller
running at 25MHz executes most of its instructions in
2 or 3 system-clock periods. That would be an average of 120 nanoseconds
(1/25,000,000*3). In theory, this means you could execute as
many as 1,730 instructions in the interrupt interval. So that was only
in theory, now you have to do the reality check. You must take into
consideration that there are more interrupts than just that
communication channel. The timer interrupt will be firing off every so
often. And the communication interrupt itself will have interrupts
disabled for good period of time, and not only that, but there is also
the tasks switch that can be expensive, so you have to think about
it efficiently.

So i invite you to read my below thoughts about preemptive and
non-preemptive timesharing and more so that to understand much more
efficiently:

More of my philosophy about preemptive and non-preemptive timesharing
and more..

I have just took a smart look at Modula-2 language(Modula-2 is a
structured, procedural programming language developed between 1977 and
1985 by Niklaus Wirth at ETH Zurich, and he has also developed Pascal
language, read about Niklaus Wirth here:
https://en.wikipedia.org/wiki/Niklaus_Wirth), and i think Modula-2
language was among the first languages that has provided preemptive and
non-preemptive timesharing with coroutines, but the preemptive
timesharing in Modula-2 uses Interrupt handling using IOTRANSFER, but it
is best reserved for programs that will run without operating system
support. Installing an interrupt handler on a multiuser system is not
feasi­ble because doing so would affect other users. (For this reason,
IOTRANSFER is not a mandatory feature of Modula-2.) Even on single-user
systems, IOTRANSFER can be difficult to use because installing an
interrupt handler causes the old interrupt handler (which most likely
belongs to the operating system) to be lost. So this is why i think that
the best way in modern operating systems is to use non-preemptive
timesharing with coroutines, so this is why i am providing you with my
sophisticated implementation of stackful coroutines, read about it in my
thoughts below:

More of my philosophy about timesharing that is a Solution to Computer
Bottlenecks..

I invite you to look at the following very interesting video about
timesharing that is a Solution to Computer Bottlenecks:

https://www.youtube.com/watch?v=Q07PhW5sCEk

I think i am smart, and you have to understand one important thing
and it is: What is the difference between a software architect and
a software engineer?, i think there is an important difference and it
is also like abstracted in the following question:

"How it is made?"

So i think that software engineering works at a higher level than
a software architect, this is why you will notice that i am
quickly implementing a sophisticated stackful coroutines
Library and i am quickly implementing setjmp() and longjmp() with
x64 assembler or code machine, read my below thoughts about them, but
you have to know that my sophisticated stackful coroutines Library
does a kind of timesharing as in the above video, but i think that there
is two kinds of timesharing: the preemptive one, and the non-preemptive
one, but the difference is that the preemptive one does interrupt with a
timer the coroutines from an external scheduler in
a form of function, but notice below that i am implementing the
non-preemptive timesharing in my sophisticated coroutines Library, but
you have to be smart and notice that my way of doing is like the
software architect way, since i am implementing it from the lowest level
with x64 assembler routines that are part of the non-preemptive
scheduler, but not only that, but you have also to look at how i am also
implementing a
sophisticated and much more rich interface in my stackful coroutines
Library, so it is like both software achitecting and software
engineering, so here is all my below thoughts that shows how i am
implementing it quickly, so read it carefully since you have also to
know what's the problem with the stack frames when architecturing and
using the setjmp() and longjmp() so that to implement coroutines:

More of my philosophy and precision about the link of the article and more..

And notice that the link below of the article that shows the problem
of implementing coroutines with just setjmp() and longjmp()
is from the last semester of the second year of the course
called "CS4411 Operating Systems" from Michigan Technological
University, but i think i am smart and those courses are easy
for me, so i invite you to read about this course that requires
both the course of "CS3331 Concurrent Computing" and "CS3421 Computer
Organization", and here it is:

http://www.csl.mtu.edu/cs4411.ck/www/Home.html

More of my philosophy about coroutines and about setjmp() and longjmp()..

I think i am smart, and i will say that with setjmp() and longjmp()
you can implement a generator or the like, but you can not implement
coroutines with just setjmp() and longjmp(), and so that to understand
it, i invite you to read the following article that shows how when you
yield from a first function with a longjmp() to the main body of a
program and when you call another functions with longjmp(), it can make
the stack frames not work correctly, and when you understand it you will
not use setjmp() and longjmp() alone so that to implement coroutines, so
read the following article so that to understand the problem with
the stack frames, and i am understanding it easily:

https://www.csl.mtu.edu/cs4411.ck/www/NOTES/non-local-goto/coroutine.html

So this is why i have also implemented my sophisticated stackful
coroutines library so that to solve this problem, and here is my
sophisticated coroutines library and read about it and download it from
here:

https://sites.google.com/site/scalable68/object-oriented-stackful-coroutines-library-for-delphi-and-freepascal

More of my philosophy about setjmp() and longjmp() and generators and
coroutines..

I have just quickly implemented setjmp() and longjmp() in x64 assembler,
and after that i have just implemented quickly a good example of a
generator with my setjmp() and longjmp(), look at it below, and in
computer science, a generator is a routine that can be used to control
the iteration behaviour of a loop. All generators are also iterators. A
generator is very similar to a function that returns an array, in that a
generator has parameters, can be called, and generates a sequence of
values. However, instead of building an array containing all the values
and returning them all at once, a generator yields the values one at a
time, which requires less memory and allows the caller to get started
processing the first few values immediately. In short, a generator looks
like a function but behaves like an iterator. So here is my
implementations in freepascal and delphi and they are working perfectly:

Here is my first unit that implements longjmp() and setjmp() and notice
how i am saving the non-volatile registers and how i am coding it in
x64 assembler:

======


{ Volatile registers: The calling program assumes registers
RAX, RCX, RDX, and R8 through R11 are volatile.
The contents of registers RBX, RSI, RDI, RBP, RSP, and
R12 through R15 are considered non-volatile. Functions return
values in RAX. }



unit JmpLib64;

{$IFDEF FPC}
{$ASMMODE intel}
{$ENDIF}
interface

type
jmp_buf = record
RBX,
RSI,
RDI,
RSP,
RBP,
RIP,
R12,
R13,
R14,
R15: UInt64;
end;

{ setjmp captures the complete task state which can later be used to
perform a non-local goto using longjmp. setjmp returns 0 when it is
initially called, and a non-zero value when it is returning from a call
to longjmp. setjmp must be called before longjmp. }

function setjmp(out jmpb: jmp_buf): UInt64;

{ longjmp restores the task state captured by setjmp (and passed in
jmpb). It then returns in such a way that setjmp appears to have
returned with the value retval. setjmp must be called before longjmp. }

procedure longjmp(const jmpb: jmp_buf; retval: UInt64);
implementation

function setjmp(out jmpb: jmp_buf): UInt64; assembler;{$IFDEF FPC}
nostackframe; {$ENDIF}register;
asm
{ -> RCX jmpb }
{ <- RAX Result }
MOV RDX, [RSP] // Fetch return address (RIP)
// Save task state
MOV [RCX+jmp_buf.&RBX], RBX
MOV [RCX+jmp_buf.&RSI], RSI
MOV [RCX+jmp_buf.&RDI], RDI
MOV [RCX+jmp_buf.&RSP], RSP
MOV [RCX+jmp_buf.&RBP], RBP
MOV [RCX+jmp_buf.&RIP], RDX
MOV [RCX+jmp_buf.&R12], R12
MOV [RCX+jmp_buf.&R13], R13
MOV [RCX+jmp_buf.&R14], R14
MOV [RCX+jmp_buf.&R15], R15


SUB RAX, RAX
@@1:
end;

procedure longjmp(const jmpb: jmp_buf; retval: UInt64);assembler;{$IFDEF
FPC} nostackframe; {$ENDIF}register;
asm
{ -> RCX jmpb }
{ RDX retval }
{ <- RAX Result }
XCHG RDX, RCX
MOV RAX,RCX
MOV RCX, [RDX+jmp_buf.&RIP]
// Restore task state
MOV RBX, [RDX+jmp_buf.&RBX]
MOV RSI, [RDX+jmp_buf.&RSI]
MOV RDI, [RDX+jmp_buf.&RDI]
MOV RSP, [RDX+jmp_buf.&RSP]
MOV RBP, [RDX+jmp_buf.&RBP]
MOV R12, [RDX+jmp_buf.&R12]
MOV R13, [RDX+jmp_buf.&R13]
MOV R14, [RDX+jmp_buf.&R14]
MOV R15, [RDX+jmp_buf.&R15]
MOV [RSP], RCX // Restore return address (RIP)

TEST RAX, RAX // Ensure retval is <> 0
JNZ @@1
MOV RAX, 1
@@1:
end;

end.

================

And here is my example of a generator with my longjmp() and setjmp():


{ In computer science, a generator is a routine that can be used to
control the iteration behaviour of a loop. All generators are also
iterators. A generator is very similar to a function that returns an
array, in that a generator has parameters, can be called, and generates
a sequence of values. However, instead of building an array containing
all the values and returning them all at once, a generator yields the
values one at a time, which requires less memory and allows the caller
to get started processing the first few values immediately. In short, a
generator looks like a function but behaves like an iterator. }

program test_generator;

{$APPTYPE CONSOLE}

uses
JmpLib64;

type PtrInt = ^Integer;

var
childtask,maintask: jmp_buf;
myarr1: array of integer;
i,a:integer;
Ptr1:PtrInt;

function generator(var myarr:array of integer):integer;

var i1:integer;
val:integer;
ptr:PtrInt;
begin


i1:=0;

val:= setjmp(childtask);

i1:=val-1;

if val=0 then
begin
new(ptr);
ptr^:=myarr1[i1];
longjmp(maintask,uint64(ptr));
end;

if val=10
then
begin
writeln('Exiting child..');
exit;
end;

inc(i1);
new(ptr);
ptr^:=myarr1[i1];
longjmp(maintask,uint64(ptr));
end;

begin

setlength(myarr1,10);

for i:=0 to 9
do myarr1[i]:=i;

uint64(ptr1):=setjmp(maintask);

if ptr1=nil then generator(myarr1);

a:=ptr1^;
dispose(ptr1);

if (a<=length(myarr1))
then
begin
if a=length(myarr1)
then longjmp(childtask,a+1)
else
begin
writeln('Value returned by generator is: ',a);
longjmp(childtask,a+1);
end;
end;

setlength(myarr1,0);

end.

====


Thank you,
Amine Moulay Ramdane.


Reply all
Reply to author
Forward
0 new messages