Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

About Java and Delphi and Freepascal..

3 views

Skip to first unread message

Wisdom90

unread,

Jan 27, 2020, 12:15:57 PM1/27/20

Hello,

About Java and Delphi and Freepascal..

I have just read the following webpage:

Java is not a safe language

https://lemire.me/blog/2019/03/28/java-is-not-a-safe-language/

But as you have noticed the webpage says:

- Java does not trap overflows

But Delphi and Freepascal do trap overflows.

And the webpage says:

- Java lacks null safety

But Delphi has null safety since i have just posted about it by saying
the following:

Here is MyNullable library for Delphi and FreePascal that brings null
safety..

Java lacks null safety. When a function receives an object, this object
might be null. That is, if you see ‘String s’ in your code, you often
have no way of knowing whether ‘s’ contains an actually String unless
you check at runtime. Can you guess whether programmers always check?
They do not, of course, In practice, mission-critical software does
crash without warning due to null values. We have two decades of
examples. In Swift or Kotlin, you have safe calls or optionals as part
of the language.

Here is MyNullable library for Delphi and FreePascal that brings null
safety, you can read the html file inside the zip to know how it works,
and you can download it from my website here:

https://sites.google.com/site/scalable68/null-safety-library-for-delphi-and-freepascal

And the webpage says:

- Java allows data races

But for Delphi and Freepascal i have just written about how to prevent
data races by saying the following:

Yet more precision about the invariants of a system..

I was just thinking about Petri nets , and i have studied more
Petri nets, they are useful for parallel programming, and
what i have noticed by studying them, is that there is two methods
to prove that there is no deadlock in the system, there is the
structural analysis with place invariants that you have to
mathematically find, or you can use the reachability tree, but we have
to notice that the structural analysis of Petri nets learns you more,
because it permits you to prove that there is no deadlock in the system,
and the place invariants are mathematically calculated by the following
system of the given Petri net:

Transpose(vector) * Incidence matrix = 0

So you apply the Gaussian Elimination or the Farkas algorithm to
the incidence matrix to find the Place invariants, and as you will
notice those place invariants calculations of the Petri nets look
like Markov chains in mathematics, with there vector of probabilities
and there transition matrix of probabilities, and you can, using
Markov chains mathematically calculate where the vector of probabilities
will "stabilize", and it gives you a very important information, and
you can do it by solving the following mathematical system:

Unknown vector1 of probabilities * transition matrix of probabilities =
Unknown vector1 of probabilities.

Solving this system of equations is very important in economics and
other fields, and you can notice that it is like calculating the
invariants , because the invariant in the system above is the
vector1 of probabilities that is obtained, and this invariant,
like in the invariants of the structural analysis of Petri nets,
gives you a very important information about the system, like where
market shares will stabilize that is calculated this way in economics.

About reachability analysis of a Petri net..

As you have noticed in my Petri nets tutorial example (read below),
i am analysing the liveness of the Petri net, because there is a rule
that says:

If a Petri net is live, that means that it is deadlock-free.

Because reachability analysis of a Petri net with Tina
gives you the necessary information about boundedness and liveness
of the Petri net. So if it gives you that the Petri net is "live" , so
there is no deadlock in it.

Tina and Partial order reduction techniques..

With the advancement of computer technology, highly concurrent systems
are being developed. The verification of such systems is a challenging
task, as their state space grows exponentially with the number of
processes. Partial order reduction is an effective technique to address
this problem. It relies on the observation that the effect of executing
transitions concurrently is often independent of their ordering.

Tina is using “partial-order” reduction techniques aimed at preventing
combinatorial explosion, Read more here to notice it:

http://projects.laas.fr/tina/papers/qest06.pdf

About modelizations and detection of race conditions and deadlocks
in parallel programming..

I have just taken further a look at the following project in Delphi
called DelphiConcurrent by an engineer called Moualek Adlene from France:

https://github.com/moualek-adlene/DelphiConcurrent/blob/master/DelphiConcurrent.pas

And i have just taken a look at the following webpage of Dr Dobb's journal:

Detecting Deadlocks in C++ Using a Locks Monitor

https://www.drdobbs.com/detecting-deadlocks-in-c-using-a-locks-m/184416644

And i think that both of them are using technics that are not as good
as analysing deadlocks with Petri Nets in parallel applications ,
for example the above two methods are only addressing locks or mutexes
or reader-writer locks , but they are not addressing semaphores
or event objects and such other synchronization objects, so they
are not good, this is why i have written a tutorial that shows my
methodology of analysing and detecting deadlocks in parallel
applications with Petri Nets, my methodology is more sophisticated
because it is a generalization and it modelizes with Petri Nets the
broader range of synchronization objects, and in my tutorial i will add
soon other synchronization objects, you have to look at it, here it is:

https://sites.google.com/site/scalable68/how-to-analyse-parallel-applications-with-petri-nets

You have to get the powerful Tina software to run my Petri Net examples
inside my tutorial, here is the powerful Tina software:

http://projects.laas.fr/tina/

Also to detect race conditions in parallel programming you have to take
a look at the following new tutorial that uses the powerful Spin tool:

https://mirrors.edge.kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html

This is how you will get much more professional at detecting deadlocks
and race conditions in parallel programming.

And about memory safety of Delphi and Freepascal, here is what i said:

I have just read the following webpage about memory safety:

Microsoft: 70 percent of all security bugs are memory safety issues

https://www.zdnet.com/article/microsoft-70-percent-of-all-security-bugs-are-memory-safety-issues/

And it says:

"Users who often read vulnerability reports come across terms over and
over again. Terms like buffer overflow, race condition, page fault, null
pointer, stack exhaustion, heap exhaustion/corruption, use after free,
or double free --all describe memory safety vulnerabilities."

So as you will notice below, that the following memory safety problems
has been solved in Delphi:

And I have just read the following webpage about "Fearless Security:
Memory safety":

https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/

Here is the memory safety problems:

1- Misusing Free (use-after-free, double free)

I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.

2- Uninitialized variables

This can be detected by the compilers of Delphi and Freepascal.

3- Dereferencing Null pointers

I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.

4- Buffer overflow and underflow

This has been solved in Delphi by using madExcept, read here about it:

http://help.madshi.net/DebugMm.htm

You can buy it from here:

http://www.madshi.net/

There remains also the stack exhaustion memory safety problem,
and here is how to detect it in Delphi:

Call the function "DoStackOverflow" below once from your code and you'll
get the EStackOverflow error raised by Delphi with the message "stack
overflow", and you can print the line of the source code where
EStackOverflow is raised with JCLDebug and such:

----

function DoStackOverflow : integer;

begin

result := 1 + DoStackOverflow;

end;

---

About my scalable algorithms inventions..

I am a white arab, and i am a gentleman type of person,
and i think that you know me too by my poetry that i wrote
in front of you and that i posted here, but i am
also a more serious computer developer, and i am also
an inventor who has invented many scalable algorithms, read about
them on my writing below:

Here is my last scalable algorithm invention, read
what i have just responded in comp.programming.threads:

About my LRU scalable algorithm..

On 10/16/2019 7:48 AM, Bonita Montero on comp.programming.threads wrote:
> Amine, a quest for you:
> Database-servers and operating-system-kernels mostly use LRU as
> the scheme to evict old buffers from their cache. One issue with
> LRU is, that an LRU-structure can't be updated by multiple threads
> simultaneously. So you have to have a global lock.
> I managed to write a LRU-caching-class that can update the links
> in the LRU-list to put the most recent fetched block to the head
> of the list without any lock in almost any acccess. Only when
> flushing an entry or inserting a new I have to lock the structure
> completely; but in contrast to cache-hits this has usually a mag-
> nitude lower frequency because of the slowness of disk-/ssd-access,
> so this doesn't relly hurt.
> The algorithm is partitiylly locked, partitially lock-free. Even
> the case when putting cache hits to the head has to be processed
> in locked mode in very rare cases. And as I said inserting and
> flushing is conventional locked access.
> So the quest is for you: Can you guess what I did?

And here is what i have just responded:

I think i am also smart, so i have just quickly found a solution that is
scalable and that is not your solution, so it needs my hashtable that is
scalable and it needs my fully scalable FIFO queue that i have invented.
And i think i will not patent it. But my solution is not Lockfree, it
uses locks like in a Lock striping manner and it is scalable.

And read about my other scalable algorithms inventions on my writing below:

About the buffer overflow problem..

I wrote yesterday about buffer overflow in Delphi and Freepascal..

I think there is a "higher" abstraction in Delphi and Freepascal
that does the job very well of avoiding buffer overflow, and it is
the TMemoryStream class, since it behaves also like a pointer
and it supports reallocmem() and freemem() on the pointer but
with a higher level abstraction, look for example at my
following example in Delphi and Freepascal, you will notice
that contrary to pointers , that the memory stream is adapting with
writebuffer() without the need of reserving the memory, and this is why
it avoids the buffer overflow problem, read the following example to
notice how i am using it with a PAnsichar type:

========================================

Program test;

uses system.classes,system.sysutils;

var P: PAnsiChar;

Begin

P:='Amine';

mem:=TMemorystream.create;

mem.position:=0;

mem.writebuffer(pointer(p)^,6);

mem.position:=0;

writeln(PAnsichar(mem.memory));

end.

===================================

So since Delphi and Freepascal also detect the buffer overflow on
dynamic arrays , so i think that Delphi and Freepascal are powerful
tools.

Read my previous thoughts below to understand more:

And I have just read the following webpage about "Fearless Security:
Memory safety":

https://hacks.mozilla.org/2019/01/fearless-security-memory-safety/

Here is the memory safety problems:

1- Misusing Free (use-after-free, double free)

I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.

2- Uninitialized variables

This can be detected by the compilers of Delphi and Freepascal.

3- Dereferencing Null pointers

I have solved this in Delphi and Freepascal by inventing a "Scalable"
reference counting with efficient support for weak references. Read
below about it.

4- Buffer overflow and underflow

This has been solved in Delphi by using madExcept, read here about it:

http://help.madshi.net/DebugMm.htm

You can buy it from here:

http://www.madshi.net/

And about race conditions and deadlocks problems and more, read my
following thoughts to understand:

I will reformulate more smartly what about race conditions detection in
Rust, so read it carefully:

You can think of the borrow checker of Rust as a validator for a locking
system: immutable references are shared read locks and mutable
references are exclusive write locks. Under this mental model, accessing
data via two independent write locks is not a safe thing to do, and
modifying data via a write lock while there are readers alive is not
safe either.

So as you are noticing that the "mutable" references in Rust follow the
Read-Write Lock pattern, so this is not good, because it is not like
more fine-grained parallelism that permits us to run the writes in
"parallel" and gain more performance from parallelizing the writes.

Read more about Rust and Delphi and my inventions..

I think the spirit of Rust is like the spirit of ADA, they are
especially designed for the very high standards of safety, like those
of ADA, "but" i don't think we have to fear race conditions that Rust
solve, because i think that race conditions are not so difficult to
avoid when you are a decent knowledgeable programmer in parallel
programming, so you have to understand what i mean, now we have to talk
about the rest of the safety guaranties of Rust, there remain the
problem of Deadlocks, and i think that Rust is not solving this problem,
but i have provided you with an enhanced DelphiConcurrent library for
Delphi and Freepascal that detects deadlocks, and there is also the
Memory Safety guaranties of Rust, here they are:

1- No Null Pointer Dereferences
2- No Dangling Pointers
3- No Buffer Overruns

But notice that I have solved the number 1 and number 2 by inventing my
scalable reference counting with efficient support for weak references
for Delphi and Freepascal, read below to notice it, and for number 3
read my following thoughts to understand:

More about research and software development..

I have just looked at the following new video:

Why is coding so hard...

https://www.youtube.com/watch?v=TAAXwrgd1U8

I am understanding this video, but i have to explain my work:

I am not like this techlead in the video above, because i am also an
"inventor" that has invented many scalable algorithms and there
implementions, i am also inventing effective abstractions, i give you an
example:

Read the following of the senior research scientist that is called Dave
Dice:

Preemption tolerant MCS locks

https://blogs.oracle.com/dave/preemption-tolerant-mcs-locks

As you are noticing he is trying to invent a new lock that is preemption
tolerant, but his lock lacks some important characteristics, this is why
i have just invented a new Fast Mutex that is adaptative and that is
much much better and i think mine is the "best", and i think you will
not find it anywhere, my new Fast Mutex has the following characteristics:

1- Starvation-free
2- Good fairness
3- It keeps efficiently and very low the cache coherence traffic
4- Very good fast path performance (it has the same performance as the
scalable MCS lock when there is contention.)
5- And it has a decent preemption tolerance.

this is how i am an "inventor", and i have also invented other scalable
algorithms such as a scalable reference counting with efficient support
for weak references, and i have invented a fully scalable Threadpool,
and i have also invented a Fully scalable FIFO queue, and i have also
invented other scalable algorithms and there inmplementations, and i
think i will sell some of them to Microsoft or to
Google or Embarcadero or such software companies.

Read my following writing to know me more:

More about computing and parallel computing..

The important guaranties of Memory Safety in Rust are:

1- No Null Pointer Dereferences
2- No Dangling Pointers
3- No Buffer Overruns

I think i have solved Null Pointer Dereferences and also solved Dangling
Pointers and also solved memory leaks for Delphi and Freepascal by
inventing my "scalable" reference counting with efficient support for
weak references and i have implemented it in Delphi and Freepascal (Read
about it below), and reference counting in Rust and C++ is "not" scalable.

About the (3) above that is Buffer Overruns, read here about Delphi and
Freepascal:

What's a buffer overflow and how to avoid it in Delphi?

read my above thoughts about it.

About Deadlock and Race conditions in Delphi and Freepascal:

I have ported DelphiConcurrent to Freepascal, and i have
also extended them with the support of my scalable RWLocks for Windows
and Linux and with the support of my scalable lock called MLock for
Windows and Linux and i have also added the support for a Mutex for
Windows and Linux, please look inside the DelphiConcurrent.pas and
FreepascalConcurrent.pas files inside the zip file to understand more.

You can download DelphiConcurrent and FreepascalConcurrent for Delphi
and Freepascal from:

https://sites.google.com/site/scalable68/delphiconcurrent-and-freepascalconcurrent

DelphiConcurrent and FreepascalConcurrent by Moualek Adlene is a new way
to build Delphi applications which involve parallel executed code based
on threads like application servers. DelphiConcurrent provides to the
programmers the internal mechanisms to write safer multi-thread code
while taking a special care of performance and genericity.

In concurrent applications a DEADLOCK may occurs when two threads or
more try to lock two consecutive shared resources or more but in a
different order. With DelphiConcurrent and FreepascalConcurrent, a
DEADLOCK is detected and automatically skipped - before he occurs - and
the programmer has an explicit exception describing the multi-thread
problem instead of a blocking DEADLOCK which freeze the application with
no output log (and perhaps also the linked clients sessions if we talk
about an application server).

Amine Moulay Ramdane has extended them with the support of his scalable
RWLocks for Windows and Linux and with the support of his scalable lock
called MLock for Windows and Linux and he has also added the support for
a Mutex for Windows and Linux, please look inside the
DelphiConcurrent.pas and FreepascalConcurrent.pas files to
understand more.

And please read the html file inside to learn more how to use it.

About race conditions now:

My scalable Adder is here..

As you have noticed i have just posted previously my modified versions
of DelphiConcurrent and FreepascalConcurrent to deal with deadlocks in
parallel programs.

But i have just read the following about how to avoid race conditions in
Parallel programming in most cases..

Here it is:

https://vitaliburkov.wordpress.com/2011/10/28/parallel-programming-with-delphi-part-ii-resolving-race-conditions/

This is why i have invented my following powerful scalable Adder to help
you do the same as the above, please take a look at its source code to
understand more, here it is:

https://sites.google.com/site/scalable68/scalable-adder-for-delphi-and-freepascal

Other than that, about composability of lock-based systems now:

Design your systems to be composable. Among the more galling claims of
the detractors of lock-based systems is the notion that they are somehow
uncomposable:

“Locks and condition variables do not support modular programming,”
reads one typically brazen claim, “building large programs by gluing
together smaller programs[:] locks make this impossible.”9 The claim, of
course, is incorrect. For evidence one need only point at the
composition of lock-based systems such as databases and operating
systems into larger systems that remain entirely unaware of lower-level
locking.

There are two ways to make lock-based systems completely composable, and
each has its own place. First (and most obviously), one can make locking
entirely internal to the subsystem. For example, in concurrent operating
systems, control never returns to user level with in-kernel locks held;
the locks used to implement the system itself are entirely behind the
system call interface that constitutes the interface to the system. More
generally, this model can work whenever a crisp interface exists between
software components: as long as control flow is never returned to the
caller with locks held, the subsystem will remain composable.

Second (and perhaps counterintuitively), one can achieve concurrency and
composability by having no locks whatsoever. In this case, there must be
no global subsystem state—subsystem state must be captured in
per-instance state, and it must be up to consumers of the subsystem to
assure that they do not access their instance in parallel. By leaving
locking up to the client of the subsystem, the subsystem itself can be
used concurrently by different subsystems and in different contexts. A
concrete example of this is the AVL tree implementation used extensively
in the Solaris kernel. As with any balanced binary tree, the
implementation is sufficiently complex to merit componentization, but by
not having any global state, the implementation may be used concurrently
by disjoint subsystems—the only constraint is that manipulation of a
single AVL tree instance must be serialized.

Read more here:

https://queue.acm.org/detail.cfm?id=1454462

And about Message Passing Process Communication Model and Shared Memory
Process Communication Model:

An advantage of shared memory model is that memory communication is
faster as compared to the message passing model on the same machine.

Read the following to notice it:

Why did Windows NT move away from the microkernel?

"The main reason that Windows NT became a hybrid kernel is speed. A
microkernel-based system puts only the bare minimum system components in
the kernel and runs the rest of them as user mode processes, known as
servers. A form of inter-process communication (IPC), usually message
passing, is used for communication between servers and the kernel.

Microkernel-based systems are more stable than others; if a server
crashes, it can be restarted without affecting the entire system, which
couldn't be done if every system component was part of the kernel.
However, because of the overhead incurred by IPC and context-switching,
microkernels are slower than traditional kernels. Due to the performance
costs of a microkernel, Microsoft decided to keep the structure of a
microkernel, but run the system components in kernel space. Starting in
Windows Vista, some drivers are also run in user mode."

More about message passing..

An advantage of shared memory model is that memory communication is
faster as compared to the message passing model on the same machine.

Read the following to notice it:

"One problem that plagues microkernel implementations is relatively poor
performance. The message-passing layer that connects
different operating system components introduces an extra layer of
machine instructions. The machine instruction overhead introduced
by the message-passing subsystem manifests itself as additional
execution time. In a monolithic system, if a kernel component needs
to talk to another component, it can make direct function calls
instead of going through a third party."

However, shared memory model may create problems such as synchronization
and memory protection that need to be addressed.

Message passing’s major flaw is the inversion of control–it is a moral
equivalent of gotos in un-structured programming (it’s about time
somebody said that message passing is considered harmful).

Also some research shows that the total effort to write an MPI
application is significantly higher than that required to write a
shared-memory version of it.

And more about my scalable reference counting with efficient support for
weak references:

My invention that is my scalable reference counting with efficient
support for weak references version 1.37 is here..

Here i am again, i have just updated my scalable reference counting with
efficient support for weak references to version 1.37, I have just added
a TAMInterfacedPersistent that is a scalable reference counted version,
and now i think i have just made it complete and powerful.

Because I have just read the following web page:

https://www.codeproject.com/Articles/1252175/Fixing-Delphis-Interface-Limitations

But i don't agree with the writting of the guy of the above web page,
because i think you have to understand the "spirit" of Delphi, here is why:

A component is supposed to be owned and destroyed by something else,
"typically" a form (and "typically" means in english: in "most" cases,
and this is the most important thing to understand). In that scenario,
reference count is not used.

If you pass a component as an interface reference, it would be very
unfortunate if it was destroyed when the method returns.

Therefore, reference counting in TComponent has been removed.

Also because i have just added TAMInterfacedPersistent to my invention.

To use scalable reference counting with Delphi and FreePascal, just
replace TInterfacedObject with my TAMInterfacedObject that is the
scalable reference counted version, and just replace
TInterfacedPersistent with my TAMInterfacedPersistent that is the
scalable reference counted version, and you will find both my
TAMInterfacedObject and my TAMInterfacedPersistent
inside the AMInterfacedObject.pas file, and to know how to use weak
references please take a look at the demo that i have included called
example.dpr and look inside my zip file at the tutorial about weak
references, and to know how to use delegation take a look at the demo
that i have included called test_delegation.pas, and take a look inside
my zip file at the tutorial about delegation that learns you how to use
delegation.

I think my Scalable reference counting with efficient support for
weak references is stable and fast, and it works on both Windows and
Linux, and my scalable reference counting scales on multicore and NUMA
systems, and you will not find it in C++ or Rust, and i don't think you
will find it anywhere, and you have to know that this invention of mine
solves the problem of dangling pointers and it solves the problem of
memory leaks and my scalable reference counting is "scalable".

And please read the readme file inside the zip file that i have just
extended to make you understand more.

You can download my new scalable reference counting with efficient
support for weak references version 1.37 from:

https://sites.google.com/site/scalable68/scalable-reference-counting-with-efficient-support-for-weak-references

And now i will talk about data dependency and parallel loops..

For a loop to be parallelized, every iteration must be independent of
the others, one way to be sure of it is to execute the loop
in the direction of the incremented index of the loop and in the
direction of the decremented index of the loop and verify if the results
are the same. A data dependency happens if memory is modified: a loop
has a data dependency if an iteration writes a variable that is read or
write in another iteration of the loop. There is no data dependency if
only one iteration reads or writes a variable or if many iterations read
the same variable without modifying it. So this is the "general" "rules".

Now there remains to know that you have for example to know how to
construct the parallel for loop if there is an induction variable or if
there is a reduction operation, i will give an example of them:

If we have the following (the code looks like Algol or modern Object
Pascal):

IND:=0

For I:=1 to N
Do
Begin
IND := IND + 1;
A[I]:=B[IND];
End;

So as you are noticing since IND is an induction variable , so
to parallelize the loop you have to do the following:

For I:=1 to N
Do
Begin
IND:=(I*(I+1))/2;
A[I]:=B[IND];
End;

Now for the reduction operation example, you will notice that my
invention that is my Threadpool with priorities that scales very well (
read about it below) supports a Parallel For that scales very well that
supports "grainsize", and you will notice that the grainsize can be used
in the ParallelFor() with a reduction operation and you will notice that
my following powerful scalable Adder is also used in this scenario, here
it is:

https://sites.google.com/site/scalable68/scalable-adder-for-delphi-and-freepascal

So here is the example with a reduction operation in modern Object Pascal:

TOTAL:=0.0
For I := 1 to N
Do
Begin
TOTAL:=TOTAL+A[I]
End;

So with my powerful scalable Adder and with my powerful invention that
is my ParallelFor() that scales very well, you will parallelize the
above like this:

procedure test1(j:integer;ptr:pointer);
begin

t.add(A[J]); // "t" is my scalable Adder object

end;

// Let's suppose that N is 100000
// In the following, 10000 is the grainsize

obj.ParallelFor(1,N,test1,10000,pointer(0));

TOTAL:=T.get();

And read the following to understand how to use grainsize of my Parallel
for that scales well:

About my ParallelFor() that scales very well that uses my efficient
Threadpool that scales very well:

With ParallelFor() you have to:

1- Ensure Sufficient Work

Each iteration of a loop involves a certain amount of work,
so you have to ensure a sufficient amount of the work,
read below about "grainsize" that i have implemented.

2- In OpenMP we have that:

Static and Dynamic Scheduling

One basic characteristic of a loop schedule is whether it is static or
dynamic:

• In a static schedule, the choice of which thread performs a particular
iteration is purely a function of the iteration number and number of
threads. Each thread performs only the iterations assigned to it at the
beginning of the loop.

• In a dynamic schedule, the assignment of iterations to threads can
vary at runtime from one execution to another. Not all iterations are
assigned to threads at the start of the loop. Instead, each thread
requests more iterations after it has completed the work already
assigned to it.

But with my ParallelFor() that scales very well, since it is using my
efficient Threadpool that scales very well, so it is using Round-robin
scheduling and it uses also work stealing, so i think that this is
sufficient.

Read the rest:

My Threadpool engine with priorities that scales very well is really
powerful because it scales very well on multicore and NUMA systems, also
it comes with a ParallelFor() that scales very well on multicores and
NUMA systems.

You can download it from:

https://sites.google.com/site/scalable68/an-efficient-threadpool-engine-with-priorities-that-scales-very-well

Here is the explanation of my ParallelFor() that scales very well:

I have also implemented a ParallelFor() that scales very well, here is
the method:

procedure ParallelFor(nMin, nMax:integer;aProc:
TParallelProc;GrainSize:integer=1;Ptr:pointer=nil;pmode:TParallelMode=pmBlocking;Priority:TPriorities=NORMAL_PRIORITY);

nMin and nMax parameters of the ParallelFor() are the minimum and
maximum integer values of the variable of the ParallelFor() loop, aProc
parameter of ParallelFor() is the procedure to call, and GrainSize
integer parameter of ParallelFor() is the following:

The grainsize sets a minimum threshold for parallelization.

A rule of thumb is that grainsize iterations should take at least
100,000 clock cycles to execute.

For example, if a single iteration takes 100 clocks, then the grainsize
needs to be at least 1000 iterations. When in doubt, do the following
experiment:

1- Set the grainsize parameter higher than necessary. The grainsize is
specified in units of loop iterations.

If you have no idea of how many clock cycles an iteration might take,
start with grainsize=100,000.

The rationale is that each iteration normally requires at least one
clock per iteration. In most cases, step 3 will guide you to a much
smaller value.

2- Run your algorithm.

3- Iteratively halve the grainsize parameter and see how much the
algorithm slows down or speeds up as the value decreases.

A drawback of setting a grainsize too high is that it can reduce
parallelism. For example, if the grainsize is 1000 and the loop has 2000
iterations, the ParallelFor() method distributes the loop across only
two processors, even if more are available.

And you can pass a parameter in Ptr as pointer to ParallelFor(), and you
can set pmode parameter of to pmBlocking so that ParallelFor() is
blocking or to pmNonBlocking so that ParallelFor() is non-blocking, and
the Priority parameter is the priority of ParallelFor(). Look inside the
test.pas example to see how to use it.

Thank you,
Amine Moulay Ramdane.

amin...@gmail.com

unread,

Jan 27, 2020, 12:16:56 PM1/27/20

Bonita Montero

unread,

Jan 27, 2020, 12:49:20 PM1/27/20

> About my LRU scalable algorithm..

Your LRU-algorithm ? Where is it ?

>> Amine, a quest for you:
>> Database-servers and operating-system-kernels mostly use LRU as
>> the scheme to evict old buffers from their cache. One issue with
>> LRU is, that an LRU-structure can't be updated by multiple threads
>> simultaneously. So you have to have a global lock.
>> I managed to write a LRU-caching-class that can update the links
>> in the LRU-list to put the most recent fetched block to the head
>> of the list without any lock in almost any acccess. Only when
>> flushing an entry or inserting a new I have to lock the structure
>> completely; but in contrast to cache-hits this has usually a mag-
>> nitude lower frequency because of the slowness of disk-/ssd-access,
>> so this doesn't relly hurt.
>> The algorithm is partitiylly locked, partitially lock-free. Even
>> the case when putting cache hits to the head has to be processed
>> in locked mode in very rare cases. And as I said inserting and
>> flushing is conventional locked access.
>> So the quest is for you: Can you guess what I did?

> And here is what i have just responded:

> I think i am also smart, ...

Then show me your LRU-code.

0 new messages