More of my philosophy about 3D stacking in CPUs and about Moore’s Law and more..

12 views
Skip to first unread message

World-News2100

unread,
Feb 13, 2022, 6:12:38 PMFeb 13
to
Hello,


More of my philosophy about 3D stacking in CPUs and about Moore’s Law
and more..

I am a white arab from Morocco, and i think i am smart since i have also
invented many scalable algorithms and algorithms..

3D stacking offers an extension for Moore’s Law, but in 3D stacking
Heat removal is the issue and the big problem, this is why the actual
technologies like the 3D stacking of Intel are limited to stacking just
two or few layers.

More of my philosophy about more of my philosophy about Moore’s Law and
EUV (Extreme ultraviolet lithography)..

Researchers have proposed successors to EUV, including e-beam and
nanoimprint lithography, but have not found any of them to be reliable
enough to justify substantial investment.

And I think by also using EUV (Extreme ultraviolet lithography) to
create CPUs we will extend Moore's law by around 15 years that
corresponds to around 100x scalability in performance, and i think that
it is the same performance of 100x as the following invention from graphene:

About graphene and about unlocking Moore’s Law..

I think that graphene can now be mass produced, you can read about it here:

We May Finally Have a Way of Mass Producing Graphene

It's as simple as one, two, three.

Read more here:

https://futurism.com/we-may-finally-have-a-way-of-mass-producing-graphene

So the following invention will be possible:

Physicists Create Microchip 100 Times Faster Than Conventional Ones

Read more here:

https://interestingengineering.com/graphene-microchip-100-times-fast?fbclid=IwAR3wG09QxtQciuku4KUGBVRQPNRSbhnodPcnDySLWeXN9RCnvb0GqRAyM-4

More philosophy about the Microchips that are 100 Times or 1000 times
Faster Than Conventional Ones..

I think that the following invention of Microchips that are 100 Times
or 1000 times Faster Than Conventional Ones has its weakness, since
its weakness is cache-coherence traffic between cores that
takes time, so i think that they are speaking about 100-times
or 1000-times more speed in a single core performance, so
parallelism is still necessary and you need scalable algorithms
for that so that to scale much more on multicores CPUs..

Physicists Create Microchip 100 Times Faster Than Conventional Ones

Read more here:

https://interestingengineering.com/graphene-microchip-100-times-fast?fbclid=IwAR3wG09QxtQciuku4KUGBVRQPNRSbhnodPcnDySLWeXN9RCnvb0GqRAyM-4


And read the following news:


AMD Demonstrates Stacked 3D V-Cache Technology: 192 MB at 2 TB/sec which
would technically be faster than the L1 cache on the die (but with
higher latency)..

"The AMD team surprised us here. What seemed like a very
par-for-the-course Computex keynote turned into an incredible
demonstration of what AMD is testing in the lab with TSMC’s new 3D
Fabric technologies. We’ve covered 3D Fabric before, but AMD is putting
it to good use by stacking up its processors with additional cache,
enabling super-fast bandwidth, and better gaming performance."

Read more here:

https://www.anandtech.com/show/16725/amd-demonstrates-stacked-vcache-technology-2-tbsec-for-15-gaming


More of my philosophy about the knee of an M/M/n queue and more..

Here is the mathematical equation of the knee of an M/M/n queue in
queuing theory in operational research:

1/(n+1)^1/n

n is the number of servers.

So then an M/M/1 has a knee of 50% of the utilization, and the one of
an M/M/2 is 0,578, so i correct below:

More of my philosophy about the network topology in multicores CPUs..

I invite you to look at the following video:

Ring or Mesh, or other? AMD's Future on CPU Connectivity

https://www.youtube.com/watch?v=8teWvMXK99I&t=904s

And i invite you to read the following article:

Does an AMD Chiplet Have a Core Count Limit?

Read more here:

https://www.anandtech.com/show/16930/does-an-amd-chiplet-have-a-core-count-limit

I think i am smart and i say that the above video and the above article
are not so smart, so i will talk about a very important thing, and it is
the following, read the following:

Performance Scalability of a Multi-core Web Server

https://www.researchgate.net/publication/221046211_Performance_scalability_of_a_multi-core_web_server

So notice carefully that it is saying the following:

"..we determined that performance scaling was limited by the capacity of
the address bus, which became saturated on all eight cores. If this key
obstacle is addressed, commercial web server and systems software are
well-positioned to scale to a large number of cores."

So as you notice they were using an Intel Xeon of 8 cores, and the
application was scalable to 8x but the hardware was not scalable to 8x,
since it was scalable only to 4.8x, and this was caused by the bus
saturation, since the Address bus saturation causes poor scaling, and
the Address Bus carries requests and responses for data, called snoops,
and more caches mean more sources and more destinations for snoops that
is causing the poor scaling, so as you notice that a network topology of
a Ring bus or a bus was not sufficient so that to scale to 8x on an
Intel Xeon with 8 cores, so i think that the new architectures like Epyc
CPU and Threadripper CPU can use a faster bus or/and a different network
topology that permits to both ensure a full scalability locally in the
same node and globally between the nodes, so then we can notice that a
sophisticated mesh network topology not only permits to reduce the
number of hops inside the CPU for good latency, but it is also good for
reliability by using its sophisticated redundancy and it is faster than
previous topologies like the ring bus or the bus since
for example the search on address bus becomes parallelized, and it looks
like the internet network that uses mesh topology using routers, so it
parallelizes, and i also think that using a more sophisticated topology
like a mesh network topology is related to queuing theory since we can
notice that in operational research the mathematics says that we can
make the queue like M/M/1 more efficient by making the server more
powerful, but we can notice that
the knee of a M/M/1 queue is around 50% , so we can notice that
by using in a mesh topology like internet or inside a CPU you can
by parallelizing more you can in operational research both enhance the
knee of the queue and the speed of executing the transactions and it is
like using many servers in queuing theory and it permits to scale better
inside a CPU or in internet.


Thank you,
Amine Moulay Ramdane.
Reply all
Reply to author
Forward
0 new messages