Once you lock any M, to the CPU assigned to it, that is not waiting for a syscall an unlock if you wait, P is not needed anymore.
Basically you have NumCPU active Ms plus a number of Ms which are pure waiters. Since there can be only one active OS thread per CPU at any given time, this should suffice. And once your M is idle, just steal some runnable Gs from foreign Ms. No P needed.
I'm glad you are thinking about this.Dmitry Vyukov <dvy...@google.com> writes:
> Here are some thoughts on subj:
> https://docs.google.com/document/d/1TTj4T2JO42uD5ID9e89oa0sLKhJYD0Y_kqxDv3I3XMw/edit
> I may start working on it in the near future.
> Feedback is welcome.
It sounds like you are saying that an M corresponds to a system thread,
as is the case today. A P loosely represents a processor, and we
require that every M attach to a P while running.
In the "Scheduling" section you say that a P picks a new G to run. But
presumably the P has to pick an M to run it. And elsewhere we see that
an M has to pick a P to run. So I guess I don't understand this. When
does a P pick a new G? How does it pick the M?
In the "Parking and Unparking" section you say that there are most
GOMAXPROCS spinning M's. But we could have a lot of M's simultaneously
come out of a syscall, with an associated G but without an associated P.
Won't they all be spinning waiting for a P?
Your initial point 4 says that syscalls lead to lots of thread blocking
and unblocking. Does your plan address that?
You speak of the P's as executing code. But as I think about this I'm
finding it easier to think of the M's as executing code. The P's become
a shared resource to manage memory, so that rather than the memory cache
being per-M, it is shared in such a way that each M has exclusive access
to a memory cache, and we only have as many memory caches as we have
simultaneously executing M's.
You have arranged for an affinity for G's to P's. But what matters more
for performance purposes is an affinity for G's to CPU's. And while a P
loosely represents a CPU, the kernel doesn't know about that. It seems
that we also need an affinity for M's to P's, so that G's tend to wind
up executing on the same set of M's.
As you note in point 2, currently
G's often move between M's unnecessarily. Can we avoid that?
Is there some reasonable way that we could permit a G entering a syscall
to stay on the same M and P, and only have the P start running a new M
and G if the syscall does not return immediately
Minor note: "Some variables form M are" s/form/from/ .
Yes, spinning is going to address this.For example consider the following scenario.1. M1 returns from a syscall, finds that all P's are busy and spins waiting for P.2. M2 enters syscalls, returns its P to the list of idle P's, checks that there is already a spinning M (M1), so it does not need to wake anybody.3. M1 discovers the idle P, picks it up and continues executing Go code.This does not involve any thread blocking/unblocking.