To save you some time, here's another idea that I think *won't* work:
You might think you could have each builder work on a separate copy of
the source tree, and then redistribute copies the built archives among
the builders to let them begin work on the shared libraries. But I
think this won't work because Ninja won't know where the copied-in
archives came from and it will attempt to rebuild them.
In general it's hard to make distributed systems work with shared
filesystems. Even with a single process builder you have to be
careful about with atomically writing outputs -- for example if the
compiler writes half a .o and then the ninja process and compiler both
crash, you need some other mechanism to know that the file needs to be
rebuilt. (I think Ninja handles this case by writing a log line out
after the build succeeds, which means that if the log line is missing
Ninja knows the file is bad. But that is also the reason the
distribution scheme I described above won't work. Another scheme,
used in tools like redo, is to always write outputs into a temporary
path and then rename upon completion, but as far as I know that only
works if commands can only have a single output.)
Because of this, the best practice I know of is to have a single
process be in charge of all scheduling and bookkeeping. Tools like
distcc work even without a shared filesystem because they ship all the
relevant files around when needed. (Note that distcc's "pump" mode
helps keep the network traffic down.)
Having written all of that, here's one final idea. Perhaps you could
break your build down into multiple separate build processes -- one
that builds the archives as before, but then a second one that is
unaware of how the archives are built and only knows to assemble them
together. (Effectively, the second step would treat the built
archives as inputs "from the system" and not things that can be built,
like source code.) You also would need to make these separate steps
use separate build directories (that is, you execute Ninja from
different directories or use the -C flag to specify a directory; you
could name these e.g. "stage1/" and "stage2/"). There would then be
no way for these steps to stomp on each other.
Unfortunately, that only solves part of your parallelization problem.
You still wouldn't be able to build multiple archives on multiple
machines in parallel.
PS: if you're on Linux you should look into "thin archives" -- they're
effectively just lists of .o files which means you don't spend build
time packing a bunch of .o files into a .a just to repack them again
into a .so. It also makes the "ar" step quick, which pairs well with
distcc.