Because it is.
Or, more precisely, perhaps GC *in itself* is not inefficient, but the
paradigm it's solving has turned out to be.
You see, garbage collection arose primarily from a practical problem in
object-oriented programming, and that problem is: When you dynamically
allocate tons and tons of individual objects, and they refer to each
other in a complicated mesh of references, how do you keep track of all
of this so that each object is properly destroyed once nothing refers to
it, and do it in a manner that's as easy for the programmer as possible,
as efficient and possible, and allows for circular dependencies without
causing leaks?
Thus, an incredible amount of academic and practical research work was
put into garbage collection algorithms and implementations that were as
efficient as possible.
Problem is, it's an efficient solution for a fundamentally inefficient
programming paradigm. It's a solution to something you actually shouldn't
be doing in the first place, in a modern computer system. What is this
thing you shouldn't be doing, you might ask?
Dynamically allocating tons and tons of individual objects. That's what.
In the 1980's and largely the 1990's, when OOP was the absolute king,
dynamically allocating tons of individual objects wasn't really such a
huge problem. CPUs didn't care how you were accessing memory, or how
the execution flow of the program jumped around. Each memory access was
equally slow regardless of which memory address you were using, and
every conditional jump was equally slow regardless of anything.
No longer.
CPUs started introducing memory caches, long instruction pipelines,
predictive branching, and a bunch of other optimizations which
suddenly started caring about how you access memory and how you do
your conditional jumps.
It has turned out that one of the greatest programming paradigms that has
ever existed, object-oriented programming, is also a performance killer
in modern CPUs. This is because OOP, at least the traditional approach
to it, is extremely cache-unfriendly, predictive-branching-unfriendly,
and pipeline-unfriendly, and thus tends to produce inefficient
executables.
For this reason for quite some time now many of the major projects out
there that require extreme performance, such as game engines, have
started moving away from an OOP design to a more data-oriented design
which optimizes for modern CPU architectures much better than OOP does.
And the thing about data-oriented design is that it so happens that it
doesn't benefit nor rely so much on a garbage collector, because you
are trying to avoid random dynamic memory allocations in the first
place (and prefer all data to be neatly in large arrays).
So yes, perhaps garbage collection *in itself* is not inefficient, but it's
fundamentally tied to a programming paradigm that is, and moving away from
that paradigm also lessens the need for GC.