I think the details aren't too onerous. Suppose f(x,y,z)=K defines z=g(x,y) implicitly.
For the graph z=g(x,y), the tangent plane at (a,b,c) (where c=g(a,b)) is:
z=c+g_x(a,b)(x-a)+g_y(a,b)(y-b),
or
-g_x(a,b)(x-a)-g_y(a,b)(y-b)+(z-c)=0.
Now sub in g_x = -f_x/f_z and g_y=-f_y/f_z, (based on results from 12.5) and multiply through by f_z to get
f_x(a,b,c)(x-a)+f_y(a,b,c)(y-b)+f_z(a,b,c)(z-c)=0
From the section on planes, we can immediately conclude that the normal vector is the gradient of f at (a,b,c).
I think this could almost be done in an aside, except that the plane equation wouldn't fit.
The part you leave out (although I think it's the cool part):
if grad f(a,b,c)=<0,0,0>, we call (a,b,c) a critical point. (That much is covered.)
the value f(a,b,c)=K is then called a critical value. [You might recall that I once objected to using "critical value" as a synonym for "critical point" in one variable.]
If K is not a critical value, we call K a regular value.
A fundamental theorem in differential topology is that if K is a regular value, the level set f(x,y,z)=K is a manifold.
In this context, that just means that a tangent plane is well defined at every point on the surface.
Why? If K is a regular value, then by definition, there are no critical points on the surface f(x,y,z)=K.
That means that at every point, at least one partial derivative is nonzero, and we can assume that the corresponding variable is defined implicitly as a function of the others.
So whether you write z=g(x,y), or y=h(x,z), or x=k(y,z), implicit differentiation will lead to the same tangent plane equation.
I assign this as an exploration every time I teach this material, because I think it's one of the more satisfying results.
Then we play around with some examples: if K is a critical value, what do the level sets look like?
(The family x^2+y^2-z^2=K is a good example, with the cone being the critical set.)