This is complicated. Your formula is a little weird, so let's rewrite it a bit:
Ci = Kd * diffuse(N) + Ks * microfacet("beckmann", N, ...);
(experts: let's sweep all the details I know you're thinking of under the rug for a moment)
OK, so in a language like RSL, diffuse() and microfacet() are functions that returns concrete color values, which you then scale and add and assign to Ci, which is also a color. The renderer gets back Ci, numeric values for RGB, representing the outgoing radiance in a particular direction. It's a number like (0.25, 0.89, 0.21). If you want to know the outgoing radiance in a different direction, or from a different set of light positions, you have to rerun the shader again, and get another Ci value, such as (0.37, 0.42, 0.74).
So, as you suspect, there is nothing, NOTHING preventing the careless shader writer (or dialer of shader parameters) from setting Kd and Ks to sum to more than 1.0.
But that's not OSL!
In OSL, closures are not colors, they are NOT collections of numbers. A closure is a sum of weighted function pointers and numeric arguments for the functions. So maybe a pictorial representation of OSL's Ci might be:
weight function arg1(N) arg2(roughness)
------------------------------------------------------
.4,.4,.4 "diffuse" (0.6,0.7,0.9) -
.52,.49,.1 "beckmann" (0.6,0.7,0.9) 0.01
The underlying implementations of diffuse and beckmann have additional parameters, incoming and outgoing directions (you can think of them like I and L) that are not supplied by the shaders. The renderer will supply those for each eventual call of the underlying diffuse & beckman, as it chooses samples and directions. So at some point, an I and L will be chosen, and there will be a call to
outgoing_radiance += weight * function.eval (I, L, arg1, arg2, ...)
But that's long after the shader has completed. Also, there's more that the renderer can do with the function than evaluate it. How about this:
light_direction = function.sample (I, random1, random2, arg1, arg2, ...)
It can *sample* the functions (supplied with uniform random values, the "sample" method is expected to pick the right probability distribution function to make good samples in directions that are "important"). This is really, really tricky in a language like RSL because there is a a chicken-and-egg problem: the BSDF evaluation functions need the light directions already chosen, but the renderer doesn't know which are important directions to sample without knowing the BSDF. Uh oh. The closure approach fixes all of this.
Anyway, in this scenario, it is trivial for the renderer to "fix" any situation where the shader has grossly over-weighted the BSDF contributions. Each BSDF primitive function will integrate to 1.0. So the renderer just sums the weights of the components, and if they exceed 1.0, it then divides all the weights by their sum, keeping them in the same balance but ensuring that they never sum to more than 1.0.
Now, that's not perfect. It ensures that integrated over ALL angles, it's energy conserving. But for a particular angle, various combinations of BSDFs (even if the sum of their weights <= 1.0) could be non-conserving. This is an ongoing problem (though less severe than Kd+Ks >1), and it's the reason that we have *pairs* of BSDFS that are meant to sum properly and mutually conserve energy -- for example, a render might have a dielectric_reflection and a dielectric_refraction that, if given the same roughness and IOR, are guaranteed to match in a way that conserves energy. Even that is tricky, so the foolproof way is to make a single dielectric_everything that handles it all for you and leaves no room for the shader writer or lookdev TD to dial things to evade the rules of energy conservation. A lot of people are starting to gravitate to this approach, where a typical shader group uses just one closure that does it all (perhaps choosing among several, but never combining except for sets that are purposely designed to produce the right results when coupled).
I hope this gives a bit more concrete idea of what's going on. It's a big conceptual shift, and while at first it seems like it's taking functionality away from the shader, what it's really doing is moving the division of labor between the shader and renderer to a more sane place, where there is a clean line between describing the material, and describing the rendering algorithm (including the sampling and integration), and this turns out to have HUGE benefits.