There are multiple reasons why W'bal may not coincide closely with 0 kJ at exhaustion. One of those is of course CP + W' estimation error. In general, when the effort duration is longer than about 60-90s, errors in W'bal at exhaustion have more to do with the CP estimate. When the effort is short, then error in W' makes a bigger difference (this should be pretty simple to understand purely from a mathematical perspective).
If CP is underestimated, then what tends to happen is that W'bal drops negative during HIIT because you're telling the model that you are less fit than what you really are ie: you can actually produce more power and expend more energy than what the model "thinks". Converse applys for overestimation of CP. So for example, it might be that your CP is slightly overestimated in this case.
Another factor is that performance is always variable anyway. So maybe you're a little bit more fatigued on the day you do some maximal effort intermittent task as compared to the day or days when you did efforts to model CP.
Lastly, a problem which is common to all scenarios is that neither the integral model or the differential model is perfect. At present the models are a bit too simplistic and so they're not robust enough over a braod range of situations ranging from say short shart crit race style efforts on one end of the spectrum to longer sustained efforts but with long recovery between efforts (eg: a road race circuit with one big hill. Everyone recovers on the flat section in between the hill, then pushes hard on the hill each lap which creates the selection).