save_in can only happen to an Auxiliary variable of the exact same type.
Let me see if I can explain this better:
Let's say you have one element with 4 nodes and you are solving one equation for "u" using Linear Lagrange shape functions (ie one "linear Quad").
You will have _4_ entries in your residual vector. One for each Lagrange shape function you are testing your residual against. For a refresher on what goes in each entry, check out the last equation in the "Numerical Integration" section (the first section) here:
http://mooseframework.org/wiki/MooseTraining/FEM/NumericalImplementation/. Note that that equation is for R_i where "i" is the test function.
So, for each test function (and there is one associated with each node with Linear Lagrange) you need to do an integration over the element and fill in R_i. So while you are "integrating" something over the element all of the contributions related to one test function all flow into _one_ entry in the residual. It's not spreading it out, or averaging it. It's integrating the current residual against that test function on that element and putting that value into R_i.
Because of the way Lagrange shape functions are constructed (such that only one shape function has a value at each node... and that value is 1.0) you can think of each of each of those test functions as being associated with one node. So, think about the residual as a vector for a moment. It has four entries in it... and we have 4 nodes. So if we want to do a vector L2 norm of the residual we have two choices:
1) We can loop through the residual directly... squaring each entry and summing up the result and ultimately taking the square root. This is what MOOSE (and PETSc) do internally during the solve when we're printing out the norm of the residual.
2) We can save a copy of the residual over in the Auxiliary system in a Linear Lagrange variable (which will give us 4 values just like the residual vector has and in the same order). Then we can loop through the nodes and lookup to see what entry in the residual vector (really the auxiliary solution vector that is holding a copy of the residual vector) goes with that node... then we can square that value at each node, sum all of those and then take a square root at the end. That's what the NodalL2Norm Postprocessor does.
Both things achieve the same goal. They are just two different ways of thinking about the residual vector.
NOW... like I said, that only works for Lagrange. If you are using more exotic shape functions like 3rd order Hermites then they actually have multiple residual entries associated with each node... so looping through the nodes and getting one value (like NodalL2Norm does), squaring them, summing them then square rooting them will NOT give you the same value as looping through the residual vector directly.
I apologize if that last point threw you off before. I just wanted you to be aware that using NodalL2Norm with other shape functions other than Lagrange will NOT yield the expected result...
Derek