Ursela,
If you want the solution vector to be sparse, you need to add a term that minimizes its l1 norm, so I recommend adding a new residual which just returns the entire parameter vector as the residual and adding a L1 loss to that residual block. Adding L1norm or Huber to the data term only robusifies them it does not make the solution sparse.
The basic nonlinear least squares problem is
\sum_i f^2_i(x_i,theta)
where x_i is your data and theta is the parameter you are trying to fit.
What you are doing right now is
\sum_i L(f^2_i(x_i,theta))
where L is some loss function.
what you want to solve is
\sum_i f^2_i(x_i,theta) + \lambda * |theta|_1
where |x|_1 indicates the 1-norm of the parameter vector.
so you want a residual block corresponding to the lamba * |theta|_1
since we do not have l1 norm the next best thing is to use a smooth approximation to l1 norm using SoftL1Loss and solve
\sum_i f^2_i(x_i,theta) + \lambda * L(|theta|^2)
where L is the SoftL1Loss.
HTH,
Sameer