Another enhancement to nonlinear regression algorithms is Golub and Pereyra's technique of projecting over any conditionally linear parameters, Statisticians could characterize it as profiling the residual sum of squares function to the nonlinear parameters only.
It does help to stabilize the estimation but most importantly it helps by reducing the number of parameters for which initial estimates are required.
I wrote the implementation of the nonlinear least squares algorithms in R and would be happy to help with such implementations in Julia. In fact it would be a good idea because I am supposed to be writing a second edition of our book on nonlinear regression (Bates and Watts, 1988, Wiley) but instead am writing Julia code for several other types of models.