I am not an expert in that, but i can suggest you some things:
The timestep is not the ideal for X-H bonds, less for gas phase molecules, since the oscillation frecuency is bigger. I would put 0.5 fs.
Other important thing is, the temperature is not well defined for one step (is a statistical property), less if the sistem just have 50 atoms, maybe that do not allow a "fast or regular" convergence of temperature.
Also, you are not using a PBC system, i do not know if the kinetic energy, and therefore the temperature, is well defined.
In my case, I usually use a liquid system (300 atoms) and PBC conditions, with the same nose-hoover parameters, except for TIMECON=50, but I use a clasical Molecular Dinamics first the ab-initio MD to obtain a better guess, with a better correspondence between the kinetic energy and the potential energy (positions); those systems reached the reference temperature and equilibrium in less than 1 ps (with the tipical oscilation around the reference temperature, but well matched the average value). I feel for the temperature plot, and the initial molecular geometry, you use an optimiced/relax potential energy geometry, and that can make it longer time to reached the reference temperature, because the thermasthat must inject a great amount of kinetic energy to reach the equilibrium.
Sorry if it is not helpful. Regards