Hello Yan,
This is probably due to the frequency of logins that are being performed or stale ssh keys but it would help me understand the issue better if I had more context on what your scripts are doing. I also have a few more questions:
What is the OS you are running? I ask because we have seen these types of issues occur with Ubuntu based images that come with features like SSH Guard and disabling SSH guard through a startup script is what we recommend. Sample script is below:
#! /bin/bash
sudo apt-get remove --auto-remove sshguard
sudo apt-get purge --auto-remove sshguard
Also, is there any chance you are rotating SSH keys frequently? This would explain the abruptness you have observed where this just suddenly stopped working. The SSH keys might have been rotated during the period your test was running.
You can check this by navigating to Compute Engine > Metadata in your GCP console then click ‘SSH Keys’ and check for any expiry dates on keys. If they are short lived keys, switching to more persistent keys that do not expire is what we recommend in these cases.
We also have a script you can use to debug SSH issues[1] all you need to do is follow the steps on that page to get some logging around what can be causing the issue.
Hope that helps Yan.
[1] https://github.com/GoogleCloudPlatform/compute-ssh-diagnostic-sh