Hi.
I am new to the community, and to Selenium grid too. Pleasure to meet :)
I am not sure if I encountered a bug or not- so I am checking the matter here before opening an issue.
I have deployed the selenium-chart in my k8s cluster (selenium grid 4.1.4) and am using Keda autoscaler to scale the browser pods, which defaulted to scaling them down to 0 when no jobs are in the queue.
after we have encountered some timeout issues, we have tested the matter by removing the autoscalers and scheduling jobs without any browser pods (scaled at still at 0).
We found that the jobs are failing after a minute, with "ERROR webdriver: RequestError: socket hang up" and a "ERROR @wdio/runner: Error: Failed to create session. socket hang up" exceptions.
Having knowing that selenium 'SE_SESSION_REQUEST_TIMEOUT' and 'SE_NODE_SESSION_TIMEOUT' defaults to 5 minutes- I knew that either there is a bug and these timeouts aren't working (we also tried increasing 'SE_SESSION_REQUEST_TIMEOUT' to 10 minutes with the env variable) or something else is at play.
We continued testing by restoring the Keda autoscalers, but configuring them to scale down to 1 instead of 0, and now it appears (so far at least) like we aren't receiving anymore timeouts.
It takes the browser pod a few minutes to get ready- can it be that the timeout settings are being used only when there is at least one browser pod available?