Hello all,
We are running some BigBlueButton servers without any trouble for a few
months now, but yesterday one of them exhibited a strange problem for a
short while: For about an hour, users were getting the "1002: Could not
make a WebSocket connection" while the nginx error log contained lines
of the form:
upstream timed out (110: Connection timed out) while connecting to
upstream, client: <client IP>, server: <server hostname>,
request: "GET /ws HTTP/1.1", upstream: "https://<server IP>:7443/ws"
I didn't restart BBB immediately, because I wanted to make sure there
were no sessions in progress using this server, but I checked
"systemctl status freeswitch.service" which indicated that it was "active
(running)". "journalctl -xe -u freeswitch.service" didn't contain any
errors, either. Just as soon as I started checking the logs, the errors
disappeared, and users were able to connect normally (without any
sysadmin action).
CPU, RAM and networking graphs didn't display anything out of the
ordinary, and the number of users connected to the server before the
problem started wasn't especially high (a few hours before, the server
handled more than twice the users).
Any ideas about the cause of the disruption, or where I might look to
determine it?
Thanks!
--
Alexandros Diamantidis *
ad...@hellug.gr