Hi all,
We're experiencing an issue that started out of nowhere on july 22nd. The server stopped responding and we saw that it was out of memory and with high cpu usage from the garbage collector. It was sudden, the server was stable for months, and the last change was more than a month ago.
The problem starts by itself in about 5minutes from startup (this changes if I up the number of connections to the database, it takes more time with more connections) with no clients and is always reproducible. So, after excluding a lot of variables like clients and other things, I tried reverting to a month old backup of the database and the problem is gone. Since there are no clients, the problem seems to be data-related and triggered by one of the routines that hapi executes from time to time.
We have batch loads from partners from time to time and it seems that some data that entered is the cause. Has anyone ever come across this?
Sometimes there are connection resets from the database or timeouts when the problem occurs. I have a feeling that calls are made to the database but take a long time to return, and thus things start to queue up and that causes the high memory usage.
Details about our setup are:
Hapi version: 2.5
Java 1.8
Amazon deployed
12GB of ram allocated for hapi's JVM
4 vCores
Oracle database (hardware shows no sign of being topped, on the contrary)
Thanks,
Carlos