SIGTERM in AKS pods during upgrade to v14.0.0

Morten Jønby

unread,

Dec 20, 2021, 6:04:12 AM12/20/21

to Keycloak User

Hi.

We are upgrading from 11.0.3 to 14.0.0. We have done it several times now with success, but we are now running into an issue where the Keycloak receives a SIGTERM from the OS during the upgrade.

We did experience this during some of our tests and believe it was due to lack of heap and or OS memory, but we are now running on a node with 32GB of RAM and 16GB allocated heap, but still the upgrade fails!

We have 321 realms on this instance of Keycloak. Not sure it that's a lot.

It fails after ~5 minutes.

From some of our previous tests, it was the Migrate12to14() method (or something like that) which failed.

We are in desperate need of some kind of help, so would really appreciate some good suggestions.

Thanks in advance,

Morten

Morten Jønby

unread,

Dec 20, 2021, 6:26:07 AM12/20/21

to Keycloak User

By enabling DEBUG we see this:

11:14:06,614 INFO [org.hibernate.Version] (ServerService Thread Pool -- 62) HHH000412: Hibernate Core {5.3.20.Final}
11:14:06,615 INFO [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 62) HHH000206: hibernate.properties not found
11:14:06,705 INFO [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 62) HCANN000001: Hibernate Commons Annotations {5.0.5.Final}
11:14:06,989 INFO [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 62) HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL95Dialect
11:14:08,049 INFO [org.hibernate.engine.jdbc.env.internal.LobCreatorBuilderImpl] (ServerService Thread Pool -- 62) HHH000424: Disabling contextual LOB creation as createClob() method threw error : java.lang.reflect.InvocationTargetException
11:14:08,052 INFO [org.hibernate.type.BasicTypeRegistry] (ServerService Thread Pool -- 62) HHH000270: Type registration [java.util.UUID] overrides previous : org.hibernate.type.UUIDBinaryType@3327edae
11:14:08,056 INFO [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 62) Envers integration enabled? : true
11:14:08,375 INFO [org.hibernate.orm.beans] (ServerService Thread Pool -- 62) HHH10005002: No explicit CDI BeanManager reference was passed to Hibernate, but CDI is available on the Hibernate ClassLoader.
11:14:08,413 INFO [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 62) HV000001: Hibernate Validator 6.0.22.Final
11:14:09,161 INFO [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 62) HHH000397: Using ASTQueryTranslatorFactory
11:14:09,380 DEBUG [org.keycloak.connections.jpa.DefaultJpaConnectionProviderFactory] (ServerService Thread Pool -- 62) Calling migrateModel
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) new JtaTransactionWrapper
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) was existing? false
11:14:09,546 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 62) Migrating older model to 12.0.0
11:18:58,822 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
("core-service" => "management"),
("management-interface" => "http-interface")
]'
11:18:58,840 INFO [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-1) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.3.jar
11:18:58,840 INFO [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-2) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.4.jar
11:18:58,840 DEBUG [org.keycloak.services.DefaultKeycloakSessionFactory] (MSC service thread 1-1) undeploy

*****

Any idea what timeout this is and what property to set to change it?

Thanks.

Jan Lieskovsky

unread,

Dec 20, 2021, 6:42:09 AM12/20/21

to Morten Jønby, Keycloak User

On Mon, Dec 20, 2021 at 12:26 PM Morten Jønby <morten...@gmail.com> wrote:

By enabling DEBUG we see this:

11:14:06,614 INFO [org.hibernate.Version] (ServerService Thread Pool -- 62) HHH000412: Hibernate Core {5.3.20.Final}
11:14:06,615 INFO [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 62) HHH000206: hibernate.properties not found
11:14:06,705 INFO [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 62) HCANN000001: Hibernate Commons Annotations {5.0.5.Final}
11:14:06,989 INFO [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 62) HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL95Dialect
11:14:08,049 INFO [org.hibernate.engine.jdbc.env.internal.LobCreatorBuilderImpl] (ServerService Thread Pool -- 62) HHH000424: Disabling contextual LOB creation as createClob() method threw error : java.lang.reflect.InvocationTargetException
11:14:08,052 INFO [org.hibernate.type.BasicTypeRegistry] (ServerService Thread Pool -- 62) HHH000270: Type registration [java.util.UUID] overrides previous : org.hibernate.type.UUIDBinaryType@3327edae
11:14:08,056 INFO [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 62) Envers integration enabled? : true
11:14:08,375 INFO [org.hibernate.orm.beans] (ServerService Thread Pool -- 62) HHH10005002: No explicit CDI BeanManager reference was passed to Hibernate, but CDI is available on the Hibernate ClassLoader.
11:14:08,413 INFO [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 62) HV000001: Hibernate Validator 6.0.22.Final
11:14:09,161 INFO [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 62) HHH000397: Using ASTQueryTranslatorFactory
11:14:09,380 DEBUG [org.keycloak.connections.jpa.DefaultJpaConnectionProviderFactory] (ServerService Thread Pool -- 62) Calling migrateModel
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) new JtaTransactionWrapper
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) was existing? false
11:14:09,546 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 62) Migrating older model to 12.0.0
11:18:58,822 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the

Hello Morten,

try to set the jboss.as.management.blocking.timeout to longer period than the 5 minutes default.

HTH,

Jan

service container was 'add' at address '[
("core-service" => "management"),
("management-interface" => "http-interface")
]'
11:18:58,840 INFO [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-1) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.3.jar
11:18:58,840 INFO [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-2) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.4.jar
11:18:58,840 DEBUG [org.keycloak.services.DefaultKeycloakSessionFactory] (MSC service thread 1-1) undeploy

*****

Any idea what timeout this is and what property to set to change it?

Thanks.

mandag den 20. december 2021 kl. 12.04.12 UTC+1 skrev Morten Jønby:
Hi.
We are upgrading from 11.0.3 to 14.0.0. We have done it several times now with success, but we are now running into an issue where the Keycloak receives a SIGTERM from the OS during the upgrade.
We did experience this during some of our tests and believe it was due to lack of heap and or OS memory, but we are now running on a node with 32GB of RAM and 16GB allocated heap, but still the upgrade fails!

We have 321 realms on this instance of Keycloak. Not sure it that's a lot.
It fails after ~5 minutes.

From some of our previous tests, it was the Migrate12to14() method (or something like that) which failed.

We are in desperate need of some kind of help, so would really appreciate some good suggestions.

Thanks in advance,
Morten

--
You received this message because you are subscribed to the Google Groups "Keycloak User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keycloak-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/keycloak-user/5c5878fc-f867-4ca3-94fc-f1d427d2c86en%40googlegroups.com.

WELTON RODRIGO TORRES NASCIMENTO

unread,

Dec 20, 2021, 6:54:40 AM12/20/21

to Jan Lieskovsky, Morten Jønby, Keycloak User

By the logs, and considering this is running in kubernetes, I strongly believe the health check is killing the pod.

I had that issue before and the logs seems identical.

I suppose you could raise the healthy periods to a much higher number to check that and later, after the database is updated, lower it again to the previous value.

You said 5 minutes, this is either health checks or progressDeadlineSeconds, which is 5 minutes by default.

You can check the pod by kubectl describe pod xxxx to see which one is happening.

—

Welton Rodrigo Torres nascimento

Em 20 de dez. de 2021, à(s) 08:42, Jan Lieskovsky <jlie...@redhat.com> escreveu:

To view this discussion on the web visit https://groups.google.com/d/msgid/keycloak-user/CAPr-%2BaG-EQHqz8pP-vH3NhBKuWvRAXnBa3mYaUyQVOuvjRAYoQ%40mail.gmail.com.

Morten Jønby

unread,

Dec 20, 2021, 7:34:29 AM12/20/21

to WELTON RODRIGO TORRES NASCIMENTO, Jan Lieskovsky, Keycloak User

Hi Welton.

We have tried to set

startupProbe:

failureThreshold: 120

Did you have any other probes in mind?

This still didn't resolve the issue, but we got these warnings now (first time only):

12:06:39,391 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 61) Migrating older model to 12.0.0
12:11:34,311 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a470270:28de92f5:61c071c4:e in state RUN
12:11:34,315 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffff0a470270:28de92f5:61c071c4:e
12:11:35,369 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a470270:28de92f5:61c071c4:11 in state RUN
12:11:35,371 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffff0a470270:28de92f5:61c071c4:11

We couln't find the progressDeadlineSeconds and this seem to be for Deployment's only. We are using a StatefulSet.

Thanks for your help!

- Morten

Morten Jønby

unread,

Dec 20, 2021, 9:07:38 AM12/20/21

to WELTON RODRIGO TORRES NASCIMENTO, stho...@redhat.com, Jan Lieskovsky, Keycloak User

Hi all.

We seem to have upgraded successfully now, but perhaps some of this ought to be handled by the upgrade as alluded to by Welton.

We ended up setting the following:

1) Kubernetes StatefulSet:

startupProbe:

failureThreshold: 120

2) Blocking timeout

Setting this in the Helmchart:

-Djboss.as.management.blocking.timeout=600

3) Transaction timeout

Configured via startup script variable in Helmchart:

change-transaction-default-timeout.cli: |
embed-server --server-config=standalone-ha.xml --std-out=echo
echo Changing transaction default timeout
/subsystem=transactions/:write-attribute(name=default-timeout,value=3600)
echo Done
stop-embedded-server

We are not completely sure if all 3 are required, but it worked.

------

Mvh. Morten

Reply all

Reply to author

Forward