SIGTERM in AKS pods during upgrade to v14.0.0

98 views
Skip to first unread message

Morten Jønby

unread,
Dec 20, 2021, 6:04:12 AM12/20/21
to Keycloak User
Hi.
We are upgrading from 11.0.3 to 14.0.0. We have done it several times now with success, but we are now running into an issue where the Keycloak receives a SIGTERM from the OS during the upgrade.
We did experience this during some of our tests and believe it was due to lack of heap and or OS memory, but we are now running on a node with 32GB of RAM and 16GB allocated heap, but still the upgrade fails!

We have 321 realms on this instance of Keycloak. Not sure it that's a lot.
It fails after ~5 minutes.

From some of our previous tests, it was the Migrate12to14() method (or something like that) which failed.

We are in desperate need of some kind of help, so would really appreciate some good suggestions.

Thanks in advance,
Morten

Morten Jønby

unread,
Dec 20, 2021, 6:26:07 AM12/20/21
to Keycloak User
By enabling DEBUG we see this:

11:14:06,614 INFO  [org.hibernate.Version] (ServerService Thread Pool -- 62) HHH000412: Hibernate Core {5.3.20.Final}
11:14:06,615 INFO  [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 62) HHH000206: hibernate.properties not found
11:14:06,705 INFO  [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 62) HCANN000001: Hibernate Commons Annotations {5.0.5.Final}
11:14:06,989 INFO  [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 62) HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL95Dialect
11:14:08,049 INFO  [org.hibernate.engine.jdbc.env.internal.LobCreatorBuilderImpl] (ServerService Thread Pool -- 62) HHH000424: Disabling contextual LOB creation as createClob() method threw error : java.lang.reflect.InvocationTargetException
11:14:08,052 INFO  [org.hibernate.type.BasicTypeRegistry] (ServerService Thread Pool -- 62) HHH000270: Type registration [java.util.UUID] overrides previous : org.hibernate.type.UUIDBinaryType@3327edae
11:14:08,056 INFO  [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 62) Envers integration enabled? : true
11:14:08,375 INFO  [org.hibernate.orm.beans] (ServerService Thread Pool -- 62) HHH10005002: No explicit CDI BeanManager reference was passed to Hibernate, but CDI is available on the Hibernate ClassLoader.
11:14:08,413 INFO  [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 62) HV000001: Hibernate Validator 6.0.22.Final
11:14:09,161 INFO  [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 62) HHH000397: Using ASTQueryTranslatorFactory
11:14:09,380 DEBUG [org.keycloak.connections.jpa.DefaultJpaConnectionProviderFactory] (ServerService Thread Pool -- 62) Calling migrateModel
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) new JtaTransactionWrapper
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) was existing? false
11:14:09,546 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 62) Migrating older model to 12.0.0
11:18:58,822 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the service container was 'add' at address '[
    ("core-service" => "management"),
    ("management-interface" => "http-interface")
]'
11:18:58,840 INFO  [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-1) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.3.jar
11:18:58,840 INFO  [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-2) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.4.jar
11:18:58,840 DEBUG [org.keycloak.services.DefaultKeycloakSessionFactory] (MSC service thread 1-1) undeploy

*****

Any idea what timeout this is and what property to set to change it?

Thanks.

Jan Lieskovsky

unread,
Dec 20, 2021, 6:42:09 AM12/20/21
to Morten Jønby, Keycloak User
On Mon, Dec 20, 2021 at 12:26 PM Morten Jønby <morten...@gmail.com> wrote:
By enabling DEBUG we see this:

11:14:06,614 INFO  [org.hibernate.Version] (ServerService Thread Pool -- 62) HHH000412: Hibernate Core {5.3.20.Final}
11:14:06,615 INFO  [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 62) HHH000206: hibernate.properties not found
11:14:06,705 INFO  [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 62) HCANN000001: Hibernate Commons Annotations {5.0.5.Final}
11:14:06,989 INFO  [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 62) HHH000400: Using dialect: org.hibernate.dialect.PostgreSQL95Dialect
11:14:08,049 INFO  [org.hibernate.engine.jdbc.env.internal.LobCreatorBuilderImpl] (ServerService Thread Pool -- 62) HHH000424: Disabling contextual LOB creation as createClob() method threw error : java.lang.reflect.InvocationTargetException
11:14:08,052 INFO  [org.hibernate.type.BasicTypeRegistry] (ServerService Thread Pool -- 62) HHH000270: Type registration [java.util.UUID] overrides previous : org.hibernate.type.UUIDBinaryType@3327edae
11:14:08,056 INFO  [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 62) Envers integration enabled? : true
11:14:08,375 INFO  [org.hibernate.orm.beans] (ServerService Thread Pool -- 62) HHH10005002: No explicit CDI BeanManager reference was passed to Hibernate, but CDI is available on the Hibernate ClassLoader.
11:14:08,413 INFO  [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 62) HV000001: Hibernate Validator 6.0.22.Final
11:14:09,161 INFO  [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 62) HHH000397: Using ASTQueryTranslatorFactory
11:14:09,380 DEBUG [org.keycloak.connections.jpa.DefaultJpaConnectionProviderFactory] (ServerService Thread Pool -- 62) Calling migrateModel
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) new JtaTransactionWrapper
11:14:09,381 DEBUG [org.keycloak.transaction.JtaTransactionWrapper] (ServerService Thread Pool -- 62) was existing? false
11:14:09,546 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 62) Migrating older model to 12.0.0
11:18:58,822 ERROR [org.jboss.as.controller.management-operation] (Controller Boot Thread) WFLYCTL0348: Timeout after [300] seconds waiting for service container stability. Operation will roll back. Step that first updated the

Hello Morten,


HTH,
Jan

 
service container was 'add' at address '[
    ("core-service" => "management"),
    ("management-interface" => "http-interface")
]'
11:18:58,840 INFO  [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-1) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.3.jar
11:18:58,840 INFO  [org.keycloak.subsystem.server.extension.KeycloakProviderDeploymentProcessor] (MSC service thread 1-2) Undeploying Keycloak provider: com.stibo.keycloak.userfederation.adapter-1.0.4.jar
11:18:58,840 DEBUG [org.keycloak.services.DefaultKeycloakSessionFactory] (MSC service thread 1-1) undeploy

*****

Any idea what timeout this is and what property to set to change it?

Thanks.

mandag den 20. december 2021 kl. 12.04.12 UTC+1 skrev Morten Jønby:
Hi.
We are upgrading from 11.0.3 to 14.0.0. We have done it several times now with success, but we are now running into an issue where the Keycloak receives a SIGTERM from the OS during the upgrade.
We did experience this during some of our tests and believe it was due to lack of heap and or OS memory, but we are now running on a node with 32GB of RAM and 16GB allocated heap, but still the upgrade fails!

We have 321 realms on this instance of Keycloak. Not sure it that's a lot.
It fails after ~5 minutes.

From some of our previous tests, it was the Migrate12to14() method (or something like that) which failed.

We are in desperate need of some kind of help, so would really appreciate some good suggestions.

Thanks in advance,
Morten

--
You received this message because you are subscribed to the Google Groups "Keycloak User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keycloak-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/keycloak-user/5c5878fc-f867-4ca3-94fc-f1d427d2c86en%40googlegroups.com.

WELTON RODRIGO TORRES NASCIMENTO

unread,
Dec 20, 2021, 6:54:40 AM12/20/21
to Jan Lieskovsky, Morten Jønby, Keycloak User
By the logs, and considering this is running in kubernetes, I strongly believe the health check is killing the pod.

I had that issue before and the logs seems identical.

I suppose you could raise the healthy periods to a much higher number to check that and later, after the database is updated, lower it again to the previous value.

You said 5 minutes, this is either health checks or progressDeadlineSeconds, which is 5 minutes by default. 

You can check the pod by kubectl describe pod xxxx to see which one is happening.

Welton Rodrigo Torres nascimento

Em 20 de dez. de 2021, à(s) 08:42, Jan Lieskovsky <jlie...@redhat.com> escreveu:



Morten Jønby

unread,
Dec 20, 2021, 7:34:29 AM12/20/21
to WELTON RODRIGO TORRES NASCIMENTO, Jan Lieskovsky, Keycloak User
Hi Welton.
We have tried to set

 startupProbe:
      failureThreshold: 120

Did you have any other probes in mind?

This still didn't resolve the issue, but we got these warnings now (first time only):

12:06:39,391 DEBUG [org.keycloak.migration.MigrationModelManager] (ServerService Thread Pool -- 61) Migrating older model to 12.0.0
12:11:34,311 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a470270:28de92f5:61c071c4:e in state RUN
12:11:34,315 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffff0a470270:28de92f5:61c071c4:e
12:11:35,369 WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff0a470270:28de92f5:61c071c4:11 in state RUN
12:11:35,371 WARN [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffff0a470270:28de92f5:61c071c4:11
We couln't find the progressDeadlineSeconds and this seem to be for Deployment's only. We are using a StatefulSet.

Thanks for your help!

- Morten

Morten Jønby

unread,
Dec 20, 2021, 9:07:38 AM12/20/21
to WELTON RODRIGO TORRES NASCIMENTO, stho...@redhat.com, Jan Lieskovsky, Keycloak User
Hi all.
We seem to have upgraded successfully now, but perhaps some of this ought to be handled by the upgrade as alluded to by Welton.

We ended up setting the following:

1) Kubernetes StatefulSet:
startupProbe:
      failureThreshold: 120

2) Blocking timeout
Setting this in the Helmchart:
-Djboss.as.management.blocking.timeout=600


3) Transaction timeout
Configured via startup script variable in Helmchart:

change-transaction-default-timeout.cli: |
  embed-server --server-config=standalone-ha.xml --std-out=echo
  echo Changing transaction default timeout
  /subsystem=transactions/:write-attribute(name=default-timeout,value=3600)
  echo Done
  stop-embedded-server



We are not completely sure if all 3 are required, but it worked. 

------
Mvh. Morten

Reply all
Reply to author
Forward
0 new messages