Hazelcast is not replicating tickets.

Mallory, Erik

unread,

Jul 19, 2017, 11:18:12 AM7/19/17

to CAS Community

Hello,

I have configured two CAS 5.1 nodes and I am using a netscaler to load balance them. I was doing some resiliency testing by logging into the cas service, then shutting down the http service on node I am connected to, then refreshing the connection. When I do this I get rerouted to the cas login screen again. In the logs I can see no Errors or failures. Hazelcast logging is set to debug. I see heartbeats and communication between the two nodes. But the service is not behaving as I expect.

Here is my configuration for hazelcast

cas.ticket.registry.hazelcast.pageSize=500

cas.ticket.registry.hazelcast.mapName=tickets

#cas.ticket.registry.hazelcast.configLocation=/data/cas/config/cas5/hazelcast.xml

cas.ticket.registry.hazelcast.cluster.evictionPolicy=LRU

cas.ticket.registry.hazelcast.cluster.maxNoHeartbeatSeconds=300

cas.ticket.registry.hazelcast.cluster.multicastEnabled=false

cas.ticket.registry.hazelcast.cluster.evictionPercentage=10

cas.ticket.registry.hazelcast.cluster.tcpipEnabled=true

cas.ticket.registry.hazelcast.cluster.members=appdev-523.wichita.edu,appdev-524.wichita.edu

cas.ticket.registry.hazelcast.cluster.loggingType=slf4j

cas.ticket.registry.hazelcast.cluster.instanceName=cas-dev

cas.ticket.registry.hazelcast.cluster.port=5701

cas.ticket.registry.hazelcast.cluster.portAutoIncrement=true

cas.ticket.registry.hazelcast.cluster.maxHeapSizePercentage=85

cas.ticket.registry.hazelcast.cluster.backupCount=1

cas.ticket.registry.hazelcast.cluster.asyncBackupCount=0

cas.ticket.registry.hazelcast.cluster.maxSizePolicy=USED_HEAP_PERCENTAGE

cas.ticket.registry.hazelcast.cluster.timeout=5

The logs generally look like this.

3.wichita.edu]:5701 [dev] [3.8.1] Applied new partition state. Version: 543, caller: [appdev-524.wichita.edu]:5701>

cas-2017-07-18-11-1.log:2017-07-18 11:12:54,087 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Received heartbeat from Member [appdev-524.wichita.edu]:5701 - 5a3577aa-4567-4629-a162-92d2cab86bd2 (now: 2017-07-18 11:12:54.085, timestamp: 2017-07-18 11:12:54.085)>

cas-2017-07-18-11-1.log:2017-07-18 11:12:54,087 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Setting cluster time diff to -2ms.>

cas-2017-07-18-11-1.log:2017-07-18 11:12:59,087 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Received heartbeat from Member [appdev-524.wichita.edu]:5701 - 5a3577aa-4567-4629-a162-92d2cab86bd2 (now: 2017-07-18 11:12:59.085, timestamp: 2017-07-18 11:12:59.085)>

cas-2017-07-18-11-1.log:2017-07-18 11:12:59,087 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Setting cluster time diff to -2ms.>

cas-2017-07-18-11-1.log:2017-07-18 11:13:04,087 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Received heartbeat from Member [appdev-524.wichita.edu]:5701 - 5a3577aa-4567-4629-a162-92d2cab86bd2 (now: 2017-07-18 11:13:04.085, timestamp: 2017-07-18 11:13:04.085)>

cas-2017-07-18-11-1.log:2017-07-18 11:13:04,087 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Setting cluster time diff to -2ms.>

cas-2017-07-18-11-1.log:2017-07-18 11:13:09,081 DEBUG [com.hazelcast.internal.partition.InternalPartitionService] - <[appdev-523.wichita.edu]:5701 [dev] [3.8.1] Master version should be greater than ours! Local version: 543, Master version: 543 Master: [appdev-524.wichita.edu]:5701>

Any help you can give would be greatly appreciated.

Thanks,

Erik Mallory

Server Analyst

Wichita State University

316.978.3502

Carlos Fernandez

unread,

Jul 19, 2017, 3:46:05 PM7/19/17

to cas-...@apereo.org

Hi, Erik,

I had a similar problem using Hazelcast as the storage service for a Shibboleth installation. I found that the member list should only include *other* servers, not the local host. In your case, on appdev-523 should have cas.ticket.registry.hazelcast.cluster.members set to appdev-524.wichita.edu, and vice versa. Once I set it up that way Hazelcast stopped logging weird partition errors.

Best regards,
--
Carlos M. Fernández
Enterprise Systems Manager
Saint Joseph’s University
Philadelphia PA 19131
T: +1 610 660 1501

--
- CAS gitter chatroom: https://gitter.im/apereo/cas
- CAS mailing list guidelines: https://apereo.github.io/cas/Mailing-Lists.html
- CAS documentation website: https://apereo.github.io/cas
- CAS project website: https://github.com/apereo/cas
---
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+unsubscribe@apereo.org.
To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/24FACB15-FA65-42F0-9595-F4D1353B7C63%40wichita.edu.

Mallory, Erik

unread,

Jul 19, 2017, 5:10:15 PM7/19/17

to cas-...@apereo.org

Hello Carlos,

Thanks for the advice. Have made the change but it still does not seem to be helping. I cleared all of my browsing data and tried again. This time the logs are a bit different:

I log into the cas-dev successfully, the logs on that node show:

017-07-19 15:55:12,149 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: Supplied credentials: [f282c439]

ACTION: AUTHENTICATION_SUCCESS

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:55:12,149 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: Supplied credentials: [f282c439]

ACTION: AUTHENTICATION_SUCCESS

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:55:12,255 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: TGT-**********************************************Vn7zvmKG6w-cas-dev.wichita.edu

ACTION: TICKET_GRANTING_TICKET_CREATED

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:55:12,255 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: TGT-**********************************************Vn7zvmKG6w-cas-dev.wichita.edu

ACTION: TICKET_GRANTING_TICKET_CREATED

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:55:12,321 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: ST-1-wacoVmE2DgaKj4WZqGmx-cas-dev.wichita.edu for https://cas-dev.wichita.edu/cas-management/manage.html

ACTION: SERVICE_TICKET_CREATED

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:55:12,321 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: f282c439

WHAT: ST-1-wacoVmE2DgaKj4WZqGmx-cas-dev.wichita.edu for https://cas-dev.wichita.edu/cas-management/manage.html

ACTION: SERVICE_TICKET_CREATED

APPLICATION: CAS

WHEN: Wed Jul 19 15:55:12 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

Then I shut down access to the first node (shut down http) and refresh and here are the logs from the second node

2017-07-19 15:56:06,157 DEBUG [com.hazelcast.internal.partition.InternalPartitionService] - <[156.26.183.186]:5701 [dev] [3.8.1] Master version should be greater than ours! Local version: 543, Master version: 543 Master: [156.26.183.187]:5701>

2017-07-19 15:56:06,157 DEBUG [com.hazelcast.internal.partition.operation.PartitionStateOperation] - <[156.26.183.186]:5701 [dev] [3.8.1] Applied new partition state. Version: 543, caller: [156.26.183.187]:5701>

2017-07-19 15:56:06,162 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[156.26.183.186]:5701 [dev] [3.8.1] Received heartbeat from Member [156.26.183.187]:5701 - ea6ee5d9-f6f2-433e-b11a-93df21a99a05 (now: 2017-07-19 15:56:06.161, timestamp: 2017-07-19 15:56:06.162)>

2017-07-19 15:56:06,162 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[156.26.183.186]:5701 [dev] [3.8.1] Setting cluster time diff to 0ms.>

2017-07-19 15:56:10,970 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: audit:unknown

WHAT: [event=success,timestamp=Wed Jul 19 15:56:10 CDT 2017,source=RankedAuthenticationProviderWebflowEventResolver]

ACTION: AUTHENTICATION_EVENT_TRIGGERED

APPLICATION: CAS

WHEN: Wed Jul 19 15:56:10 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:56:10,970 INFO [org.apereo.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN

=============================================================

WHO: audit:unknown

WHAT: [event=success,timestamp=Wed Jul 19 15:56:10 CDT 2017,source=RankedAuthenticationProviderWebflowEventResolver]

ACTION: AUTHENTICATION_EVENT_TRIGGERED

APPLICATION: CAS

WHEN: Wed Jul 19 15:56:10 CDT 2017

CLIENT IP ADDRESS: 156.26.183.205

SERVER IP ADDRESS: unknown

=============================================================

>

2017-07-19 15:56:11,162 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[156.26.183.186]:5701 [dev] [3.8.1] Received heartbeat from Member [156.26.183.187]:5701 - ea6ee5d9-f6f2-433e-b11a-93df21a99a05 (now: 2017-07-19 15:56:11.162, timestamp: 2017-07-19 15:56:11.162)>

2017-07-19 15:56:11,162 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[156.26.183.186]:5701 [dev] [3.8.1] Setting cluster time diff to 0ms.>

2017-07-19 15:56:15,466 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[156.26.183.186]:5701 [dev] [3.8.1] Sending MasterConfirmation to Member [156.26.183.187]:5701 - ea6ee5d9-f6f2-433e-b11a-93df21a99a05>

2017-07-19 15:56:16,162 DEBUG [com.hazelcast.internal.cluster.impl.ClusterHeartbeatManager] - <[156.26.183.186]:5701 [dev] [3.8.1] Received heartbeat from Member [156.26.183.187]:5701 - ea6ee5d9-f6f2-433e-b11a-93df21a99a05 (now: 2017-07-19 15:56:16.162, timestamp: 2017-07-19 15:56:16.162)>

2017-07-19 15:56:16,162 DEBUG [com.hazelcast.internal.cluster.ClusterService] - <[156.26.183.186]:5701 [dev] [3.8.1] Setting cluster time diff to 0ms.>

I’m not sure if it’s normal for the WHO: to be audit:unknown. It makes sense because I’m already authenticated and I assume CAS would be checking for a ticket in hazelcast not requiring authenication, nonetheless I’m kicked to the login page on the second node.

If any one has any ideas I’m all ears.

--

To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.

To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/24FACB15-FA65-42F0-9595-F4D1353B7C63%40wichita.edu.

--

- CAS gitter chatroom: https://gitter.im/apereo/cas
- CAS mailing list guidelines: https://apereo.github.io/cas/Mailing-Lists.html
- CAS documentation website: https://apereo.github.io/cas
- CAS project website: https://github.com/apereo/cas
---
You received this message because you are subscribed to the Google Groups "CAS Community" group.

To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.
To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/CAE7KU842VecUe4Wuy2aStukkVeuxuMvTUsX8Om6K1%3DxvwwhdMg%40mail.gmail.com.

Larisa Tudos

unread,

Oct 12, 2017, 5:25:22 AM10/12/17

to CAS Community, Erik.M...@wichita.edu

Hi All,

I have the same issue described by Erik - when one node goes down, another node kicks me to the login page. Please help to overcome the issue.

Thanks

Larisa

Waldbieser, Carl

unread,

Oct 12, 2017, 11:05:45 AM10/12/17

to cas-user, Erik Mallory

We are running CAS v5.1.3 with Hazelcast. Both our production and Integration testing environments have 3 nodes. We have them behind a web proxy. We haven't had any issue with losing tickets if I take down a node and bring it back up, though the throughput of the live nodes does dip a bit as they attempt to rebalance the ticket load. The only Hazelcast settings we've changed from the default are as follows:

# Hazelcast
cas.ticket.registry.hazelcast.cluster.members={{csv_cluster_ip_addrs}}
cas.ticket.registry.hazelcast.crypto.signing.key={{hazelcast_signing_key}}
cas.ticket.registry.hazelcast.crypto.encryption.key={{hazelcast_encryption_key}}
cas.ticket.registry.hazelcast.cluster.backupCount=0
cas.ticket.registry.hazelcast.cluster.asyncBackupCount=2

The "{{...}}" need to filled in with your own values.

Thanks,
Carl Waldbieser
ITS Systems Programmer
Lafayette College

--
- Website: https://apereo.github.io/cas
- Gitter Chatroom: https://gitter.im/apereo/cas
- List Guidelines: https://goo.gl/1VRrw7
- Contributions: https://goo.gl/mh7qDG

---
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.

To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/51e0a32c-031d-45c8-a83d-6990ec736524%40apereo.org.

Adam Causey

unread,

Feb 2, 2018, 4:45:00 PM2/2/18

to cas-...@apereo.org, Erik Mallory

Carl,

How did you generate the keys for your Hazelcast settings?

Also, do you have all of your nodes listed in the 'cas.ticket.registry.hazelcast.cluster.members' property, including the localhost IP? I synchronize my properties b/w servers so the ability to have the IPs of all servers here would be beneficial.

Thanks!

Adam

To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+unsubscribe@apereo.org.

To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/51e0a32c-031d-45c8-a83d-6990ec736524%40apereo.org.

--
- Website: https://apereo.github.io/cas
- Gitter Chatroom: https://gitter.im/apereo/cas
- List Guidelines: https://goo.gl/1VRrw7
- Contributions: https://goo.gl/mh7qDG
---
You received this message because you are subscribed to the Google Groups "CAS Community" group.

To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+unsubscribe@apereo.org.

To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/225573921.199089020.1507820741678.JavaMail.zimbra%40lafayette.edu.

Reply all

Reply to author

Forward