Cas multiple service tickets created and multiple tickets failed validation for same user

459 views
Skip to first unread message

John Stevens II

unread,
Aug 2, 2016, 2:49:49 PM8/2/16
to CAS Community
I have a problem that randomly happens, after hitting my service url successfully I get redirected to CAS login page, I login successfully and get redirected to my service but the service fails to load with the browser error "performance.example.com redirected you too many times." 

Setup:
2 Active node servers behind F5 lb w/ source afffinity
shared couchbase ticket and service registry

When the issue occurs in the logs the service tickets are created on one server and are validated on the other server (fails validation)

CAS Server 1 (cas1.example.com):

=============================================================
WHO: user1
WHAT: Supplied credentials: [user1]
ACTION: AUTHENTICATION_SUCCESS
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,579 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
WHAT: TGT-**********************************************OsqfxL4sVn-cas1.example.com
ACTION: TICKET_GRANTING_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,600 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <Granted ticket [ST-68-YJvyjEaYZCeKhFhHxet1-cas1.example.com] for service [http://performance.example.com/cas/login/] and principal [user1]>
2016-08-02 14:35:07,608 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: user1
ACTION: SERVICE_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,706 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <Granted ticket [ST-69-IjRkj39xVHYQHIlTZaer-cas1.example.com] for service [http://performance.example.com/cas/login/] and principal [user1]>
2016-08-02 14:35:07,710 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: user1
ACTION: SERVICE_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,803 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <Granted ticket [ST-70-lOLGaHuNJ6GWz2euScXS-cas1.example.com] for service [http://performance.example.com/cas/login/] and principal [user1]>
2016-08-02 14:35:07,807 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: user1
ACTION: SERVICE_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,895 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <Granted ticket [ST-71-gTqf6iPzhHTA1JhXux0Z-cas1.example.com] for service [http://performance.example.com/cas/login/] and principal [user1]>
2016-08-02 14:35:07,899 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: user1
ACTION: SERVICE_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================

>
2016-08-02 14:35:07,983 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <Granted ticket [ST-72-vA1nKtvrcHWlgLODqn5e-cas1.example.com] for service [http://performance.example.com/cas/login/] and principal [user1]>
2016-08-02 14:35:07,987 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: user1
ACTION: SERVICE_TICKET_CREATED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:07 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.142
=============================================================



CAS Server 2 (cas2.example.com):

=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:10 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================

>
2016-08-02 14:35:22,082 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <ServiceTicket [ST-68-YJvyjEaYZCeKhFhHxet1-cas1.example.com] has expired.>
2016-08-02 14:35:22,091 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:22 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================

>
2016-08-02 14:35:22,174 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <ServiceTicket [ST-69-IjRkj39xVHYQHIlTZaer-cas1.example.com] has expired.>
2016-08-02 14:35:22,181 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:22 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================

>
2016-08-02 14:35:22,274 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <ServiceTicket [ST-70-lOLGaHuNJ6GWz2euScXS-cas1.example.com] has expired.>
2016-08-02 14:35:22,281 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:22 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================

>
2016-08-02 14:35:22,364 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <ServiceTicket [ST-71-gTqf6iPzhHTA1JhXux0Z-cas1.example.com] has expired.>
2016-08-02 14:35:22,370 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:22 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================

>
2016-08-02 14:35:22,579 INFO [org.jasig.cas.CentralAuthenticationServiceImpl] - <ServiceTicket [ST-72-vA1nKtvrcHWlgLODqn5e-cas1.example.com] has expired.>
2016-08-02 14:35:22,591 INFO [org.jasig.inspektr.audit.support.Slf4jLoggingAuditTrailManager] - <Audit trail record BEGIN
=============================================================
WHO: audit:unknown
ACTION: SERVICE_TICKET_VALIDATE_FAILED
APPLICATION: CAS
WHEN: Tue Aug 02 14:35:22 EDT 2016
CLIENT IP ADDRESS: 192.168.0.100
SERVER IP ADDRESS: 192.168.21.143
=============================================================




Misagh Moayyed

unread,
Aug 2, 2016, 4:18:58 PM8/2/16
to CAS Community

John Stevens II

unread,
Aug 2, 2016, 4:50:59 PM8/2/16
to CAS Community
Misagh,

I've looked at the docs and increased the the service ticket timeout to 30 seconds: 
 st.timeToKillInSeconds=30

Now I am no longer receiving the too many redirect errors and am able to login to my service and management web application but I am confused about something.

1. I was able to confirm that node1 is the only one having the problem (I have all my logs set to debug) and the nodes should be the same and I can't figure out why I need to increase the st ttl on node1 but not node2.
2. Node1 and node2 share ticket registries so I don't believe it is a problem with that.

Any insight would be helpful, Thanks for your help.

Misagh Moayyed

unread,
Aug 3, 2016, 1:22:30 AM8/3/16
to CAS Community
1. That is a bad idea. Your nodes need to share the same configuration. While this may work for now, it will eventually break. It works now because the gods are favoring node1 to issue tickets and node2 to validate them. Whoever issues the ticket gets to decide how long it should last. 
2. As the doc says, look into your app and figure out why it’s taking 20+ seconds to submit a validation event. 30-second timeouts are not unheard of though. 

How’s couchbase working out for you? 

John Stevens II

unread,
Aug 3, 2016, 7:17:55 AM8/3/16
to CAS Community
I agree I only increased the st ttl to see if it would fix the problem but that troublesome node is not active in the load balancer, it was disabled for troubleshooting.

The reason i'm confused is because the default st ttl is 10 seconds (correct me if i'm wrong) but my ticket is immediately invalidated, so less then 10 seconds and I can't find anything relevant in the logs as to why.

As for couchbase it seems to be working well outside of the issue you help me solve with the connection to the buckets timing out, since I increased the timeout to 60 seconds I haven't had any other problems.

Any suggestions would be great and as always your help is appreciated.

Misagh Moayyed

unread,
Aug 3, 2016, 11:50:59 AM8/3/16
to CAS Community

Your tickets are not immediately invalidated. You have ST-68-YJvyjEaYZCeKhFhHxet1-cas1.example.com issued at 14:35:07. When you trace the corresponding validation event, you see that request is received at 14:35:22. 22-7=15. 15 is greater than the default 10. Ergo... 


Since your nodes are sharing the same registry instance, that node1 vs node2 discussion is irrelevant. As you may have observed, you will just as well experience the same problem with a single node without couchbase even involved. As I said, you need to investigate the app to figure out why it’s taking 10+ seconds to issue a validation event. Fix it there. If you can’t, you might as well live with an adjusted timeout value. 

Ray Bon

unread,
Aug 3, 2016, 12:30:47 PM8/3/16
to cas-...@apereo.org
John,

Your service ticket is not 'immediately invalidated'. It seems unlikely your client takes so long to request st validation. It looks like it sends you back to CAS multiple times (5 times in 400 ms), then there is a 14 s delay before your client attempts to validate the tickets.
Could the servers be out of temporal sync (approx 15 s)? Server 2 is 'in the past', st tickets created here will have 'more time' to be validated which is why the config change has no effect. Server 1 st need a longer ttl because they are validated 'in the past'.

Ray
--
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.
To post to this group, send email to cas-...@apereo.org.
Visit this group at https://groups.google.com/a/apereo.org/group/cas-user/.
To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/0cbc4ad7-4c7e-4f57-9fc5-e825072931e0%40apereo.org.
For more options, visit https://groups.google.com/a/apereo.org/d/optout.

John Stevens II

unread,
Aug 3, 2016, 1:59:02 PM8/3/16
to CAS Community, rb...@uvic.ca
Ray/Misagh,

I thought of that when I was out to lunch and checked the servers just now and there was a 10+ second time difference on server 1. 

I corrected it and removed the 30 st ttyl I set in cas.properties, restarted and now everything is working.

Going to monitor the servers and will post any updates. 

Thanks for the help guys.

Misagh Moayyed

unread,
Aug 9, 2016, 11:25:12 AM8/9/16
to CAS Community
1. That is a bad idea. Your nodes need to share the same configuration. While this may work for now, it will eventually break. It works now because the gods are favoring node1 to issue tickets and node2 to validate them. Whoever issues the ticket gets to decide how long it should last. 
2. As the doc says, look into your app and figure out why it’s taking 20 seconds to submit a validation event.

How’s couchbase working out for you? 

-- 
Misagh
--
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.
To post to this group, send email to cas-...@apereo.org.
Visit this group at https://groups.google.com/a/apereo.org/group/cas-user/.

Misagh Moayyed

unread,
Aug 9, 2016, 11:26:06 AM8/9/16
to CAS Community

From: John Stevens II <jstev...@gmail.com>
Reply: John Stevens II <jstev...@gmail.com>
Date: August 2, 2016 at 11:49:57 AM
To: CAS Community <cas-...@apereo.org>
--
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.
To post to this group, send email to cas-...@apereo.org.
Visit this group at https://groups.google.com/a/apereo.org/group/cas-user/.

Misagh Moayyed

unread,
Aug 9, 2016, 11:26:13 AM8/9/16
to CAS Community

Your tickets are not immediately invalidated. You have ST-68-YJvyjEaYZCeKhFhHxet1-cas1.example.com issued at 14:35:07. When you trace the corresponding validation event, you see that request is received at 14:35:22. 22-7=15. 15 is greater than the default 10. Ergo... 

Since your nodes are sharing the same registry instance, that node1 vs node2 discussion is irrelevant. As you may have observed, you will just as well experience the same problem with a single node without couchbase even involved. As I said, you need to investigate the app to figure out why it’s taking 10+ seconds to issue a validation event. Fix it there. If you can’t, you might as well live with an adjusted timeout value. 

-- 
Misagh

From: John Stevens II <jstev...@gmail.com>
Reply: John Stevens II <jstev...@gmail.com>
Date: August 3, 2016 at 4:18:05 AM
To: CAS Community <cas-...@apereo.org>
Subject:  [cas-user] Re: Cas multiple service tickets created and multiple tickets failed validation for same user
I agree I only increased the st ttl to see if it would fix the problem but that troublesome node is not active in the load balancer, it was disabled for troubleshooting.

The reason i'm confused is because the default st ttl is 10 seconds (correct me if i'm wrong) but my ticket is immediately invalidated, so less then 10 seconds and I can't find anything relevant in the logs as to why.

As for couchbase it seems to be working well outside of the issue you help me solve with the connection to the buckets timing out, since I increased the timeout to 60 seconds I haven't had any other problems.

Any suggestions would be great and as always your help is appreciated.


On Wednesday, August 3, 2016 at 1:22:30 AM UTC-4, Misagh Moayyed wrote:
1. That is a bad idea. Your nodes need to share the same configuration. While this may work for now, it will eventually break. It works now because the gods are favoring node1 to issue tickets and node2 to validate them. Whoever issues the ticket gets to decide how long it should last. 
2. As the doc says, look into your app and figure out why it’s taking 20+ seconds to submit a validation event. 30-second timeouts are not unheard of though. 

How’s couchbase working out for you? 

On Tuesday, August 2, 2016 at 1:50:59 PM UTC-7, John Stevens II wrote:
Misagh,

I've looked at the docs and increased the the service ticket timeout to 30 seconds: 
 st.timeToKillInSeconds=30

Now I am no longer receiving the too many redirect errors and am able to login to my service and management web application but I am confused about something.

1. I was able to confirm that node1 is the only one having the problem (I have all my logs set to debug) and the nodes should be the same and I can't figure out why I need to increase the st ttl on node1 but not node2.
2. Node1 and node2 share ticket registries so I don't believe it is a problem with that.

Any insight would be helpful, Thanks for your help.

On Tuesday, August 2, 2016 at 4:18:58 PM UTC-4, Misagh Moayyed wrote:
See https://apereo.github.io/cas/4.2.x/installation/Troubleshooting-Guide.html

On Tuesday, August 2, 2016 at 11:49:49 AM UTC-7, John Stevens II wrote:
Reply all
Reply to author
Forward
0 new messages