On 09/10/2013 11:17 AM, Marvin S. Addison wrote:
>> had an aberrant issue yesterday with a
>> single user of a typically busy service looping on ST
>> request/validation after authentication (only one TGT).
>
> What does the audit log say about the validations? If they were
> successful then it's almost certainly a problem with the CAS client
> and/or user agent. In almost every case of a redirect loop we've seen,
> the root cause was not the CAS server.
Validations during looping were successful for a while--for almost two
hours. At one point I think I counted 12 ST requests per second (request
server A, validation server B). Then ST validations started failing. So,
agreed, suspect it's the CAS client. Am working with them. So far unable
to identify what's different about the user.
That said, we are observing ST replication errors. Seems to go in
bursts. Haven't checked carefully, but seems to correlate with ST
validation failures (SERVICE_TICKET_VALIDATE_FAILED).
> Sep 10, 2013 1:44:11 PM org.apache.catalina.core.StandardWrapperValve invoke
> INFO: 2013-09-10 13:44:11,773 ERROR [net.sf.ehcache.distribution.RMISynchronousCacheReplicator] - <Exception on replication of putNotification. null. Continuing...>
> java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextEntry(HashMap.java:894)
> at java.util.HashMap$EntryIterator.next(HashMap.java:934)
> at java.util.HashMap$EntryIterator.next(HashMap.java:932)
> at java.util.HashMap.writeObject(HashMap.java:1098)
> at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:988)
Given that using VMs for CAS is new to us, will have to assess whether
(1) we're underpowered cf. memory, (2) whether the instances are not
responsive enough (affected by demand on sibling VMs), (3) whether
multicast+RMI is insufficient for our needs (which didn't show up early
on), etc.
Need to figure out what is an acceptable/baseline/historical number of
SERVICE_TICKET_VALIDATE_FAILED log entries.
Tom.