Instability of me CAS system

65 views
Skip to first unread message

Juan Carlos Giménez Moncada

unread,
Jun 16, 2016, 10:00:25 AM6/16/16
to CAS Community
Hi, the configuration of me cluster CAS are two nodes behind a load
balancer. Dual CPU system, 6Gb RAM with Ubuntu 16.04, Tomcat8 with
java-8-openjdk.

Relevant Tomcat8 conf: -Xms512 -Xmx4096 -XX:+UseParallelGC
-XX:ParallelGCThreads=2 -XX:NewRatio=2

CAS 4.1.6 container have: LDAP conn, MySQL conn and EHCache. 60k of
users approx. Normally 15k of TGT live.

Relevant EHCache conf (two caches are Sync because i need instant
replication of PGT same as ST):

<bean id="abstractTicketCache" abstract="true"
class="org.springframework.cache.ehcache.EhCacheFactoryBean"
p:cacheManager-ref="cacheManager"
p:diskExpiryThreadIntervalSeconds="1"
p:diskPersistent="false"
p:eternal="false"
p:maxElementsInMemory="100000"
p:maxElementsOnDisk="110000"
p:memoryStoreEvictionPolicy="LRU"
p:overflowToDisk="true"
p:bootstrapCacheLoader-ref="ticketCacheBootstrapCacheLoader" />

<bean id="serviceTicketsCache"
class="org.springframework.cache.ehcache.EhCacheFactoryBean"
parent="abstractTicketCache"
p:cacheName="cas_st"
p:timeToIdle="0"
p:timeToLive="10"
p:cacheEventListeners-ref="ticketRMISynchronousCacheReplicator" />

<bean id="ticketGrantingTicketsCache"
class="org.springframework.cache.ehcache.EhCacheFactoryBean"
p:cacheName="cas_tgt"
parent="abstractTicketCache"
p:timeToIdle="14400"
p:timeToLive="43200"
p:cacheEventListeners-ref="ticketRMISynchronousCacheReplicator" />

<bean id="ticketRMISynchronousCacheReplicator"
class="net.sf.ehcache.distribution.RMISynchronousCacheReplicator"
c:replicatePuts="true"
c:replicatePutsViaCopy="true"
c:replicateUpdates="true"
c:replicateUpdatesViaCopy="true"
c:replicateRemovals="true" />

<bean id="ticketRMIAsynchronousCacheReplicator"
class="net.sf.ehcache.distribution.RMIAsynchronousCacheReplicator"
parent="ticketRMISynchronousCacheReplicator"
c:replicationInterval="10000"
c:maximumBatchSize="100" />

<bean id="ticketCacheBootstrapCacheLoader"
class="net.sf.ehcache.distribution.RMIBootstrapCacheLoader"
c:asynchronous="false"
c:maximumChunkSize="5000000" />

The system working normaly with JMX Memory avg 1.8Gb, but after a while
undetermined (24h, 48h, 72h ...) the JMX DaemonThreads grow (between
100k and 200k) saturating the memory and hangs CAS container.

Any idea about the behaviour or where I can investigate to debug the error.

Thanks for all.

Christopher Myers

unread,
Jun 16, 2016, 10:15:02 AM6/16/16
to cas-...@apereo.org, mon...@um.es
One thing we had to do on ours was set the stack size; the default was too big for the number of sessions we cache, so we set it to -Xss512k.

Also, we don't use ehcache, we use hazelcast because it was WAY easier to set up. Also have 2+1+1 nodes (two primary, one backup, and one "server of last resort" on a different set of hardware elsewhere.)

You might also consider setting your Xms and Xmx to be the same value - you'll probably end up at the 4GB limit eventually anyhow, so stating Java out with the max will help prevent memory fragmentation.

To help with tuning ours, I kept JVisualVM running on the boxes, so I could watch what was happening when the servers would run out of memory. Also, you might look into thread dumping and using parsejstack on the dump.

These are our tuning parameters for Java7; they don't work on Java8 because of the massive changes that Oracle made in Java8, but might maybe help out somehow?

-Xms6g
-Xmx6g
-Xss512k
-Dorg.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER=true
-XX:+UseCompressedOops
-XX:MaxPermSize=256m
-XX:NewRatio=3
-XX:SurvivorRatio=8
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+DisableExplicitGC
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSClassUnloadingEnabled
-XX:+CMSScavengeBeforeRemark
-XX:CMSInitiatingOccupancyFraction=68




>>> Juan Carlos Giménez Moncada<mon...@um.es> 06/16/16 9:00 AM >>>
--
You received this message because you are subscribed to the Google Groups "CAS Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cas-user+u...@apereo.org.
To post to this group, send email to cas-...@apereo.org.
Visit this group at https://groups.google.com/a/apereo.org/group/cas-user/.
To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-user/5762B0F5.4070508%40um.es.
For more options, visit https://groups.google.com/a/apereo.org/d/optout.

Reply all
Reply to author
Forward
0 new messages