Sakai 19.5 - Continuous Garbage Collection

16 views
Skip to first unread message

Austin

unread,
Jan 4, 2021, 4:25:22 PM1/4/21
to sakai-dev
Hello Sakai Devs,

We recently upgraded our production servers to Sakai 19.5.  Since then we've been seeing Continuous Garbage Collection, more than usual.  Taking a look at some heap dumps with YourKit, which I'm not too familiar with, it looks like the Finalizer might be getting stuck?

On two occasions when I "fully expanded" the Finalizer class, it loaded a ton more finalizers.  Then right clicking on them and choosing "Selected Object Class", it was showing

referent  [Pending Finalization] java.util.jar.JarFile name = "/tomcat/webapps/lessonbuilder-tool/WEB-INF/lib/rsf-web-evolvers-1.1.jar"

In another instance of the continuous gc, the finializer "Selected Object Class" was

referent  [Pending Finalization] com.sun.crypto.provider.PBEKey

We tried forcing finalization by running a script that Hendrick sent us a while ago, but that didn't seem to fix it.

As far as the lessons tool, I found https://jira.sakaiproject.org/browse/SAK-37755 but that issue mentions importing / exporitng .imscc files.  In our case, I haven't been able to find evidence of that.  However, one of our instructors mentioned they did a Site Import that included lessons.

Does anyone have any ideas? Or has anyone seen this in Sakai 19?

Thanks,

Austin

Austin

unread,
Jan 6, 2021, 8:16:33 PM1/6/21
to sakai-dev
We hit this again today, but this time the object that looked stuck in the finalizer was

referent  [Pending Finalization] org.sakaiproject.user.impl.BasePreferencesService$BasePreferences
|_ m_properties  [Pending Finalization] org.sakaiproject.util.BaseResourcePropertiesEdit
|_ m_id  [Pending Finalization] java.lang.String "abcd-1234-efgh-5678"
|_ m_event  java.lang.String "prefs.upd"

Where the same user_id seemed stuck (in the handful of finalizer objects i checked).  Our servers are allocated with 10G of heap, so I'd think we'd have plenty of memory, but it's strange that we've been seeing more of these after upgrading to 19.5.

Austin

unread,
Jan 6, 2021, 9:34:00 PM1/6/21
to sakai-dev
sorry, nevermind about this last email about the user preferences getting stuck.  after a few hours it finally cleared on its own.

but my previous email about the Lessons tool getting stuck still applies, those continued until we restarted the server during our daily backups.

Austin

unread,
Jan 8, 2021, 4:47:14 PM1/8/21
to sakai-dev
Hello Again,

We hit this 3 times yesterday where the GC ran all day until we restarted for backups.  And It's happening again today on one of our servers.  In each of these 4 instances the finalizer was stuck on different objects

org.sakaiproject.user.impl.BasePreferencesService$BasePreferences
org.sakaiproject.content.impl.BaseContentService$BaseCollectionEdit
java.util.jar.JarFile name = "/home/sakai/tomcat/webapps/rubrics-service/WEB-INF/lib/thymeleaf-3.0.10.RELEASE.jar"
java.util.zip.ZipFile$ZipFileInflaterInputStream

however, in 3 out of those 4, I also noticed that there were a few unreachable objects that looked like they might be taking up a lot of space

Unreachable] char[40529284] "Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care > Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care > Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care >...
Unreachable] char[40529284] "Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care > Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care > Week 2: Screening & Health Maintenance/Promotion; ompetence: Special Populations; Preconception Care >...

What's strange about that is I found that it was related to a Lessons item, but for some reason that item has a pageId = 0 and siteId = 0 in the database. That is the only record in our database that has both the pageId and siteId = 0.  Maybe something is stuck in a loop trying to find that item's parents?  Would it be safe to just delete that record?

Thanks,

Austin
GC-yourkit.png

Evandro Pires Alves

unread,
Jan 13, 2021, 5:10:00 AM1/13/21
to Austin, sakai-dev

Hi, could this be related to a previous thread on this list with the subject "[sakai-dev] Performance Issues in Timed Test & Quizzes" ?

I'll be doing a dummy test using the SAMIGO tool tomorrow but I'm still preparing to gather as much data as possible, in particular about memory utilization.


Às 21:46 de 08/01/21, Austin escreveu:
--
You received this message because you are subscribed to the Google Groups "Sakai Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sakai-dev+...@apereo.org.
To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/sakai-dev/CAKL84%3DnS5RiiiRHgZFcLd5un2nMextNxUzYVrD-gdRJYKfGcHg%40mail.gmail.com.
-- 
Evandro Pires Alves
===================
Serviços Centrais, gab. -1.62
273 330 804 - extensão 43804

Earle Nietzel

unread,
Jan 13, 2021, 9:37:56 AM1/13/21
to Evandro Pires Alves, Austin, sakai-dev
Austin can you share your java version and jvm options for that node?

-earle


Austin

unread,
Jan 13, 2021, 2:01:07 PM1/13/21
to Earle Nietzel, sakai-dev
Sakai 19.5
Tomcat 9.0.37
Java 1.8.0_261-b25

RAM 24G

-Xmx10280m
-Xms10280m
-XX:MaxMetaspaceSize=1024m
-Xmn2500m
-XX:+UseCompressedOops
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=80
-XX:+DisableExplicitGC
-XX:+DoEscapeAnalysis
-Dhttp.agent=Sakai
-Djava.awt.headless=true
-Duser.timezone=Pacific/Honolulu
-Dsun.lang.ClassLoader.allowArraySyntax=true
-Djava.util.Arrays.useLegacyMergeSort=true
-Dorg.apache.jasper.compiler.Parser.STRICT_QUOTE_ESCAPING=false
-verbose:gc
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps


The GC issue happened several more times in the past few days.  With most of them showing the same unreachable Lessons item.  Thankfully, the servers seemed to be usable (especially since the semester just started) despite the GC running all day.  We'll be experimenting with changing that lessons item's pageId from 0 to something else to see if that helps.  If not, we'll try deleting that record all together.  I don't think this is the only cause of the continuous GC, but hopefully that will reduce the frequency.

Earle Nietzel

unread,
Jan 14, 2021, 1:38:28 AM1/14/21
to Austin, sakai-dev
Your options look reasonable only thing I would change is 
-XX:CMSInitiatingOccupancyFraction=80

With a larger heap it's a good idea to start gc sooner I would go to 70 or even 65. The goal is to make sure gc has enough time to finish.
If your eden gc's are fast at 2500 you could up that to 3G this will also help with the pressure of collecting the old generation. When it comes to Sakai you should increase eden until either your 1/3 the total heap or it eden gc's slow down.

But your options generally look good.

-earle

Earle Nietzel

unread,
Jan 26, 2021, 2:23:39 PM1/26/21
to Austin, sakai-dev
Opened a JIRA as some more information has come into view around this issue

-earle

Reply all
Reply to author
Forward
0 new messages