High CPU consumption with agent on windows

67 views
Skip to first unread message

Pablo Maldonado

unread,
Feb 4, 2022, 3:16:13 PM2/4/22
to schedulix
Hello, I wanted to know why the Windows agent has high CPU consumption in some cases and what action I could take. thank you

Ronald Jeninga

unread,
Feb 5, 2022, 4:50:58 PM2/5/22
to schedulix
Hi,

the high CPU consumption is possibly due to the execution of WMIC that collects the start times of the currently running processes.
(In a Linux environment this is done similarly using ps).

There is a bug in Windows that can cause jobs to end in a BROKEN_FINISHED state after a time change to DST or back.
Starting with this Windows related bug, I had a lengthy conversation on GitHub with someone and during our efforts to find a workaround we also found performance issues on certain Windows releases.
As a side effect of the investigations I've developed a tiny program in C that directly asks the operating system for the required information and prints that to stdout.
If that executable is called instead of WMIC or powershell, both the performance issue and the DST issue are resolved.

You can find that GitHub discussion on


You'll also find the source code of my "winps" and instructions how to modify schedulix to get it running somewhere in there.

Unfortunately I didn't find the time yet to include it in the schedulix sources, or the Windows archive on the website.

Best regards,

Ronald

Pablo Maldonado

unread,
Feb 7, 2022, 8:36:40 AM2/7/22
to schedulix

Hi Ronald,

Thanks for the help, I understand that a fresh build of 2.9 or 2.10 would fix the problem.
As just today I went from version 2.9 to 2.10 on the server and then the update of the agents comes, it could be the solution.

thank you

Paul Maldonado

Ronald Jeninga

unread,
Feb 7, 2022, 9:12:04 AM2/7/22
to schedulix
Hi Pablo,

I've just checked and found that in the current 2.10 source, it is already winps.exe that is called.
If you have issues compiling it from source on Windows, I've uploaded the executable to our web server.
You can download the executable from: https://www.independit.de/Downloads/schedulix/winps.exe
This executable should be stored in $BICSUITEHOME/bin. Naturally in Windows this is called %BICSUITEHOME%\bin.
The Jobserver calls the executable with a full qualified path (after replacing BICSUITEHOME).

I'm not 100% sure I've already built rpms that use winps though.
Hence if you've upgraded from rpms, it might or might not use the winps executable.

Building the new rpms is on my TODO list as well.
I wanted to try to build docker containers to create the rpms, which turns out to be not that easy.
I have to add I'm not a docker expert and there are still a few thing I'll have to figure out.

Best regards,

Ronald

Pablo Maldonado

unread,
Feb 7, 2022, 9:35:48 AM2/7/22
to schedulix
Hi Ronald, If I already have the agent zip for Windows and the winps.exe is included. Thanks

Pablo Maldonado

unread,
Feb 15, 2022, 1:47:50 PM2/15/22
to schedulix
Hi Ronald, I upgraded the agent from 2.9 to 2.10 on a windows server, and I don't see much CPU and MEM consumption by the Schedulix, but if I look at the memory with Ranmap a Microsoft application, I get a huge amount of winps.exe where it says session 0, it's like zombies. Can you disable the use of winps.exe to see if this changes or what could be done? Pablo Maldonado

Ronald Jeninga

unread,
Feb 15, 2022, 10:18:00 PM2/15/22
to schedulix
Hi Pablo,

I'm afraid I don't know where those zombies come from and what they actually are.
I know exactly what a zombie is in a Unix/Linux environment, but I'm not that much a specialist in Windows.

Before we used winps.exe, we called a Mincrosoft program WMIC instead.
In the Java code that was just a matter of replacing one string with another.
This means no change in logic whatsoever.
It would be interesting to know if you would see a huge amount of WMIC processes after (temporarily) reverting the upgrade.
(Note that the 2.9 jobserver works perfectly with a 2.10 scheduling server. Hence by replacing the current BICsuite.jar file with the old one, you can easily downgrade a jobserver.
If your scheduling server runs on the same machine as the jobserver it isn't that easy though; you'd basically need to set up 2 installations).

In the end, since you don't see much resources to be spoiled by schedulix, those winps session 0 entries don't seem to burden the system.
I can be wrong there, of course.
We don't run winps.exe out of fun, it is run because we need the information. Hence disabling it would break the jobserver.
So let's try to find out first what those Zombies are and how they are created. (in Unix: by not retrieving the exit code of a child process).

Best regards,

Ronald

Pablo Maldonado

unread,
Feb 16, 2022, 9:03:13 AM2/16/22
to schedulix
Hi Ronald, Perfect, there I replace the BCSuite.jar from 2.9 with the one from 2.10 and we'll see what happens. Thanks


Pablo Maldonado

Ronald Jeninga

unread,
Feb 16, 2022, 9:25:53 AM2/16/22
to schedulix
Hi Pablo,

eeeh, you mean the other way around? You've just upgraded to 2.10.
So 2.10 is already in place and 2.9 is what you (and I) want to test for comparison.

Best regards,

Ronald

Pablo Maldonado

unread,
Feb 16, 2022, 10:19:35 AM2/16/22
to schedulix
Hi Ronald, Sorry, I misunderstood you, I also have version 2.9 of the agent in another windows and if the WMIC.exe appears several times, I see this with the RAMMAP as I said, now the amount of WMIC.exe that I see compared to the winps .exe on the other computer, it is very small.
The attached image shows part of the list of winps.exe processes that appear.
Paul Maldonado

winps.png

Ronald Jeninga

unread,
Feb 16, 2022, 11:10:58 AM2/16/22
to schedulix
Hi Pablo,

no problem, it is good I've asked.
Thank you for the screen shot. Although it perfectly matches your description, the picture gave me a better idea.

Since the WMIC and the winps processes are started in exactly the same way, I doubt that a mistake lies on the Java side.
Maybe in Windows there's something one can do to tell the OS that nobody is ever going to be interested in the corpse that remains after terminating the process.
It is at least a hypothesis and Google will be my friend in confirming or falsifying it.

Maybe you can help me a bit and find out if Microsoft's documentation tells you something about the nature of those zombies (and about who's responsible for cleaning)?

I think I am a bit ahead of time compared to you. I'll do my research tomorrow as it is past working hours here now.
Granted, that list of zombies looks ugly, but one can afford quite a few of them with their 20K size.

Best regards,

Ronald

Ronald Jeninga

unread,
Mar 9, 2022, 10:02:30 AM3/9/22
to schedulix
Hi Pablo,

today I've created a little Java program that does nothing more than to start winps.exe in a loop, using exactly the same code (essentially) that is used by the jobservers.
It runs winps once per second, 500 times in a row.

Indeed, if I look at RAMMAP, I see all those winps processes pop up.
And as it seems, all those entries are only removed after my little program terminated.

So the good news is that I have a solid testcase now.
The bad news is that I don't have a clue yet what is going wrong (in fact, how to release that handle my process seems to keep).

But I'm making progress. Just a little more patience please.

Best regards,

Ronald

Pablo Maldonado

unread,
Mar 11, 2022, 3:23:01 PM3/11/22
to schedulix
Hi Ronald, Sorry for replying so late, first of all thank you for your help. From what I see, for another reason, it doesn't only do it with Schedulix, but with scripts that run nagios and are compiled, because if I pass them to powershell or bat without compiling, it won't generate that effect. Paul Maldonado

Ronald Jeninga

unread,
Mar 12, 2022, 4:22:25 AM3/12/22
to schedulix
Hi Pablo,

java seems to keep a handle to the former child process for some reason.
And because not all references to the process are released, not even after a p.waitFor(), Windows has no choice than to keep the entry in the process table.
Unfortunately I didn't find yet how to release or even retrieve that handle (that is well hidden in the depths of the JVM).
It is possible that we are talking about an implementation flaw in Java (which could be resolvable with Java 9 and better).
Note that all examples how to start another process from within Java basically all show the same code which, by accident, matches our code.
This again might explain why not only schedulix is affected.

The good news is that I don't give up yet. But it's a challenging problem, that's for sure.

Best regards,

Ronald

Ronald Jeninga

unread,
Mar 14, 2022, 10:04:18 AM3/14/22
to schedulix
Hi Pablo,

today I've made a real progress, as it seems.
I've been trying about everything, including reading Java Source Code (the stuff in src.zip, if installed; more convenient is e.g. http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang).
I've been using the Runtime.exec() method instead of using the ProcessBuilder (until I found that exec() ist just a wrapper around the Processbuilder).
I've tried to destroy() processes explicitly (after they have terminated).
All to no avail :-(

Anyway, the only thing left to try was to see if the Garbage Collector does more than just to clean up memory.
The hint that made me try this stated that the GC not only cleans up but also closes open files (if the object to clean out happens to be an unreferenced file object).
Hence that could work for process handles as well.

After changing my test program:

import java.util.*;
import java.io.*;
import java.util.regex.*;
import java.text.*;
import java.lang.reflect.*;
import java.util.concurrent.TimeUnit;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.nio.file.attribute.BasicFileAttributes;

public class StartPS
{
        static Long STARTTIME_JITTER = 5000L;

        synchronized public static HashMap<String,Long> getStartTimes()
        {
                HashMap<String,Long> result = new HashMap<String, Long>();;
                final File tmp_file = new File("/tmp/starttimes.out");
                String tmpfilename = null;
                try {
                        tmpfilename = tmp_file.getCanonicalPath();
                } catch (Exception e) {}

                try {
                        ProcessBuilder pb = new ProcessBuilder(".\\winps.exe");

                        pb.redirectOutput(new File(tmpfilename));
                        pb.redirectErrorStream(true);
                        Process p = pb.start();
                        p.getOutputStream().close();
                        p.getInputStream().close();
                        p.getErrorStream().close();
                        try {
                                p.waitFor();
                        } catch (InterruptedException ie) {
                                // maybe some cleanup here
                        }

                        try {
                                while (p.isAlive()) { // <- most likely unnecessary ;-)
                                        Thread.sleep(100);
                                        p.destroy();
                                        System.gc();
                                }
                                p = null;      // should help the garbage collector (eliminates a reference to the object p points to)
                                System.gc();   // <- most likely the key to success !

                        } catch (Exception e) {
                                System.out.println("Warning: " + e.toString());
                        }
                } catch (Exception e) {
                        throw new RuntimeException("(02310251044) Process start times : " + e.toString());
                }
                return result;
        }

        public static void main(String[] argv)
        {
                String sMax;
                int iMax = 500;

                if  (argv.length > 0) {
                        sMax = argv[0];
                        try {
                                iMax = Integer.parseInt(sMax);
                        } catch (Exception e) {
                                System.out.println("Oops : " + e.toString());
                                System.exit(1);
                        }
                }
                for (int i = 0; i < iMax; ++i) {
                        System.out.print("\r" + i);
                        getStartTimes();
                        try {
                                Thread.sleep(1000);
                        } catch (Exception e) {
                                // do nothing
                        }
                }
        }

}


I was able to run it and RAMMAP didn't show more than a single winps.exe in the list (in fact, none most of the time).
This compared to the hundreds of winps.exe entries I've seen before, makes it likely that the tiny change of calling System.gc() resolved the issue.

My next step is to add the change to all relevant releases (2.9, 2.10, 2.11/Development) and to create new rpm packages.
I'll drop a note here as soon as I've completed the process.

Best regards,

Ronald

Pablo Maldonado

unread,
Mar 14, 2022, 12:43:41 PM3/14/22
to schedulix
Hi Ronald

Thank you for this.

Ronald Jeninga

unread,
Mar 17, 2022, 11:36:13 AM3/17/22
to schedulix
Hi Pable, hi all,

just a moment ago I've released the new rpms (2.9 and 2.10, RHEL7 and RHEL8).

@Pablo: can you test the new release and tell me if it works for you as well as it did for me?

Best regards,

Ronald

Pablo Maldonado

unread,
Mar 18, 2022, 2:06:53 PM3/18/22
to schedulix
Hi Ronald,

One question, I install everything and more in Windows without rpm, did you impact the change in Github or only in the rpm?

If not, give me some time and I'll look for a Linux computer and try it out by rpm.


Thank you for this


Pablo Maldonado

Ronald Jeninga

unread,
Mar 19, 2022, 4:26:59 AM3/19/22
to schedulix
Hi Pablo,

my workflow dictates that I need to get my Git sandbox right first.
Only with a clean sandbox I do a "make rpm". It is not (yet) forced though, so there's room for mistakes.
The good news is that I followed the procedure to the letter ;-)

So yes, GitHub is leading at the moment. (that means: all local repositories are either clean or old).
IOW: "cd $SDMSHOME/src && git pull && make new" should give you a fresh jar file.

Best regards,

Ronald

Pablo Maldonado

unread,
Mar 22, 2022, 8:14:27 PM3/22/22
to schedulix
Hi Ronald, To test the improvements, compile the Schedulix software, I attach a document of how I did it, in case something is wrong. Now I'm just going to update the Linux libs to the Windows libs and install an agent for testing. Pablo Maldonado
Compilacion_Schedulix_2.10.txt

Ronald Jeninga

unread,
Mar 23, 2022, 5:35:39 AM3/23/22
to schedulix
Hi Pablo,

your recipe looks pretty good. I didn't test it, but on the other hand I'm definitely experienced if it is about setting up a compilation environment.
What I didn't like, but here I'm blaming me, is the swt.jar.
I'll consider creating an extra jar file for the SDMSpopup program. 
Everything else can be compiled without swt.jar.

I'm curious about your results.

Best regards and thank you for supporting us,

Ronald

Pablo Maldonado

unread,
Mar 23, 2022, 4:07:49 PM3/23/22
to schedulix
Hi Ronald, I updated the agent in a Windows, before I looked with RAMMAP and there were many WMIC.exe, when I downloaded the service, they disappeared, and with the update you don't see that anymore, I mean it seems to be solved, I'll follow it for a few days and I'll tell you tale. Thank you very much Pablo

Ronald Jeninga

unread,
Mar 24, 2022, 5:28:42 AM3/24/22
to schedulix
Hi Pablo,

thank you for the feedback!
Your observation matches mine. If I restart an old jobserver, I find many winps/WMIC entries in the Windows process table before the restart and they are vanished after the restart.
This proves that Java keeps a handle "alive".
With the new code, I find at most a single winps entry in the process table.
Which proves that the (obsolete) handles are cleaned out by the garbage collector.

Thank you,

Ronald

Pablo Maldonado

unread,
Mar 25, 2022, 9:59:46 AM3/25/22
to schedulix
Hi Ronald, Today check the computer where you install the updated and perfect Schedulix 2.10, the list of winps.exe no longer appears. Thank you very much Pablo Maldonado

Ronald Jeninga

unread,
Mar 25, 2022, 10:22:28 AM3/25/22
to schedulix
Hi Pablo,

Great! Thank you for the confirmation!
Enjoy your weekend!

Best regards,

Ronald

Reply all
Reply to author
Forward
0 new messages