Retrieve Monitoring Information Related to Multiple Running Processes using psutil at the same time | PSUTIL | KTH Master Thesis

15 views
Skip to first unread message

Vamshi Pulluri

unread,
May 7, 2019, 7:21:50 AM5/7/19
to psu...@googlegroups.com, Giampaolo Rodola'
 Hi Giampaolo,


My name is Vamshi and I am a Master Thesis student at KTH Royal Institute of Technology.​ I am working on a project for my master thesis related to security detection using Machine Learning. Idea of the project is to monitor a application running in the Linux environment using psutil. Using this monitored data construct a Machine Learning model to detect the security attacks when the application is under attack.


The application that I am planning to monitor, when it runs on the Linux environment, it starts more than one process. So, I want to monitor all these process at the same time. To construct the ML model, I need monitored data related to all these metrics at the same time. I am just wondering if it is possible to collect using psutil. I have written a Python code to collect the metrics related to multiple processes that looks something like below:


import datetime

def pids_data(multiple_pids):

    process_list = list()

    for pid in multiple_pids:

        for x in range(2):
            proc_info = None
            process_infos = {}

            try:
                process = psutil.Process(pid)
                proc_info = {"time": datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
                             "cpu_percent": process.cpu_percent(interval=2),
                             "memory_percent": process.memory_percent()

                        }

            except (psutil.ZombieProcess, psutil.AccessDenied, psutil.NoSuchProcess):
                proc_info = None

            if proc_info is not None:
                process_infos[process.name()]=proc_info
            process_list.append(process_infos)

    return process_list


The problem with the above code is that it won't collect the metrics of multiple PIDs at the same time. For example, pids_data([9232, 9272]) is called, the function would return:


[{'abc': {'time': '2019-05-06 12:09:01', 'cpu_percent': 39.5, memory_percent': 9.826277910758757}}, {'abc': {'time': '2019-05-06 12:09:03', 'cpu_percent': 39.4, 'memory_percent': 9.826277910758757,}}, {'def': {'time': '2019-05-06 12:09:07','cpu_percent': 0.5, 'memory_percent': 1.4573088996030759}}, {'def': {'time': '2019-05-06 12:09:09','cpu_percent': 0.5, 'memory_percent': 1.4573088996030759}}]


However, I want my output to be something like:


[{'abc': {'time': '2019-05-06 12:09:01', 'cpu_percent': 39.5, memory_percent': 9.826277910758757}}, {'abc': {'time': '2019-05-06 12:09:03', 'cpu_percent': 39.4, 'memory_percent': 9.826277910758757,}}, {'def': {'time': '2019-05-06 12:09:01','cpu_percent': 0.5, 'memory_percent': 1.4573088996030759}}, {'def': {'time': '2019-05-06 12:09:03','cpu_percent': 0.5, 'memory_percent': 1.4573088996030759}}]



I am wondering if it possible using psutil to collect the data metrics related to multiple process at the same time. Please let me know if it is not possible, which tool or what hack I can use to make it work. I will be pleased to hear from you. Waiting for a positive reply.


Thanks,

Best Regards,

Vamshi Pulluri.​

Jim Crowell

unread,
May 8, 2019, 11:29:43 AM5/8/19
to psutil
Not a very clear question. If I've understood, the problem is that you're getting two readings for process one followed by two readings for process two, whereas you want simultaneous readings for both processes.

But you've set it up to get sequential ones. What happens if you reverse the order of the loops, i.e.:

for x in range(2):
   
for pid in multiple_pids:


??

Giampaolo Rodola'

unread,
May 9, 2019, 3:54:22 AM5/9/19
to psutil

Hello Vamshi. I'm glad to hear you're using psutil in your thesis. If you complete your thesis I'd curious to take a look at it. =)
I agree the question is not very clear. I see a problem in your code though: it is blocking because of how you use cpu_percent(). If you need to monitor multiple processes during the entire lifetime of your app, you'll likely want to do it every X secs in a non-blocking fashion, and get "fresh" results every time. In order to do so your Process instances have to be persistent (aka reusable, aka you don't want to discard them). In order to do so you can use a generator function. Something like this, which you can call on regular intervals and return multiple process information in one shot:


    import psutil, time

    def get_procs_info(pids):
        """Accepts a list of proces PIDs to monitor from now on.
        Return a generator which yields a {pid: {proc_info}, ...} struct
        every time next() is called.
        """
        procs = [psutil.Process(p) for p in pids]
        while True:
            data = {}
            for p in procs:
                data[p.pid] = p.as_dict(attrs=['cpu_percent', 'memory_percent'])
            yield data

    gen = get_procs_info([26080, 26085])
    print(next(gen))
    time.sleep(1)
    print(next(gen))


Hope this helps.
Reply all
Reply to author
Forward
0 new messages