Hello.
Context:
I started my first job recently in this company. I was given the task to try to find a way to delete old executions without compromising the software, preferably without having to stop it. I am no expert but I have a lot of seniors around me so I can ask for help if this thread becomes too technical.
I was told this situation continued because the "execution history clean" functionality was not working at the time and no one solved the issue. I am still testing it to see if it already works but that is not the reason I am here today.
Problem:
As of now we are looking at 3-4 million total executions, with around 1.5 million old executions that need to be deleted to free up space and avoid any future problem. So far there hasn't been any huge impact in performance that I know of, which is surprising to me.
System:
We are running the rundeck free version (4.9.0) in Windows with a microsoft sql server database.
Ideas:
From my research even if the "execution history clean" functionality were to be working, as of now we are already looking at too many executions with more being created everyday and this method wouldn't work. I was also told to investigate the rundeck API but this method seems to be to slow, some people tried this and ended using the rundeck CLI which seems to be faster, and so did I, but even though it's better then nothing it's still too slow and doesn't solve the problem.
Possible Solution:
My last hope is that I can manually delete the logs files and the corresponding rows in the database with sql commands. I found some examples online using this method, the problem is that I cannot guarantee that this is a viable solution and if I can do this while the program is working without running into a problem. I tried this in a test environment and it seemed to be ok but with a way smaller execution history.
I would like to note that the person who configured the rundeck is no longer with us.
After looking into the database I also noticed that some of the rows in the base_report table are not associated with any row in the execution table, so the number of actual executions might be lower.
Conclusion:
I need help to figure out what is the best approach to solve this problem and if the possible solution mentioned is a viable option.
Thank you for your time.
Hi Rafael,
Another option is to use the rd cli tool but focus on the deletebulk parameter (this deletes the project executions). If you execute the rd executions deletebulk command, you can see the available parameters like --older to specify the “older than” time. Please test this in a non-prod environment first :-)
You can put this in a script on a scheduled job to clean your instance.
Regards!
Hey racuna,
Thanks for the quick reply.
As I (somewhat) mentioned I already tried this approach. I was able to delete about 100 executions every 10 minutes which is not enough.
I configured it to delete 20 executions per project every 10 minutes (it averaged between 7-8 minutes to run once), when I tried to increase the amount it would fail the deletion and it still does sometimes even with just 20.
Is this expected to be faster?
Regards.
Hello again,
Thanks for the input.
I talked with a colleague and performance itself does not seem to be the main issue at least when analyzing the cpu/memory usage.
We will consider testing and eventually upgrading to the latest version if the execution History clean feature really does not work or the problem persists.
But first I want to ask again more specifically on the topic of why some rows in the base_report table don't seem to be pointing to any actual execution, could these be slowing down the History Clean?
I could be wrong and they are actually doing something. I came to this conclusion because when looking at execution history in the UI and when I go to the oldest executions using the pagination, I get a message saying no executions were found.
In the test environment I was able to delete the old executions with the rundeck CLI scheduled script but I still had to remove these rows from the base_report with a sql DELETE command so the correct executions count / pagination was shown, and so far there hasn't been any problem but again we are looking at a way smaller execution history. I also stopped the script and configured the clean feature and so far it seems to be working fine.
Have you seen this before? I understand that it is risky to run commands directly on the database but based on my observation what do you think?
Best regards.
Hello, Rafael.
That table (base_report) must have the same number of registers as the execution table (which is linked to several tables). So, if you delete an execution via API/RD-CLI/GUI, both tables must have the same number of registers.
About this: “In the test environment I was able to delete the old executions with the rundeck CLI scheduled script but I still had to remove these rows from the base_report with a SQL DELETE command so the correct executions count / pagination was shown”
That’s strange; I tested deleting executions using RD-CLI/GUI, and the database consistency was good (even on PostgreSQL). In your case probably the database was most likely modified before, causing it to lose consistency at some point. (Can you test that procedure on 5.0.1?).
From my perspective, the best way to delete executions is to use the Rundeck tools :-)
Greetings!