cleanup artifacts & pipelines in Gocd

15 views
Skip to first unread message

ashok reddy C k

unread,
Mar 11, 2025, 10:48:20 PMMar 11
to go...@googlegroups.com, ch...@thoughtworks.com
Hi Chad/Team,

We are using EFS storage for artifact pipelines. We have two environments: the staging environment and the production environment.

We have deleted the pipeline old builds under the artifactory folder, but we can still see the history in the GoCD UI.

How can we permanently clean up all pipeline history and VSM at the database level?

Could you please suggest a way to clean up the history while keeping only the last 50 builds, and deleting the rest from the pipeline artifacts and cache?

Chad Wilson

unread,
Mar 11, 2025, 11:50:58 PMMar 11
to ashok reddy C k, go...@googlegroups.com
There's no "supported" way to clean up the DB-level history. It could cause unintended consequences to fan-in, VSM and other parts of GoCD depending on the nature of your builds.

There are people that have done so in target errors such as in https://github.com/gocd/gocd/issues/879#issuecomment-2069311458 but it's certainly not supported and in the general case where your VSM has more complex inter-pipeline dependencies, you could easily create a lot of mess. Deleting from the DB also won't cause GoCD to automatically clean up the artifacts and/or artifact download cache as a side-effect.

I'd ask why you feel you need to clean up the DB level history?
  • If it's performance related, I'd prioritise moving away from H2 to PostgreSQL if you have a large setup and are still using H2.
  • If you are already on PostgreSQL and it's still performance related, it'd be good to narrow down the problem so we might be able to look at fixing it.
    • There are possibly some queries that are unexpectedly slow and according to this user, the startup can get very slow in some extreme cases if you have a very large history if runs for a "current"/non-deleted pipeline (due to the way it tries to populate a cache from the DB)
For cleaning up the artifacts and build logs, although there is also no mature "built-in" way of doing so most folks I think use some hacky cron jobs to find build jobs/runs that have not been modified in X days and remove them - although this can be tough to script if you have rules like keeping certain builds longer than others. For a more sophisticated approach to artifact maintenance, Ashwanth has a tool he wrote at https://github.com/ashwanthkumar/gocd-janitor which you might want to look at - however I have not used it personally and it might need some modernizing changes.

-Chad

ashok reddy C k

unread,
Mar 12, 2025, 4:55:42 AMMar 12
to Chad Wilson, go...@googlegroups.com
Thanks for the advice. 
Reply all
Reply to author
Forward
0 new messages