Pruning LFS data

481 views
Skip to first unread message

Mikko Paukkila

unread,
Mar 1, 2022, 10:41:46 AM3/1/22
to Repo and Gerrit Discussion
Hi!
We start to have quite a lot of LFS data in our Gerrit. What are the best ways to prune old LFS that is not needed any more from the server?

Gerrit's lfsdata folder as such does not contain clear text repository names etc. Are there any scripts to get statistics of which repositories have the most of the lfsdata? 

If I remove a repository from Gerrit (by delete-project), does it clean-up LFS data too?

Br. Mikko


ld...@audiokinetic.com

unread,
Sep 13, 2022, 1:16:47 PM9/13/22
to Repo and Gerrit Discussion
Hi Mikko, did you find a way to prune the LFS data?

Thanks,
Lawrence

Mikko Paukkila

unread,
Sep 14, 2022, 4:01:44 AM9/14/22
to Repo and Gerrit Discussion
Hi!
I didn't find out. I haven't been looking solution either since, but though this a little bit more:

Gerrit stores lfs data to one folder and not under git repository folders. Thus I assume that git lfs prune can't be used (https://manpages.debian.org/testing/git-lfs/git-lfs-prune.1.en.html).

But clean-up could be scripted by using following:

/etc/gerrit/git/my_repo.git$ git lfs ls-files --all -l
6c35d136f365fb96c907c72f01bd158ef7b35e12b32dfdsadss3e9aa80e7c37d - test.bin
....

Create that list before deleting a git repository. Or loop all git repository folders and create a list of all LFS blobs that are currently found from repositories/branches. I am not sure will that clean too much (also old versions of LFS files).

After you have the LFS blob list, you can find LFS files from lfsdata folder. Take the first four characters from the hash to find the folder:

/etc/gerrit/lfsdata/6c/35$ ls
6c35d136f365fb96c907c72f01bd158ef7b35e12b32dfdsadss3e9aa80e7c37d

Before trying, backupping is recommended :)

Br. Mikko

doug.r...@wandisco.com

unread,
Sep 14, 2022, 3:46:30 PM9/14/22
to Repo and Gerrit Discussion
Folks:

On Wednesday, September 14, 2022 at 4:01:44 AM UTC-4 mikko.p...@gmail.com wrote:
I didn't find out. I haven't been looking solution either since, but though this a little bit more:

Gerrit stores lfs data to one folder and not under git repository folders. Thus I assume that git lfs prune can't be used (https://manpages.debian.org/testing/git-lfs/git-lfs-prune.1.en.html).

But clean-up could be scripted by using following:

/etc/gerrit/git/my_repo.git$ git lfs ls-files --all -l
6c35d136f365fb96c907c72f01bd158ef7b35e12b32dfdsadss3e9aa80e7c37d - test.bin
....

Create that list before deleting a git repository. Or loop all git repository folders and create a list of all LFS blobs that are currently found from repositories/branches. I am not sure will that clean too much (also old versions of LFS files).

Be careful: the entries in the LFS cache are cross-repository.  Just because a repository being removed has a pointer to an LFS cache object does not mean that other repositories do not also have a pointer to it (you can check in the same blob to multiple repos).

Cheers.

Doug
Reply all
Reply to author
Forward
0 new messages