orphaned blobs not deleted by gc

29 views
Skip to first unread message

Krzysztof Kaczmarski

unread,
Jun 12, 2015, 5:25:35 PM6/12/15
to disc...@googlegroups.com, Paweł Tobiś
Hi All,

We constantly upload and delete lots of data in disco ddfs.
Upload is done by ddfs chunk command, deletion by the command line tool, too.
After deletion we can see tags with names changed to +deleted*. However blobs are not modified in the file system.
However, disco log does not report any errors.
Also execution of the garbage collector is not improving this situation. We cannot distinguish blob files
which may be safely removed from the file system. We also observe constant shrinking of free space in the cluster until it
is completely packed and unusable.

We set the following configuration:
DISCO_GC_AFTER=60*60*24 (1day)
DDFS_PARANOID_DELETE=True
DDFS_TAG_REPLICAS=3
DDFS_BLOB_REPLICAS=3

Can somebody points us to the source of the problem? How can we force GC to remove not used files?

Cheers,
Krzysztof



Erik Dubbelboer

unread,
Jun 13, 2015, 3:10:34 AM6/13/15
to disc...@googlegroups.com, pawel...@orange.pl
I'm not sure if this has anything to do with your problem but orphaned blocks and tags are only delete after 5 days. See: https://github.com/discoproject/disco/blob/83874d93e74491c15699233865880ffb9ca7c2a0/master/src/ddfs/config.hrl#L130

Cheers,
Erik

Krzysztof Kaczmarski

unread,
Jun 13, 2015, 8:20:03 AM6/13/15
to disc...@googlegroups.com, pawel...@orange.pl
Is there any way to change this to one day?
We haven't got enough space to keep input data for five days...

K.
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Disco-development" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/disco-dev/oRyxxUCmg-4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> disco-dev+...@googlegroups.com
> <mailto:disco-dev+...@googlegroups.com>.
> To post to this group, send email to disc...@googlegroups.com
> <mailto:disc...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/disco-dev.
> For more options, visit https://groups.google.com/d/optout.

Erik Dubbelboer

unread,
Jun 14, 2015, 3:50:42 AM6/14/15
to disc...@googlegroups.com, pawel...@orange.pl
At the moment it's only possible to change this by modifying the source and recompiling and installing your modified version.

Cheers,
Erik

Krzysztof Kaczmarski

unread,
Jun 15, 2015, 5:03:19 AM6/15/15
to disc...@googlegroups.com, Paweł Tobiś
Can you point us to a place in code which should be changed?

Thanks,
Krzysztof

Erik Dubbelboer

unread,
Jun 16, 2015, 3:53:06 AM6/16/15
to disc...@googlegroups.com, pawel...@orange.pl

Krzysztof Kaczmarski

unread,
Jun 17, 2015, 3:08:03 AM6/17/15
to disc...@googlegroups.com, pawel...@orange.pl
Thanks Erik, this is very helpful.

There is a comment for this line:

% When orphaned blob can be deleted.  This should be large enough that
% you can upload all the new blobs of a tag and perform the tag update
% within this time.

Is this valid only for adding a blob to an existing tag?
Will it matter if we just use ddfs chunk and create tags and blobs in one command?

Cheers,
Krzysztof
--
You received this message because you are subscribed to a topic in the Google Groups "Disco-development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/disco-dev/oRyxxUCmg-4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to disco-dev+...@googlegroups.com.
To post to this group, send email to disc...@googlegroups.com.

Erik Dubbelboer

unread,
Jun 19, 2015, 12:15:40 AM6/19/15
to disc...@googlegroups.com, pawel...@orange.pl
ddfs chunk might be one command but internally it is implemented as multiple operations. It first creates and uploads the blobs and then creates a tag and adds them to it.
I would suggest setting it to 1 day or a couple of hours depending on how big the blobs are that you are putting into disco.

Cheers,
Erik
Reply all
Reply to author
Forward
0 new messages