--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/3b291c8c-5692-4da5-89b2-99698e292cc4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/CAKyF60%2Br0wHeZSxn7jOY9-bo_zX1B9aHHSEYPehpDgAbt7cpDw%40mail.gmail.com.
Druid should take care of automatically dropping obsoleted segments for a time range. When you query for data, only the most recent data for a time range is scanned. If you really want to manually remove a segment, you can set the "used" flag for that segment to 'false' in your mysql database. Disabling a datasource should cause all segments of that datasource to be removed. There was a config parameter called "druid.master.millisToWaitBeforeDeleting" which determines how long Druid waits before starting to drop segments. In more recent versions of Druid, you should be able to dynamically configure this parameter from the master console under the dynamic configuration link.
For the really simple answer, if you are just in a POC/dev environment and are not worried about potentially doing it wrong, you can just go to the segments table in MySQL and remove all of the rows for the segments you don't want.
This won't delete the segments from deep storage, but it will get them out of the Druid system.
Hi Fangjin,
Thanks for the note.Druid should take care of automatically dropping obsoleted segments for a time range. When you query for data, only the most recent data for a time range is scanned. If you really want to manually remove a segment, you can set the "used" flag for that segment to 'false' in your mysql database. Disabling a datasource should cause all segments of that datasource to be removed.
On Wednesday, November 6, 2013 10:02:04 AM UTC-8, Rui Wang wrote:Hi Fangjin,
Thanks for the note.Druid should take care of automatically dropping obsoleted segments for a time range. When you query for data, only the most recent data for a time range is scanned. If you really want to manually remove a segment, you can set the "used" flag for that segment to 'false' in your mysql database. Disabling a datasource should cause all segments of that datasource to be removed.
Btw, Fangjin, I have 2 additional questions:
1. by set the 'used' flag to false in mysql, does that
a. make the segment disabled -- ready for deletion(or disable the datasource does this)? ...or
b. really remove the segment?
c. and in either case, it won't be used in a query right?
2. by disabling the datasource, I do see that segments are being dropped. this is what we want...but it is going
very slow. is it the way it should be? looks like in over 3 days, each machine dropped about 200gb of segments.
This won't delete the segments from deep storage, but it will get them out of the Druid system.
Thanks, Eric. In that case, will those segments become orphan? could you get them back in Druid if you want to?
--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/7e453a1f-32ca-466c-a07d-d6103d5c3400%40googlegroups.com.
2. by disabling the datasource, I do see that segments are being dropped. this is what we want...but it is going
very slow. is it the way it should be? looks like in over 3 days, each machine dropped about 200gb of segments.Dropping should be very fast, are you saying after 3 days all the segments of the datasource are not dropped? That should definitely not be the case. Do you see thing not get dropped at all or dropped and loaded back?
It was removing segments at a constant slow speed -- 3 days, each machine removed about 200GB worth of segments. One thing I should mention -- these segments were created when we had oom problems in the hadoop ingestion so each segment is about 14MB, quite small, and we need to drop 35000 of them. is this the reason that makes it slow?
thanks,
Rui
--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/53b1f68e-44e4-4676-a953-82b08ba96b28%40googlegroups.com.
Does it appear to have removed all of the small segments, or is it still working on them?
--
You received this message because you are subscribed to the Google Groups "Druid Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-developm...@googlegroups.com.
To post to this group, send email to druid-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-development/9f1ba18a-629e-4d4a-8ced-f7c3b0fe0a8c%40googlegroups.com.