TSDB Delete 2 more hour data

35 views
Skip to first unread message

Patrick Ma

unread,
May 20, 2020, 4:03:19 PM5/20/20
to OpenTSDB
Hi I'm having a problem in tsdb deletion. The problem is that for some metrics, when I delete the data points in the time range [start, end], the data points in [end, end + 2h] are also deleted. I'm using tsdb 2.3.4.

1. How to Reproduce
And here's the python script to reproduce this problem. The test metric name is "test.fix_missing_data".
```
def insert_test_data(start, end):
    requests.delete('http://localhost:4242/api/query?m=none:{}{{}}{{}}&start={}&end={}'.format(METRIC, start, end))
    time.sleep(3)
    # seed random number generator
    seed(1)
    DATA_INTERVAL = 30  # insert a data point every 30 seconds
    tags = {'tag4': 'v4', 'tag1': 'v1'}

    # insert test data
    buf_size = 100
    buf = []
    for ts in xrange(start, end, DATA_INTERVAL):
        value = randint(0, 10)
        buf.append({'metric': METRIC, 'timestamp': ts, 'value': value, 'tags': tags})
        if len(buf) >= buf_size:
            requests.post('http://localhost:4242/api/put?summary=true', data=json.dumps(buf))
            buf = []
    if len(buf) > 0:
        requests.post('http://localhost:4242/api/put?summary=true', data=json.dumps(buf))

def delete(start, end):
    requests.delete('http://localhost:4242/api/query?&start={}&end={}&m=max:explicit_tags:test.fix_missing_data{}{tag4=wildcard(*),tag1=wildcard(*)}'.format(start, end))
```
Note that I'm using `explicit_tags` to delete data points

2. My Investigations

So I tried `scan --delete 1589785200 1589871600 none test.fix_missing_data tag4=v4 tag1=v1` (I used 1589785200 as start and 1589871600 as end.), and found that one-more-hour data was matched. Therefore I was expecting that there were at most 1-more-hour data that could be deleted. But as I mentioned before, I do have 2-more-hour data being deleted.

This seems to be related to the usage of "explicit_tags". When I removed the "explicit_tags" from the http query, there was only 1-more-hour data deleted, instead of 2. The query I used was: 'http://localhost:4242/api/query?&start=1589785200&end=1589871600&m=max:test.fix_missing_data{}{tag4=wildcard(*),tag1=wildcard(*)}'.

3. My Questions
My situation is that:
- I have to use explicit tags to match data points.
- I delete data on a daily basis, and I don't want to delete data outside the current day.

Can anyone suggest what I should do? Many thanks!

Reply all
Reply to author
Forward
0 new messages