Callbacks around series lifecycle in TSDB for the use of external users

GANESH VERNEKAR

unread,

Apr 21, 2020, 10:17:43 AM4/21/20

to Prometheus Developers

Hi All,

I would like to propose a callback for the TSDB for series creation and deletion. This would help external projects like Cortex and Thanos in implementing limits on series while having no performance impact on Prometheus with minimal code addition.

The callback would look something like this (names TBD) with no performance impact on TSDB.

type SeriesLifecycleCallback interface {

PreCreationCallback(labels.Labels) error

PostCreationCallback(labels.Labels)

PostDeletionCallback(labels.Labels)

}

1. SeriesLifecycleCallback

It would be a part of DB.Options

2. PreCreationCallback

It would be called just before holding the lock here. And if the method returns an error, it means we don’t want to create a new series.

Further the hashes would be checked as usual, and if we reach the step of adding a new hash, it would be either executed or skipped based on the result of `PreCreationCallback`.

If the series creation was skipped, the error returned by `PreCreationCallback` would be propagated back into the call stack and the `Add` or `AddFast` method would finally return back the error. (This means additional return values for the methods in that call stack).

3. PostCreationCallback

This would be called after releasing the lock here letting the caller know that the series was created.

4. PostDeletionCallback

It would be called after deleting a series during garbage collection here (after unlocking) so that external projects can do the bookkeeping.

----

In the context of Prometheus, the interface would be a noop, hence no performance impact.

Some reasoning behind the design decisions and what guarantees it provides:

The idea was to avoid having the callbacks inside the locked section which could lead to performance degradation. Hence the Series creation callback was split into 2 methods. PreCreationCallback method tells whether to create the series if we need to create and the next method `PostCreationCallback` to let the user know if it was created, so that appropriate bookkeeping can be done.

Since all the callbacks happen outside the locks (even if it was inside the lock, there could be concurrent calls for different shards of the stripe series), the user can expect some soft consistencies here. For example, the PreCreationCallback could be called concurrently by multiple series creation when you are at the edge of the series limit and all the series creation would be allowed, which might cause the limit to cross by a small number. Similarly during the deletion phase: some creation of series would fail as the deletion callback is called a little later. Hence the user is required to expect soft consistencies while using these callbacks.

What do you all think about this addition in TSDB?

Thanks,

Ganesh

Brian Brazil

unread,

Apr 21, 2020, 11:29:12 AM4/21/20

to GANESH VERNEKAR, Prometheus Developers

On Tue, 21 Apr 2020 at 15:17, 'GANESH VERNEKAR' via Prometheus Developers <prometheus...@googlegroups.com> wrote:

Hi All,

I would like to propose a callback for the TSDB for series creation and deletion. This would help external projects like Cortex and Thanos in implementing limits on series while having no performance impact on Prometheus with minimal code addition.

The callback would look something like this (names TBD) with no performance impact on TSDB.

type SeriesLifecycleCallback interface {
PreCreationCallback(labels.Labels) error
PostCreationCallback(labels.Labels)
PostDeletionCallback(labels.Labels)
}

1. SeriesLifecycleCallback

It would be a part of DB.Options

2. PreCreationCallback

It would be called just before holding the lock here. And if the method returns an error, it means we don’t want to create a new series.

Further the hashes would be checked as usual, and if we reach the step of adding a new hash, it would be either executed or skipped based on the result of `PreCreationCallback`.

If the series creation was skipped, the error returned by `PreCreationCallback` would be propagated back into the call stack and the `Add` or `AddFast` method would finally return back the error. (This means additional return values for the methods in that call stack).

3. PostCreationCallback

This would be called after releasing the lock here letting the caller know that the series was created.

4. PostDeletionCallback

It would be called after deleting a series during garbage collection here (after unlocking) so that external projects can do the bookkeeping.

----

In the context of Prometheus, the interface would be a noop, hence no performance impact.

It'd still cost a few cycles.

Some reasoning behind the design decisions and what guarantees it provides:

The idea was to avoid having the callbacks inside the locked section which could lead to performance degradation. Hence the Series creation callback was split into 2 methods. PreCreationCallback method tells whether to create the series if we need to create and the next method `PostCreationCallback` to let the user know if it was created, so that appropriate bookkeeping can be done.

Since all the callbacks happen outside the locks (even if it was inside the lock, there could be concurrent calls for different shards of the stripe series), the user can expect some soft consistencies here. For example, the PreCreationCallback could be called concurrently by multiple series creation when you are at the edge of the series limit and all the series creation would be allowed, which might cause the limit to cross by a small number. Similarly during the deletion phase: some creation of series would fail as the deletion callback is called a little later. Hence the user is required to expect soft consistencies while using these callbacks.

What do you all think about this addition in TSDB?

This seems to require adding error handling for cases where Prometheus cannot have errors, what happens to a scrape or rule evaluation when this fails?

I'm also not sure it's considering that the semantics of how this is all managed inside the tsdb is not what you might expect. For example a failed scrape due to hitting the sample limit can still cause series to be created, which usually doesn't come up but might now.

--

Brian Brazil

www.robustperception.io

GANESH VERNEKAR

unread,

Apr 21, 2020, 11:47:56 AM4/21/20

to Prometheus Developers

On Tuesday, April 21, 2020 at 8:59:12 PM UTC+5:30, Brian Brazil wrote:

On Tue, 21 Apr 2020 at 15:17, 'GANESH VERNEKAR' via Prometheus Developers <prometheus...@googlegroups.com> wrote:
Hi All,

I would like to propose a callback for the TSDB for series creation and deletion. This would help external projects like Cortex and Thanos in implementing limits on series while having no performance impact on Prometheus with minimal code addition.

The callback would look something like this (names TBD) with no performance impact on TSDB.

type SeriesLifecycleCallback interface {
PreCreationCallback(labels.Labels) error
PostCreationCallback(labels.Labels)
PostDeletionCallback(labels.Labels)
}

1. SeriesLifecycleCallback

It would be a part of DB.Options

2. PreCreationCallback

It would be called just before holding the lock here. And if the method returns an error, it means we don’t want to create a new series.

Further the hashes would be checked as usual, and if we reach the step of adding a new hash, it would be either executed or skipped based on the result of `PreCreationCallback`.

If the series creation was skipped, the error returned by `PreCreationCallback` would be propagated back into the call stack and the `Add` or `AddFast` method would finally return back the error. (This means additional return values for the methods in that call stack).

3. PostCreationCallback

This would be called after releasing the lock here letting the caller know that the series was created.

4. PostDeletionCallback

It would be called after deleting a series during garbage collection here (after unlocking) so that external projects can do the bookkeeping.

----

In the context of Prometheus, the interface would be a noop, hence no performance impact.

It'd still cost a few cycles.

But it would be very less to even notice.

Some reasoning behind the design decisions and what guarantees it provides:

The idea was to avoid having the callbacks inside the locked section which could lead to performance degradation. Hence the Series creation callback was split into 2 methods. PreCreationCallback method tells whether to create the series if we need to create and the next method `PostCreationCallback` to let the user know if it was created, so that appropriate bookkeeping can be done.

Since all the callbacks happen outside the locks (even if it was inside the lock, there could be concurrent calls for different shards of the stripe series), the user can expect some soft consistencies here. For example, the PreCreationCallback could be called concurrently by multiple series creation when you are at the edge of the series limit and all the series creation would be allowed, which might cause the limit to cross by a small number. Similarly during the deletion phase: some creation of series would fail as the deletion callback is called a little later. Hence the user is required to expect soft consistencies while using these callbacks.

What do you all think about this addition in TSDB?

This seems to require adding error handling for cases where Prometheus cannot have errors, what happens to a scrape or rule evaluation when this fails?

The Add and AddFast methods already returns an error. So failure in series creation would follow the same procedure. But thinking about it more, it will be a no-op for Prometheus, and I don't think the external users who would use this feature will have scraping in it (considering it as a "TSDB" package and not "Prometheus" package).

I'm also not sure it's considering that the semantics of how this is all managed inside the tsdb is not what you might expect. For example a failed scrape due to hitting the sample limit can still cause series to be created, which usually doesn't come up but might now.

Yes, all the changes in TSDB are not totally explored yet, I will work on a draft to see what all needs changing. Additionally, series is still created currently if the appender Commit() fails (a failed scrape?).

--
Brian Brazil
www.robustperception.io

Brian Brazil

unread,

Apr 21, 2020, 12:09:49 PM4/21/20

to GANESH VERNEKAR, Prometheus Developers

A failed scrape would be a Rollback. A WAL error is what would cause a Commit to fail, which is likely a full filesystem.

--

Brian Brazil

www.robustperception.io

GANESH VERNEKAR

unread,

Apr 22, 2020, 5:06:13 AM4/22/20

to Prometheus Developers

Yes, but series is still created in case of Rollback and not deleted till next compaction.

--
Brian Brazil
www.robustperception.io

GANESH VERNEKAR

unread,

Apr 27, 2020, 6:41:14 AM4/27/20

to Prometheus Developers

Hi,

I forgot to post the link to the PR implementing this proposal in an attempt to show the required changes, here it is: https://github.com/prometheus/prometheus/pull/7159

Coming back to the concerns regarding error in no-error cases and how would it work in case of failed scrapes:

Inside TSDB: The error created during series creation would be transported back to the Add method which already returns errors.

Outside TSDB where Add is used: In the case of Prometheus, it is a no-op, hence there no change. And it largely depends on how the external user handles the errors if we are talking about outside Prometheus.

Additionally, I don't see this being used along with scraping by an external user. But talking about scraping anyway, it would be a rollback when an error is returned. In this case the series won't be created and we have cases where series are not created in `Add` (empty label set or duplicate labels). Any problems in semantics here?

Thanks,

Ganesh

Reply all

Reply to author

Forward