Different expiration policies for results and errors

8 views
Skip to first unread message

Giovanni Santini

unread,
Aug 31, 2019, 4:25:42 PM8/31/19
to celery-users
Hi everyone,

I have a quite big issue with my current setup; I have a quite dense workflow which is
executed many times in order to fetch metrics and send them with the appropriate
tags to a Graphite cluster.

My main issue is that I am using RabbitMQ as a result backend in order to use chords;
however, the results usually fill up the RabbitMQ queue.
Long story short, too many results are stored thus making RabbitMQ crash or go nuts.

My approach was to ignore results completely, as sending to Graphite does not require
the result to be saved.
However, using the task_ignore_result in the app config makes the chords to break
and the workflow unusable.

Thus my questions:
  1. Is it possible to really store only errors? As in, having the chord results to be
    really removed from RabbitMQ when read?
  2. Can we have 2 different timeout policies for results and errors? That would be
    really nice (result for 1 hour, errors for a few days for logging)
  3. Any suggestion on how to solve this issue?
Additional notes:
  1. I am using RabbitMQ as broker and result backend; I am not using the optimized library though.
  2. The workflow I've mentioned is really similar to this:
    group(
        get_entities(),
        group(
            get_meta_a(),  # A chorded group of tasks
            get_meta_b.s(),
            get_meta_c.s()
        ) | merge_metadatas.s(),
    ) | match_metadata_to_entity.s() | group(
        get_metrics.s(start=i, end=i+slice_size) | send_to_graphite.s()
        for i in range(0, count_entities(), slice_size)
    )

Thanks in advance!

Giovanni

Ing. Josue Balandrano Coronel

unread,
Aug 31, 2019, 6:58:17 PM8/31/19
to celery...@googlegroups.com
I would suggest to switch RabbitMQ for Redis as a backend. It is much more stable.
Ignoring results will break workflows because celery uses results to keep tabs on how a workflow is e executed.
I don't think there's such granularity in result expiration. One thing you could do is to implement your own error handler and store the errors somewhere else. You have to remember that Celery results are ephemeral and should be treated that way instead of a proper stored value.

--
You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/celery-users/131d2a10-7c22-4f63-9006-9e0d54ea0150%40googlegroups.com.

Giovanni Santini

unread,
Sep 2, 2019, 4:25:42 AM9/2/19
to celery-users
Hi Josue,

Thanks for the suggestion!
Regarding the backend switch, is it for the results or also for the broker?
I stopped ignoring the results except for a few tasks where I know I can.
I will consider this approach (a proper handler ) in case it's necessary in the future :)

Thanks a lot again!

Giovanni


Il giorno domenica 1 settembre 2019 00:58:17 UTC+2, Ing. Josue Balandrano Coronel ha scritto:
I would suggest to switch RabbitMQ for Redis as a backend. It is much more stable.
Ignoring results will break workflows because celery uses results to keep tabs on how a workflow is e executed.
I don't think there's such granularity in result expiration. One thing you could do is to implement your own error handler and store the errors somewhere else. You have to remember that Celery results are ephemeral and should be treated that way instead of a proper stored value.

To unsubscribe from this group and stop receiving emails from it, send an email to celery...@googlegroups.com.

Ing. Josue Balandrano Coronel

unread,
Sep 2, 2019, 7:31:47 AM9/2/19
to celery...@googlegroups.com
O Lu for the results. When using more.complocated workflows, like chords, results shouldn't be informed. In the case of chords results are mainly used by Celery internally.
Broker is always better to have a proper AMQP implementation like RabbitMQ.

To unsubscribe from this group and stop receiving emails from it, send an email to celery-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/celery-users/6a3dc78d-c8b5-4fd8-bf7a-1d47e5a16080%40googlegroups.com.

Giovanni Santini

unread,
Sep 3, 2019, 5:59:14 AM9/3/19
to celery-users
Ok, thank you very much for the suggestion again :)


Il giorno lunedì 2 settembre 2019 13:31:47 UTC+2, Ing. Josue Balandrano Coronel ha scritto:
O Lu for the results. When using more.complocated workflows, like chords, results shouldn't be informed. In the case of chords results are mainly used by Celery internally.
Broker is always better to have a proper AMQP implementation like RabbitMQ.

On Mon, Sep 2, 2019, 03:25 Giovanni Santini <itachi.sa...@gmail.com> wrote:
Hi Josue,

Thanks for the suggestion!
Regarding the backend switch, is it for the results or also for the broker?
I stopped ignoring the results except for a few tasks where I know I can.
I will consider this approach (a proper handler ) in case it's necessary in the future :)

Thanks a lot again!

Giovanni

Il giorno domenica 1 settembre 2019 00:58:17 UTC+2, Ing. Josue Balandrano Coronel ha scritto:
I would suggest to switch RabbitMQ for Redis as a backend. It is much more stable.
Ignoring results will break workflows because celery uses results to keep tabs on how a workflow is e executed.
I don't think there's such granularity in result expiration. One thing you could do is to implement your own error handler and store the errors somewhere else. You have to remember that Celery results are ephemeral and should be treated that way instead of a proper stored value.

You received this message because you are subscribed to the Google Groups "celery-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to celery...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages