Data Retention Policy in Jaeger with ES storage backend

2,931 views
Skip to first unread message

Agung Pratama

unread,
Nov 5, 2018, 1:44:29 AM11/5/18
to Jaeger Tracing
Hi All,

Our production kubernetes workload is using Istio and Jaeger for the tracing. I have setup the jaeger installation to use ElasticSearch with persistent volume (using the helm chart installation). My question is:
- how to setup data retention on the jaeger with ES storage backend?

Currently I have setup 500 GB of pvc and use trace sampling of 1% (in istio sidecar container), and facing a very big data growth. I want to avoid disk full problem, so it is better if I can setup the data retention policy. Is there any configuration or other means to set it up?

Best,
Agung

Pavol Loffay

unread,
Nov 5, 2018, 5:04:34 AM11/5/18
to agp.c...@gmail.com, jaeger-...@googlegroups.com
Hi Agung,

Jaeger stores data to daily indices which are not cleaned automatically. To clean old hada you have to use elasticsearch curator. We also provide a script and a docker image with simplified API https://github.com/jaegertracing/jaeger/tree/master/plugin/storage/es and https://www.jaegertracing.io/download/ (jaeger-es-index-cleaner). In k8s this usually runs as a cron job.

I don't know what TTL you should use to ideally manage disk resources. You will have to do some calculations based on your traffic and span data istio generates. If you come up with an equation for istio deployment feel free to post it back here.

Regards,


--
You received this message because you are subscribed to the Google Groups "Jaeger Tracing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jaeger-tracin...@googlegroups.com.
To post to this group, send email to jaeger-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jaeger-tracing/77816863-5a44-44ef-b854-08223590f3e0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

PAVOL LOFFAY

SOFTWARE ENGINEER

Red Hat

M: +41791562647    

Agung Pratama

unread,
Nov 8, 2018, 3:41:23 AM11/8/18
to plo...@redhat.com, jaeger-...@googlegroups.com
Hi Pavol,

Thanks for the information. I just want to let you know that I've successfully setup a scheduled job in kube to clean up my jaeger's elastic search data. I'll share the kube cron job setup here, just in case other people need it as well
```
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: es-jaeger-cleaner
namespace: data-devops
labels:
app: es-jaeger-cleaner
env: stg
spec:
# every 11 AM UTC-0
schedule: "0 11 * * *"
jobTemplate:
metadata:
labels:
app: es-jaeger-cleaner
env: stg
spec:
template:
metadata:
labels:
app: es-jaeger-cleaner
env: stg
spec:
containers:
- name: es-jaeger-cleaner
image: jaegertracing/jaeger-es-index-cleaner:latest
# clean up ES data indices older than 7 days from now
args: ["7", "<jaeger's elastic search client service/hostname>:9200"]
env:
- name: TIMEOUT
value: "300"
restartPolicy: Never


```

Anyway, since the data growth depends on various attributes:
- trace sampling on istio-proxy
- number of services as well as
- number of active request within the mesh

I also don't have exact formula. But this is where the disk usage monitoring plays a role, where I can see the trend of data growth every day/week and set up data retention accordingly.

Best,
Agung
--
Agung Pratama

Pavol Loffay

unread,
Nov 8, 2018, 3:45:04 AM11/8/18
to agp.c...@gmail.com, jaeger-...@googlegroups.com
Thanks for sharing Agung!

Regards,


For more options, visit https://groups.google.com/d/optout.

Juraci Paixão Kröhling

unread,
Nov 8, 2018, 4:18:35 AM11/8/18
to jaeger-...@googlegroups.com
Agung,

Would you be able to send a PR with this code to the jaeger-kubernetes
repository?

https://git.io/fpTsv

I'm sure more people would benefit from having your code as template for
their own usage. Bonus points if you also include a mention to it on the
README on that repo :)

And if you really feeling generous, we could also make use of something
like that in the Jaeger Operator!

https://github.com/jaegertracing/jaeger-operator

Thanks,
Juraci
> <mailto:jaeger-tracin...@googlegroups.com>.
> To post to this group, send email to
> jaeger-...@googlegroups.com
> <mailto:jaeger-...@googlegroups.com>.
> <https://groups.google.com/d/msgid/jaeger-tracing/77816863-5a44-44ef-b854-08223590f3e0%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
> --
>
> PAVOL LOFFAY
>
> SOFTWARE ENGINEER
>
> Red Hat<https://www.redhat.com/>
>
> M: +41791562647 <tel:+41791562647>
>
> <https://red.ht/sig>
>
>
>
> --
> Agung Pratama
> Linkedin <http://www.linkedin.com/pub/agung-pratama/28/a99/67b> | Github
> <https://github.com/bangau1> | Stack Overflow
> <https://stackoverflow.com/users/476917/agung-pratama>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Jaeger Tracing" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to jaeger-tracin...@googlegroups.com
> <mailto:jaeger-tracin...@googlegroups.com>.
> To post to this group, send email to jaeger-...@googlegroups.com
> <mailto:jaeger-...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jaeger-tracing/CAGYFGuJc1ebeDFsDzR4F6JC_xk7sUazLoL8rgvQj8pNum%3DsHPA%40mail.gmail.com
> <https://groups.google.com/d/msgid/jaeger-tracing/CAGYFGuJc1ebeDFsDzR4F6JC_xk7sUazLoL8rgvQj8pNum%3DsHPA%40mail.gmail.com?utm_medium=email&utm_source=footer>.

Agung Pratama

unread,
Nov 8, 2018, 4:46:23 AM11/8/18
to jpkro...@redhat.com, jaeger-...@googlegroups.com
Hi Juraci,

Sure, I would be glad to create a PR on this thing. I’ll let you know once the the PR is ready to be reviewed.

Thanks,
Agung

To unsubscribe from this group and stop receiving emails from it, send an email to jaeger-tracin...@googlegroups.com.
To post to this group, send email to jaeger-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jaeger-tracing/804067ad-31f1-22ef-b6b2-2c2b74e35471%40redhat.com.

For more options, visit https://groups.google.com/d/optout.
--
Sent from iPhone

Shubham Tiwari

unread,
Sep 20, 2022, 9:49:44 AM9/20/22
to Jaeger Tracing
Hi Agung,

I am also using the Istio and jaeger for the tracing purpose in my kubernetes production environment but I need to implement the persistence colume in order to keep the previous data. Can you please help me on the same. Any sample application or steps on how to do that would be much appreciated.

Thanks and Regards,
Shubham Tiwari

Jonah Kowall

unread,
Sep 20, 2022, 2:11:08 PM9/20/22
to Shubham Tiwari, Jaeger Tracing
I guess to reply to both of you, using Istio (or a service mesh) to enable tracing is a really bad idea. You will be getting little more value than what you get from logging without doing the actual work to instrument the application itself. This is best summarized and expanded upon in Yuri's recent article : Myth: service mesh can do distributed tracing of your application | by Yuri Shkuro | Aug, 2022 | Medium 

To answer the question on managing storage in ElasticSearch I would look at the documentation here: https://www.jaegertracing.io/docs/1.38/deployment/#elasticsearch-rollover 

Let me know if you have additional questions.


-Jonah Kowall
Google Voice - 617-500-3575
Twitter - @jkowall

Sent via Superhuman


--
You received this message because you are subscribed to the Google Groups "Jaeger Tracing" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jaeger-tracing+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jaeger-tracing/b2d5190a-8c1b-4cbe-9aef-39ca451b9e92n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages