Hi Pavol,
Thanks for the information. I just want to let you know that I've successfully setup a scheduled job in kube to clean up my jaeger's elastic search data. I'll share the kube cron job setup here, just in case other people need it as well
```
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: es-jaeger-cleaner
namespace: data-devops
labels:
app: es-jaeger-cleaner
env: stg
spec:
# every 11 AM UTC-0
schedule: "0 11 * * *"
jobTemplate:
metadata:
labels:
app: es-jaeger-cleaner
env: stg
spec:
template:
metadata:
labels:
app: es-jaeger-cleaner
env: stg
spec:
containers:
- name: es-jaeger-cleaner
image: jaegertracing/jaeger-es-index-cleaner:latest
# clean up ES data indices older than 7 days from now
args: ["7", "<jaeger's elastic search client service/hostname>:9200"]
env:
- name: TIMEOUT
value: "300"
restartPolicy: Never
```
Anyway, since the data growth depends on various attributes:
- trace sampling on istio-proxy
- number of services as well as
- number of active request within the mesh
I also don't have exact formula. But this is where the disk usage monitoring plays a role, where I can see the trend of data growth every day/week and set up data retention accordingly.
Best,
Agung