Google Cloud Ops Agent too many errors

1,646 views
Skip to first unread message

Anton Makarov

unread,
Feb 17, 2022, 6:03:24 AM2/17/22
to Google Cloud Developers

On one of my server in GCP something wrong with google-cloud-ops-agent. Fluent Bit that agent uses for logs writes too many errors logs. For three days it had 88 GB, and before we already cleaned. I can’t recognize what exactly logs mean. Can somebody help with it?

root@***:/var/log/google-cloud-ops-agent/subagents# tail -50 logging-module.log [2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb [2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance

After restart google-cloud-ops-agent-fluent-bit.service it started infinity run and down and it repeating:

root@***:/var/log/google-cloud-ops-agent/subagents# tail -300 logging-module.log [2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.1] metadata_server set to http://metadata.google.internal [2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] client_email is not defined, using a default one [2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] private_key is not defined, fetching it from metadata server [2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] worker #7 started

.....

[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238945.234513362.flb [2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238950.216326541.flb [2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238953.150198939.flb [2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238957.150224348.flb [2022/02/15 18:15:46] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb [2022/02/15 18:15:46] [error] [engine] could not segregate backlog chunks [2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopping... [2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopped [2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #1 stopping...

Restarts google-cloud-ops-agent-opentelemetry-collector.service and google-cloud-ops-agent.service not helped. Any ideas why it happaning and what does logs mean?

Lluis Munoz Ladron de Guevara

unread,
Feb 23, 2022, 10:51:25 AM2/23/22
to Google Cloud Developers
Hi, 

This issue seems to be related to the one described in this issue tracker, which explains how a corrupted file can cause this errors. 

I encourage you to try the solution suggested in this comment
Reply all
Reply to author
Forward
0 new messages