'Google Cloud Ops Agent too many errors
On one of my server in GCP something wrong with google-cloud-ops-agent. Fluent Bit that agent uses for logs writes too many errors logs. For three days it had 88 GB, and before we already cleaned. I can’t recognize what exactly logs mean. Can somebody help with it?
root@***:/var/log/google-cloud-ops-agent/subagents# tail -50 logging-module.log
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [storage] [cio file] file is not mmap()ed: tail.1:29458-1644260316.150179737.flb
[2022/02/15 16:56:06] [error] [input chunk] error writing data from tail.1 instance
After restart google-cloud-ops-agent-fluent-bit.service it started infinity run and down and it repeating:
root@***:/var/log/google-cloud-ops-agent/subagents# tail -300 logging-module.log
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.1] metadata_server set to http://metadata.google.internal
[2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] client_email is not defined, using a default one
[2022/02/15 18:15:46] [ warn] [output:stackdriver:stackdriver.1] private_key is not defined, fetching it from metadata server
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] worker #7 started
.....
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238945.234513362.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238950.216326541.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238953.150198939.flb
[2022/02/15 18:15:46] [ info] [input:storage_backlog:storage_backlog.2] register tail.1/29458-1644238957.150224348.flb
[2022/02/15 18:15:46] [error] [storage] format check failed: tail.1/29458-1644260316.150179737.flb
[2022/02/15 18:15:46] [error] [engine] could not segregate backlog chunks
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopping...
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #0 stopped
[2022/02/15 18:15:46] [ info] [output:stackdriver:stackdriver.0] thread worker #1 stopping...
Restarts google-cloud-ops-agent-opentelemetry-collector.service and google-cloud-ops-agent.service not helped. Any ideas why it happaning and what does logs mean?
Solution 1:[1]
You didn't mention the version that is experiencing this issue, or whether you've upgraded from an earlier version, but there was a bug in Ops agent versions prior to 2.7.1 that caused buffer corruption, which manifested in later versions as the error you are quoting ("format check failed"). The solution is to delete the corrupted files until the agent runs properly. See the public issue tracker for detailed instructions.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Igor Peshansky |