'Ansible error loading fact, please check content but not always

I am trying to retrieve the available memory using ansible facts. When I run the code directly on the target machine, it works as expected. However, when I run it from a remote one with ansible, it only populate the data correctly sometimes, but fill it with the warning string "error loading fact - please check content" most of the time. I haven't been able to find a pattern, sometimes it works two times in a row, sometimes only once every 4 or 5 attempts.

memory.fact

#!/bin/bash
echo "{ \"total_mb\": $(free -m | grep Mem: | awk '{print $2*0.95}') }"

output when running on the target machine

/etc/ansible/facts.d$ ./memory.fact
{ "total_mb": 60996,6 }

ansible command line

ansible <hostname> -m ansible.builtin.setup -a "filter=ansible_local"

output when it works correctly (I also checked the assigned value, and the it seems to be correct)

PLAY [Ansible Ad-Hoc] *********************************************************************************************************************************************************************************************

TASK [ansible.builtin.setup] **************************************************************************************************************************************************************************************
ok: [hostname]

PLAY RECAP ********************************************************************************************************************************************************************************************************
hostname             : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

output when it doesn't work (I also checked the assigned value, and it's the warning string)

PLAY [Ansible Ad-Hoc] *********************************************************************************************************************************************************************************************

TASK [ansible.builtin.setup] **************************************************************************************************************************************************************************************
[WARNING]: error loading fact - please check content
ok: [hostname]

PLAY RECAP ********************************************************************************************************************************************************************************************************
hostname             : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

I've tried to run the ansible command with max verbosity level and add debug=true in the config file, but the output is exactly the same whether or not it works (the only exceptions being obviously the assigned value in the ansible_facts json, the warning line and the timestamps).

How can the output of the same command be different? What can be a possible source cause I should investigate on to fix it?

Also, in case I can't find the cause, is there a workaround I could use to tell ansible to retry until it gets a correct value instead of a string? As the task doesn't fail, it doesn't stop the playbook until the value is actually used.



Solution 1:[1]

In case anyone finds this from a search, I didn't find the reason why the memory fact wasn't collected properly sometimes. However, I found a hacky workaround to make it "work".

I just modified my playbook to add a repeat condition in the facts gathering task, until the fact contains a "total_mb" key.

name: regather local facts
    setup: filter=ansible_local
    until: '"total_mb" in ansible_local["memory"]'
    retries: 20
    delay: 2

If anyone has an idea of the real reason why this failed in the first place, I'm still interested to know.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 adepierre