'ROS nodes running but some connections are broken

Setup

I am running ROS nodes on two separate machines (my laptop, running ROS melodic on Ubuntu 18.04, and a voxl computer running ROS kinetic on yocto).

Problem

The voxl computer runs the roscore. The communication between the two devices is fine. The respective IPs are set properly (ROS_IP and ROS_MASTER_URI are explicitly set accordingly on each device).

However, there are some nodes (only a few!) that seem unconnected although they launch fine, with roswtf generating the error:

ERROR The following nodes should be connected but aren't:
 * /node_x -> /node_y
 * ...

When searching online, what causes this type of error are usually DNS related issues, but here, both devices are successfully connected over the network and most nodes function properly, except for some of them...

Also, killing a node that has trouble communicating with rosnode kill node_z and restarting it alone with rosrun package node_z enables it to communicate properly again.

roscore related issue?

Furthermore, taking a working configuration (several nodes interacting together on the same device, with the roscore on the same machine) is fine, but running the same configuration, but this time, with the roscore on another device breaks certain connections. What could cause this discrepancy?

Sensitivity to launch sequence

It seems that the order in which nodes or launch files are called have an impact.

Conclusion

I am not sure what causes the issue here and where to look to solve the it...



Solution 1:[1]

Ok so it seems this behavior can be due to a firewall, in this case ufw. I was not aware the firewall was enabled and since some nodes could communicate while some couldn't, I did not think to check this first.

To fix the communication issue, all I had to do was to disable the firewall:

sudo ufw disable

After that, all nodes were able to communicate properly and roswtf is no longer reporting any error.

Solution 2:[2]

The error The following nodes should be connected but aren't: is due to firewall blocking the inbound/outbound connections. I faced the similar issue when I wanted to communicate from MATLAB ROS node running in a windows PC (Windows 10) to a master node (roscore) running in Linux Machine (Ubuntu 18.04). Finally I was able to get it done after allowing the MATLAB to communicate over the network through firewall.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 LoW
Solution 2 Vinu Raja Kumar C