'How do xargs and gnu parallel differ when parallelizing code?
Here's a basic question. I'm curious as to how do xargs and gnu parallel differ when parallelizing code?
And are there use cases in which you'd use one over the other?
I ask this because I have seen answers to parallelization questions where using either tool has been deemed acceptable by the community.
Solution 1:[1]
Some of the differences are covered on: https://www.gnu.org/software/parallel/parallel_alternatives.html#differences-between-xargs-and-gnu-parallel
Tl;dr: xargs is faster because there is almost no overhead (~0.3 ms/job compared to GNU Parallel's ~3 ms/job). GNU Parallel is safer because it takes all sorts of precautions so you do not need to worry (e.g. output from two jobs running in parallel will not mix). GNU Parallel has loads of features that xargs does not have. GNU Parallel requires Perl, xargs does not. xargs is everywhere, GNU Parallel requires you to use --embed
to make sure it is everywhere.
So in general: If the primary concern is to avoid overhead (e.g. if your jobs take a few ms to run each) or avoid installing Perl (e.g. if your system is embedded and thus resource strained), then use xargs (and take the relevant precautions depending your input/output).
Full disclosure: I have a vested interest in GNU Parallel.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |