'Apache Beam FileIO match - What's better/more efficient way to match files? [closed]

I'm just wondering - does the use of wildcard have an impact on how Beam matches files? For instance, if I want to match a file with Apache Beam, is there an advantage if I'd specify a direct path to a file (i.e. folder/subfolder/file.txt). Or, if I'd give just a wildcard to match() method as an input, would it be as efficient or worse, in terms of frameworks's performance?

Thanks



Solution 1:[1]

Compared to the cost of reading the file (and spinning up workers, if running on a distributed runner), the cost of matching will be negligible. On the other hand, multiple reads (with distinct direct paths) will generally be more overhead than reading a wildcard match.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 robertwb