'BASH: Filter list of files by return value of another command

I have series of directories with (mostly) video files in them, say

test1
  1.mpg
  2.avi
  3.mpeg
  junk.sh
test2
  123.avi
  432.avi
  432.srt
test3
  asdf.mpg
  qwerty.mpeg

I create a variable (video_dir) with the directory names (based on other parameters) and use that with find to generate the basic list. I then filter based on another variable (video_type) for file types (because there is sometimes non-video files in the dirs) piping it through egrep. Then I shuffle the list around and save it out to a file. That file is later used by mplayer to slideshow through the list. I currently use the following command to accomplish that. I'm sure it's a horrible way to do it, but it works for me and it's quite fast even on big directories.

video_dir="/test1 /test2"
video_types=".mpg$|.avi$|.mpeg$"

find ${video_dir} -type f    |
  egrep -i "${video_types}"  |
  shuf > "$TEMP_OUT"

I now would like to add the ability to filter out files based on the resolution height of the video file. I can get that from.

mediainfo --Output='Video;%Height%' filename

Which just returns a number. I have tried using the -exec functionality of find to run that command on each file.

 find ${video_dir} -type f -exec mediainfo --Output='Video;%Height%' {} \;

but that just returns the list of heights, not the filenames and I can't figure out how to reject ones based on a comparison, like <480. I could do a for next loop but that seems like a bad (slow) idea.

Using info from @mark-setchell I modified it to,

video_dir="test1"

find ${video_dir} -type f   \
   -exec bash -c 'h=$(mediainfo --Output="Video;%Height%" "$1"); [[ $h -gt 480 ]]' _ {} \; -print

Which works.



Solution 1:[1]

You can replace your egrep with the following so you are still inside the find command (-iname is case insensitive and -o represents a logical OR):

find test1 test2 -type f                                       \
     \( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \) \
     NEXT_BIT

The NEXT_BIT can then -exec bash and exit with status 0 or 1 depending on whether you want the current file included or excluded. So it will look like this:

-exec bash -c 'H=$(mediainfo -output ... "$1"); [ $H -lt 480 ] && exit 1; exit 0' _ {} \;

So, taking note of @tripleee advice in comments about superfluous exit statements, I get this:

find test1 test2 -type f                                       \
    \( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \)  \
    -exec bash -c 'h=$(mediainfo ...options... "$1"); [ $h -lt 480 ]' _ {} \; -print

Solution 2:[2]

Here is my solution.

#!/bin/bash

shopt -s extglob

video_dir=(/test1 /test2)
video_types=(*.@(mpg|avi|mpeg|mp4))

while read -r file; do
  if [[ $file == ${video_types[@]} ]]; then
    h=$(mediainfo --Output="Video;%Height%"  "$file")
    (( h >= 480 )) && echo "$file"
  fi
done < <(find "${video_dir[@]}" -type f)

That really needs extglob enabled otherwise the syntax *.@(...) will throw some errors. This solution you can process everything inside the while read loop.

Solution 3:[3]

This Q&A was focused on one particular case, so the accepted answer is not as general as it could be.

find

If the list of files comes from find, one can use its filtering facilities, e.g. -exec:

find ${video_dir} -type f \
     -exec COMMAND \; \
     -print

Here

  • COMMAND is not enclosed in quotes -- find reads everything after -exec and up to a \;
  • find will expand {} to the current file name (including path -- you might find -execdir helpful, which will cd to the file's directory and replace {} with the leaf file name)
  • The exit code of COMMAND is treated as follows:
    • 0 -> true
    • non-0 -> false

Note that you can build more complex expressions (e.g. -not -exec ...), which will be evaluated "from left to right, according to the rules of precedence ... -and is assumed where the operator is omitted." (per man find)

xargs

If the list of files comes from elsewhere (and is available on stdin), you can use xargs as follows (from If xargs is map, what is filter? )

ls | xargs -I{} bash -c "COMMAND '{}' && echo '{}'"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Jetchisel
Solution 3 Nickolay