'How to 'cut' on null?

Unix 'file' command has a -0 option to output a null character after a filename. This is supposedly good for using with 'cut'.

From man file:

-0, --print0
         Output a null character ‘\0’ after the end of the filename. Nice
         to cut(1) the output. This does not affect the separator which is
         still printed.

(Note, on my Linux, the '-F' separator is NOT printed - which makes more sense to me.)

How can you use 'cut' to extract a filename from output of 'file'?

This is what I want to do:

find . "*" -type f | file -n0iNf - | cut -d<null> -f1

where <null> is the NUL character.

Well, that is what I am trying to do, what I want to do is get all file names from a directory tree that have a particular MIME type. I use a grep (not shown).

I want to handle all legal file names and not get stuck on file names with colons, for example, in their name. Hence, NUL would be excellent.

I guess non-cut solutions are fine too, but I hate to give up on a simple idea.



Solution 1:[1]

Just specify an empty delimiter:

cut -d '' -f1

(N.B.: The space between the -d and the '' is important, so that the -d and the empty string get passed as separate arguments; if you write -d'', then that will get passed as just -d, and then cut will think you're trying to use -f1 as the delimiter, which it will complain about, with an error message that "the delimiter must be a single character".)

Solution 2:[2]

This works with gnu awk.

awk 'BEGIN{FS="\x00"}{print$1}'

Solution 3:[3]

  • ruakh's helpful answer works well on Linux.

  • On macOS, the cut utility doesn't accept '' as a delimiter argument (bad delimiter):

Here is a portable workaround that works on both platforms, via the tr utility; it only makes one assumption:

  • The input mustn't contain \1 control characters (START OF HEADING, U+0001) - which is unlikely in text.

  • You can substitute any character known not to occur in the input for \1; if it's a character that can be represented verbatim in a string, that simplifies the solution because you won't need the aux. command substitution ($(...)) with a printf call for the -d argument.

  • If your shell supports so-called ANSI C-quoted strings - which is true of bash, zsh and ksh - you can replace "$(printf '\1')" with $'\1'

(The following uses a simpler input command to demonstrate the technique).

# In zsh, bash, ksh you can simplify "$(printf '\1')" to $'\1'
$ printf '[first field 1]\0[rest 1]\n[first field 2]\0[rest 2]' |
    tr '\0' '\1' | cut -d "$(printf '\1')" -f 1

[first field 1]
[first field 2]

Alternatives to using cut:

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ruakh
Solution 2 mklement0
Solution 3