'How to 'cut' on null?
Unix 'file' command has a -0 option to output a null character after a filename. This is supposedly good for using with 'cut'.
From man file
:
-0, --print0
Output a null character ‘\0’ after the end of the filename. Nice
to cut(1) the output. This does not affect the separator which is
still printed.
(Note, on my Linux, the '-F' separator is NOT printed - which makes more sense to me.)
How can you use 'cut' to extract a filename from output of 'file'?
This is what I want to do:
find . "*" -type f | file -n0iNf - | cut -d<null> -f1
where <null>
is the NUL character.
Well, that is what I am trying to do, what I want to do is get all file names from a directory tree that have a particular MIME type. I use a grep (not shown).
I want to handle all legal file names and not get stuck on file names with colons, for example, in their name. Hence, NUL would be excellent.
I guess non-cut solutions are fine too, but I hate to give up on a simple idea.
Solution 1:[1]
Just specify an empty delimiter:
cut -d '' -f1
(N.B.: The space between the -d
and the ''
is important, so that the -d
and the empty string get passed as separate arguments; if you write -d''
, then that will get passed as just -d
, and then cut
will think you're trying to use -f1
as the delimiter, which it will complain about, with an error message that "the delimiter must be a single character".)
Solution 2:[2]
This works with gnu awk.
awk 'BEGIN{FS="\x00"}{print$1}'
Solution 3:[3]
ruakh's helpful answer works well on Linux.
On macOS, the
cut
utility doesn't accept''
as a delimiter argument (bad delimiter
):
Here is a portable workaround that works on both platforms, via the tr
utility; it only makes one assumption:
The input mustn't contain
\1
control characters (START OF HEADING,U+0001
) - which is unlikely in text.You can substitute any character known not to occur in the input for
\1
; if it's a character that can be represented verbatim in a string, that simplifies the solution because you won't need the aux. command substitution ($(...)
) with aprintf
call for the-d
argument.If your shell supports so-called ANSI C-quoted strings - which is true of
bash
,zsh
andksh
- you can replace"$(printf '\1')"
with$'\1'
(The following uses a simpler input command to demonstrate the technique).
# In zsh, bash, ksh you can simplify "$(printf '\1')" to $'\1'
$ printf '[first field 1]\0[rest 1]\n[first field 2]\0[rest 2]' |
tr '\0' '\1' | cut -d "$(printf '\1')" -f 1
[first field 1]
[first field 2]
Alternatives to using cut
:
- C. Paul Bond's helpful answer shows a portable
awk
solution.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | ruakh |
Solution 2 | mklement0 |
Solution 3 |