How does ‘find -exec’ pass file names with spaces?

If I have a directory containing some files whose names have spaces, e.g.

$ ls -1 dir1
file 1
file 2
file 3

I can successfully copy all of them to another directory like this:

$ find dir1 -mindepth 1 -exec cp -t dir2 {} +

However, the output of find dir1 -mindepth 1 contains un-escaped spaces:

$ find dir1 mindepth 1
dir1/file 1
dir1/file 3
dir1/file 3

If I use print0 instead of print, the output still contains un-escaped spaces:

$ find dir1 mindepth 1 -print0
dir1/file 1dir1/file 2dir1/file 3

To copy these files manually using cp, I would need to escape the spaces; but it seems that this is unnecessary when cp‘s aguments come from find, irrespective of whether I use + or ; at the end of the command.

What’s the reason for this?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The find command executes the command directly. The command, including the filename argument, will not be processed by the shell or anything else that might modify the filename. It’s very safe.

You are correct that there’s no need to escape filenames which are represented by {} on the find command line.

find passes the raw filename from disk directly into the internal argument list of the -exec command, in your case, the cp command.

Method 2

The question is two-part:

  • how does find manage to call programs using -exec without running into problems with spaces embedded in filenames, and
  • what good is the -print0 option?

For the first, find is making a system call, actually one of a group of related calls referred to as “exec”. It passes the filename as an argument directly to this call, which then is passed directly (after creating a new process) without losing information about the filename.

The POSIX find feature + is explained as follows, in the rationale:

A feature of SVR4’s find utility was the -exec primary’s +
terminator. This allowed filenames containing special characters
(especially newline characters) to be grouped together without the
problems that occur if such filenames are piped to xargs. Other
implementations have added other ways to get around this problem,
notably a -print0 primary that wrote filenames with a null byte
terminator. This was considered here, but not adopted. Using a null
terminator meant that any utility that was going to process find’s
-print0 output had to add a new option to parse the null terminators
it would now be reading.

That “notably a -print0 primary” refers to GNU find and xargs which solve the problem in a different way. It is also supported by FreeBSD find and xargs. If you added a -0 option (see the manual page) to the xargs call, then that program accepts lines terminated by “null byte” characters. In turn, xargs calls exec-functions to do its work. The main distinction between the -print0 and -0 feature versus the + feature is that the former passes the filenames over a pipe, while the latter does not. Developers find uses for almost any feature; the pipes are no exception.

Back to OP’s example, which uses a -t option to cp: that is not found in POSIX cp. Rather, it is an extension (aka “nonstandard feature”) provided by GNU cp. The -0 extension of xargs would not improve this example, but there are other cases where it can be used effectively—keeping in mind that there is the portable alternative +, which GNU find accepts.

Method 3

(This should be a comment but it’s too large.)

For those who like to try things out:

Create a script listing positional parameters passed in, call it list_positional_parameters.sh.

#!/bin/bash

# http://tldp.org/LDP/Bash-Beginners-Guide/html/sect_09_07.html
# Try globbing patterns, e.g. "X[[:digit:]][[:digit:]]" to see what happens

if [ $# -lt 1 ]; then
   echo "Usage: $0 and then at least one parameter"
   exit 1
fi

counter=1

while (($#)); do
   echo "$counter = '$1'"
   # pop positional argument 1 off the stack of positional arguments
   shift
   (( counter++ ))
done

Run find with it on some directory $dir:

find "$dir" -exec ./list_positional_parameters.sh '{}' ';' | less

As expected, there is only a single parameter in all calls, the filename, whether there are spaces in its name or not.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x