I want to recursively delete all files not accessed in a while in folder a, except all files in the subfolder b.
find a ( -name b -prune ) -o -type f -delete
However, I get an error message:
find: The -delete action automatically turns on -depth, but -prune does
nothing when -depth is in effect. If you want to carry on anyway,
just explicitly use the -depth option.
Adding -depth causes all files in b to be included, which must not happen.
Anyone know a safe way to make this work?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
one way is to use -exec rm instead of -delete.
find a ( -name b -prune ) -o -type f -exec rm {} +
alternatively use -not -path instead of -prune:
find a -not -path "*/b*" -type f -delete
Explanation why -prune collides with -delete:
find complains when you try to use -delete with -prune because -delete implies -depth and -depth makes -prune ineffective.
observe the behaviour of find with and without -depth:
$ find foo/ foo/ foo/f1 foo/bar foo/bar/b2 foo/bar/b1 foo/f2
There is no guarantee about the order in a single directory. But there is a guarantee that a directory is processed before its contents. Note foo/ before any foo/* and foo/bar before any foo/bar/*.
This can be reversed with -depth.
$ find foo/ -depth foo/f2 foo/bar/b2 foo/bar/b1 foo/bar foo/f1 foo/
Note that now all foo/* appear before foo/. Same with foo/bar.
more explanation:
-pruneprevents find from descending into a directory. In other words-pruneskips the contents of the directory. In your case-name b -prunemeans that when find reaches a directory with the namebit will skip the directory including all subdirectories.-depthmakes find to process the contents of a directory before the directory itself. That means by the time find gets to process the directory entrybits contents has already been processed. Thus-pruneis ineffective with-depthin effect.-deleteimplies-depthso it can delete the contents first and then the empty directory.-deleterefuses to delete non-empty directories.
Explanation of alternative method:
find a -not -path "*/b*" -type f -delete
This may or may not be easier to remember.
This command still descends into the directory b and proceses every single file in it only for -not to reject them. This can be a performance issue if the directory b is huge.
-path works differently than -name. -name only matches against the name (of the file or directory) while -path is matching against the entire path. For example observe the path /home/lesmana/foo/bar. -name bar will match because the name is bar. -path "*/foo*" will match because the string /foo is in the path. -path has some intricacies you should understand before using it. Read the man page of find for more details.
Beware that this is not 100% foolproof. There are chances of “false positives”. The way the command is written above it will skip any file which has any parent directory which name is starting with b (positive). But it will also skip any file which name is starting with b regardless of position in the tree (false positive). This can be fixed by writing a better expression than "*/b*". That is left as an exercise for the reader.
I assume that you used a and b as placeholders and the real names are more like allosaurus and brachiosaurus. If you put brachiosaurus in place of b then the amount of false positives will be drastically reduced.
At least the false positives will be not deleted, so it will be not as tragic. Furthermore, you can check for false positives by first running the command without -delete (but remember to place the implied -depth) and examine the output.
find a -not -path "*/b*" -type f -depth
Method 2
Just use rm instead of -delete:
find a -name b -prune -o -type f -exec rm -f {} +
Method 3
The above answers and explanations were very helpful.
I use the workarounds of “-exec rm {} +” or “-not -path … -delete’, but those can be much slower than “find … -delete”. I have seen “find … -delete” run 5x faster than “-exec rm {} +” on deep directories on an NFS filesystem.
The ‘-not path ” solution has the obvious overhead of looking at all the files in the excluded directories and below.
The “find .. -exec rm {} +” calls rm which does system calls:
fstatat(AT_FDCWD, path...); unlinkat(AT_FDCWD, path, 0)
The “find -delete” does system calls:
fd=open(dir,...); fchdir(fd); fstatat(AT_FDCWD, filename,...) unlinkat(dirfd, filename,...)
So “-exec rm {}+” rm command does the full path to inode lookup twice twice per file, but “find -delete” does a stat and unlink of the filename in the current directory. That is a big win when you are removing a lot of files in one directory.
(whine mode on (sorry))
It seems like the design of the interaction between -depth, -delete and -prune needlessly eliminates the most efficient way of doing the common action “delete files except those in -prune directories”
The combination of “-type f -delete” should be able to run without -depth since it is not trying to delete directories. Alternatively, if “find” had a “-deletefile” action that says do not delete directories, -depth would not need to be implied.
The xargs or find -exec calls to rm command could be sped up if rm had an option to sort filenames, open directories, and do unlinkat(dir_fd,filename) instead of the unlinking the full paths. It already does the unlinkat(dir_fd,filename) when recursing through directories with the -r option.
(whine mode off)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0