One program created lots of nested sub-folders. I tried to use command
rm -fr * to remove them all. But it’s very slow. I’m wondering is there any faster way to delete them all?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
The fastest way to remove them from that directory is to move them out of there, after that just remove them in the background:
mkdir ../.tmp_to_remove mv -- * ../.tmp_to_remove rm -rf ../.tmp_to_remove &
This assumes that your current directory is not the toplevel of some mounted partition (i.e. that ../.tmp_to_remove is on the same filesystem).
The -- after mv (as edited in by Stéphane) is necessary if you have any file/directory names starting with a -.
The above removes the files from your current directory in a fraction of a second, as it doesn’t have to recursively handle the subdirectories. The actual removal of the tree from the filesystem takes longer, but since it is out of the way, its actual efficiency shouldn’t matter that much.
Method 2
rsync is surprisingly fast and simple. You have to create empty directory first,
mkdir emptydir rsync -a --delete emptydir/ yourdirectory/
yourdirectory/ is the directory from where you want to remove the files.
Method 3
One problem with rm -rf *, or its more correct equivalent rm -rf -- * is that the shell has first to list all the (non-hidden) files in the current directory, sort them and pass them to rm, which if the list of files in the current directory is big is going to add some unnecessary extra overhead, and could even fail if the list of file is too big.
Normally, you’d do rm -rf . instead (which would also have the benefit of deleting hidden files as well). But most rm implementations including all POSIX conformant ones will refuse to do that. The reason is that some shells (including all POSIX ones) have that misfeature that the expansion of .* glob would include . and ... Which would mean that rm -rf .* would delete the current and parent directory, so rm has been modified to work around that misfeature of those shells.
Some shells like pdksh (and other Forsyth shell derivatives), zsh or fish don’t have that misfeature. zsh has a rm builtin which you can enable with autoload zsh/files that, since zsh‘s .* doesn’t include . nor .. works OK with rm -rf .. So in zsh, you can do:
zmodload zsh/files rm -rf .
On Linux, you can do:
rm -rf /proc/self/cwd/
to empty the current directory or:
rm -rf /dev/fd/3/ 3< some/dir
to empty an arbitrary directory.
(note the trailing /)
On GNU systems, you can do:
find . -delete
Now, if the current directory only has a few entries and the bulk of the files are in subdirs, that won’t make a significant difference and rm -rf -- * will probably be the fastest you can get. It’s expected for rm -rf (or anything that removes every file) to be expensive as it means reading the content of all directories and calling unlink() on every entry. unlink() itself can be quite expensive as it involves modifying the deleted file’s inode, the directory containing the file, and some file system map or other of what areas are free.
rm and find (at least the GNU implementations) already sort the list of files by inode number in each directory which can make a huge difference in terms of performance on ext4 file systems as it reduces the number of changes to the underlying block devices when consecutive (or close to each other) inodes are modified in sequence.
rsync sorts the files by name which could drastically reduce performance unless the by-name order happens to match the by-inum order (like when the files have been created from a sorted list of file names).
One reason why rsync may be faster in some cases is that it doesn’t appear to take safety precautions to avoid race conditions that could cause it to descend into the wrong directory if a directory was replaced with a symlink while it’s working like rm or find do.
To optimize a bit further:
If you know the maximum depth of your directory tree, you can pass it to find:
find . -maxdepth 3 -delete
That saves find having to try and read the content of the directories at depth 3.
Method 4
The fastest is with rm -rf dirname. I used a snapshotted mountpoint of an ext3 filesystem on RedHat6.4 with 140520 files and 9699 directories. If rm -rf * is slow, it might be because your top-level directory entry has lots of files, and the shell is busy expanding *, which requires an additional readdir and sort. Go up a directory and do rm -rf dirname/.
Method Real time Sys time Variance (+/-) find dir -delete 0m8.108s 0m3.668s 0.055s rm -rf dir 0m7.956s 0m3.640s 0.081s rsync -delete empty/ dir/ 0m8.305s 0m3.918s 0.029s
Notes:
- rsync version : 3.0.6
- rm/coreutils version: 8.4-19
- find/findutils version: 4.4.2-6
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0