I have thousands of files named 1.txt 2.txt and so on. Some of those files are missing. Which would be the easiest way to find out which files are missing?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
ub=1000 # Replace this with the largest existing file's number.
seq "$ub" | while read -r i; do
[[ -f "$i.txt" ]] || echo "$i.txt is missing"
done
You can easily find the proper value for ub by doing ls | sort -n or similar. This relies on the files being in the format output by seq, notably here without leading zeroes.
Method 2
$ ls
1.txt 3.txt
$ seq 1 10 | xargs -I {} ls {}.txt >/dev/null
ls: cannot access 2.txt: No such file or directory
ls: cannot access 4.txt: No such file or directory
ls: cannot access 5.txt: No such file or directory
ls: cannot access 6.txt: No such file or directory
ls: cannot access 7.txt: No such file or directory
ls: cannot access 8.txt: No such file or directory
ls: cannot access 9.txt: No such file or directory
ls: cannot access 10.txt: No such file or directory
$
Method 3
This is the function I will be using
missing () {
#ub gets the largest sequential number
ub=$(ls | sort -n | tail -n 1 | xargs basename -s .txt)
seq "$ub" | while read -r i; do
[[ -f "$i.jpg" ]] || echo "$i.txt is missing"
done
}
Method 4
Another (bash):
comm -23 <(printf '%d.txtn' {1..1000} | sort) <(ls *.txt |sort)
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0