I know I can find files using find: find . -type f -name 'sunrise'. Example result:
./sunrise ./events/sunrise ./astronomy/sunrise ./schedule/sunrise
I also know that I can determine the file type of a file: file sunrise. Example result:
sunrise: PEM RSA private key
But how can I find files by file type?
For example, my-find . -type f -name 'sunrise' -filetype=bash-script:
./astronomy/sunrise ./schedule/sunrise
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
“File types” on a Unix system are things like regular files, directories, named pipes, character special files, symbolic links etc. These are the type of files that find can filter on with its -type option.
The find utility can not by itself distinguish between a “shell script”, “JPEG image file” or any other type of regular file. These types of data may however be distinguished by the file utility, which looks at particular signatures within the files themselves to determine their type.
A common way to label the different types of data files is by their MIME type, and file is able to determine the MIME type of a file.
Using file with find to detect the MIME type of regular files, and use that to only find shell scripts:
find . -type f -exec sh -c '
case $( file -bi "$1" ) in (*/x-shellscript*) exit 0; esac
exit 1' sh {} ; -print
or, using bash,
find . -type f -exec bash -c '
[[ "$( file -bi "$1" )" == */x-shellscript* ]]' bash {} ; -print
Add -name sunrise before the -exec if you wish to only detect scripts with that name.
The find command above will find all regular files in or below the current directory, and for each such file call a short in-line shell script. This script runs file -bi on the found file and exits with a zero exit status if the output of that command contains the string /x-shellscript. If the output does not contain that string, it exits with a non-zero exit status which causes find to continue immediately with the next file. If the file was found to be a shell script, the find command will proceed to output the file’s pathname (the -print at the end, which could also be replaced by some other action).
The file -bi command will output the MIME type of the file. For a shell script on Linux (and most other systems), this would be something like
text/x-shellscript; charset=us-ascii
while on systems with a slightly older variant of the file utility, it may be
application/x-shellscript
The common bit is the /x-shellscript substring.
Note that on macOS, you would have to use file -bI instead of file -bi because of reasons (the -i option does something quite different). The output on macOS is otherwise similar to that of a Linux system.
Would you want to perform some custom action on each found shell script, you could do that with another -exec in place of the -print in the find commands above, but it would also be possible to do
find . -type f -exec sh -c '
for pathname do
case $( file -bi "$pathname" ) in
*/x-shellscript*) ;;
*) continue
esac
# some code here that acts on "$pathname"
done' sh {} +
or, with bash,
find . -type f -exec bash -c '
for pathname do
[[ "$( file -bi "$pathname" )" != */x-shellscript* ]] && continue
# some code here that acts on "$pathname"
done' bash {} +
Related:
Method 2
You could exec file on every found file and then grep for the result you’re interested in.
# When looking for ASCII Text
find . -type f -exec file {} ; | grep "ASCII"
# or for MS Word Documents
find . -type f -exec file {} ; | grep "Microsoft Word"
I suggest to make the search pattern as close as possible to your expectation to keep the number of the false positive matches low.
Beware that files with newlines in their filenames may cause issues with this approach.
Method 3
A shorter form that doesn’t involve that much shell magic:
find . -exec sh -c "file '{}' | grep -q 'pattern'" ; -print
Method 4
I am not really sure if this is what one looks for, but:Like this?
find DIRECTORY -name "*.EXT" -type f -exec COMMAND {} ;
Where you replace:
DIRECTORY with the directory to search for files in (and, opt. Operate
on)
-name “*.EXT” is the Extension, *wildcard means everything, so everything ending at “EXT” (, *every file that ends with extension
EXT*)
-type f is , file. (type file, not Directory e.g)
-exec COMMAND is executing COMMAND on every found file with the extension EXT at the end to them.
E.g, calculate the sha512 sum on all files with the .odt extension, inside directory Docs/:
find Docs/ -name "*.odt" -type f -exec sha512sum {} ;
Method 5
Using perl‘s File::LibMagic module:
perl -MFile::LibMagic=:easy -MFile::Find -le '
find sub {
print $File::Find::name if
$_ eq "sunrise" and
-f and
MagicFile$_ eq "PEM RSA private key"
}, @ARGV' -- .
File::LibMagic uses the same algorithms and bank of heuristics to guess a file’s type as file, but here since everything is done in one perl invocation, it’s much more efficient than having to run one file invocation for each file.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0