I have the md5sum of a file and I don’t know where it is on my system. Is there any easy option of find to identify a file based on its md5? Or do I need to develop a small script ?
I’m working on AIX 6 without the GNU tools.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Using find:
find /tmp/ -type f -exec md5sum {} + | grep '^file_md5sum_to_match'
If you searching through / then you can exclude /proc and /sys see following find command example :
Also I had done some testing, find take more time and less CPU and RAM where ruby script is taking less time but more CPU and RAM
Test Result
Find
[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9fedf0f0ebdffbfcae">[email protected]</a> ~]# time find / -type f -not -path "/proc/*" -not -path "/sys/*" -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5'
304a5fa2727ff9e6e101696a16cb0fc5 /tmp/file1
real 6m20.113s
user 0m5.469s
sys 0m24.964s
Find with -prune
[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9ae8f5f5eedafef9ab">[email protected]</a> ~]# time find / ( -path /proc -o -path /sys ) -prune -o -type f -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5'
304a5fa2727ff9e6e101696a16cb0fc5 /tmp/file1
real 6m45.539s
user 0m5.758s
sys 0m25.107s
Ruby Script
[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9be9f4f4efdbfff8aa">[email protected]</a> ~]# time ruby findm.rb File Found at: /tmp/file1 real 1m3.065s user 0m2.231s sys 0m20.706s
Method 2
Script Solution
#!/usr/bin/ruby -w
require 'find'
require 'digest/md5'
file_md5sum_to_match = [ '304a5fa2727ff9e6e101696a16cb0fc5',
'0ce6742445e7f4eae3d32b35159af982' ]
Find.find('/') do |f|
next if /(^.|^/proc|^/sys)/.match(f) # skip
next unless File.file?(f)
begin
md5sum = Digest::MD5.hexdigest(File.read(f))
rescue
puts "Error reading #{f} --- MD5 hash not computed."
end
if file_md5sum_to_match.include?(md5sum)
puts "File Found at: #{f}"
file_md5sum_to_match.delete(md5sum)
end
file_md5sum_to_match.empty? && exit # if array empty then exit
end
Bash Script solution based on probability which works faster
#!/bin/bash
[[ -z $1 ]] && read -p "Enter MD5SUM to search file: " md5 || md5=$1
check_in=( '/home' '/opt' '/tmp' '/etc' '/var' '/usr' )
last_find_cmd="find / \( -path /proc -o -path /sys ${check_in[@]///-o -path /} \) -prune -o -type f -exec md5sum {} +"
last_element=${#check_in}
echo "Please wait... searching for file"
for d in ${!check_in[@]}
do
[[ $d == $last_element ]] && eval $last_find_cmd | grep "^${md5}" && exit
find ${check_in[$d]} -type f -exec md5sum {} + | grep "^${md5}" && exit
done
Test Result
[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b8cad7d7ccf8dcdb89">[email protected]</a> /]# time bash find.sh 304a5fa2727ff9e6e101696a16cb0fc5 Please wait... searching for file 304a5fa2727ff9e6e101696a16cb0fc5 /var/log/file1 real 0m21.067s user 0m1.947s sys 0m2.594s
Method 3
If you decide to install gnu find anyway (and since you indicated interest in one of your comments), you can try something like:
find / -type f ( -exec checkmd5 {} YOURMD5SUM ; -o -quit )
and have checkmd5 compare the md5sum of the file it gets as argument compare to
the second argument and print the name if it matches and exit with 1 (instead of 0 otherwise). The -quit will have find stop once it is found.
checkmd5 (not tested):
#!/bin/bash md=$(md5sum $1 | cut -d' ' -f1) if [ $md == $2 ] ; then echo $1 exit 1 fi exit 0
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0