Find a file when you know its checksum?

I have the md5sum of a file and I don’t know where it is on my system. Is there any easy option of find to identify a file based on its md5? Or do I need to develop a small script ?

I’m working on AIX 6 without the GNU tools.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Using find:

find /tmp/ -type f -exec md5sum {} + | grep '^file_md5sum_to_match'

If you searching through / then you can exclude /proc and /sys see following find command example :

Also I had done some testing, find take more time and less CPU and RAM where ruby script is taking less time but more CPU and RAM

Test Result

Find

[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9fedf0f0ebdffbfcae">[email protected]</a> ~]# time find / -type f -not -path "/proc/*" -not -path "/sys/*" -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5'
304a5fa2727ff9e6e101696a16cb0fc5  /tmp/file1


real    6m20.113s
user    0m5.469s
sys     0m24.964s

Find with -prune

[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9ae8f5f5eedafef9ab">[email protected]</a> ~]# time find / ( -path /proc -o -path /sys ) -prune -o -type f -exec md5sum {} + | grep '^304a5fa2727ff9e6e101696a16cb0fc5'
304a5fa2727ff9e6e101696a16cb0fc5  /tmp/file1

real    6m45.539s
user    0m5.758s
sys     0m25.107s

Ruby Script

[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="9be9f4f4efdbfff8aa">[email protected]</a> ~]# time ruby findm.rb
File Found at: /tmp/file1

real    1m3.065s
user    0m2.231s
sys     0m20.706s

Method 2

Script Solution

#!/usr/bin/ruby -w

require 'find'
require 'digest/md5'

file_md5sum_to_match = [ '304a5fa2727ff9e6e101696a16cb0fc5',
                         '0ce6742445e7f4eae3d32b35159af982' ]

Find.find('/') do |f|
  next if /(^.|^/proc|^/sys)/.match(f) # skip
  next unless File.file?(f)
  begin
        md5sum = Digest::MD5.hexdigest(File.read(f))
  rescue
        puts "Error reading #{f} --- MD5 hash not computed."
  end
  if file_md5sum_to_match.include?(md5sum)
       puts "File Found at: #{f}"
       file_md5sum_to_match.delete(md5sum)
  end
  file_md5sum_to_match.empty? && exit # if array empty then exit

end

Bash Script solution based on probability which works faster

#!/bin/bash
[[ -z $1 ]] && read -p "Enter MD5SUM to search file: " md5 || md5=$1

check_in=( '/home' '/opt' '/tmp' '/etc' '/var' '/usr'  )
last_find_cmd="find / \( -path /proc -o -path /sys ${check_in[@]///-o -path /} \) -prune -o -type f -exec md5sum {} +"
last_element=${#check_in}
echo "Please wait... searching for file"
for d in ${!check_in[@]}
do

        [[ $d == $last_element ]] && eval $last_find_cmd | grep "^${md5}" && exit

        find ${check_in[$d]} -type f -exec md5sum {} + | grep "^${md5}" && exit


done

Test Result

[<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="b8cad7d7ccf8dcdb89">[email protected]</a> /]# time bash find.sh 304a5fa2727ff9e6e101696a16cb0fc5
Please wait... searching for file
304a5fa2727ff9e6e101696a16cb0fc5  /var/log/file1

real    0m21.067s
user    0m1.947s
sys     0m2.594s

Method 3

If you decide to install gnu find anyway (and since you indicated interest in one of your comments), you can try something like:

find / -type f ( -exec checkmd5 {} YOURMD5SUM ; -o -quit )

and have checkmd5 compare the md5sum of the file it gets as argument compare to
the second argument and print the name if it matches and exit with 1 (instead of 0 otherwise). The -quit will have find stop once it is found.

checkmd5 (not tested):

#!/bin/bash

md=$(md5sum $1 |  cut -d' ' -f1)

if [ $md == $2 ] ; then
  echo $1
  exit 1
fi
exit 0


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x