In Linux and Unix systems there are two common search commands: locate and find.
What are the pros and cons of each? When one have benefits over the other?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
locate(1) has only one big advantage over find(1): speed.
find(1), though, has many advantages over locate(1):
-
find(1)is primordial, going back to the very first version of AT&T Unix. You will even find it in cut-down embedded Linuxes via Busybox. It is all but universal.locate(1)is much younger thanfind(1). The earliest ancestor oflocate(1)didn’t appear until 1983, and it wasn’t widely available as “locate” until 1994, when it was adopted into GNU findutils and into 4.4BSD. -
locate(1)is also nonstandard, thus it is not installed by default everywhere. Some POSIX type OSes don’t even offer it as an option, and where it is available, the implementation may be lacking features you want because there is no independent standard specifying the minimum feature set that must be available.There is a de facto standard, being BSD
locate(1), but that is only because the other two main flavors oflocateimplement all of its options:-0,-c,-d,-i,-l,-m,-s, and-S.mlocateimplements 6 additional options not in BSDlocate:-b,-e,-P,-q,--regexand-w. GNUlocateimplements those six plus another four:-A,-D,-E, and-p. (I’m ignoring aliases and minor differences like-?vs-hvs--help.)The BSDs and Mac OS X ship BSD
locate.Most Linuxes ship GNU
locate, but Red Hat Linuxes and Arch shipmlocateinstead. Debian doesn’t install either in its base install, but offers both versions in its default package repositories; if both are installed at once, “locate” runsmlocate.Oracle has been shipping
mlocatein Solaris since 11.2, released in December 2014. Prior to that,locatewas not installed by default on Solaris. (Presumably, this was done to reduce Solaris’ command incompatibility with Oracle Linux, which is based on Red Hat Enterprise Linux, which also usesmlocate.)IBM AIX still doesn’t ship any version of
locate, at least as of AIX 7.2, unless you install GNUfindutilsfrom the AIX Toolbox for Linux Applications.HP-UX also appears to lack
locatein the base system.Older “real” Unixes generally did not include an implementation of
locate. -
find(1)has a powerful expression syntax, with many functions, Boolean operators, etc. -
find(1)can select files by more than just name. It can select by:- age
- size
- owner
- file type
- timestamp
- permissions
- depth within the subtree…
-
When finding files by name, you can search using file globbing syntax in all versions of
find(1), or in GNU or BSD versions, using regular expressions.Current versions of
locate(1)accept glob patterns asfinddoes, but BSDlocatedoesn’t do regexes at all. If you’re like me and have to use a variety of machine types, you find yourself preferringgrepfiltering to developing a dependence on-ror--regex.locateneeds strong filtering more thanfinddoes because… -
find(1)doesn’t necessarily search the entire filesystem. You typically point it at a subdirectory, a parent containing all the files you want it to operate on. The typical behavior for alocate(1)implementation is to spew up all files matching your pattern, leaving it togrepfiltering and such to cut its eruption down to size.(Evil tip:
locate /will probably get you a list of all files on the system!)There are variants of
locate(1)likeslocate(1)which restrict output based on user permissions, but this is not the default version oflocatein any major operating system. -
find(1)can do things to files it finds, in addition to just finding them. The most powerful and widely supported such operator is-exec, but there are others. In recent GNU and BSD find implementations, for example, you have the-deleteand-execdiroperators. -
find(1)runs in real time, so its output is always up to date.Because
locate(1)relies on a database updated hours or days in the past, its output can be outdated. (This is the stale cache problem.) This coin has two sides:-
locatecan name files that no longer exist.GNU
locateandmlocatehave the-eflag to make it check for file existence before printing out the name of each file it discovered in the past, but this eats away some of thelocatespeed advantage, and isn’t available in BSDlocatebesides. -
locatewill fail to name files that were created since the last database update.
You learn to be somewhat distrustful of
locateoutput, knowing it may be wrong.There are ways to solve this problem, but I am not aware of any implementation in widespread use. For example, there is
rlocate, but it appears to not work against any modern Linux kernel. -
-
find(1)never has any more privilege than the user running it.Because
locateprovides a global service to all users on a system, it wants to have itsupdatedbprocess run asrootso it can see the entire filesystem. This leads to a choice of security problems:-
Run
updatedbas root, but make its output file world-readable solocatecan run without special privileges. This effectively exposes the names of all files in the system to all users. This may be enough of a security breach to cause a real problem.BSD
locateis configured this way on Mac OS X and FreeBSD. -
Write the database as readable only by
root, and makelocatesetuidroot so it can read the database. This meanslocateeffectively has to reimplement the OS’s permission system so it doesn’t show you files you can’t normally see. It also increases the attack surface of your system, specifically risking a root escalation attack. -
Create a special “
locate” user or group to own the database file, and mark thelocatebinary assetuid/setgidfor that user/group so it can read the database. This doesn’t prevent privilege escalation attacks by itself, but it greatly mitigates the damage one could cause.mlocateis configured this way on Red Hat Enterprise Linux.You still have a problem, though, because if you can use a debugger on
locateor cause it to dump core you can get at privileged parts of the database.
I don’t see a way to create a truly “secure”
locatecommand, short of running it separately for each user on the system, which negates much of its advantage overfind(1). -
Run
Bottom line, both are very useful. locate(1) is better when you’re just trying to find a particular file by name, which you know exists, but you just don’t remember where it is exactly. find(1) is better when you have a focused area to examine, or when you need any of its many advantages.
Method 2
locate uses a prebuilt database, which should be regularly updated, while find iterates over a filesystem to locate files.
Thus, locate is much faster than find, but can be inaccurate if the database -can be seen as a cache- is not updated (see updatedb command).
Also, find can offer more granularity, as you can filter files by every attribute of it, while locate uses a pattern matched against file names.
Method 3
find is not possible for a novice or occasional user of Unix to successfully use without careful perusal of the man page. Historically, some versions of find didn’t even default the -print option, adding to the user-hostility.
locate is less flexible, but far more intuitive to use in the common case.
Method 4
A slight drawback of locate is that it may not be indexing the area of the file system you are interested in. On Debian desktop systems, for example Linux Mint 17.2, the /etc/updatedb.conf file is configured to exclude certain areas from consideration, including /tmp, /var/spool, and /home/.ecryptfs.
Ignoring /home/.ecryptfs prevents file names in encrypted directories from being exposed to unauthorised users. However, if your home directory is encrypted with ecryptfs, it also means your home directory is not indexed, and locate will therefore never find anything in your home directory. This might make it largely useless for you (it does for me). In addition to not finding results, the updatedb process will periodically load your disk for no benefit, and might as well be disabled if you are the main or only user of the system.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0