How to time grep commands accurately?

I want to compare the speed of these two commands:

grep pattern1 files* 
grep pattern2 files*

Unfortunately, the first grep reads much of files* into memory
buffers, so the second grep runs very quickly, but for the wrong
reason.

How do I tell Linux (Fedora 11): “please stop caching disk reads
because I’m testing something.”

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

I don’t think you can, easily, tell it “temporarily stop caching”. But what you can do is tell the system to drop the cache before each run:

As root:

sync; echo 3 > /proc/sys/vm/drop_caches

(This is documented in the kernel docs at Documentation/sysctl/vm.txt, which is handy if like some of us you can’t always remember offhand what the values 1, 2, or 3 do.)

Or alternately, of course, prime the cache and compare the cached performance. (I think both are useful numbers.)

Method 2

When timing things like this I usually run it first to prime the cache. Then run the command using time. In testing something like this you should be more concerned about CPU and elapsed times, and less concerned about I/O time.

In any case it is difficult to get fully accurate timings. If the input files exceed the size of memory available for buffers, then you will likely end up cycling all the files through buffer cache. Otherwise, you may just access all the data from buffer cache. In real life, there is often a mix of buffered data and data read from disk.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x