When to use grep, less, awk, sed

I’m entering the world of Linux and at work I’m using grep more and more. By doing that I’m figuring out that sometimes it’s not adequate for what I want.

I was struggling with grep a few days ago and a colleague of mine who is a senior Linux admin, told me to use awk. I was stunned by how fast I got a result.

So my question is when do you choose to use one over the other? What questions can I ask myself before going to work with grep and spending a lot of time, when I could have done it with awk and saved time?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

sed and awk are supersets of grep, there are things that are easier to do with one or the other.

grep foo can be written sed '/foo/!d' or awk /foo/, but consider:

grep -i foo would have to be sed '/[fF][oO][oO]/!d' unless you want to consider non-standard extensions like GNU’s sed '/foo/I!d'. Or with awk: awk 'tolower($0) ~ /foo/' or again using a GNU extension: awk -v IGNORECASE=1 /foo/.

Things the different tools are good at and cumbersome with the other tools:

grep

grep is a simple tool but has very specialised modes of operation that are harder to reproduce with awk or sed:

  • grep -i for case insensitive matching (see above)
  • grep -Fe "$string" for fixed string search (export string; awk 'index($0, ENVIRON["string"])' with awk, no direct equivalent with sed).
  • (non standard) grep -r for recursive search
  • (non standard) grep -P/pcregrep for perl-like regexps (some sed implementations have perl-like regexp support though not the most major ones)
  • (non standard) grep -o to return the matched portion (several lines of awk or sed to do the same)
  • (non standard) grep -A/B/C to return context around the match (again painful to do in a similar fashion with sed or awk)

sed

  • s/foo/bar/: sed‘s s command has features that are hard to implement in awk like:
  • s/foo(.*)bar/1/g: capturing (though GNU awk has a gensub() extension for that)
  • s/foo/bar/3: replace the 3rd occurrence on each line
  • (non-standard): in-place file editing (though it’s also supported by GNU awk now).

awk

awk is the most feature rich of the three.

  • good for dealing with numbers
  • good for dealing with input formatted in columns.
  • good for extracting and combining data from different sources, with its associative arrays.

perl

perl as a practical extraction and reporting tool has the best of all those. That’s what it was initially designed for (to be the tool that makes all those sed/awk obsolete).

Mastering perl to do text processing does give a serious advantage. I’d recommend spending some time on it, even before looking at the less common sed commands for instance.

performance

As a rule of thumb, the more specialised the tool, the most efficient it is at the task. But that also very much depends on the implementation, the task and a few other factors and performance can have trade-offs that may need to be taken into account.

For instance, there are some grep or sed implementations that are very fast, but for instance they don’t support multibyte characters so can only work correctly on US-English text in multi-byte locales. Or they’re fast because they work on a small fixed-length buffer and thus can’t work on arbitrary input…


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x