I have a file which is having following content:
zdk aaa b12 cdn dke kdn
Input1: aaa and cdn
Output 1:
aaa b12 cdn
Input 2: zdk and dke
Output 2:
zdk aaa b12 cdn dke
I could use below commands to achieve:
grep -a aaa -A2 file # Output 1 grep -a aaa -A4 file # Output 2
But in the file I don’t know what is the exact occurrence (position) of end string pattern (file is having 20000 rows)
How can I achieve this?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
grep won’t help you here. This is a job better accomplished with sed using range expressions:
$ sed -n '/aaa/,/cdn/p' file aaa b12 cdn $ sed -n '/zdk/,/dke/p' file zdk aaa b12 cdn dke
sed -n suppresses the automatic printing, so that lines are printed just if explicitly asked to. And this happens when the range /aaa/,/cdn/ happens.
These range expressions are also available in awk, where you can say:
awk '/zdk/,/dke/' file
Of course, all these conditions can be expanded to a more strict regex like sed -n '/^aaa$/,/^cdn$/p' file to check that the lines consist on exactly aaa and cdn, nothing else.
Method 2
It can be done by sed
sed -n '
/^aaa$/,/^cdn$/w output1
/^zdk$/,/^dke$/w output2
' file
Method 3
Here is grep command:
grep -o "aaa.*cdn" <(paste -sd_ file) | tr '_' 'n'
You can achieve multiline match in grep, but you need to use perl-regexp for grep (-P – which is not supported on every platform, like OS X), so as workaround we’re replacing new lines with _ character and after grep, we’re changing them back.
Alternatively you can use pcregrep which supports multi-line patterns (-M).
Or use ex:
ex +"/aaa/,/cdn/p" -scq! file
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0