If you were helping someone to learn the concept of pipes on the command line what example would you use? The example that actually came up was as follows:
cat whatever.txt | less
I feel like that’s not the best example, namely because there’s only one step. What’s a good, but fundemental, use of |?
Ideally the example I’ll present will use programs that have outputs themselves that can be run independently and then shown piped together.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
I’m going to walk you through a somewhat complex example, based on a real life scenario.
Problem
Let’s say the command conky stopped responding on my desktop, and I want to kill it manually. I know a little bit of Unix, so I know that what I need to do is execute the command kill <PID>. In order to retrieve the PID, I can use ps or top or whatever tool my Unix distribution has given me. But how can I do this in one command?
Answer
$ ps aux | grep conky | grep -v grep | awk '{print $2}' | xargs kill
DISCLAIMER: This command only works in certain cases. Don’t copy/paste it in your terminal and start using it, it could kill processes unsuspectingly. Rather learn how to build it.
How it works
1- ps aux
This command will output the list of running processes and some info about them. The interesting info is that it’ll output the PID of each process in its 2nd column. Here’s an extract from the output of the command on my box:
$ ps aux rahmu 1925 0.0 0.1 129328 6112 ? S 11:55 0:06 tint2 rahmu 1931 0.0 0.3 154992 12108 ? S 11:55 0:00 volumeicon rahmu 1933 0.1 0.2 134716 9460 ? S 11:55 0:24 parcellite rahmu 1940 0.0 0.0 30416 3008 ? S 11:55 0:10 xcompmgr -cC -t-5 -l-5 -r4.2 -o.55 -D6 rahmu 1941 0.0 0.2 160336 8928 ? Ss 11:55 0:00 xfce4-power-manager rahmu 1943 0.0 0.0 32792 1964 ? S 11:55 0:00 /usr/lib/xfconf/xfconfd rahmu 1945 0.0 0.0 17584 1292 ? S 11:55 0:00 /usr/lib/gamin/gam_server rahmu 1946 0.0 0.5 203016 19552 ? S 11:55 0:00 python /usr/bin/system-config-printer-applet rahmu 1947 0.0 0.3 171840 12872 ? S 11:55 0:00 nm-applet --sm-disable rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:38 conky -q
2- grep conky
I’m only interested in one process, so I use grep to find the entry corresponding to my program conky.
$ ps aux | grep conky rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:39 conky -q rahmu 3233 0.0 0.0 7592 840 pts/1 S+ 16:55 0:00 grep conky
3- grep -v grep
As you can see in step 2, the command ps outputs the grep conky process in its list (it’s a running process after all). In order to filter it, I can run grep -v grep. The option -v tells grep to match all the lines excluding the ones containing the pattern.
$ ps aux | grep conky | grep -v grep rahmu 1948 0.2 0.0 276000 3564 ? Sl 11:55 0:39 conky -q
NB: I would love to know a way to do steps 2 and 3 in a single grep call.
4- awk '{print $2}'
Now that I have isolated my target process. I want to retrieve its PID. In other words I want to retrieve the 2nd word of the output. Lucky for me, most (all?) modern unices will provide some version of awk, a scripting language that does wonders with tabular data. Our task becomes as easy as print $2.
$ ps aux | grep conky | grep -v grep | awk '{print $2}'
1948
5- xargs kill
I have the PID. All I need is to pass it to kill. To do this, I will use xargs.
xargs kill will read from the input (in our case from the pipe), form a command consisting of kill <items> (<items> are whatever it read from the input), and then execute the command created. In our case it will execute kill 1948. Mission accomplished.
Final words
Note that depending on what version of unix you’re using, certain programs may behave a little differently (for example, ps might output the PID in column $3). If something seems wrong or different, read your vendor’s documentation (or better, the man pages). Also be careful as long pipes can be dangerous. Don’t make any assumptions especially when using commands like kill or rm. For example, if there was another user named ‘conky’ (or ‘Aconkyous’) my command may kill all his running processes too!
What I’m saying is be careful, especially for long pipes. It’s always better to build it interactively as we did here, than make assumptions and feel sorry later.
Method 2
My favourite is this one:
youtube-dl $1 -q -o - | ffmpeg -i - $2
downloads a video from the given youtube url passed by $1 and outputs it as the file given by $2. Note how the file is quietly -q output to STDOUT -o -, piped to ffmpeg and used as input there by -i -.
Especially for linux newbies this might be a pratical example why the command line can be useful and make things easier than using GUI tools. I am not sure how long it would take to download a video from youtube and convert its sound to an mp3. The above line can do that in some seconds.
Method 3
The general use (read: the way I use it most of the times) is when, for some reason, I have to run some data through several tools to carry on different processing tasks.
So I’d say the use of pipes is as glue to assemble several building blocks (the different UNIX tools) together. Like Ulrich said, sort and uniq is a common stanza.
Depending on the audience, if you want to highlight this usage of pipes, you could, for example, start with: “hey, this syllabus has links to several interesting PDFs with papers and lecture notes, but some of them are repeated. can I somehow automate this?”
Then you could show how lynx --dump --listonly gets the list of links, how grep could filter for links ending in .pdf, how colrm or sed could get rid of the numbers lynx writes left to each URL, how sort and uniq could get rid of duplicates, and finally how wget -i - can be used to retrieve the files (using --wait to be gentle on the server, of course).
I’m afraid this is a complex example. On the other hand, it may help showing the power of pipes when you just pipe it and have the shell run it all at once.
Method 4
I don’t exactly know about good, but piping through grep has to be one of the most common uses, possibly followed by wc -l. (Yes, grep has the little-known -c switch.)
Another common stanza is | sort | uniq, if only because uniq requires its input to be sorted.
Method 5
This is the first thing that came to mind for me…
mysqldump is a console application that sends data, schema and optionally procedures and functions to stdout. Usually it gets redirected to a file for backup.
mysqldump <options> > mydb.dump
That would give you an uncompressed sql script. To save space, you could compress it with bzip2.
bzip2 mydb.dump
Alternatively, you could do both in one step:
mysqldump <options> | bzip2 > mydb.dump.bz2
In this example above, the stdout from mysqldump is piped to bzip2 which then has its output redirected to a file.
Method 6
Not that you need it for this example, but:
$ ps aux | grep -v grep | grep conky
…reversing the order of the greps preserves the colorization, but is
MUCH less efficient. presumably on large lists, color wouldn’t matter.
also, this webpage suggests:
https://stackoverflow.com/questions/9375711/more-elegant-ps-aux-grep-v-grep
>Johnsyweb answered Feb 21 '12 at 10:31 >The usual trick is this: >ps aux | grep '[t]erminal' >This will match lines containing terminal, which grep '[t]erminal' does not! >It also works on many flavours of Unix.
…but that will not work if you are seeking a single letter (like process ‘X’).
Method 7
I finally get to share this mess of a oneliner I made about a year and a half ago…
while read in; do host "$in"; done < sites.txt | grep -iv "GOOGLE" | grep -E '1.2.3.4|5.6.7.8' | sed -e 's/has address 216.70.91.72//' | sed -e 's/has address 94.23.33.92//' | while read sites; do curl -sL -w "%{http_code} %{url_effective}\n" "$sites" -o /dev/null; done | grep -ivE '4.*|5.*' | sed -e 's/200//' | sed -e 's/HTTP/http/'
It…
- Reads sites.txt
- Runs “host” on each (in hindsight, dig +short would’ve made this a ton easier)
- Removes the lines that contain “GOOGLE” – these are the mx records
- Gets the lines that have one of two IPs
- Gets the http status code from each site in the list
- Removes the sites that return 4xx or 5xx
- Strips the “200” from sites that returned that
- Replaces “HTTP” with “http” – purely aesthetic, no real reason.
This could’ve been done a lot better with a single Python script, I bet.
Method 8
Here’s an example I use in my job of multiple pipes in one command. This uses gawk to search MySQL general query log ($OFILE) and find any denied logins. It then sorts that list by name, pipes that list to uniq which counts the occurrences, and then pipes it sort one last time to sort the counted list numerically …
gawk '{ for (x=1;x<=NF;x++) if ( $x~"Access" && $(x+4)~".*@.*") print $(x+4)}' $OFILE | sort | uniq -c | sort -n
Method 9
Pipes work best with filters and translators
find /usr/bin/ | #produce sed 's:.*/::' | #translate: strip directory part grep -i '^z' | #filter : select items starting with z xargs -d 'n' aFinalConsumer #consume
This way, the data can flow from one program to the next one bufferful and at no point does all the data have to be in memory at once.
Method 10
cat filename | less is a terrible use of piping since you can just do less filename
Here is an example of pips that I use every day (but may also be a bad example): ls -la | more -c
The scott hoffman and njsg answers are better examples.
Method 11
How about: pattern search and display the last occuring pattern?
Shows how to use tail, sed and grep in a pipe context that can be built up step by step.
Method 12
execute this one in any directory you want have an sorted analyse for folder size (then scroll down with END key):
du -m| sort -n| less
Sortiert nach Ordnergrösse
Method 13
Here is an example which I used for setting the DISPLAY variable when xauth was not an option …
export DISPLAY=`who am i |awk '{print $NF}' | sed 's/[()]//g'`":0.0"
The first command gets the data needed, ie, hostname or IP.
The second command gets just that data (last field).
Finally, the last command strips parenthesis from the data.
Method 14
Command piping you can use anywhere you feel that output of first command can be fed as input to next.
Examples.
- With text files you can pass the text file to grep to find particular lines of text. Then you can pass the output to sed or awk to modify or print certain part of line.
cat example txt | grep {some_line} | awk {some_command}
- Working with process you can use piping to send commands to kill process.
Its simply the concept that if you feel output of command you ran can be input of another command you can pipe them.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0