Command to display first few and last few lines of a file

I have a file with many rows, and each row has a timestamp at the starting, like

[Thread-3] (21/09/12 06:17:38:672) logged message from code.....

So, I frequently check 2 things from this log file.

  1. First few rows, that has the global conditions and start time is also given.
  2. Last few rows, that has the exit status with some other info.

Is there any quick handy single command that could let me display just the first and last few lines of a file?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

@rush is right about using head + tail being more efficient for large files, but for small files (< 20 lines), some lines may be output twice.

{ head; tail;} < /path/to/file

would be equally efficient, but wouldn’t have the problem above.

Method 2

You can use sed or awk to make it with one command. However you’ll loose at speed, cause sed and awk will need to run through the whole file anyway.
From a speed point of view it’s much better to make a function or every time to combination of tail + head. This does have the downside of not working if the input is a pipe, however you can use proccess substitution, in case your shell supports it (look at example below).

first_last () {
    head -n 10 -- "$1"
    tail -n 10 -- "$1"
}

and just launch it as

first_last "/path/to/file_to_process"

to proceed with process substitution (bash, zsh, ksh like shells only):

first_last <( command )

ps. you can even add a grep to check if your “global conditions” exist.

Method 3

The { head; tail; } solution wouldn’t work on pipes (or sockets or any other non-seekable files) because head could consume too much data as it reads by blocks and can’t seek back on a pipe potentially leaving the cursor inside the file beyond what tail is meant to select.

So, you could use a tool that reads one character at a time like the shell’s read (here using a function that takes the number of head lines and tail lines as arguments).

head_tail() {
  n=0
  while [ "$n" -lt "$1" ]; do
    IFS= read -r line || { printf %s "$line"; break; }
    printf '%sn' "$line"
    n=$(($n + 1))
  done
  tail -n "${2-$1}"
}
seq 100 | head_tail 5 10
seq 20 | head_tail 5

or implement tail in awk for instance as:

head_tail() {
  awk -v h="$1" -v t="${2-$1}" '
    {l[NR%t]=$0}
    NR<=h
    END{
      n=NR-t+1
      if(n <= h) n = h+1
      for (;n<=NR;n++) print l[n%t]
    }'
}

With sed:

head_tail() {
  sed -e "1,${1}b" -e :1 -e "$(($1+${2-$1})),$!{N;b1" -e '}' -e 'N;D'
}

(though beware that some sed implementations have a low limitation on the size of their pattern space, so would fail for big values of the number of tail lines).

Method 4

With the -u (--unbuffered) option of GNU sed, you can use sed -u 2q as an unbuffered alternative to head -n2:

$ seq 100|(sed -u 2q;tail -n2)
1
2
99
100

When (head -n2;tail -n2) is used in a pipeline, it doesn’t print the last lines when the last lines are part of a block of input that is consumed by head:

$ seq 1000|(head -n2;tail -n2)
1
2
999
1000
$ seq 100|(head -n2;tail -n2)
1
2

If the input is a regular file and not a pipeline, you can just use (head -n2;tail -n2):

$ seq 100 >a;(head -n2;tail -n2)<a
1
2
99
100

Method 5

Using bash process substitution, you can do the following:

make_some_output | tee >(tail -n 2) >(head -n 2; cat >/dev/null) >/dev/null

Note that the lines are not guaranteed to be in order, though for files longer than about 8kB, they very likely will be. This 8kB cutoff is the typical size of the read buffer, and is related to the reason | {head; tail;} doesn’t work for small files.

The cat >/dev/null is necessary to keep the head pipeline alive. Otherwise tee will quit early, and while you’ll get output from tail, it’ll be from somewhere in the middle of the input, rather than the end.

Finally, why the >/dev/null instead of, say, moving tail to another |? In the following case:

make_some_output | tee >(head -n 2; cat >/dev/null) | tail -n 2  # doesn't work

head‘s stdout is fed into the pipe to tail rather than the console, which isn’t what we want at all.

Method 6

Using ed (which will read the entire file into RAM though):

# cf. http://wiki.bash-hackers.org/howto/edit-ed
printf '%sn' 'H' '1,10p' '$-10,$p' 'q' | ed -s file

Method 7

Stephane’s first solution in a function so that you can use arguments (works in any Bourne-like or POSIX shell):

head_tail() {
    head "<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="88acc8">[email protected]</a>";
    tail "<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="cce88c">[email protected]</a>";
}

Now you can do this:

head_tail -n 5 < /path/to/file

This of course assumes that you’re looking at only one file and like Stephane’s solution works (reliably) only on regular (seekable) files.

Method 8

I ran into something like this today where I needed only the last line and a few lines from the front of a stream and came up with the following.

sed -n -e '1{h}' -e '2,3{H}' -e '${H;x;p}'

I read this as: initialize the hold space with the contents of the first line, append lines 2-3 in the hold space, at EOF append the last line to the hold space, swap hold-and-pattern space, and print the pattern space.

Perhaps someone with more sed-fu than I have can figure out how to generalize this to print the last few lines of the stream indicated in this question but I didn’t need it and could not find an easy way to do math based on the $ address in sed or perhaps by managing the hold space so that only the last few lines are in it when EOF is reached.

Method 9

You may try Perl, if you have it installed:

perl -e '@_ = <>; @<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="18472558">[email protected]</a>_[0, -3..-1]; print @_'

This will work for most files, but reads the whole file into memory before processing it. If you’re not familiar with Perl slices, “0” in square brackets means “take the first line”, and “-3…-1” means “take last three lines”. You may tailor both of them to your needs. If you need to process really large files (what is ‘large’ may depend on your RAM and perhaps swap sizes), you may want to go for:

perl -e 'while($_=<>){@_=(@_,$_)[0,-3..-1]}; print @_'

it may be somewhat slower, because it makes a slice every iteration, but it’s independent on the file size.

Both commands should work both in pipes and with regular files.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x