Why does awk do full buffering when reading from a pipe

I’m reading from a serial port connected to a gps device sending nmea strings.

A simplified invocation to illustrate my point:

  $ awk '{ print $0 }' /dev/ttyPSC9 
  GPGGA,073651.000,6310.1043,N,01436.1539,E,1,07,1.0,340.2,M,33.3,M,,0000*56
  $GPGSA,A,3,28,22,09,27,01,19,17,,,,,,2.3,1.0,2.0*39
  $GPRMC,073651.000,A,6310.1043,N,01436.1539,E,0.42,163.42,070312,,,A*67
  GPGGA,073652.000,6310.1043,N,01436.1540,E,1,07,1.0,339.2,M,33.3,M,,0000*55
  $GPGSA,A,3,28,22,09,27,01,19,17,,,,,,2.3,1.0,2.0*39

If I instead try to read from a pipe, awk buffers the input before sending it to stdout.

$ cat /dev/ttyPSC9 | awk '{ print $0 }'
<long pause>
GPGGA,073651.000,6310.1043,N,01436.1539,E,1,07,1.0,340.2,M,33.3,M,,0000*56
$GPGSA,A,3,28,22,09,27,01,19,17,,,,,,2.3,1.0,2.0*39
$GPRMC,073651.000,A,6310.1043,N,01436.1539,E,0.42,163.42,070312,,,A*67
GPGGA,073652.000,6310.1043,N,01436.1540,E,1,07,1.0,339.2,M,33.3,M,,0000*55
$GPGSA,A,3,28,22,09,27,01,19,17,,,,,,2.3,1.0,2.0*39

How can I avoid the buffering?

Edit: Kyle Jones suggested that cat is buffering it’s output but that doesn’t appear to be happening:

$ strace cat /dev/ttyPSC9 | awk '{ print $0 }'
write(1, "2,"..., 2)                    = 2
read(3, "E"..., 4096)                   = 1
write(1, "E"..., 1)                     = 1
read(3, ",0"..., 4096)                  = 2

When I think about it: I thought that a program used line buffering when writing to a terminal and “regular buffering” for all other cases. Then, why is cat not buffering more? Is the serial port signaling EOF? Then why is cat not terminated?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

I know it is an old question, but a one-liner may help those who come here searching:

cat /dev/ttyPSC9 | awk '{ print $0; system("")}'

system("") does the trick, and is POSIX compliant. Non-posix systems: beware.

There exists a more specific function fflush() that does the same, but is not available in older versions of awk.

An important piece of information from the docs regarding the use of system(""):

gawk treats this use of the system() function as a special case and is
smart enough not to run a shell (or other command interpreter) with
the empty command. Therefore, with gawk, this idiom is not only
useful, it is also efficient.

Method 2

It is likely to be buffering in awk, not cat. In the first case, awk believes it is interactive because its input and output are TTYs (even though they’re different TTYs – I’m guessing that awk is not checking that). In the second, the input is a pipe so it runs non-interactively.

You will need to explicitly flush in your awk program. This is not portable, though.

For more background and details on how to flush output, read: http://www.gnu.org/software/gawk/manual/html_node/I_002fO-Functions.html


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x