Why do newline characters get lost when using command substitution?

I have a text file named links.txt which looks like this

link1
link2
link3

I want to loop through this file line by line and perform an operation on every line. I know I can do this using while loop but since I am learning, I thought to use a for loop.
I actually used command substitution like this

a=$(cat links.txt)

Then used the loop like this

for i in $a; do ###something###;done

Also I can do something like this

for i in $(cat links.txt); do ###something###; done

Now my question is when I substituted the cat command output in a variable a, the new line characters between link1 link2 and link3 are removed and is replaced by spaces

echo $a

outputs

link1 link2 link3

and then I used the for loop.
Is it always that a new line is replaced by space when we do a command substitution??

Regards

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Newlines get swapped out at some points because they are special characters. In order to keep them, you need to make sure they’re always interpreted, by using quotes:

$ a="$(cat links.txt)"
$ echo "$a"
link1
link2
link3

Now, since I used quotes whenever I was manipulating the data, the newline characters (n) always got interpreted by the shell, and therefore remained. If you forget to use them at some point, these special characters will be lost.

The very same behaviour will occur if you use your loop on lines containing spaces. For instance, given the following file…

mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt

The output will depend on whether or not you use quotes:

$ for i in $(cat links.txt); do echo $i; done
mypath1/file
with
spaces.txt
mypath2/filewithoutspaces.txt

$ for i in "$(cat links.txt)"; do echo "$i"; done
mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt

Now, if you don’t want to use quotes, there is a special shell variable which can be used to change the shell field separator (IFS). If you set this separator to the newline character, you will get rid of most problems.

$ IFS=$'n'; for i in $(cat links.txt); do echo $i; done
mypath1/file with spaces.txt
mypath2/filewithoutspaces.txt

For the sake of completeness, here is another example, which does not rely on command output substitution. After some time, I found out that this method was considered more reliable by most users due to the very behaviour of the read utility.

$ cat links.txt | while read i; do echo $i; done

Here is an excerpt from read‘s man page:

The read utility shall read a single line from standard input.

Since read gets its input line by line, you’re sure it won’t break whenever a space shows up. Just pass it the output of cat through a pipe, and it’ll iterate over your lines just fine.

Edit: I can see from other answers and comments that people are quite reluctant when it comes to the use of cat. As jasonwryan said in his comment, a more proper way to read a file in shell is to use stream redirection (<), as you can see in val0x00ff’s answer here. However, since the question isn’t “how to read/process a file in shell programming“, my answer focuses more on the quotes behaviour, and not the rest.

Method 2

The newlines were lost, because the shell had performed field splitting after command substitution.

In POSIX Command Substitution section:

The shell shall expand the command substitution by executing command
in a subshell environment (see Shell Execution Environment) and
replacing the command substitution (the text of command plus the
enclosing “$()” or backquotes) with the standard output of the
command, removing sequences of one or more characters at the
end of the substitution. Embedded characters before the end
of the output shall not be removed; however, they may be treated as
field delimiters and eliminated during field splitting, depending on
the value of IFS and quoting that is in effect
. If the output contains
any null bytes, the behavior is unspecified.

Default IFS value (at least in bash):

$ printf '%qn' "$IFS"
$' tn'

In your case, you don’t set IFS or using double quotes, so newlines character will be eliminated during field splitting.

You can preserve newlines, example by settingIFSto empty:

$ IFS=
$ a=$(cat links.txt)
$ echo "$a"
link1
link2
link3

Method 3

To add my emphasis, for loops iterate over words. If your file is:

one two
three four

Then this will emit four lines:

for word in $(cat file); do echo "$word"; done

To iterate over the lines of a file, do this:

while IFS= read -r line; do
    # do something with "$line" <-- quoted almost always
done < file

Method 4

You can use read from bash. Also look for the mapfile

while read -r link
  do
   printf '%sn' "$link"
  done < links.txt

Or using mapfile

mapfile -t myarray < links.txt
for link in "${myarray[@]}"; do printf '%sn' "$link"; done

Method 5

The newlines are replaced with spaces because that’s how echo works – it concatenates its arguments on spaces. echo replaces argument delimiters with a space. In truth, you can iterate with for over anything you want, but you have to specify the field delimiter first:

string=abababababababababababa IFS=a        
for c in $string
do printf %s "$c"
done

OUTPUT

bbbbbbbbbbb

But this isn’t behavior unique to a for loop – this happens for any field split expansion:

printf %s $string
bbbbbbbbbbb

For example, if you want to print only the first 10 bytes from any non-blank line in a file…

###original:
first "line"
<second>"line"
<second>"line"
<second>line and so on%
(IFS='                                                       
'; printf %.10s\n $(cat file))
###output
first "lin
<second>"l
<second>"l
<second>li

There is a reason I specify non-blank above – the newline is one of three special bytes in $IFS. While everything else will get you an empty argument when 2 or more occur in succession, any sequence of spaces, tabs, or newlines can only ever evaluate to a single field.

And so for example:

(IFS=0;printf 'ten lines!%sn' $(printf "%010d"))

ten lines!
ten lines!
ten lines!
ten lines!
ten lines!
ten lines!
ten lines!
ten lines!
ten lines!
ten lines!

But…

(IFS= ;printf 'one line%sn' $(printf "%010s"))
one line

In both cases printf prints 10 filler characters – in the first case it prints 10 zeroes and in the second 10 spaces. In the first case every 0 generates a null field and the second printf gets 10 empty arguments for each of which it writes its format string, but all of the spaces printed in the second case amount to nothing at all.

You should note that this is not the only kind of field generation that the shell will do with unquoted expansions – by default it will also glob. Doing things like:

for line in $(cat file)

Can lead to very unexpected results because there is a very real chance that some of those lines will contain shell globs that match real files – and suddenly $line doesn’t refer to a line of input any more but instead an on-disk filename.

If you plan to use $IFS for any splitting it is always a good idea to:

set -f

…first, which will instruct the shell not to glob while you do. When you are through you can re-enable it with set +f.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x