I have a text file named links.txt which looks like this
link1 link2 link3
I want to loop through this file line by line and perform an operation on every line. I know I can do this using while loop but since I am learning, I thought to use a for loop.
I actually used command substitution like this
a=$(cat links.txt)
Then used the loop like this
for i in $a; do ###something###;done
Also I can do something like this
for i in $(cat links.txt); do ###something###; done
Now my question is when I substituted the cat command output in a variable a, the new line characters between link1 link2 and link3 are removed and is replaced by spaces
echo $a
outputs
link1 link2 link3
and then I used the for loop.
Is it always that a new line is replaced by space when we do a command substitution??
Regards
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Newlines get swapped out at some points because they are special characters. In order to keep them, you need to make sure they’re always interpreted, by using quotes:
$ a="$(cat links.txt)" $ echo "$a" link1 link2 link3
Now, since I used quotes whenever I was manipulating the data, the newline characters (n) always got interpreted by the shell, and therefore remained. If you forget to use them at some point, these special characters will be lost.
The very same behaviour will occur if you use your loop on lines containing spaces. For instance, given the following file…
mypath1/file with spaces.txt mypath2/filewithoutspaces.txt
The output will depend on whether or not you use quotes:
$ for i in $(cat links.txt); do echo $i; done mypath1/file with spaces.txt mypath2/filewithoutspaces.txt $ for i in "$(cat links.txt)"; do echo "$i"; done mypath1/file with spaces.txt mypath2/filewithoutspaces.txt
Now, if you don’t want to use quotes, there is a special shell variable which can be used to change the shell field separator (IFS). If you set this separator to the newline character, you will get rid of most problems.
$ IFS=$'n'; for i in $(cat links.txt); do echo $i; done mypath1/file with spaces.txt mypath2/filewithoutspaces.txt
For the sake of completeness, here is another example, which does not rely on command output substitution. After some time, I found out that this method was considered more reliable by most users due to the very behaviour of the read utility.
$ cat links.txt | while read i; do echo $i; done
Here is an excerpt from read‘s man page:
The read utility shall read a single line from standard input.
Since read gets its input line by line, you’re sure it won’t break whenever a space shows up. Just pass it the output of cat through a pipe, and it’ll iterate over your lines just fine.
Edit: I can see from other answers and comments that people are quite reluctant when it comes to the use of cat. As jasonwryan said in his comment, a more proper way to read a file in shell is to use stream redirection (<), as you can see in val0x00ff’s answer here. However, since the question isn’t “how to read/process a file in shell programming“, my answer focuses more on the quotes behaviour, and not the rest.
Method 2
The newlines were lost, because the shell had performed field splitting after command substitution.
In POSIX Command Substitution section:
The shell shall expand the command substitution by executing command
in a subshell environment (see Shell Execution Environment) and
replacing the command substitution (the text of command plus the
enclosing “$()” or backquotes) with the standard output of the
command, removing sequences of one or more characters at the
end of the substitution. Embedded characters before the end
of the output shall not be removed; however, they may be treated as
field delimiters and eliminated during field splitting, depending on
the value of IFS and quoting that is in effect. If the output contains
any null bytes, the behavior is unspecified.
Default IFS value (at least in bash):
$ printf '%qn' "$IFS" $' tn'
In your case, you don’t set IFS or using double quotes, so newlines character will be eliminated during field splitting.
You can preserve newlines, example by settingIFSto empty:
$ IFS= $ a=$(cat links.txt) $ echo "$a" link1 link2 link3
Method 3
To add my emphasis, for loops iterate over words. If your file is:
one two three four
Then this will emit four lines:
for word in $(cat file); do echo "$word"; done
To iterate over the lines of a file, do this:
while IFS= read -r line; do
# do something with "$line" <-- quoted almost always
done < file
Method 4
You can use read from bash. Also look for the mapfile
while read -r link do printf '%sn' "$link" done < links.txt
Or using mapfile
mapfile -t myarray < links.txt
for link in "${myarray[@]}"; do printf '%sn' "$link"; done
Method 5
The newlines are replaced with spaces because that’s how echo works – it concatenates its arguments on spaces. echo replaces argument delimiters with a space. In truth, you can iterate with for over anything you want, but you have to specify the field delimiter first:
string=abababababababababababa IFS=a for c in $string do printf %s "$c" done
OUTPUT
bbbbbbbbbbb
But this isn’t behavior unique to a for loop – this happens for any field split expansion:
printf %s $string bbbbbbbbbbb
For example, if you want to print only the first 10 bytes from any non-blank line in a file…
###original: first "line" <second>"line" <second>"line" <second>line and so on% (IFS=' '; printf %.10s\n $(cat file)) ###output first "lin <second>"l <second>"l <second>li
There is a reason I specify non-blank above – the newline is one of three special bytes in $IFS. While everything else will get you an empty argument when 2 or more occur in succession, any sequence of spaces, tabs, or newlines can only ever evaluate to a single field.
And so for example:
(IFS=0;printf 'ten lines!%sn' $(printf "%010d")) ten lines! ten lines! ten lines! ten lines! ten lines! ten lines! ten lines! ten lines! ten lines! ten lines!
But…
(IFS= ;printf 'one line%sn' $(printf "%010s")) one line
In both cases printf prints 10 filler characters – in the first case it prints 10 zeroes and in the second 10 spaces. In the first case every 0 generates a null field and the second printf gets 10 empty arguments for each of which it writes its format string, but all of the spaces printed in the second case amount to nothing at all.
You should note that this is not the only kind of field generation that the shell will do with unquoted expansions – by default it will also glob. Doing things like:
for line in $(cat file)
Can lead to very unexpected results because there is a very real chance that some of those lines will contain shell globs that match real files – and suddenly $line doesn’t refer to a line of input any more but instead an on-disk filename.
If you plan to use $IFS for any splitting it is always a good idea to:
set -f
…first, which will instruct the shell not to glob while you do. When you are through you can re-enable it with set +f.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0