I would like to remove all leading and trailing spaces and tabs from each line in an output.
Is there a simple tool like trim I could pipe my output into?
Example file:
test space at back
test space at front
TAB at end
TAB at front
sequence of some space in the middle
some empty lines with differing TABS and spaces:
test space at both ends
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
awk '{$1=$1;print}'
or shorter:
awk '{$1=$1};1'
Would trim leading and trailing space or tab characters1 and also squeeze sequences of tabs and spaces into a single space.
That works because when you assign something to one of the fields, awk rebuilds the whole record (as printed by print) by joining all fields ($1, …, $NF) with OFS (space by default).
To also remove blank lines, change it to awk '{$1=$1};NF' (where NF tells awk to only print the records for which the Number of Fields is non-zero). Do not do as sometimes suggested as that would also remove lines whose first field is any representation of awk '$1=$1'0 supported by awk (0, 00, -0e+12…)
1(and possibly other blank characters depending on the locale and the awk implementation)
Method 2
The command can be condensed like so if you’re using GNU sed:
$ sed 's/^[ t]*//;s/[ t]*$//' < file
Example
Here’s the above command in action.
$ echo -e " t blahblah t " | sed 's/^[ t]*//;s/[ t]*$//' blahblah
You can use hexdump to confirm that the sed command is stripping the desired characters correctly.
$ echo -e " t blahblah t " | sed 's/^[ t]*//;s/[ t]*$//' | hexdump -C 00000000 62 6c 61 68 62 6c 61 68 0a |blahblah.| 00000009
Character classes
You can also use character class names instead of literally listing the sets like this, [ t]:
$ sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' < file
Example
$ echo -e " t blahblah t " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'
Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).
[[:alnum:]] - [A-Za-z0-9] Alphanumeric characters
[[:alpha:]] - [A-Za-z] Alphabetic characters
[[:blank:]] - [ t] Space or tab characters only
[[:cntrl:]] - [x00-x1Fx7F] Control characters
[[:digit:]] - [0-9] Numeric characters
[[:graph:]] - [!-~] Printable and visible characters
[[:lower:]] - [a-z] Lower-case alphabetic characters
[[:print:]] - [ -~] Printable (non-Control) characters
[[:punct:]] - [!-/:<a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="280568">[email protected]</a>[-`{-~] Punctuation characters
[[:space:]] - [ tvfnr] All whitespace chars
[[:upper:]] - [A-Z] Upper-case alphabetic characters
[[:xdigit:]] - [0-9a-fA-F] Hexadecimal digit characters
Using these instead of literal sets always seems like a waste of space, but if you’re concerned with your code being portable, or having to deal with alternative character sets (think international), then you’ll likely want to use the class names instead.
References
Method 3
xargs without arguments do that.
Example:
trimmed_string=$(echo "no_trimmed_string" | xargs)
Method 4
As suggested by Stéphane Chazelas in the accepted answer, you can now
create a script /usr/local/bin/trim:
#!/bin/bash
awk '{$1=$1};1'
and give that file executable rights:
chmod +x /usr/local/bin/trim
Now you can pass every output to trim for example:
cat file | trim
(for the comments below: i used this before: while read i; do echo "$i"; done
which also works fine, but is less performant)
Method 5
If you store lines as variables, you can use bash to do the job:
remove leading whitespace from a string:
shopt -s extglob
printf '%sn' "${text##+([[:space:]])}"
remove trailing whitespace from a string:
shopt -s extglob
printf '%sn' "${text%%+([[:space:]])}"
remove all whitespace from a string:
printf '%sn' "${text//[[:space:]]}"
Method 6
shopt -s extglob
printf '%sn' "${text%%+([[:space:]])}"
remove all whitespace from a string:
printf '%sn' "${text//[[:space:]]}"
Method 6
To remove all leading and trailing spaces from a given line thanks to a ‘piped’ tool, I can identify 3 different ways which are not completely equivalent. These differences concern the spaces between words of the input line. Depending on the expected behaviour, you’ll make your choice.
Examples
To explain the differences, let consider this dummy input line:
" t A tBtC t "
tr
$ echo -e " t A tBtC t " | tr -d "[:blank:]" ABC
tr is really a simple command. In this case, it deletes any space or tabulation character.
awk
$ echo -e " t A tBtC t " | awk '{$1=$1};1'
A B C
awk deletes leading and tailing spaces and squeezes to a single space every spaces between words.
sed
$ echo -e " t A tBtC t " | sed 's/^[ t]*//;s/[ t]*$//' A B C
In this case, sed deletes leading and tailing spaces without touching any spaces between words.
Remark:
In the case of one word per line, tr does the job.
Method 7
sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'
If you’re reading a line into a shell variable, read does that already unless instructed otherwise.
Method 8
sed is a great tool for that:
# substitute ("s/")
sed 's/^[[:blank:]]*//; # parts of lines that start ("^") with a space/tab
s/[[:blank:]]*$//' # or end ("$") with a space/tab
# with nothing (/)
You can use it for your case be either piping in the text, e.g.
<file sed -e 's/^[[...
or by acting on it ‘inline’ if your sed is the GNU one:
sed -i 's/...' file
but changing the source this way is “dangerous” as it may be unrecoverable when it doesn’t work right (or even when it does!), so backup first (or use -i.bak which also has the benefit to be portable to some BSD seds)!
Method 9
An answer you can understand in a glance:
#!/usr/bin/env python3 import sys for line in sys.stdin: print(line.strip())
Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.
Like rubo77’s answer, save as script /usr/local/bin/trim and give permissions with chmod +x.
Method 10
If the string one is trying to trim is short and continuous/contiguous, one can simply pass it as a parameter to any bash function:
trim(){
echo <a href="https://getridbug.com/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="7f5b3f">[email protected]</a>
}
a=" some random string "
echo ">>`trim $a`<<"
Output
>>some random string<<
Method 11
Using Raku (formerly known as Perl_6):
raku -ne '.trim.put;'
Or more simply:
raku -pe '.=trim;'
As a previous answer suggests (thanks, @Jeff_Clayton!), you can create a trim alias in your bash environment:
alias trim="raku -pe '.=trim;'"
Finally, to only remove leading/trailing whitespace (e.g. unwanted indentation), you can use the appropriate trim-leading or trim-trailing command instead.
Method 12
trimpy () {
python3 -c 'import sys
for line in sys.stdin: print(line.strip())'
}
trimsed () {
gsed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'
}
trimzsh () {
local out="$(</dev/stdin)"
[[ "$out" =~ '^s*(.*S)s*$' ]] && out="$match[1]" || out=''
print -nr -- "$out"
}
# example usage
echo " hi " | trimpy
Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.
Method 13
I wrote this shell function using awk
awkcliptor(){
awk -e 'BEGIN{ RS="^$" } {gsub(/^[nt ]*|[nt ]*$/,"");print ;exit}' "$1" ; }
BEGIN{ RS="^$" }:
in the beginning before start parsing set record
separator to none i.e. treat the whole input as
a single record
gsub(this,that):
substitute this regexp with that string
/^[nt ]*|[nt ]*$/:
of that string catch any pre newline space and tab class
or post newline space and tab class and replace them with
empty string
print;exit:
then print and exit
"$1":
and pass the first argument of the function to be
process by awk
how to use:
copy above code , paste in shell, and then enter to
define the function.
then you can use awkcliptor as a command with first argument as the input file
sample usage:
echo '
ggggg
' > a_file
awkcliptor a_file
output:
ggggg
or
echo -e "n ggggg nn "|awkcliptor
output:
ggggg
Method 14
For those of us without enough space in the brain to remember obscure sed syntax, just reverse the string, cut the 1st field with a delimiter of space, and reverse it back again.
cat file | rev | cut -d' ' -f1 | rev
Method 15
You will be adding this to your little Bash library. I can almost bet on it!
This has the benefit of not adding a newline character to the end of your output, as will happen with echo throwing off your expected output. Moreover, these solutions are reusable, do not require modifying the shell options, can be called in-line with your pipelines, and are posix compliant. This is the best answer, by far. Modify to your liking.
Output tested with od -cb, something some of the other solutions might want to do with their output.
BTW: The correct quantifier is the +, not the *, as you want the replacement to be triggered upon 1 or more whitespace characters!
ltrim (that you can pipe input into)
function ltrim ()
{
sed -E 's/^[[:space:]]+//'
}
rtrim (that you can pipe input into)
function rtrim ()
{
sed -E 's/[[:space:]]+$//'
}
trim (the best of both worlds and you can, and yes, you can pipe to it)
function trim ()
{
ltrim | rtrim
}
Method 16
function rtrim ()
{
sed -E 's/[[:space:]]+$//'
}
trim (the best of both worlds and you can, and yes, you can pipe to it)
function trim ()
{
ltrim | rtrim
}
Method 16
translate command would work
cat file | tr -d [:blank:]
Method 17
for bash example:
alias trim="awk '{$1=$1};1'"
usage:
echo -e " hellottkitty " | trim | hexdump -C
result:
00000000 68 65 6c 6c 6f 20 6b 69 74 74 79 0a |hello kitty.| 0000000c
Method 18
Remove start space and tab and end space and tab:
alias strip='python3 -c "from sys import argv; print(argv[1].strip(" ").strip("t"))"'
Remove every space and tab
alias strip='python3 -c "from sys import argv; print(argv[1].replace("t", "").replace(" ", "")"'
Give argument to strip. Use sys.stdin().read() to make pipeable instead of argv.
Method 19
simple enough for my purposes was this:
_text_=" one two three "
echo "$_text_" | { read __ ; echo ."$__". ; }
… giving …
.one two three.
… if you want to squeeze the spaces then …
echo .$( echo $_text_ ).
… gives …
.one two three.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0