I have a problem with the output of a program. I need to launch a command in bash and take its output (a string) and split it to add new lines in certain places. The string looks like this:
battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
basically it is an xxx.yy.zz: value, but the value may contain spaces.
Here’s the output I’d like to get
battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
I have an idea to search for first dot and then look back from that position for space to put a new line there, but I’m not sure how to achieve it in Bash.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
With GNU sed, you can match each contiguous string (i.e. without whitespace) terminated by : and then place a newline before all but the first one:
sed 's/[^[:space:]]+:/n&/g2'
If your version of sed does not support the gn extension, you can use a plain g modifier
sed 's/[^[:space:]]{1,}:/
&/g'
which will work the same except for printing an additional newline before the first key. You could use perl -pe 's/S+:/n$&/g' with the same proviso (there may be a perl equivalent of the GNU sed g2 but I don’t know it).
Method 2
Pure bash solution, no external tools used to process the strings, just parameter expansion:
#! /bin/bash
str='battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500'
IFS=: read -a fields <<< "$str"
for (( i=0 ; i < ${#fields[@]} ; i++ )) ; do
f=${fields[i]}
notfirst=$(( i>0 ))
last=$(( i+1 == ${#fields[@]} ))
(( notfirst )) && echo -n ${f% *}
start=('' $'n' ' ')
colon=('' ': ')
echo -n "${start[notfirst + last]}${f##* }${colon[!last]}"
done
echo
Explanation: $notfirst and $last are booleans. The part before the last space ${f% *} isn’t printed for the first field, as there is no such thing. $start and $colon hold various strings that separate the fields: at the first item, notfirst + last is 0, so nothing is prepended, for the rest of the lines, $notfirst is 1, so a newline is printed, and for the last line, the addition gives 2, so a space is printed. Then, the part after the last space is printed ${f##* }. Colon is printed for all lines except the last one.
Method 3
A perl solution:
$ perl -pe 's{S+:}{$seen++ ? "n$&" : "$&"}ge' file
battery.charge: 90
battery.charge.low: 30
battery.runtime: 3690
battery.voltage: 230.0
device.mfr: MGE UPS SYSTEMS
device.model: Pulsar Evolution 500
Explanation
S+:matches string end with:.- With all matched strings, we insert the newline before them
("n$&")except the first one($seen++).
Method 4
It’s easier using a tool that supports lookarounds:
$ s="battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500" $ grep -oP 'S+:s+.*?(?=s+S+:|$)' <<< "$s" battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
If you wanted the result in an array:
$ IFS=$'n' foo=($(grep -oP 'S+:s+.*?(?=s+S+:|$)' <<< "$s"))
$ for i in "${!foo[@]}"; do echo "$i<==>${foo[i]}"; done
0<==>battery.charge: 90
1<==>battery.charge.low: 30
2<==>battery.runtime: 3690
3<==>battery.voltage: 230.0
4<==>device.mfr: MGE UPS SYSTEMS
5<==>device.model: Pulsar Evolution 500
EDIT: Explanation of the regex:
'S+:s+.*?(?=s+S+:|$)'
S+matches one or more non-whitespace characters:matches:s+matches one or more spaces after the:.*?denotes a non-greedy match(?=s+S+:|$)is a lookahead assertion to determine if there is:- one or more space followed by a string (non-whitespace charaters) and a colon, or
- end of string
So the string is split into parts like battery.charge: 90, … device.mfr: MGE UPS SYSTEMS, …
Below are links to a couple of online regular expression analyzers:
Method 5
Here’s a naive approach that should work assuming you don’t care that tabs and newlines in the input (if any) are converted to plain spaces.
The idea is simple: split the input on whitespace, and print every token except that you prepend tokens that end with : with a newline (and re-add a space in front of the others). The $count variable and related if are only useful to prevent an initial empty line. Could be removed if that’s not a problem. (The script assumes the input is in a file called intput in the current directory.)
#! /bin/bash
count=0
for i in $(<input) ; do
fmt=
if [[ $i =~ :$ ]] ; then
if [[ $count -gt 0 ]] ; then
fmt="n%s"
else
fmt="%s"
fi
((count++))
else
fmt=" %s"
fi
printf "$fmt" "$i"
done
echo
echo "Num items: $count"
I hope someone can come up with a nicer alternative though.
$ cat input battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500 $ ./t.sh battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500 Num items: 6
Method 6
You can use awk(1) with the following script split.awk:
BEGIN { RS=" "; first=1; }
first { first=0; printf "%s", $1; next; }
/[a-z]+.[^:]+:/ { printf "n%s", $1; next; }
{ printf " %s", $1 }
END { printf "n" }
When you run
awk -f split.awk input.dat
you will get
battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
The idea is to let awk split the input when it sees a space (setting record separator RS in line 1). Then it matches xxx.yy.zz: values in line 2 and 3 (distinguishing the very first match from subsequent ones), while line 4 matches whenever line 2 and 3 do not match. Line 5 just print the last newline.
Method 7
Here is short awk script to demo awk power.
awk '
len=patsplit($0, namesArr,"[^ :]+: ", valuesArr) {
for(i=0;i<=len;i++)
print namesArr[i], valuesArr[i]
}' input.txt
input.txt
battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
output:
battery.charge: 90 battery.charge.low: 30 battery.runtime: 3690 battery.voltage: 230.0 device.mfr: MGE UPS SYSTEMS device.model: Pulsar Evolution 500
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0