I have a file that includes details about VMs running in a hypervisor. We run some command and redirect the output to a file. And the is data available in the below format.
Virtual Machine : OL6U5
ID : 0004fb00000600003da8ce6948c441bb
Status : Running
Memory : 65536
Uptime : 17835 Minutes
Server : MyOVS1.vmorld.com
Pool : HA-POOL
HA Mode: false
VCPU : 16
Type : Xen PVM
OS : Oracle Linux 6
Virtual Machine : OL6U6
ID : 0004fb00000600003da8ce6948c441bc
Status : Running
Memory : 65536
Uptime : 17565 Minutes
Server : MyOVS2.vmorld.com
Pool : NON-HA-POOL
HA Mode: false
VCPU : 16
Type : Xen PVM
OS : Oracle Linux 6
Virtual Machine : OL6U7
ID : 0004fb00000600003da8ce6948c441bd
Status : Running
Memory : 65536
Uptime : 17835 Minutes
Server : MyOVS1.vmorld.com
Pool : HA-POOL
HA Mode: false
VCPU : 16
Type : Xen PVM
OS : Oracle Linux 6
This output differs from hypervisor to hypervisor since on some hypervisors we have 50 + vms running. Above file is a just an example from hypervisor where we have only 3 VMs running and hence the redirected file is expected to contain information about several( N number of VMs)
We need to get this details in the below format using awk/sed or with a shell script
Virtual_Machine ID Status Memory Uptime Server Pool HA VCPU Type OS OL6U5 0004fb00000600003da8ce6948c441bb Running 65536 17835 MyOVS1.vmworld.com HA-POOL false 16 Xen PVM Oracle Linux 6 OL6U6 0004fb00000600003da8ce6948c441bc Running 65536 17565 MyOVS2.vmworld.com NON-HA-POOL false 16 Xen PVM Oracle Linux 6 OL6U5 0004fb00000600003da8ce6948c441bd Running 65536 17835 MyOVS1.vmworld.com HA-POOL false 16 Xen PVM Oracle Linux 6
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
If you have the rs (reshape) utility available, you can do the following:
rs -Tzc: < input.txt
This gives the output format exactly as specified in the question, even down to the dynamic column widths.
-TTransposes the input data-zsizes the columns appropriately from the max in each column-c:uses colon as the input field separator
This works for arbitrarily sized tables, e.g.:
$ echo "Name:Alice:Bob:Carol Age:12:34:56 Eyecolour:Brown:Black:Blue" | rs -Tzc: Name Age Eyecolour Alice 12 Brown Bob 34 Black Carol 56 Blue $
rs is available by default on OS X (and likely other BSD machines). It can be installed on Ubuntu (and debian family) with:
sudo apt-get install rs
Method 2
EDIT: Extensible to any number of output rows, in a simple one-liner for loop:
for ((i=1;i<=2;i++)); do cut -d: -f "$i" input | paste -sd: ; done | column -t -s:
Original answer:
You can do this as a one-liner using bash process substitution:
paste -sd: <(cut -d: -f1 input) <(cut -d: -f2 input) | column -t -s:
The -s option to paste makes it handle each file one at a time. The : delimiter set in paste is “caught” by the -s option to column at the end, to pretty up the format by making the fields line up.
The cut commands in the two process substitutions pull out the first field and the second field, respectively.
Whether there are blank lines in the input or not doesn’t matter, as column -t -s: will clean up the output regardless. (There were blank lines in the original input specified in the question, but they’ve since been removed. The above command works regardless of blank lines.)
Input – contents of file named “input” in above command:
Virtual_Machine:OL6U7 ID:0004fb00000600003da8ce6948c441bd Status:Running Memory:65536 Uptime:17103 Server:MyOVS1.vmworld.com Pool:HA-POOL HA:false VCPU:16 Type:Xen PVM OS:Oracle Linux 6
Output:
Virtual_Machine ID Status Memory Uptime Server Pool HA VCPU Type OS OL6U7 0004fb00000600003da8ce6948c441bd Running 65536 17103 MyOVS1.vmworld.com HA-POOL false 16 Xen PVM Oracle Linux 6
Method 3
If walking the file twice is not a (big) problem (will store only one line in memory):
awk -F : '{printf("%st ", $1)}' infile
echo
awk -F : '{printf("%st ", $2)}' infile
Which, for a general count of fields would be (which could have many walks of the file):
#!/bin/bash
rowcount=2
for (( i=1; i<=rowcount; i++ )); do
awk -v i="$i" -F : '{printf("%st ", $i)}' infile
echo
done
But for a really general transpose, this will work:
awk '$0!~/^$/{ i++;
split($0,arr,":");
for (j in arr) {
out[i,j]=arr[j];
if (maxr<j){ maxr=j} # max number of output rows.
}
}
END {
maxc=i # max number of output columns.
for (j=1; j<=maxr; j++) {
for (i=1; i<=maxc; i++) {
printf( "%st", out[i,j]) # out field separator.
}
printf( "%sn","" )
}
}' infile
And to make it pretty (using tab t as out field separator) :
./script | |column -t -s $'t' Virtual_Machine ID Status Memory Uptime Server Pool HA VCPU Type OS OL6U7 0004fb00000600003da8ce6948c441bd Running 65536 17103 MyOVS1.vmworld.com HA-POOL false 16 Xen PVM Oracle Linux 6
The code above for a general transpose will store the whole matrix in memory.
That could be a problem for really big files.
Update for new text.
To process the new text posted in the question, It seems to me that two pass of awk are the best answer. One pass, as short as fields exist, will print the header field titles. The next awk pass will print only field 2. In both cases, I added a way to remove leading and trailing spaces (for better formatting).
#!/bin/bash
{
awk -F: 'BEGIN{ sl="Virtual Machine"}
$1~sl && head == 1 { head=0; exit 0}
$1~sl && head == 0 { head=1; }
head == 1 {
gsub(/^[ t]+/,"",$1); # remove leading spaces
gsub(/[ t]+$/,"",$1); # remove trailing spaces
printf( "%st", $1)
}
' infile
#echo
awk -F: 'BEGIN { sl="Virtual Machine"}
$1~sl { printf( "%sn", "") }
{
gsub(/^[ t]+/,"",$2); # remove leading spaces
gsub(/[ t]+$/,"",$2); # remove trailing spaces
printf( "%st", $2)
}
' infile
echo
} | column -t -s "$(printf '%b' 't')"
The surrounding { ... } | column -t -s "$(printf '%b' 't')" is to format the whole table in a pretty way.
Please note that the "$(printf '%b' 't')" could be replaced with $'t' in ksh, bash, or zsh.
Method 4
declare -a COLS
declare -a DATA
while IFS=':' read -ra fields; do
COLS+=("${fields[0]}")
DATA+=("${fields[1]}")
done < <( cat /path/to/input.txt)
HEADER=""
DATA=""
for i in $(seq 0 $((${#fields[@]}-1)); do
HEADER="${HEADER}${COLS[$i]} "
DATA="${DATA}${COLS[$i]} "
done
echo $HEADER
echo $DATA
Method 5
Using awk, store off the key and value and print them out in the end.
#!/usr/bin/awk -f
BEGIN {
CNT=0
FS=":"
}
{
HDR[CNT]=$1;
ENTRY[CNT]=$2;
CNT++;
}
END {
for (x = 0; x < CNT; x++)
printf "%st",HDR[x]
print""
for (x = 0; x < CNT; x++)
printf "%st",ENTRY[x]
}
The just run awk -f ./script.awk ./input.txt
Method 6
With gnu datamash and column from util-linux:
datamash -t: transpose <infile | column -t -s:
This works with more than two columns but assumes there are no empty lines in your input file; with empty lines in between (like in your initial input sample) you would get an error like:
datamash: transpose input error: line 2 has 0 fields (previous lines had 2);
so to avoid that you’ll have to squeeze them before processing with datamash:
tr -s \n <infile | datamash -t: transpose | column -t -s:
Otherwise, in this particular case (only two columns), with zsh and the same column:
list=(${(f)"$(<infile)"})
printf %s\n ${(j;:;)list[@]%:*} ${(j;:;)list[@]#*:} | column -t -s:
(${(f)"$(<infile)"}) reads the lines in an array; ${(j;:;)list[@]%:*} joins (with :) the first field of each element and ${(j;:;)list[@]#*:} joins (again with :) the second field of each element; these are both printed, e.g. the output is
Virtual_Machine:ID:Status:Memory:Uptime:Server:Pool:HA:VCPU:Type:OS OL6U7:0004fb00000600003da8ce6948c441bd:Running:65536:17103:MyOVS1.vmworld.com:HA-POOL:false:16:Xen PVM:Oracle Linux 6
which is then piped to column -t -s:
Method 7
cat <(head -n 11 virtual.txt | cut -d: -f1) <(sed 's/.*: //' virtual.txt) | xargs -d 'n' -n 11 | column -t
The number of lines per Virtual Machine is hardcoded in this case – 11. Will be better count it beforehand and store in to the variable, then use this variable in the code.
Explanation
-
cat <(command 1) <(command 2)–<()construction makescommandoutput appearing like a temporary file. Therefore,catconcatenates two files and pipes it further.- command 1:
head -n 11 virtual.txt | cut -d: -f1, gives us future column headers. The one Virtual Machine entry is first eleven lines, theheadcommand is used to get it. Thecutsplits this entry to two columns and print the only first one. - command 2:
sed 's/.*: //' virtual.txt– gives us future column values.sedremoves all unneeded text and leaves only values.
- command 1:
-
xargs -d 'n' -n 11. Each input item is terminated by newline. This command gets items and prints them by 11 per line. -
column -t– is needed for pretty-printing displays. It displays our lines in a table form. Otherwise, each line will be different width.
Output
Virtual Machine ID Status Memory Uptime Server Pool HA Mode VCPU Type OS OL6U5 0004fb00000600003da8ce6948c441bb Running 65536 17835 Minutes MyOVS1.vmorld.com HA-POOL false 16 Xen PVM Oracle Linux 6 OL6U6 0004fb00000600003da8ce6948c441bc Running 65536 17565 Minutes MyOVS2.vmorld.com NON-HA-POOL false 16 Xen PVM Oracle Linux 6 OL6U7 0004fb00000600003da8ce6948c441bd Running 65536 17835 Minutes MyOVS1.vmorld.com HA-POOL false 16 Xen PVM Oracle Linux 6
Method 8
Use datamash and its transpose option to swap rows and columns in a file.
datamash -t: transpose < infile.txt
By default, transpose verifies the input has the same number of fields in each line, and fails with an error otherwise and you can disable its strict mode to allow missing values by --no-strict
datamash -t: --no-strict transpose < infile.txt
Also you can use --filler to set the missing-field filler value:
datamash -t: --no-strict --filler " " transpose < infile.txt
derived from datamash manual
Method 9
if your data is in separate files in a directory, you can use :
for file in $(ls $DIRECTORY)
do
cat ${file} | while read line
do
value=$(echo $line | cut -d: -f2-)
printf "%st" "${value}" >> bigfile
done
echo " " >> bigfile
done
you may need to massage the number of t (tab) characters on the printf line if your variable values are of different lengths.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0