Parse XML to get node value in bash script?

I would like to know how I can get the value of a node with the following paths:

config/global/resources/default_setup/connection/host
config/global/resources/default_setup/connection/username
config/global/resources/default_setup/connection/password
config/global/resources/default_setup/connection/dbname

from the following XML:

<?xml version="1.0"?>
<config>
    <global>
        <install>
            <date><![CDATA[Tue, 11 Dec 2012 12:31:25 +0000]]></date>
        </install>
        <crypt>
            <key><![CDATA[70e75d7969b900b696785f2f81ecb430]]></key>
        </crypt>
        <disable_local_modules>false</disable_local_modules>
        <resources>
            <db>
                <table_prefix><![CDATA[]]></table_prefix>
            </db>
            <default_setup>
                <connection>
                    <host><![CDATA[localhost]]></host>
                    <username><![CDATA[root]]></username>
                    <password><![CDATA[pass123]]></password>
                    <dbname><![CDATA[testdb]]></dbname>
                    <initStatements><![CDATA[SET NAMES utf8]]></initStatements>
                    <model><![CDATA[mysql4]]></model>
                    <type><![CDATA[pdo_mysql]]></type>
                    <pdoType><![CDATA[]]></pdoType>
                    <active>1</active>
                </connection>
            </default_setup>
        </resources>
        <session_save><![CDATA[files]]></session_save>
    </global>
    <admin>
        <routers>
            <adminhtml>
                <args>
                    <frontName><![CDATA[admin]]></frontName>
                </args>
            </adminhtml>
        </routers>
    </admin>
</config>

Also I want to assign that value to the variable for further use. Let me know your idea.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Using bash and xmllint (as given by the tags):

xmllint --version  #  xmllint: using libxml version 20703

# Note: Newer versions of libxml / xmllint have a --xpath option which 
# makes it possible to use xpath expressions directly as arguments. 
# --xpath also enables precise output in contrast to the --shell & sed approaches below.
#xmllint --help 2>&1 | grep -i 'xpath'

{
# the given XML is in file.xml
host="$(echo "cat /config/global/resources/default_setup/connection/host/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
username="$(echo "cat /config/global/resources/default_setup/connection/username/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
password="$(echo "cat /config/global/resources/default_setup/connection/password/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
dbname="$(echo "cat /config/global/resources/default_setup/connection/dbname/text()" | xmllint --nocdata --shell file.xml | sed '1d;$d')"
printf '%sn' "host: $host" "username: $username" "password: $password" "dbname: $dbname"
}

# output
# host: localhost
# username: root
# password: pass123
# dbname: testdb

In case there is just an XML string and the use of a temporary file is to be avoided, file descriptors are the way to go with xmllint (which is given /dev/fd/3 as a file argument here):

set +H
{
xmlstr='<?xml version="1.0"?>
<config>
    <global>
        <install>
            <date><![CDATA[Tue, 11 Dec 2012 12:31:25 +0000]]></date>
        </install>
        <crypt>
            <key><![CDATA[70e75d7969b900b696785f2f81ecb430]]></key>
        </crypt>
        <disable_local_modules>false</disable_local_modules>
        <resources>
            <db>
                <table_prefix><![CDATA[]]></table_prefix>
            </db>
            <default_setup>
                <connection>
                    <host><![CDATA[localhost]]></host>
                    <username><![CDATA[root]]></username>
                    <password><![CDATA[pass123]]></password>
                    <dbname><![CDATA[testdb]]></dbname>
                    <initStatements><![CDATA[SET NAMES utf8]]></initStatements>
                    <model><![CDATA[mysql4]]></model>
                    <type><![CDATA[pdo_mysql]]></type>
                    <pdoType><![CDATA[]]></pdoType>
                    <active>1</active>
                </connection>
            </default_setup>
        </resources>
        <session_save><![CDATA[files]]></session_save>
    </global>
    <admin>
        <routers>
            <adminhtml>
                <args>
                    <frontName><![CDATA[admin]]></frontName>
                </args>
            </adminhtml>
        </routers>
    </admin>
</config>
'

# exec issue
#exec 3<&- 3<<<"$xmlstr"
#exec 3<&- 3< <(printf '%s' "$xmlstr")
exec 3<&- 3<<EOF
$(printf '%s' "$xmlstr")
EOF

{ read -r host; read -r username; read -r password; read -r dbname; } < <(
       echo "cat /config/global/resources/default_setup/connection/*[self::host or self::username or self::password or self::dbname]/text()" | 
          xmllint --nocdata --shell /dev/fd/3 | 
          sed -e '1d;$d' -e '/^ *--* *$/d'
       )

printf '%sn' "host: $host" "username: $username" "password: $password" "dbname: $dbname"

exec 3<&-
}
set -H


# output
# host: localhost
# username: root
# password: pass123
# dbname: testdb

Method 2

Although there are a lot of answers already, I’ll chime in with xml2.

$ xml2 < test.xml
/config/global/install/date=Tue, 11 Dec 2012 12:31:25 +0000
/config/global/crypt/key=70e75d7969b900b696785f2f81ecb430
/config/global/disable_local_modules=false
/config/global/resources/db/table_prefix
/config/global/resources/default_setup/connection/host=localhost
/config/global/resources/default_setup/connection/username=root
/config/global/resources/default_setup/connection/password=pass123
/config/global/resources/default_setup/connection/dbname=testdb
/config/global/resources/default_setup/connection/initStatements=SET NAMES utf8
/config/global/resources/default_setup/connection/model=mysql4
/config/global/resources/default_setup/connection/type=pdo_mysql
/config/global/resources/default_setup/connection/pdoType
/config/global/resources/default_setup/connection/active=1
/config/global/session_save=files
/config/admin/routers/adminhtml/args/frontName=admin

With a little magic you can even set those as variables directly:

$ eval $(xml2 < test.xml | tr '/, ' '___' | grep =)
$ echo $_config_global_resources_default_setup_connection_host          
localhost

Method 3

Using xmllint and the –xpath option, it is very easy. You can simply do this:

XML_FILE=/path/to/file.xml

HOST=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/host)' $XML_FILE
USERNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/username)' $XML_FILE
PASSWORD=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/password)' $XML_FILE 
DBNAME=$(xmllint --xpath 'string(/config/global/resources/default_setup/connection/dbname)' $XML_FILE

If you need to get to an element’s attribute, that’s also easy using XPath. Imagine you have the file:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<addon id="screensaver.turnoff"
       name="Turn Off"
       version="0.10.0"
       provider-name="Dag Wieërs">
  ..snip..
</addon>

The needed shell statements would be:

VERSION=$(xmllint --xpath 'string(/addon/@version)' $ADDON_XML)
AUTHOR=$(xmllint --xpath 'string(/addon/@provider-name)' $ADDON_XML)

Method 4

The following works when run against your test data:

{ read -r host; read -r username; read -r password; read -r dbname; } 
  < <(xmlstarlet sel -t -m /config/global/resources/default_setup/connection 
      -v ./host -n 
      -v ./username -n 
      -v ./password -n 
      -v ./dbname -n)

This puts the content into variables host, username, password and dbname.

Method 5

A pure bash function, just for the unfortunate case when you are not allowed to install anything appropriate. This may, and probably will, fail on more complicated XML:

function xmlpath()
{
  local expr="${1//// }"
  local path=()
  local chunk tag data

  while IFS='' read -r -d '<' chunk; do
    IFS='>' read -r tag data <<< "$chunk"

    case "$tag" in
      '?'*) ;;
      '!–-'*) ;;
      '![CDATA['*) data="${tag:8:${#tag}-10}" ;;
      ?*'/') ;;
      '/'?*) unset path[${#path[@]}-1] ;;
      ?*) path+=("$tag") ;;
    esac

    [[ "${path[@]}" == "$expr" ]] && echo "$data"
  done
}

Usage:

bash-4.1$ xmlpath 'config/global/resources/default_setup/connection/host' < MagePsycho.xml
localhost

Known issues:

  • slow
  • searches only by tag names
  • no character entity decoding

Method 6

Using xq (from https://kislyuk.github.io/yq/) to just get those strings out:

#!/bin/sh

set -- 
        config/global/resources/default_setup/connection/host 
        config/global/resources/default_setup/connection/username 
        config/global/resources/default_setup/connection/password 
        config/global/resources/default_setup/connection/dbname

IFS=:

xq -r --arg path_string "$*" 
        'getpath(($path_string | split(":") | map(split("/")))[])' file.xml

This gives the path expressions to xq as a :-delimited list in the variable $path_string. This string is subsequently split into its constituent paths, and these are then further split into path elements, so that one path internally may look like

[
  "config",
  "global",
  "resources",
  "default_setup",
  "connection",
  "dbname"
]

The path arrays are given to the getpath() function which extracts the values located at those paths.

The output, for the given XML document, will be

localhost
root
pass123
testdb

Creating shell assignment statements instead:

#!/bin/sh

set -- 
        config/global/resources/default_setup/connection/host 
        config/global/resources/default_setup/connection/username 
        config/global/resources/default_setup/connection/password 
        config/global/resources/default_setup/connection/dbname

eval "$(
    IFS=:
    xq -r --arg path_string "$*" '
            ($path_string | split(":") | map(split("/"))[]) as $path |
            "($path[-1])=(getpath($path)|@sh)"' file.xml
)"

printf 'host = "%s"n' "$host"
printf 'user = "%s"n' "$username"
printf 'pass = "%s"n' "$password"
printf 'database = "%s"n' "$dbname"

For the given paths and XML document, the xq statement above would create the output

host='localhost'
username='root'
password='pass123'
dbname='testdb'

This would be safe to eval to assign the host, username, password, and dbname shell variables.

The output of the script would be

host = "localhost"
user = "root"
pass = "pass123"
database = "testdb"

Method 7

You can make use of php command line interface coding in bash scripts to handle several complex scripts that actually span over multiple lines of coding. First, try to make your solution using PHP scripts, and then later on pass the parameters using CLI mode. Thus, you can get control over superb usages of XML parsers.

The environment seems that you can use PHP in client mode via ssh/shell access.

php -f yourxmlparser.php

Now, do all the things within your php file. Make use of command line parameters it can take.

You can even assign that return values to Shell environment to continue rest of your shell scripts.

And the other way is to use |grep option to match your required value within the xml file, if you are pretty sure of the structure of your xml file that does not change over time.

Method 8

This comment use only sh/bash commands and methods !
/test.xml is your XML type file at first question…

#!/bin/sh

cat /test.xml | while read line;do
[ "$(echo "$line" | grep "<host>")" ]&& echo "host: $(echo $line |  cut -f3 -d'[' | cut -f1 -d']')"
[ "$(echo "$line" | grep "<username>")" ]&& echo "username: $(echo $line |  cut -f3 -d'[' | cut -f1 -d']')"
[ "$(echo "$line" | grep "<password>")" ]&& echo "password: $(echo $line |  cut -f3 -d'[' | cut -f1 -d']')"
[ "$(echo "$line" | grep "<dbname")" ]&& echo "dbname: $(echo $line |  cut -f3 -d'[' | cut -f1 -d']')"
done

output:

host: localhost
username: root
password: pass123
dbname: testdb

if u want write this values to file use this method :

#!/bin/sh

cat /test.xml | while read line;do
[ "$(echo "$line" | grep "<host>")" ]&& echo "$line" |  cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/host
[ "$(echo "$line" | grep "<username>")" ]&& echo "$line" |  cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/username
[ "$(echo "$line" | grep "<password>")" ]&& echo "$line" |  cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/password
[ "$(echo "$line" | grep "<dbname")" ]&& echo "$line" |  cut -f3 -d'[' | cut -f1 -d']' > /config/global/resources/default_setup/connection/dbname
done

this method will overwrite your local files used only getting values (your datas will lost from output files)

Method 9

Using Raku (formerly known as Perl_6):

I recognize the OP requested a bash script, but since other answers have deviated from this requirement, here’s a Raku solution (4 one-liners):

raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<host>).>>.cdata>>.data.put;'
raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<username>).>>.cdata>>.data.put;'
raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<password>).>>.cdata>>.data.put;'
raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); $xml.lookfor(:TAG<dbname>).>>.cdata>>.data.put;'

OUTPUT:

localhost
root
pass123
testdb

Briefly, Raku is called at the bash command line, and -M module XML is loaded with the command -MXML. The xml file is opened with open-xml and stored in the $xml object. Then the $xml object is queried recursively for desired tags [ in point of fact, the lookfor(...) code is a shortcut for elements(..., :RECURSE) ]. Then the CDATA values are extracted.

There are other ways to get the desired data, such as simply walking the XML-parse tree:

raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); .cdata.map(*.data).put for $xml.nodes[1].nodes[7].nodes[3].nodes[1].nodes[1,3,5,7];'

Which can be simplified to:

raku -MXML -e 'my $xml=open-xml($*ARGFILES.Str); .cdata>>.data.put for $xml.nodes[1][7][3][1][1,3,5,7];'

The two lines of code above each return:

localhost
root
pass123
testdb

https://github.com/raku-community-modules/XML
https://raku.org/
[For alternative solutions in Raku, there’s also the LibXML module, which provides bindings to the (possibly faster) libxml2 library. See https://modules.raku.org/dist/LibXML:cpan:WARRINGD].

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x