parse one field from an JSON array into bash array

I have a JSON output that contains a list of objects stored in a variable. (I may not be phrasing that right)

[
  {
    "item1": "value1",
    "item2": "value2",
    "sub items": [
      {
        "subitem": "subvalue"
      }
    ]
  },
  {
    "item1": "value1_2",
    "item2": "value2_2",
    "sub items_2": [
      {
        "subitem_2": "subvalue_2"
      }
    ]
  }
]

I need all the values for item2 in a array for a bash script to be run on ubuntu 14.04.1.

I have found a bunch of ways to get the entire result into an array but not just the items I need

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Using :

$ cat json
[
  {
    "item1": "value1",
    "item2": "value2",
    "sub items": [
      {
        "subitem": "subvalue"
      }
    ]
  },
  {
    "item1": "value1_2",
    "item2": "value2_2",
    "sub items_2": [
      {
        "subitem_2": "subvalue_2"
      }
    ]
  }
]

CODE:

arr=( $(jq -r '.[].item2' json) )
printf '%sn' "${arr[@]}"

OUTPUT:

value2
value2_2

Method 2

The following is actually buggy:

# BAD: Output line of * is replaced with list of local files; can't deal with whitespace
arr=( $( curl -k "$url" | jq -r '.[].item2' ) )

If you have bash 4.4 or newer, a best-of-all-worlds option is available:

# BEST: Supports bash 4.4+, with failure detection and newlines in data
{ readarray -t -d '' arr && wait "$!"; } < <(
  set -o pipefail
  curl --fail -k "$url" | jq -j '.[].item2 | (., "u0000")'
)

…whereas with bash 4.0, you can have terseness at the cost of failure detection and literal newline support:

# OK (with bash 4.0), but can't detect failure and doesn't support values with newlines
readarray -t arr < <(curl -k "$url" | jq -r '.[].item2' )

…or bash 3.x compatibility and failure detection, but without newline support:

# OK: Supports bash 3.x; no support for newlines in values, but can detect failures
IFS=$'n' read -r -d '' -a arr < <(
  set -o pipefail
  curl --fail -k "$url" | jq -r '.[].item2' && printf ''
)

…or bash 3.x compatibility and newline support, but without failure detection:

# OK: Supports bash 3.x and supports newlines in values; does not detect failures
arr=( )
while IFS= read -r -d '' item; do
  arr+=( "$item" )
done < <(curl --fail -k "$url" | jq -j '.[] | (.item2, "u0000")')

Method 3

Use jq to produce a shell statement that you evaluate:

eval "$( jq -r '@sh "arr=( ([.[].item2]) )"' file.json )"

Given the JSON document in your question, the call to jq will produce the string

arr=( 'value2' 'value2_2' )

which is then evaluated by your shell. Evaluating that string will create the named array arr with the two elements value2 and value2_2:

$ eval "$( jq -r '@sh "arr=( ([.[].item2]) )"' file.json )"
$ printf '"%s"n' "${arr[@]}"
"value2"
"value2_2"

The @sh operator in jq takes care to properly quote the data for the shell.

Alternatively, move the arr=( ... ) part out of the jq expression:

eval "arr=( $( jq -r '@sh "([.[].item2])"' file.json ) )"

Now, jq only generates the quoted list of elements, which is then inserted into arr=( ... ) and evaluated.

If you need to read data from a curl command, use curl ... | jq -r ... in place of jq -r ... file.json in the commands above.

Method 4

To handle arbitrary values:

#!/bin/bash
cat <<EOF >json
[
  {
    "item1": "value1",
    "item2": "  val'"ue2  ",
    "sub items": [
      {
        "subitem": "subvalue"
      }
    ]
  },
  {
    "item1": "value1_2",
    "item2": "  valuen2_2  ",
    "sub items_2": [
      {
        "subitem_2": "subvalue_2"
      }
    ]
  }
]
EOF
eval "arr=( $(jq -r ' .[].item2 | @sh ' json) )"
printf '<%s>n' "${arr[@]}"

Output:

<  val'"ue2  >
<  value
2_2  >

Putting together some of the other solutions in this question mentioned in comments and answers from @CharlesDuffy, @StéphaneChazelas, @Kusalananda:

#!/bin/bash
v1='anb  n'   
v2='c'''"d  '   # v2 will contain <c'"d  >  
printf '$v1=<%s>n$v2=<%s>nn' "$v1" "$v2"

>json printf "%sn" "[ "$v1", "$v2" ]" 
printf 'JSON data: '; cat json
printf 'n'

eval "arr=( $( cat json | jq -r '.[] | @sh ' ) )" 
printf '$arr[0]:<%s>n$arr[1]:<%s>nn' "${arr[@]}"

set -- 
eval "set -- $( cat json | jq -r '[.[]] | @sh ' )" 
printf '$1:<%s>n$2:<%s>nn' "$1" "$2"

{ readarray -td '' arr2 && wait "$!"; } < <( 
   cat json | jq -j '.[] | (., "u0000") '
)
printf 'rc=%sn$arr[0]:<%s>n$arr[1]:<%s>nn' "$?" "${arr2[@]}"

{ readarray -td '' arr3 && wait "$!"; } < <( 
   { echo x; cat json; } | jq -j '.[] | (., "u0000") '
)
printf 'rc=%sn' "$?"

Output:

$v1=<anb  n>
$v2=<c'"d  >

JSON data: [ "anb  n", "c'"d  " ]

$arr[0]:<a
b  
>
$arr[1]:<c'"d  >

$1:<a
b  
>
$2:<c'"d  >

rc=0
$arr[0]:<a
b  
>
$arr[1]:<c'"d  >

parse error: Invalid numeric literal at line 2, column 0
rc=4

Method 5

Thanks to sputnick I got to this:

arr=( $(curl -k https://localhost/api | jq -r '.[].item2') )

The JSON I have is the output from an API. All I needed to do wans remove the file argument and pipe | the output of curl to jq. Works great and saved some steps.

Method 6

as an easy alternative, look at jtc tool (at https://github.com/ldn-softdev/jtc),
to achieve the same thing (as in jq’s example):

bash $ arr=( $(jtc -w '<item2>l+0' file.json) )
bash $ printf '%sn' "${arr[@]}"
"value2"
"value2_2"
bash $

explanation on -w option: angular brackets <...> specify search entire json, suffix l instructs to search labels rather than values, +0 instructs to find all occurrences (rather than just first one).


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x