Checking a nested dictionary using a dot notation string “a.b.c.d.e”, automatically create missing levels

Given the following dictionary:

d = {"a":{"b":{"c":"winning!"}}}

I have this string (from an external source, and I can’t change this metaphor).

k = "a.b.c"

I need to determine if the dictionary has the key 'c', so I can add it if it doesn’t.

This works swimmingly for retrieving a dot notation value:

reduce(dict.get, key.split("."), d)

but I can’t figure out how to ‘reduce’ a has_key check or anything like that.

My ultimate problem is this: given "a.b.c.d.e", I need to create all the elements necessary in the dictionary, but not stomp them if they already exist.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You could use an infinite, nested defaultdict:

>>> from collections import defaultdict
>>> infinitedict = lambda: defaultdict(infinitedict)
>>> d = infinitedict()
>>> d['key1']['key2']['key3']['key4']['key5'] = 'test'
>>> d['key1']['key2']['key3']['key4']['key5']
'test'

Given your dotted string, here’s what you can do:

>>> import operator
>>> keys = "a.b.c".split(".")
>>> lastplace = reduce(operator.getitem, keys[:-1], d)
>>> lastplace.has_key(keys[-1])
False

You can set a value:

>>> lastplace[keys[-1]] = "something"
>>> reduce(operator.getitem, keys, d)
'something'
>>> d['a']['b']['c']
'something'

Method 2

… or using recursion:

def put(d, keys, item):
    if "." in keys:
        key, rest = keys.split(".", 1)
        if key not in d:
            d[key] = {}
        put(d[key], rest, item)
    else:
        d[keys] = item

def get(d, keys):
    if "." in keys:
        key, rest = keys.split(".", 1)
        return get(d[key], rest)
    else:
        return d[keys]

Method 3

How about an iterative approach?

def create_keys(d, keys):
    for k in keys.split("."):
        if not k in d: d[k] = {}  #if the key isn't there yet add it to d
        d = d[k]                  #go one level down and repeat

If you need the last key value to map to anything else than a dictionary you could pass the value as an additional argument and set this after the loop:

def create_keys(d, keys, value):
    keys = keys.split(".")
    for k in keys[:-1]:
        if not k in d: d[k] = {}
        d = d[k]            
    d[keys[-1]] = value

Method 4

d = {"a":{}}
k = "a.b.c".split(".")

def f(d, i):
    if i >= len(k):
        return "winning!"
    c = k[i]
    d[c] = f(d.get(c, {}), i + 1)
    return d

print f(d, 0)
"{'a': {'b': {'c': 'winning!'}}}"

Method 5

I thought this discussion was very useful, but for my purpose to only get a value (not setting it), I ran into issues when a key was not present. So, just to add my flair to the options, you can use reduce in combination of an adjusted dict.get() to accommodate the scenario that the key is not present, and then return None:

from functools import reduce
import re
from typing import Any, Optional

def find_key(dot_notation_path: str, payload: dict) -> Any:
    """Try to get a deep value from a dict based on a dot-notation"""

    def get_despite_none(payload: Optional[dict], key: str) -> Any:
        """Try to get value from dict, even if dict is None"""
        if not payload or not isinstance(payload, (dict, list)):
            return None
        # can also access lists if needed, e.g., if key is '[1]'
        if (num_key := re.match(r"^[(d+)]$", key)) is not None:
            try:
                return payload[int(num_key.group(1))]
            except IndexError:
                return None
        else:
            return payload.get(key, None)

    found = reduce(get_despite_none, dot_notation_path.split("."), payload)
   
    # compare to None, as the key could exist and be empty
    if found is None:
        raise KeyError()
    return found

In my use case, I need to find a key within an HTTP request payload, which can often include lists as well. The following examples work:

payload = {
    "haystack1": {
        "haystack2": {
            "haystack3": None, 
            "haystack4": "needle"
        }
    },
    "haystack5": [
        {"haystack6": None}, 
        {"haystack7": "needle"}
    ],
    "haystack8": {},
}

find_key("haystack1.haystack2.haystack4", payload)
# "needle"
find_key("haystack5.[1].haystack7", payload)
# "needle"
find_key("[0].haystack5.[1].haystack7", [payload, None])
# "needle"
find_key("haystack8", payload)
# {}
find_key("haystack1.haystack2.haystack4.haystack99", payload)
# KeyError

EDIT: added list accessor


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x