I am using the builtin Python ElementTree module. It is straightforward to access children, but what about parent or sibling nodes? – can this be done efficiently without traversing the entire tree?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
There’s no direct support in the form of a parent attribute, but you can perhaps use the patterns described here to achieve the desired effect. The following one-liner is suggested (updated from the linked-to post to Python 3.8) to create a child-to-parent mapping for a whole tree, using the method xml.etree.ElementTree.Element.iter:
parent_map = {c: p for p in tree.iter() for c in p}
Method 2
Vinay’s answer should still work, but for Python 2.7+ and 3.2+ the following is recommended:
parent_map = {c:p for p in tree.iter() for c in p}
getiterator() is deprecated in favor of iter(), and it’s nice to use the new dict list comprehension constructor.
Secondly, while constructing an XML document, it is possible that a child will have multiple parents, although this gets removed once you serialize the document. If that matters, you might try this:
parent_map = {}
for p in tree.iter():
for c in p:
if c in parent_map:
parent_map[c].append(p)
# Or raise, if you don't want to allow this.
else:
parent_map[c] = [p]
# Or parent_map[c] = p if you don't want to allow this
Method 3
You can use xpath ... notation in ElementTree.
<parent>
<child id="123">data1</child>
</parent>
xml.findall('.//child[@id="123"]...')
>> [<Element 'parent'>]
Method 4
As mentioned in Get parent element after using find method (xml.etree.ElementTree) you would have to do an indirect search for parent.
Having xml:
<a> <b> <c>data</c> <d>data</d> </b> </a>
Assuming you have created etree element into xml variable, you can use:
In[1] parent = xml.find('.//c/..')
In[2] child = parent.find('./c')
Resulting in:
Out[1]: <Element 'b' at 0x00XXXXXX> Out[2]: <Element 'c' at 0x00XXXXXX>
Higher parent would be found as:secondparent=xml.find('.//c/../..') being <Element 'a' at 0x00XXXXXX>
Method 5
Pasting here my answer from https://stackoverflow.com/a/54943960/492336:
I had a similar problem and I got a bit creative. Turns out nothing prevents us from adding the parent info ourselves. We can later strip it once we no longer need it.
def addParentInfo(et):
for child in et:
child.attrib['__my_parent__'] = et
addParentInfo(child)
def stripParentInfo(et):
for child in et:
child.attrib.pop('__my_parent__', 'None')
stripParentInfo(child)
def getParent(et):
if '__my_parent__' in et.attrib:
return et.attrib['__my_parent__']
else:
return None
# Example usage
tree = ...
addParentInfo(tree.getroot())
el = tree.findall(...)[0]
parent = getParent(el)
while parent:
doSomethingWith(parent)
parent = getParent(parent)
stripParentInfo(tree.getroot())
Method 6
The XPath ‘..’ selector cannot be used to retrieve the parent node on 3.5.3 nor 3.6.1 (at least on OSX),
eg in interactive mode:
import xml.etree.ElementTree as ET
root = ET.fromstring('<parent><child></child></parent>')
child = root.find('child')
parent = child.find('..') # retrieve the parent
parent is None # unexpected answer True
The last answer breaks all hopes…
Method 7
Got an answer from
https://towardsdatascience.com/processing-xml-in-python-elementtree-c8992941efd2
Tip: use ‘…’ inside of XPath to return the parent element of the current element.
for object_book in root.findall('.//*[@name="The Hunger Games"]...'):
print(object_book)
Method 8
If you are using lxml, I was able to get the parent element with the following:
parent_node = next(child_node.iterancestors())
This will raise a StopIteration exception if the element doesn’t have ancestors – so be prepared to catch that if you may run into that scenario.
Method 9
Another way if just want a single subElement’s parent and also known the subElement’s xpath.
parentElement = subElement.find(xpath+"/..")
Method 10
Look at the 19.7.2.2. section: Supported XPath syntax …
Find node’s parent using the path:
parent_node = node.find('..')
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0