How do I use a default namespace in an lxml xpath query?

I have an xml document in the following format:

<feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:gsa="http://schemas.google.com/gsa/2007">
  ...
  <entry>
    <id>https://ip.ad.dr.ess:8000/feeds/diagnostics/smb://ip.ad.dr.ess/path/to/file</id>
    <updated>2011-11-07T21:32:39.795Z</updated>
    <app:edited xmlns:app="http://purl.org/atom/app#">2011-11-07T21:32:39.795Z</app:edited>
    <link rel="self" type="application/atom+xml" href="https://ip.ad.dr.ess:8000/feeds/diagnostics" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener"/>
    <link rel="edit" type="application/atom+xml" href="https://ip.ad.dr.ess:8000/feeds/diagnostics" rel="nofollow noreferrer noopener" rel="nofollow noreferrer noopener"/>
    <gsa:content name="entryID">smb://ip.ad.dr.ess/path/to/directory</gsa:content>
    <gsa:content name="numCrawledURLs">7</gsa:content>
    <gsa:content name="numExcludedURLs">0</gsa:content>
    <gsa:content name="type">DirectoryContentData</gsa:content>
    <gsa:content name="numRetrievalErrors">0</gsa:content>
  </entry>
  <entry>
    ...
  </entry>
  ...
</feed>

I need to retrieve all entry elements using xpath in lxml. My problem is that I can’t figure out how to use an empty namespace. I have tried the following examples, but none work. Please advise.

import lxml.etree as et

tree=et.fromstring(xml)

The various things I have tried are:

for node in tree.xpath('//entry'):

or

namespaces = {None:"http://www.w3.org/2005/Atom" ,"openSearch":"http://a9.com/-/spec/opensearchrss/1.0/" ,"gsa":"http://schemas.google.com/gsa/2007"}

for node in tree.xpath('//entry', namespaces=ns):

or

for node in tree.xpath('//"{http://www.w3.org/2005/Atom}entry"'):

At this point I just don’t know what to try. Any help is greatly appreciated.

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Something like this should work:

import lxml.etree as et

ns = {"atom": "http://www.w3.org/2005/Atom"}
tree = et.fromstring(xml)
for node in tree.xpath('//atom:entry', namespaces=ns):
    print node

See also http://lxml.de/xpathxslt.html#namespaces-and-prefixes.

Alternative:

for node in tree.xpath("//*[local-name() = 'entry']"):
    print node

Method 2

Use findall method.

for item in tree.findall('{http://www.w3.org/2005/Atom}entry'): 
    print item


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x