lxml Archives - Magenaut

Error while installing lxml through pip: Microsoft Visual C++ 14.0 is required

August 20, 2022 by Magenaut

I am on a windows 10 machine and recently moved from python 2.7 to 3.5. When trying to install lxml through pip, it stops and throws this error message-

Cannot install Lxml on Mac OS X 10.9

August 18, 2022 by Magenaut

I want to install Lxml so I can then install Scrapy.

how to remove an element in lxml

August 18, 2022 by Magenaut

I need to completely remove elements, based on the contents of an attribute, using python’s lxml. Example:

Parsing HTML in python – lxml or BeautifulSoup? Which of these is better for what kinds of purposes?

August 17, 2022 by Magenaut

From what I can make out, the two main HTML parsing libraries in Python are lxml and BeautifulSoup. I’ve chosen BeautifulSoup for a project I’m working on, but I chose it for no particular reason other than finding the syntax a bit easier to learn and understand. But I see a lot of people seem to favour lxml and I’ve heard that lxml is faster.

Get all text inside a tag in lxml

August 17, 2022 by Magenaut

I’d like to write a code snippet that would grab all of the text inside the <content> tag, in lxml, in all three instances below, including the code tags. I’ve tried tostring(getchildren()) but that would miss the text in between the tags. I didn’t have very much luck searching the API for a relevant function. Could you help me out?

How can this function be rewritten to implement OrderedDict?

August 17, 2022 by Magenaut

I have the following function which does a crude job of parsing an XML file into a dictionary. Unfortunately, since Python dictionaries are not ordered, I am unable to cycle through the nodes as I would like. How do I change this so it outputs an ordered dictionary which reflects the original order of the … Read more

Error while installing lxml through pip: Microsoft Visual C++ 14.0 is required

Cannot install Lxml on Mac OS X 10.9

how to remove an element in lxml

Parsing HTML in python – lxml or BeautifulSoup? Which of these is better for what kinds of purposes?

Get all text inside a tag in lxml

How can this function be rewritten to implement OrderedDict?

Remove namespace and prefix from xml in python using lxml

using lxml and iterparse() to parse a big (+- 1Gb) XML file

How do I use a default namespace in an lxml xpath query?

builtins.TypeError: must be str, not bytes