beautifulsoup Archives - Page 2 of 6

UnicodeEncodeError: ‘charmap’ codec can’t encode characters

August 21, 2022 by Magenaut

I’m trying to scrape a website, but it gives me an error.

How to find elements by class

August 21, 2022 by Magenaut

I’m having trouble parsing HTML elements with “class” attribute using Beautifulsoup. The code looks like this

Extracting an attribute value with beautifulsoup

August 21, 2022 by Magenaut

I am trying to extract the content of a single “value” attribute in a specific “input” tag on a webpage. I use the following code:

BeautifulSoup Grab Visible Webpage Text

August 21, 2022 by Magenaut

Basically, I want to use BeautifulSoup to grab strictly the visible text on a webpage. For instance, this webpage is my test case. And I mainly want to just get the body text (article) and maybe even a few tab names here and there. I have tried the suggestion in this SO question that returns lots of <script> tags and html comments which I don’t want. I can’t figure out the arguments I need for the function findAll() in order to just get the visible texts on a webpage.

can we use XPath with BeautifulSoup?

August 20, 2022 by Magenaut

I am using BeautifulSoup to scrape an URL and I had the following code, to find the td tag whose class is 'empformbody':

python BeautifulSoup parsing table

August 20, 2022 by Magenaut

I’m learning python requests and BeautifulSoup. For an exercise, I’ve chosen to write a quick NYC parking ticket parser. I am able to get an html response which is quite ugly. I need to grab the lineItemsTable and parse all the tickets.

Scraping: SSL: CERTIFICATE_VERIFY_FAILED error for http://en.wikipedia.org

August 19, 2022 by Magenaut

I’m practicing the code from ‘Web Scraping with Python’, and I keep having this certificate problem:

TypeError: a bytes-like object is required, not ‘str’ in python and CSV

August 19, 2022 by Magenaut

TypeError: a bytes-like object is required, not ‘str’

Using BeautifulSoup to extract text without tags

August 18, 2022 by Magenaut

My webpage looks like this:

Only extracting text from this element, not its children

August 18, 2022 by Magenaut

I want to extract only the text from the top-most element of my soup; however soup.text gives the text of all the child elements as well: