utf-8
Convert Unicode to ASCII without errors in Python
My code just scrapes a web page, then converts it to Unicode.
UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xef in position 1
I’m having a few issues trying to encode a string to UTF-8. I’ve tried numerous things, including using string.encode('utf-8') and unicode(string), but I get the error:
Convert UTF-8 with BOM to UTF-8 with no BOM in Python
Two questions here. I have a set of files which are usually UTF-8 with BOM. I’d like to convert them (ideally in place) to UTF-8 with no BOM. It seems like codecs.StreamRecoder(stream, encode, decode, Reader, Writer, errors) would handle this. But I don’t really see any good examples on usage. Would this be the best way to handle this?
Python and BeautifulSoup encoding issues
I’m writing a crawler with Python using BeautifulSoup, and everything was going swimmingly till I ran into this site: http://www.elnorte.ec/ I’m getting the contents with the requests library: r = requests.get('http://www.elnorte.ec/') content = r.content If I do a print of the content variable at that point, all the spanish special characters seem to be working … Read more
Write to UTF-8 file in Python
I’m really confused with the codecs.open function. When I do:
How to convert a file to utf-8 in Python?
I need to convert a bunch of files to utf-8 in Python, and I have trouble with the “converting the file” part.
How to convert a string to utf-8 in Python
I have a browser which sends utf-8 characters to my Python server, but when I retrieve it from the query string, the encoding that Python returns is ASCII. How can I convert the plain string to utf-8?
How do I check if a string is unicode or ascii?
What do I have to do in Python to figure out which encoding a string has?
python requests.get() returns improperly decoded text instead of UTF-8?
When the content-type of the server is 'Content-Type:text/html', requests.get() returns improperly encoded data.