pandas to_csv output quoting issue

I’m having trouble getting the pandas dataframe.to_csv(...) output quoting strings right.

import pandas as pd

text = 'this is "out text"'
df = pd.DataFrame(index=['1'],columns=['1','2'])
df.loc['1','1']=123
df.loc['1','2']=text
df.to_csv('foo.txt',index=False,header=False)

The output is:

123,”this is “”out text”””

But I would like:

123,this is “out text”

Does anyone know how to get this right?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

You could pass quoting=csv.QUOTE_NONE, for example:

>>> df.to_csv('foo.txt',index=False,header=False)
>>> !cat foo.txt
123,"this is ""out text"""
>>> import csv
>>> df.to_csv('foo.txt',index=False,header=False, quoting=csv.QUOTE_NONE)
>>> !cat foo.txt
123,this is "out text"

but in my experience it’s better to quote more, rather than less.

Method 2

Note: there is currently a small error in the Pandas to_string documentation. It says:

  • quoting : int, Controls whether quotes should be recognized. Values are taken from csv.QUOTE_* values. Acceptable values are 0, 1, 2, and
    3 for QUOTE_MINIMAL, QUOTE_ALL, QUOTE_NONE, and QUOTE_NONNUMERIC,
    respectively.

But this reverses how csv defines the QUOTE_NONE and QUOTE_NONNUMERIC variables.

In [13]: import csv
In [14]: csv.QUOTE_NONE
Out[14]: 3

Method 3

To use quoting=csv.QUOTE_NONE, you need to set the escapechar, e.g.

# Create a tab-separated file with quotes
$ echo abc$'t'defg$'t'$'"xyz"' > in.tsv
$ cat in.tsv
abc defg    "xyz"

# Gotcha the quotes disappears in `"..."`
$ python3
>>> import pandas as pd
>>> import csv
>>> df = pd.read("in.tsv", sep="t")
>>> df = pd.read_csv("in.tsv", sep="t")
>>> df
Empty DataFrame
Columns: [abc, defg, xyz]
Index: []


# When reading in pandas, to read the `"..."` quotes,
# you have to explicitly say there's no `quotechar`
>>> df = pd.read_csv("in.tsv", sep="t", quotechar='')
>>> df
Empty DataFrame
Columns: [abc, defg, "xyz"]
Index: []

# To print out without the quotes.
>> df.to_csv("out.tsv", , sep="t", quoting=csv.QUOTE_NONE, quotechar="",  escapechar="\")

Method 4

To use without escapechar:

Replace comma char , (Unicode:U+002C) in your df with an single low-9 quotation mark character (Unicode: U+201A)

After this, you can simply use:

import csv
df.to_csv('foo.txt', index=False, header=False, quoting=csv.QUOTE_NONE)

Method 5

If you don’t want to bother with importing csv, you simply can use the following line

df.to_csv('foo.txt', index=False, header=False, quoting=3)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x