Can I modify a CSV file inline using Python’s CSV library, or similar technique?
Current I am processing a file and updating the first column (a name field) to change the formatting. A simplified version of my code looks like this:
with open('tmpEmployeeDatabase-out.csv', 'w') as csvOutput:
writer = csv.writer(csvOutput, delimiter=',', quotechar='"')
with open('tmpEmployeeDatabase.csv', 'r') as csvFile:
reader = csv.reader(csvFile, delimiter=',', quotechar='"')
for row in reader:
row[0] = row[0].title()
writer.writerow(row)
The philosophy works, but I am curious if I can do an inline edit so that I’m not duplicating the file.
I’ve tried the follow, but this appends the new records to the end of the file instead of replacing them.
with open('tmpEmployeeDatabase.csv', 'r+') as csvFile:
reader = csv.reader(csvFile, delimiter=',', quotechar='"')
writer = csv.writer(csvFile, delimiter=',', quotechar='"')
for row in reader:
row[1] = row[1].title()
writer.writerow(row)
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
No, you should not attempt to write to the file you are currently reading from. You can do it if you keep seeking back after reading a row but it is not advisable, especially if you are writing back more data than you read.
The canonical method is to write to a new, temporary file and move that into place over the old file you read from.
from tempfile import NamedTemporaryFile
import shutil
import csv
filename = 'tmpEmployeeDatabase.csv'
tempfile = NamedTemporaryFile('w+t', newline='', delete=False)
with open(filename, 'r', newline='') as csvFile, tempfile:
reader = csv.reader(csvFile, delimiter=',', quotechar='"')
writer = csv.writer(tempfile, delimiter=',', quotechar='"')
for row in reader:
row[1] = row[1].title()
writer.writerow(row)
shutil.move(tempfile.name, filename)
I’ve made use of the tempfile and shutil libraries here to make the task easier.
Method 2
There is no underlying system call for inserting data into a file. You can overwrite, you can append, and you can replace. But inserting data into the middle means reading and rewriting the entire file from the point you made your edit down to the end.
As such, the two ways to do this are either (a) slurp the entire file into memory, make your edits there, and then dump the result back to disk, or (b) open up a temporary output file where you write your results while you read the input file, and then replace the old file with the new one once you get to the end. One method uses more ram, the other uses more disk space.
Method 3
If you just want to modify a csv file inline by using Python, you may just employ pandas:
import pandas as pd
df = pd.read_csv('yourfilename.csv')
# modify the "name" in row 1 as "Lebron James"
df.loc[1, 'name'] = "Lebron James"
# save the file using the same name
df.to_csv("yourfilename.csv")
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0