CSV Dictionary, comparing 2 CSV files and replacing values based on matching values

I am trying to compare between 2 CSV files and if one of the columns’ values match, then I
will need to replace the values of the other csv file with the second csv file’s values.

Example:

Book1.csv:

Alfa,Beta,Charlie,Delta,Echo,Foxtrot,Golf,Hotel,India,Juliett,Kilo
A1,B1,C1,D1,E1,F1,G1,H1,I1,J1,
A2,B2,C2,D2,E2,F2,G2,H2,I2,J2,1
A3,B3,C3,D3,E3,F3,G3,H3,I3,J3,
A4,B4,C4,D4,E4,F4,G4,H4,I4,J4,
A5,B5,C5,D5,E1,F5,G5,H5,I5,J5,1
A6,B6,C6,D6,E6,F6,G6,H6,I6,J6,1
A7,B7,C7,D7,E7,F7,G7,H7,I7,J7,
A8,B8,C8,D8,E8,F8,G8,H8,I8,J8,1
A9,B9,C9,D9,E9,F9,G9,H9,I9,J9,

Book2.csv:

Oscar,Papa,Lima
A1_x,B1,K2
A2,B2,X2
A3_x,B3,L2
A4_x,B4,K2
A5,B5,J2
A6,B6,A2
A7_x,B7,AS
A8,B8,S3
A9_x,B9,S1

If Book2.csv column “Oscar”’s values are equal to Book1.csv “Alfa”’s values (A2==A2, A3!=A3, A4==A4), then Book2.csv‘s “Lima” values will overwrite whatever is in Book1.csv‘s “Beta” values.

So the output for Beta_New will look like this (it looks like this because of the code below where I am switching and calling the columns by a dictionary:

xtest_file.csv:

Beta_New,Echo_New,Foxtrot_New_ALL,Hotel_New,India_New,Charlie_New
X2,E2,F2,H2,I2,C2
J2,E5,F5,H5,I5,C5
A2,E6,F6,H6,I6,C6
S3,E8,F8,H8,I8,C8

My code so far:

import csv

fieldnames_dict = {
    'Beta': 'Beta_New',
    'Echo': 'Echo_New',
    'Foxtrot': 'Foxtrot_New_ALL',
    'Hotel': 'Hotel_New',
    'India': 'India_New',
    'Charlie': 'Charlie_New'
}

1_open_cd_csv = open("book1.csv", "r", encoding="utf-8", errors='ignore')
1_reader_cd_csv = csv.DictReader(1_open_cd_csv, delimiter=',', quotechar='"')
1_header_csv = next(1_reader_cd_csv)

2_open_cd_csv = open("book2.csv", "r", encoding="utf-8", errors='ignore')
2_reader_cd_csv = csv.DictReader(2_open_cd_csv, delimiter=',', quotechar='"')
2_header_csv = next(2_reader_cd_csv)

open_output_test_csv = open(xtest_file.csv, "w", encoding="utf-8", errors='ignore')
writer_output_test_csv = csv.DictWriter(open_output_test_csv, delimiter=',',
                                        quotechar='"', quoting=csv.QUOTE_ALL,
                                        fieldnames=list(fieldnames_dict.values()))
writer_output_test_csv.writeheader()

for row_in in 1_reader_cd_csv:
    if row_in['Kilo'] == "1":
        row_out = {new: row_in[old] for old, new in fieldnames_dict.items()}
        writer.writerow(row_out)
        #if 2_reader_cd_csv[Oscar]'s values = 1_reader_cd_csv[Alfa]'s values:
            #then the output for "Beta_New" = the values for 2_reader_cd_csv[Lima]

or mabye something like this:

for row_in in 1_reader_cd_csv:
    alfa_match = 2_reader_cd_csv['Oscar'] matches with = row_in['Alfa']
    if alfa_match != none and row_in['Kilo'] == "1":
        row_out = {new: row_in[old] for old, new in fieldnames_dict.items()}
        #then the output for "Beta_New" = the values for 2_reader_cd_csv[Lima]
        writer.writerow(row_out)

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

Break your code into smaller chunks for each part:

  1. Read rows of “book1.csv” where Kilo=”1″ to file1
  2. Read rows of “book2.csv” to file2
  3. Replace values of “Beta” in file1 based on file2 values of Oscar
  4. Write back to a new csv file
with open("book1.csv", "r") as infile:
    reader = csv.DictReader(infile)
    file1 = <div class="su-row"></div>=='1']

with open("book2.csv") as infile:
    reader = csv.DictReader(infile)
    file2 = <div class="su-row"></div>

output = list()
oscars = <div class="su-row"></div> for row in file2]
for row in file1:
    if row["Alfa"] in oscars:
        row["Beta"] = [r["Lima"] for r in file2 if r["Oscar"]==row["Alfa"]][0]
    output.append({new:row[old] for old,new in fieldnames_dict.items()})

with open("output.csv", "w", newline="") as outfile:
    writer = csv.DictWriter(outfile,fieldnames=list(fieldnames_dict.values()))
    writer.writeheader()
    for row in output:
        writer.writerow(row)
output.csv:
Beta_New,Echo_New,Foxtrot_New_ALL,Hotel_New,India_New,Charlie_New
X2,E2,F2,H2,I2,C2
J2,E1,F5,H5,I5,C5
A2,E6,F6,H6,I6,C6
S3,E8,F8,H8,I8,C8

Method 2

If Book2.csv column “Oscar”‘s values are equal to Book1.csv “Alfa”‘s
values (A2==A2, A3!=A3, A4==A4), then Book2.csv’s “Lima” values will
overwrite whatever is in Book1.csv’s “Beta” values.

You might want to give pandas a try:

import pandas as pd

#read csvs as dataframes
df1 = pd.read_csv("book1.csv")
df2 = pd.read_csv("book2.csv")

#replace 'Beta' in first df with the value in 'Lima' where 'Alfa' matches 'Oscar'. 
df1['Beta'].where(df1['Alfa'] != df2['Oscar'], df2['Lima'], inplace=True)
#store as csv
df1.to_csv('new_file.csv')

Output df1:

Alfa Beta Charlie Delta Echo Foxtrot Golf Hotel India Juliett Kilo
0 A1 B1 C1 D1 E1 F1 G1 H1 I1 J1 nan
1 A2 X2 C2 D2 E2 F2 G2 H2 I2 J2 1
2 A3 B3 C3 D3 E3 F3 G3 H3 I3 J3 nan
3 A4 B4 C4 D4 E4 F4 G4 H4 I4 J4 nan
4 A5 J2 C5 D5 E1 F5 G5 H5 I5 J5 1
5 A6 A2 C6 D6 E6 F6 G6 H6 I6 J6 1
6 A7 B7 C7 D7 E7 F7 G7 H7 I7 J7 nan
7 A8 S3 C8 D8 E8 F8 G8 H8 I8 J8 1
8 A9 B9 C9 D9 E9 F9 G9 H9 I9 J9 nan


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x