I am reading data from a CSV file (xyz.CSV) which contains below data:
col1,col2,col3,col4 name1,empId1,241682-27638-USD-CIGGNT ,1 name2,empId2,241682-27638-USD-OCGGINT ,1 name3,empId3,241942-37190-USD-GGDIV ,2 name4,empId4,241942-37190-USD-CHYOF ,1 name5,empId5,241942-37190-USD-EQPL ,1 name6,empId6,241942-37190-USD-INT ,1 name7,empId7,242066-15343-USD-CYJOF ,3 name8,empId8,242066-15343-USD-CYJOF ,3 name9,empId9,242066-15343-USD-CYJOF ,3 name10,empId10,241942-37190-USD-GGDIV ,2
When I am iterating it with a loop I am able to print the data row wise and and only column1 data by the below code.
file=open( path +"xyz.CSV", "r")
reader = csv.reader(file)
for line in reader:
t=line[0]
print t
By the above code I can only get the first column.
If I try to print line[1] or line[2] it gives me the below error.
file=open( path +"xyz.CSV", "r")
reader = csv.reader(file)
for line in reader:
t=line[1],[2]
print t
t=line[1],line[2]
IndexError: list index out of range
Please suggest for printing the data of column2 or column3.
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Here is how I’ve got 2nd and 3rd columns:
import csv
path = 'c:\temp\'
file=open( path +"xyz.CSV", "r")
reader = csv.reader(file)
for line in reader:
t=line[1],line[2]
print(t)
Here is the results:
('col2', 'col3')
('empId1', '241682-27638-USD-CIGGNT ')
('empId2', '241682-27638-USD-OCGGINT ')
('empId3', '241942-37190-USD-GGDIV ')
('empId4', '241942-37190-USD-CHYOF ')
('empId5', '241942-37190-USD-EQPL ')
('empId6', '241942-37190-USD-INT ')
('empId7', '242066-15343-USD-CYJOF ')
('empId8', '242066-15343-USD-CYJOF ')
('empId9', '242066-15343-USD-CYJOF ')
('empId10', '241942-37190-USD-GGDIV ')
Method 2
Although it’s a pretty old question, just want to share my suggestion. Found it easier to read csv using pandas in a dataframe and access the data.
import pandas
df = pandas.read_csv('<path/to/your/csv/file>')
print(df)
#OUTPUT
# col1 col2 col3 col4
#0 name1 empId1 241682-27638-USD-CIGGNT 1
#1 name2 empId2 241682-27638-USD-OCGGINT 1
#2 name3 empId3 241942-37190-USD-GGDIV 2
#3 name4 empId4 241942-37190-USD-CHYOF 1
#4 name5 empId5 241942-37190-USD-EQPL 1
#5 name6 empId6 241942-37190-USD-INT 1
#6 name7 empId7 242066-15343-USD-CYJOF 3
#7 name8 empId8 242066-15343-USD-CYJOF 3
#8 name9 empId9 242066-15343-USD-CYJOF 3
#9 name10 empId10 241942-37190-USD-GGDIV 2
#you can access any column using
df['col2']
#OUTPUT
#0 empId1
#1 empId2
#2 empId3
#3 empId4
#4 empId5
#5 empId6
#6 empId7
#7 empId8
#8 empId9
#9 empId10
#Name: col2, dtype: object
#Or print a specific value using
df['col2'][0]
Update: I was mainly using Pandas in my project so found it easier to just use it to read the csv as well. There are other dedicated libraries available to read CSV (creating your own CSV reader should also be few lines of code).
Method 3
Your first line only has one column, so the process fails and doesn’t continue. To solve, just skip first row
>>> with open( path, "r") as file:
... reader = csv.reader(file)
... for idx,line in enumerate(reader):
... if idx>0:
... t=line[1],line[2]
... print t
...
('empId1', '241682-27638-USD-CIGGNT ')
('empId2', '241682-27638-USD-OCGGINT ')
('empId3', '241942-37190-USD-GGDIV ')
('empId4', '241942-37190-USD-CHYOF ')
('empId5', '241942-37190-USD-EQPL ')
('empId6', '241942-37190-USD-INT ')
('empId7', '242066-15343-USD-CYJOF ')
('empId8', '242066-15343-USD-CYJOF ')
('empId9', '242066-15343-USD-CYJOF ')
('empId10', '241942-37190-USD-GGDIV ')
Method 4
Hope it clears the issue
import csv
file=open( "xyz.CSV", "r")
reader = csv.reader(file)
for line in reader:
t=line[0]+","+line[1]
print (t)
Method 5
import csv
csv_file=open("xyz.csv", "r")
reader = csv.reader(csv_file)
for row in reader:
print(" ".join(row[:2]))
Output :-
col1 col2
name1 empId1
name2 empId2
name3 empId3
name4 empId4
name5 empId5
name6 empId6
name7 empId7
name8 empId8
name9 empId9
name10 empId10
Just put value in row as slice. Below is code for printing 2nd and 3rd coloumn.
import csv
csv_file=open("xyz.csv", "r")
reader = csv.reader(csv_file)
for row in reader:
print(" ".join(row[1:3]))
output:
col2 col3
empId1 241682-27638-USD-CIGGNT
empId2 241682-27638-USD-OCGGINT
empId3 241942-37190-USD-GGDIV
empId4 241942-37190-USD-CHYOF
empId5 241942-37190-USD-EQPL
empId6 241942-37190-USD-INT
empId7 242066-15343-USD-CYJOF
empId8 242066-15343-USD-CYJOF
empId9 242066-15343-USD-CYJOF
empId10 241942-37190-USD-GGDIV
Method 6
There is a simple method you can check out more at:
Python CSV Docs
with open(filename, 'r') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
data.append(row)
Method 7
you can also read csv data without importing pandas and csv
with open('testdata.csv', 'r') as f:
results = []
for line in f:
words = line.split(',')
results.append((words[0], words[1:]))
print (results)
Method 8
You can use tablebase.
Step 1: Open and store your CSV file.
import tablebase
MyTable = tablebase.CsvTable("<path/to/your/csv/file>")
Step 2: Get your column.
print(MyTable.get_col("ColumnName"))
This will return a list of your column content.
Method 9
To read and write in a text file in Python, you can use the below syntax:
f = open('helloworld.txt', 'r')
message = f.read()
print(message)
f.close()
f = open('helloworld.txt', 'w')
f.write('hello world')
f.close()
To read the CSV file, follow the below code:
results = []
with open("C:/Users/Prateek/Desktop/TA Project/data1.csv") as inputfile:
for line in inputfile:
results.append(line.strip().split(','))
Method 10
load the preprocessed CSV data
data_preprocessed = pd.read_csv('file_name.csv')
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0