sort csv by column

I want to sort a CSV table by date. Started out being a simple task:

import sys
import csv

reader = csv.reader(open("files.csv"), delimiter=";")

for id, path, title, date, author, platform, type, port in reader:
    print date

I used Python’s CSV module to read in a file with that structure:

id;file;description;date;author;platform;type;port
  • The date is ISO-8601, therefore I can sort it quite easily without parsing: 2003-04-22 e. g.
  • I want to sort the by date, newest entries first
  • How do I get this reader into a sortable data-structure? I think with some effort I could make a datelist: datelist += date, split and sort. However I have to re-identify the complete entry in the CSV table. It’s not just sorting a list of things.
  • csv doesn’t seem to have a built in sorting function

The optimal solution would be to have a CSV client that handles the file like a database. I didn’t find anything like that.

I hope somebody knows some nice sorting magic here 😉

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

import operator
sortedlist = sorted(reader, key=operator.itemgetter(3), reverse=True)

or use lambda

sortedlist = sorted(reader, key=lambda row: row[3], reverse=True)

Method 2

To sort by MULTIPLE COLUMN (Sort by column_1, and then sort by column_2)

with open('unsorted.csv',newline='') as csvfile:
    spamreader = csv.DictReader(csvfile, delimiter=";")
    sortedlist = sorted(spamreader, key=lambda row:(row['column_1'],row['column_2']), reverse=False)


with open('sorted.csv', 'w') as f:
    fieldnames = ['column_1', 'column_2', column_3]
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    for row in sortedlist:
        writer.writerow(row)

Method 3

The reader acts like a generator. On a file with some fake data:

>>> import sys, csv
>>> data = csv.reader(open('data.csv'),delimiter=';')
>>> data
<_csv.reader object at 0x1004a11a0>
>>> data.next()
['a', ' b', ' c']
>>> data.next()
['x', ' y', ' z']
>>> data.next()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

Using operator.itemgetter as Ignacio suggests:

>>> data = csv.reader(open('data.csv'),delimiter=';')
>>> import operator
>>> sortedlist = sorted(data, key=operator.itemgetter(2), reverse=True)
>>> sortedlist
[['x', ' y', ' z'], ['a', ' b', ' c']]

Method 4

for sorting csv by column, i would use something like this

import pandas
csvData = pandas.read_csv('myfile.csv')
csvData.sort_values(["date"], axis=0, ascending=[False], inplace=True)
print(csvData)


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x