'how do i sort data from a csv file numerically in python
I am writing a program that takes students scores from a csv file and needs to sort then highest to lowest score. the csv file looks like this:
josh 12
john 6
fred 8
harry 7
i have tried to put the items in a list like this:
Mylist=[]
csvfile = open (classname,'r')
reader = csv.reader(csvfile)
for row in reader:
Mylist.append(row)
then reverse the list to put the numerical value first:
Mynewlist = []
for each in Mylist:
value2 = ''.join(each[0])
value1 = ''.join(each[1])
mynewlist.append(value1,value2)
with no luck i get this error:
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
main()
File "\\SRV-FILE3\ca231$\task 3\3.py", line 143, in main
value1 = ''.join(each[1])
IndexError: list index out of range
i use ''.join(each[1])
to convert to a string and then append them in the opposite order then i planned to use .sort()
to sort them numerically but I cant get them to append to a list.
does anyone know how to sort the contents of a csv file by its numerical value?
Solution 1:[1]
I think you're overcomplicating things. Assuming you have the data as a list of lists:
data = [("josh", "12"), ("john", "6"), ("fred", "8"), ("harry", "7")]
This could come from CSV of course, it doesn't matter to the sorting. You can sort just by calling sorted()
:
sorted(data, key = lambda x: int(x[1]))
The lambda
is a function that picks the second element of each sub-list as the key, i.e. the score, and converts it to a number for sorting. This prints:
[('john', '6'), ('harry', '7'), ('fred', '8'), ('josh', '12')]
Solution 2:[2]
If all your CSV contains is a name and a number and your names are unique, then
- store CSV contents as {name:score} as a dict
Use the code below to sort based on values(scores in your case)
import operator x = {"josh": 12, "john": 6, "fred": 8, "harry": 7,} sorted_x = sorted(x.iteritems(), key=operator.itemgetter(1))
Solution 3:[3]
You could do something like this: ( Create a dictionary out of your values )
for row in reader:
my_dict = {row[0]:row[1]}
Then you can do a representation of a sorted dictionary (dictionaries are inherently orderless so this will be a list):
import operator
sorted_dict = sorted(my_dict.items(), key=operator.itemgetter(1))
It's worth noting that there are better / simpler ways to do this ( Panda for instance ) but atleast you learn a different approach :)
Solution 4:[4]
from operator import itemgetter
result = []
with open("data", 'r') as f:
r = csv.reader(f, delimiter=' ')
# next(r, None) # skip the headers
for row in r:
result.append(row[:-1])
# sort by numeric part, which is a key value for sorted.
# itemgetter(1) gets the number in each sublist
print(sorted(result,key=itemgetter(1)))
[['josh', '12'], ['john', '6'], ['harry', '7'], ['fred', '8']]
Solution 5:[5]
You can utilize pandas for this.
import pandas as pd
df = pd.read_csv('students.csv', header=None)
df.columns = ['Name', 'Score']
df.sort('Score', ascending=False, inplace=True)
At the end of this, you will have a data frame that looks like this:
Name Score
0 josh 12
2 fred 8
3 harry 7
1 john 6
The code is reading your CSV file, and explicitly stating there isn't a header. By default pandas assumes that the first row contains column headers. Since there isn't any headers, we then add those: Name
and Score
. Finally, we sort, inplace, based on the Score
column. You could leave the original dataframe unchanged by removing the inplace=
parameter and doing this:
sorted_df = df.sort('Score', ascending=False)
After this line, you'd have your original file in df
and the sorted file in sorted_df
Solution 6:[6]
If your data
in the csv
file looks like this:
josh 12
john 6
fred 8
harry 7
Then you can create a dictionary
and use key=d.__getitem__
:
import csv
with open('yourfile.csv', 'rb') as f:
reader = csv.reader(f)
d = {}
for row in f:
row = row.split(",")
d[row[0]]=int(row[1])
k = sorted(d, key=d.__getitem__, reverse=True)
v = sorted(d.values(), reverse=True)
sorted_d = zip(k,v)
print (sorted_d)
Output:
[('josh', 12), ('fred', 8), ('harry', 7), ('john', 6)]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | labheshr |
Solution 3 | |
Solution 4 | |
Solution 5 | Andy |
Solution 6 | Joe T. Boka |