'How to find the frequency of words in a list created from a .csv file
I am trying to write a program that first reads in the name of an input file and then reads the file using the csv.reader() method. The file contains a list of words separated by commas. The program should output the words and their frequencies (the number of times each word appears in the file) without any duplicates.
The file input1.csv has hello,cat,man,hey,dog,boy,Hello,man,cat,woman,dog,Cat,hey,boy
So far I have this:
import csv
with open('input1.csv', 'r') as wordsfile:
words_reader = csv.reader(wordsfile)
for row in words_reader:
for word in row:
count = row.count(word)
print(word, count)
But my output is this: "hello 1 cat 2 man 2 hey 2 dog 2 boy 2 Hello 1 man 2 cat 2 woman 1 dog 2 Cat 1 hey 2 boy 2"
I am trying to output this but without any duplicates, I'm stumped and any help would be appreciated.
Solution 1:[1]
Try using set()
import csv
with open('input1.csv', 'r') as wordsfile:
words_reader = csv.reader(wordsfile)
for row in words_reader:
list_of_words = set(row)
for word in list_of_words:
count = row.count(word)
print(word, count)
I am not very familiar with csv library and I dont know if row is a list or not so sorry if this throws an error. If row is a string probably you can use
row = row.split()
list_of_words = set(row)
Hope it helps.
Solution 2:[2]
import csv
input1 = input()
with open(input1, 'r') as wordsfile:
words_reader = csv.reader(wordsfile)
for row in words_reader:
list_of_words = row
no_duplicates_in_list = list(dict.fromkeys(list_of_words))
listlength = len(no_duplicates_in_list)
for i in range(listlength):
print(no_duplicates_in_list[i], list_of_words.count(no_duplicates_in_list[i]))
pretty much the same as Aryman's but the order is the same as in the csv
Solution 3:[3]
Alright, so I'm pretty basic with Python but I was able to figure this out in about an hour of trying different for loops etc. I ended up sticking to using lists as that is what the assignment indicated in the instructions. In order to get rid of the duplicates within the first list, I made a second list and nested an if statement that only adds words that aren't contained within it, resulting in a new list of one copy of each word from the first.
filename = input()
words = []
new_words = []
with open(filename, 'r') as csvfile:
reader = csv.reader(csvfile, delimiter = ',')
for row in reader:
for word in row:
words.append(word)
for word in words:
freq = words.count(word)
if word not in new_words:
new_words.append(word)
print(word, freq)
Solution 4:[4]
import csv
name = input()
with open(name, 'r') as myfile:
Reader = csv.reader(myfile, delimiter=',')
dictionary = dict()
for l in Reader:
for m in l:
if m in dictionary:
dictionary[m] = dictionary[m] + 1
else:
dictionary[m] = 1
for n in list(dictionary.keys()):
print("{} {}".format(n, dictionary[n]))
Solution 5:[5]
import csv words = {} user_file = input() with open(user_file, "r") as csvfile: inputreader = csv.reader(csvfile) for row in inputreader: listofwords = row
for i in row:
if i in words:
words[i] += 1
else:
words[i] = 1
for i in words: print(i, words[i])
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | |
Solution 3 | GoldenTeacher91 |
Solution 4 | Flair |
Solution 5 | Joshua Desir |