How to efficiently count the number of occurrences of each word in Python

Joey Joestar

I am trying to count the occurrences of each word in a file, such that the output is something like

the: 102
me: 100
etc

Here is the code I have so far.

from pathlib import Path
from collections import Counter
import string

filepath = Path('input.txt')

with open(filepath) as f:
    content = f.readlines()

word_list = sum((
    (s.strip('\n').translate(str.maketrans('', '', string.punctuation))).split(' ')
    for s in content
), [])

for key,value in Counter(word_list).items():
    print(f'{key} : {value}')

However, this takes infinite amount of time when the input file is large. How do I make this workable for large files?

SuperStormer

Changed f.readlines() to directly iterating over f and sum to list.extend in a loop.

from pathlib import Path
from collections import Counter
import string

filepath = Path('input.txt')

with open(filepath) as f:
    word_list = []
    for s in f:
        word_list.extend((s.strip('\n').translate(str.maketrans('', '', string.punctuation))).split(' '))

for key,value in Counter(word_list).items():
    print(f'{key} : {value}')

works almost instantly on my test file.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Count the number occurrences of each word in a text - Python

Count the number of occurrences of each word

How to count the number of occurrences of each word in a sentence for each sentence score?

How to count number of occurrences of a word in a string using Python

How to count occurrences of each word belonging to a file in all of `n` number of files passed as arguments?

Count the number of Occurrences of a Word in a String

How to count each occurrences of each number in a large file

Efficiently count the number of occurrences of unique subarrays in NumPy?

Python - How to count the number of occurrences in a list

Python - How to Count & Remember Number of Occurrences in a Loop

How to count the number of occurrences in a variable using python

Python: Count occurrences of each number in a python data-frame

How to count the number of appearances of a word in each line

how to count the number of occurrences of a word in different files? -BASH

How to use pthread in C to count the number of word occurrences?

Python - Count word occurrences in a list

python dataframe count word occurrences

Efficiently count word frequencies in python

Get count/histogram of occurrences of each word in document

How to count occurrences of a column value efficiently in SQL?

Split list created by re.findall to single words, then count occurrence of each word sorted descending by number of occurrences

How to count all occurrences of a word in a string using python

How do I count occurrences of a word in a csv file using python?

Count the total number of occurrences of a word in DataTables

count the number of occurrences of all word in unix

How to count occurrences of each character?

Count number of occurrences using Python

How to traverse a list of words and search each word and count the occurrences of a given substring within the word?

SQL Count number of occurrences for each distinct combination