Python - Calculate average for every column in a csv file

Pabloo LR

I'm new in Python and I'm trying to get the average of every (column or row) of a csv file for then select the values that are higher than the double of the average of its column (o row). My file have hundreds of columns, and have float values like these:

845.123,452.234,653.23,...
432.123,213.452.421.532,...
743.234,532,432.423,...

I've tried several changes to my code to get the average for every column (separately), but at the moment my code is like this one:

def AverageColumn (c):
    f=open(csv,"r")
    average=0
    Sum=0
    column=len(f)
    for i in range(0,column):
        for n in i.split(','):
            n=float(n)
            Sum += n
        average = Sum / len(column)
    return 'The average is:', average

    f.close()


csv="MDT25.csv"
print AverageColumn(csv)

But I always get a error like " f has no len()" or "'int' object is not iterable"...

I'd really appreciate if someone show me how to get the average for every column (or row, as you want), and then select the values that are higher than the double of the average of its column (or row). I'd rather without importing modules as csv, but as you prefer. Thanks!

monkut

Here's a clean up of your function, but it probably doesn't do what you want it to do. Currently, it is getting the average of all values in all columns:

def average_column (csv):
    f = open(csv,"r")
    average = 0
    Sum = 0
    row_count = 0
    for row in f:
        for column in row.split(','):
            n=float(column)
            Sum += n
        row_count += 1
    average = Sum / len(column)
    f.close()
    return 'The average is:', average

I would use the csv module (which makes csv parsing easier), with a Counter object to manage the column totals and a context manager to open the file (no need for a close()):

import csv
from collections import Counter

def average_column (csv_filepath):
    column_totals = Counter()
    with open(csv_filepath,"rb") as f:
        reader = csv.reader(f)
        row_count = 0.0
        for row in reader:
            for column_idx, column_value in enumerate(row):
                try:
                    n = float(column_value)
                    column_totals[column_idx] += n
                except ValueError:
                    print "Error -- ({}) Column({}) could not be converted to float!".format(column_value, column_idx)                    
            row_count += 1.0            

    # row_count is now 1 too many so decrement it back down
    row_count -= 1.0

    # make sure column index keys are in order
    column_indexes = column_totals.keys()
    column_indexes.sort()

    # calculate per column averages using a list comprehension
    averages = [column_totals[idx]/row_count for idx in column_indexes]
    return averages

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

Calculate average of every n rows from a csv file

Python Pandas How to calculate the average of every other row in a column

How to Calculate Average Time between Dates in CSV File with multiple column in Python?

Python multiple file csv sum column, average for the week and average for the branch

Calculate average of each column in a file

Calculate average of every 7 instances in a dataframe column

Python: Calculating Average and Standard deviation for every hour in csv file

Python - Find the average for each column in a csv file excluding headers and time

Using simple code to get the average (in Python) of an entire column in a csv file

Pandas: Calculate the average every 2 rows of a column and put it into the a new column

How to parse a huge file with csv data and calculate average on one of its column in plain Scala?

Read file.csv (two columns; x and y) then calculate cumulative moving average of second column

Sort, group and calculate average in csv python

Using python to calculate average of a grouped column

Slice values of a column and calculate average in python

How to find the average of a column in a csv file?

How to calculate Average rating for each movie in R from CSV File?

Powershell - Trying to calculate the average of a csv file using a function

Python: I need to find the average over x amount of rows in a specific column of a large csv file

Find the maximum length of every column in a csv file

Formatting data in a CSV file (calculating average) in python

CSV file in Python to create graph and average degree

How to compute average every 5 seconds based on time column in python

How to calculate the average time and fill the nan in another column in python?

Calculate the average values of a column that has duplicate timestamps in python?

Python : Calculate the difference of two columns imported from a csv file and store to another column in python script

Calculate average in python

Calculate average python

Average every N rows by column