Python | How to make a program that calculates strings


I'm trying to create a Python script using pandas that can import a .txt file and calculate the average of each subject

I'm trying to turn this "file.txt":

code name subject1 subject2 subject3
1234 Ali 6 0 8
1235 Carl 4 7 7
1236 Jason 3 5 0

and turn in intro this:

subject1 average is: 4.3
subject2 average is: 6
subject3 average is: 7.5
  • subject1 is calculated like this: (6 + 4 + 3) / 3,
  • subject2 is calculated like this: (7 + 5) / 2 <-- because one person has a 0 means he/she didn't anticipate so their 0 does't add and counts toward the average

  • subject3 is calculated like this: (8 + 7) / 2 <-- Like above

    I'm also trying to figure out a way for the script to be flexible and have the ability to add more subjects and more people (so 3 instead of 5)

This is my code until now:

# read input file
df = pd.read_csv('file.txt')

# calculate mean, ignoring 0 values
df['mean'] = df.iloc[:, 2:].astype(float).replace(0, np.nan).mean(1)

# iterate rows and print results
for name, mean in df.set_index('name')['mean'].items():
    print(f'{name} has average of {mean:.2f}')
  • It calculates the average of each person (horizontally)
  • but I can't figure out a way to do it vertically for each subject.

thanks for the help guys ^_^


The argument 1 that you provide to pd.Series.mean is the axis along which the mean is calculated; the default is columns, so you are explicitly telling it to calculate the row-wise mean. Remove that argument and you should be good.

In [155]: df.iloc[:, 2:].astype(float).replace(0, np.nan).mean()
subject1    4.333333
subject2    6.000000
subject3    7.500000

