I have a csv file that has 25000 rows. I want to put the average of every 30 rows in another csv file.
I've given an example with 9 rows as below and the new csv file has 3 rows (3, 1, 2):
| H |
========
| 1 |---\
| 3 | |------------->| 3 |
| 5 |---/
| -1 |---\
| 3 | |------------->| 1 |
| 1 |---/
| 0 |---\
| 5 | |------------->| 2 |
| 1 |---/
What I did:
import numpy as np
import pandas as pd
m_path = "ALL0001.CSV"
m_df = pd.read_csv(m_path, usecols=['Col-01'])
m_arr = np.array([])
temp = m_df.to_numpy()
step = 30
for i in range(1, 25000, step):
arr = np.append(m_arr,np.array([np.average(temp[i:i + step])]))
data = np.array(m_arr)[np.newaxis]
m_df = pd.DataFrame({'Column1': data[0, :]})
m_df.to_csv('AVG.csv')
This works well but is there a better solution?
You can use integer division by step
for consecutive groups and pass to groupby
for aggregate mean
:
step = 30
m_df = pd.read_csv(m_path, usecols=['Col-01'])
df = m_df.groupby(m_df.index // step).mean()
Or:
df = m_df.groupby(np.arange(len(dfm_df// step).mean()
Sample data:
step = 3
df = m_df.groupby(m_df.index // step).mean()
print (df)
H
0 3
1 1
2 2
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments