I'm currently working on a code which reads timestamps from csv. But the problem is that I need to store data in a 12-hour time interval. Here's my code:
data = pd.read_csv("2021-08-13.csv", parse_dates=['time'], infer_datetime_format=True)
datafilter = data[data.lane == "Lane 1"]
datafilter['time'] = pd.to_datetime(datafiltr['time'], errors='coerce')
df = datafilter['time'].groupby(datafilter.time.dt.to_period("H")).agg('count')
Print gives me:
2021-08-18 01:00 20
2021-08-18 02:00 8
2021-08-18 03:00 8
2021-08-18 04:00 13
2021-08-18 05:00 15
2021-08-18 06:00 17
2021-08-18 07:00 23
2021-08-18 08:00 27
2021-08-18 09:00 27
2021-08-18 10:00 28
2021-08-18 11:00 17
2021-08-18 12:00 12
And no matter how hard i try, I cannot find a way to store this in a way that i want. So, for example, when there is no records in a csv file within 6:00 to 7:00, line with that timestamp will dissapear. How can I made it to print it like this?:
2021-08-18 00:00 32
2021-08-18 01:00 0 <---
2021-08-18 02:00 8
Use Grouper
for fill values between:
df = datafiltr.groupby(pd.Grouper(freq='H', key='time'))['time'].count()
If need also 00:00
and 12:00
rows before and after use reindex
:
time a
0 2021-08-18 01:00:00 20
1 2021-08-18 03:00:00 8
2 2021-08-18 04:00:00 13
3 2021-08-18 05:00:00 15
first = datafiltr['time'].min().normalize()
last = datafiltr['time'].max().normalize() + pd.Timedelta(12, 'H')
r = pd.date_range(first, last, freq='H')
df = datafiltr.groupby(pd.Grouper(freq='H', key='time'))['time'].count().reindex(r, fill_value=0)
print (df)
2021-08-18 00:00:00 0
2021-08-18 01:00:00 1
2021-08-18 02:00:00 0
2021-08-18 03:00:00 1
2021-08-18 04:00:00 1
2021-08-18 05:00:00 1
2021-08-18 06:00:00 0
2021-08-18 07:00:00 0
2021-08-18 08:00:00 0
2021-08-18 09:00:00 0
2021-08-18 10:00:00 0
2021-08-18 11:00:00 0
2021-08-18 12:00:00 0
Freq: H, Name: time, dtype: int64
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments