我正在從 API 下載數據,最長回溯期為 833 天,但從我的測試中我知道他們的數據可以追溯到 2002 年。我在下面有一個函數,它定義了從今天到兩個日期時間“結束”的 833 天和“開始”,這些被輸入到一個 API 命令中。請注意,它們需要採用字符串格式並以這種方式格式化,以便 api 接受它們。
d=datetime.today()
end = str(d.year) + "-" + str(d.month) + "-" + str(d.day)
lookbook_period = 833
# Take current date and minus the the max time delta of 833 days to get 'start' var
time_delta = timedelta(days=lookbook_period)
now = datetime.now()
#split the answer and format it to the required timestamp type.
start = str(now - time_delta).split(" ")[0]
我想要做的是下載 833 天部分的數據幀,然後將它們拼湊成一個 CSV 或數據幀。到目前為止,我有以下內容,但我不確定如何製作一個可以隨時更改日期的函數。
def time_machine():
df_total = pd.DataFrame
start_str = str(2002) + "-0" + str(5) + "-0" + str(1)
start = datetime(2002,5,1)
print(start)
# amount of days from 2002-05-01 to now
rolling_td = timedelta(days=int(str((datetime.today() - start)).split(" ")[0]))
print(rolling_td, "\n")
# API maximum amount of lookbook days
max_td = timedelta(days=833)
# The function would do something similar to this, and on each pass, calling the API and saving the data to a dataframe or CSV.
s1 = start + max_td
print(s1)
s2 = s1 + max_td
print(s2)
s3 = s2 + max_td
print(s3)
d=datetime.today()
end = str(d.year) + "-" + str(d.month) + "-" + str(d.day)
print(d)
任何建議或工具/庫查看將不勝感激。我一直在用 while 循環測試東西,但我仍然盲目地在這個循環中陷入困境。
這是我認為我需要的粗略 sudo 代碼,但我仍然不確定如何進入下一部分
while count > 0 and > 833:
start =
end =
call the API first to download first set of data.
Check date range:
get most recent date + 833 days to it
Download next section
repeat
if count < 833:
calulate requied dates for start and end
如果您首先定義日期範圍,您將能夠在每 833 天的時間段內迭代以使用 API 提取數據。然後,您需要為每次迭代將數據附加到數據框(或 csv)。
import datetime as dt
# Date range to pull data over
start_date = dt.date(2002,5,1)
end_date = dt.date.today()
delta = dt.timedelta(days=832) # 832 so you have a range of 833 days inclusive
# Iterating from start date, recording date ranges of 833 days
date_ranges = []
temp_start_date = start_date
while temp_start_date < end_date:
temp_end_date = temp_start_date + delta
if temp_end_date > end_date:
temp_end_date = end_date
date_ranges.append([temp_start_date, temp_end_date])
temp_start_date = temp_end_date + dt.timedelta(days=1)
# For each date range, pass dates into API
# Initialise dataframe here
for start_date, end_date in date_ranges:
start_date_str = start_date.strftime("%Y-%m-%d")
end_date_str = end_date.strftime("%Y-%m-%d")
# Input to API with start and end dates in correct string format
# Process data into dataframe
不需要計算 833 天,正如您所說,API 將開始日期和結束日期作為參數,因此您只需要為每個日期範圍找到它們。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句