我正在将100个CSV与诸如XXX_XX_20112020.csv之类的名称连接起来,以创建一个文件,例如master.csv
我可以从每个文件名中提取日期并创建一个新列,并自动为该文件中的所有记录填充该日期吗?我应该在连接之前还是之后执行此操作,以及如何进行?
import os
import pandas as pd
master_df = pd.DataFrame()
for file in os.listdir('folder_with_csvs'):
# we access the last element after an underscore and all before the dot before csv
date_for_file = file.split('_')[-1].split('.')[0]
date_for_file = datetime.datetime.strptime(date_for_file, "%d%m%Y").date()
df = pd.read_csv(file)
# Following line will put your date in the `POST_DATE` column for every record of this file
df['POST_DATE'] = date_for_file
master_df = pd.concat([master_df, df])
# Eventually
master_df.to_csv('master.csv')
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句