我在將三個 DataFrame 與 Pandas 連接時遇到了一些問題。我的 DataFrame 之一的行與其他兩個不一致(請參閱下面的代碼和輸出):
import requests
import pandas as pd
from bs4 import BeautifulSoup
List = ['LU0526609390:EUR', 'IE00BHBX0Z19:EUR', 'LU1076093779:EUR', 'LU1116896363:EUR']
df = pd.DataFrame(List, columns=['List'])
urls = 'https://markets.ft.com/data/funds/tearsheet/summary?s='+ df['List']
dfs =[]
results = pd.DataFrame()
for url in urls:
print(url)
r = requests.get(url).content
soup = BeautifulSoup(r, 'html.parser')
elemList = soup.find('title')
df0 = pd.DataFrame(elemList, columns = ['Fund Name'])
df0["Fund Name"] = df0["Fund Name"].str.replace("summary - FT.com", "", regex=True)
table1 = soup.find_all('table')[0]
table2 = soup.find_all('table')[1]
df1 = pd.read_html(str(table1), index_col=0)[0].T
df2 = pd.read_html(str(table2), index_col=0)[0].T
df = pd.concat([df0, df1, df2], axis=1)
dfs.append(df)
pd.concat(dfs).to_csv(r'/Users/Test.csv', index=False)
我的輸出如下:
看起來我的 df0 DataFrame 上的行(列:'Fund Name')與我的其他 DataFrame 的行不一致。如果有人能讓我知道為什麼會這樣,我將不勝感激。謝謝!
想法是在Fund Name
第一列中添加列DataFrame.insert
:
dfs =[]
results = pd.DataFrame()
for url in urls:
print(url)
r = requests.get(url).content
soup = BeautifulSoup(r, 'html.parser')
elemList = soup.find('title')
table1 = soup.find_all('table')[0]
table2 = soup.find_all('table')[1]
df1 = pd.read_html(str(table1), index_col=0)[0].T
df2 = pd.read_html(str(table2), index_col=0)[0].T
# print (df2)
df = pd.concat([df1, df2], axis=1)
df.insert(0, 'Fund Name', elemList)
df["Fund Name"] = df["Fund Name"].str.replace("summary - FT.com", "", regex=True)
dfs.append(df)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句