Python、Jupyter Notebook、从 URL 下载 Excel 文件

哈西克·迪内什

我目前正在尝试从 ABS 网站访问一些数据。

https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release#data-download

表 5。

excel 文件的名称在每次发布时都会更改。我想通过自动下载并将其保存到数据框中来更新它。

现在的进展:

谢谢你美丽的汤。使用该函数获取网站上的 Url 列表。

#####Step 1: start by importing all of the necessary packages#####
import requests #requesting URLs
import urllib.request #requesting URLs
import pandas as pd #for simplifying data operations (e.g. creating dataframe objects)
from bs4 import BeautifulSoup #for web-scraping operations

#####Step 2: connect to the URL in question for scraping#####
url = 'https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release' 
response = requests.get(url) #Connect to the URL using the "requests" package
response #if successful then it will return 200

#####Step 3: read in the URL via the "BeautifulSoup" package#####
soup = BeautifulSoup(response.text, 'html.parser') 

#####Step 4: html print#####
for link in soup('a'):
    print(link.get('href'))

##how to get the link to table 5?##
**url = ?**

##last step to save into data frame##
ws = pd.read_excel(url, sheet_name='Payroll jobs index-SA4', skiprows=5)
巴维亚·帕里克

您可以从 URL 中找到与 XSLX 关联的 div 类,并使用find_all方法返回元素列表并使用索引 1 进行查找href

import requests 
from bs4 import BeautifulSoup

url = 'https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/latest-release' 
response = requests.get(url) 
response 
soup = BeautifulSoup(response.text, 'html.parser') 

url=soup.find_all("div",class_="abs-data-download-right")[1].find("a")['href']
pd.read_excel(url, sheet_name='Payroll jobs index-SA4', skiprows=5,engine='openpyxl')

查找所有 URL:

urls=soup.find_all("div",class_="abs-data-download-right")
for i in urls:
    print(i.find("a")['href'])

输出:

https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/week-ending-31-july-2021/6160055001_DO004.xlsx
https://www.abs.gov.au/statistics/labour/earnings-and-work-hours/weekly-payroll-jobs-and-wages-australia/week-ending-31-july-2021/6160055001_DO005.xlsx
....

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章