我正在嘗試使用以下代碼從文件中獲取 URL,然後使用以下腳本打印響應標頭:
import requests
file = open('urls.txt','r')
for url in file:
print(url)
r = requests.head(url)
print(r.headers["Server"])
我不斷收到此錯誤消息:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 156, in _new_conn
conn = connection.create_connection(
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 61, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/usr/lib/python3.8/socket.py", line 918, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
大家可以幫忙嗎?謝謝!
行尾字符:
您的文件的urls.txt
每一行都有行尾字符。像這樣的東西
www.google.com\n
https://stackoverflow.com/\n
https://github.com/\n
https://www.google.com/
當您讀取文件時,字符\n
也會被讀取,這會導致requests.head(url)
.
刪除行尾字符:
這是一個簡單的修復。要刪除行尾字符,您可以使用 python 字符串方法.strip()
刪除換行符以及前導\尾隨空格。
另一種選擇是使用splitlines()
. 它將為您處理 EOL 字符。示例代碼是
temp = open(filename,'r').read().splitlines()
Sidenode:with
在讀取/寫入文件時始終使用子句,因為一旦您離開範圍,它會自動為您處理文件的關閉。
import requests
with open("urls.txt", "r") as url_file:
temp = url_file.read().splitlines()
for url in temp:
print(url)
r = requests.head(url)
print(r.headers["Server"])
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句