我正在尝试从网页中提取一个url(链接),我用“ find_element_by_css_selector”来获取我想要的项目。该项目中包含一个URL。如何提取此网址。
我试过了:
prod_item = browser.find_elements_by_css_selector('div.col-lg-2')
print(prod_item[0].get_attribute('href'))
但是我得到“无”作为输出。我很想使用css_selector,因为页面上有很多类似的项目,并且'div.col-lg-2'是所有对象共有的属性。如何解决此问题并获得链接?
现在是完整的代码:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException
url = 'https://auctionmaxx.com/Browse?page=0'
browser = webdriver.Firefox()
browser.get(url)
prod_item = WebDriverWait(browser, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.col-lg-2[href]")))
print(prod_item[4].get_attribute('href'))
要打印href属性的值,您必须为引入WebDriverWait,visibility_of_all_elements_located()
并且可以使用以下定位策略之一:
使用CSS_SELECTOR
:
browser.get("https://auctionmaxx.com/Browse?page=0")
prod_item = WebDriverWait(browser, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.col-lg-2>div a")))
print(prod_item[0].get_attribute('href'))
CSS_SELECTOR
单行使用:
browser.get("https://auctionmaxx.com/Browse?page=0")
print(WebDriverWait(browser, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div.col-lg-2>div a")))[0].get_attribute('href'))
控制台输出:
https://auctionmaxx.com/Listing/Details/321939965/NEW-PUREX-LAUNDRY-DETERGENT-924L
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句