我以说过类似的问题为开头,但是没有一种解决方案对我有用
所以我在我的html页面中寻找一个特定的类,但是我总是返回None值。我在这里看到过一些描述相同问题的文章,但是没有一种解决方案对我有用。这是我的尝试-我正在寻找带有其名称的播放器标签,即“ Chase Young”
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import requests
url = "https://www.nfl.com/draft/tracker/prospects/allPositions?
college=allColleges&page=1&status=ALL&year=2020"
soup = BeautifulSoup(url.content, 'lxml')
match = soup.find('div', class_ = 'css-gu7inl')
print(match)
# Prints None
我尝试了另一种方法来找到匹配项,但仍返回None:
match = soup.find("div", {"class": "css-gu7inl"} # Print match is None
似乎html文件未包含所有网页,因此我尝试使用硒,因为我在类似的帖子中看到了推荐,但仍然没有得到任何结果:
driver = webdriver.Chrome("chromedriver")
driver.get(url)
soup = BeautifulSoup(driver.page_source, 'lxml')
items=soup.select(".css-gu7inl")
print(items) # Empty list
我在这里做错了什么?
数据是由Java脚本呈现的,因此Induce WebDriverWait
()并使用visibility_of_all_elements_located
()等待元素可见
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
url='https://www.nfl.com/draft/tracker/prospects/allPositions?college=allColleges&page=1&status=ALL&year=2020'
driver = webdriver.Chrome()
driver.get(url)
WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,'.css-gu7inl')))
soup = BeautifulSoup(driver.page_source, 'lxml')
items=soup.select(".css-gu7inl")
Players=[item.select_one('a.css-1fwlqa').text for item in items]
print(Players)
输出:
['chase young', 'jeff okudah', 'derrick brown', 'isaiah simmons', 'joe burrow', "k'lavon chaisson", 'jedrick wills', 'tua tagovailoa', 'ceedee lamb', 'jerry jeudy', "d'andre swift", 'c.j. henderson', 'mekhi becton', 'mekhi becton', 'patrick queen', 'henry ruggs iii', 'henry ruggs iii', 'javon kinlaw', 'laviska shenault jr.', 'yetur gross-matos']
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句