在Chrome中使用Selenium无法获取元素文本

潘万

我正在尝试使用Python和Chrome作为Selenium Webdriver来刮掉Merriam-Webster的Medical Dictionary的医学术语。到目前为止，这就是我所拥有的：

    from os import path
    from selenium import webdriver

    # Adding an ad-blocker to Chrome to speed up page load times
    options = webdriver.ChromeOptions()
    options.add_extension(path.abspath("ublock-origin.crx"))

    # Declaring the Selenium webdriver
    driver = webdriver.Chrome(chrome_options = options)

    # Fetching the "A" terms as a test set
    driver.get("https://www.merriam-webster.com/browse/medical/a")

    scraped_words = []  # The list that will hold each word
    page_num = 1
    while page_num < 55:  # There are 54 pages of "A" terms
        try:
            for i in range(4):  # There are 3 columns per page of words
                column = "/html/body/div/div/div[5]/div[2]/div[1]/div/div[3]/ul/li[" + str(i) + "]/a"
                number_of_words = len(driver.find_elements_by_xpath(column))
                for j in range(number_of_words):
                    word = driver.find_elements_by_xpath(column + "[" + str(j) + "]")
                    scraped_words.append(word)
            driver.find_element_by_class_name("fa-angle-right").click()  # Next page
            page_num += 1  # Increment page number to keep track of current page
        except:
            driver.close()

    # Write out words to a file
    with open("medical_terms.dict", "w") as text_file:
        for i in range(len(scraped_words)):
            text_file.write(str(scraped_words[i]))
            text_file.write("\n")

    driver.close()

上面的代码获取所有项目，因为的输出len(scraped_words)是预期的数量。但是，由于我没有指定要获取元素的文本，所以我得到了元素标识符（我认为吗？），而不是文本。如果我决定使用word = driver.find_elements_by_xpath(column + "[" + str(j) + "]").text来指定要获取元素的文本，则会出现以下错误：

Traceback (most recent call last):
  File "mw_download.py", line 20, in <module>
    number_of_words = len(driver.find_elements_by_xpath(column))
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 325, in find_elements_by_xpath
    return self.find_elements(by=By.XPATH, value=xpath)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 817, in find_elements
    'value': value})['value']
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: no such session
  (Driver info: chromedriver=2.31.488774 (7e15618d1bf16df8bf0ecf2914ed1964a387ba0b),platform=Mac OS X 10.12.6 x86_64)


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "mw_download.py", line 27, in <module>
    driver.close()
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 541, in close
    self.execute(Command.CLOSE)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: no such session
  (Driver info: chromedriver=2.31.488774 (7e15618d1bf16df8bf0ecf2914ed1964a387ba0b),platform=Mac OS X 10.12.6 x86_64)

在这里令我感到奇怪的是，我在两次运行之间更改的唯一代码是在第22行，而错误消息却指出了第20行。

我们将不胜感激，以帮助您理解这里发生的一切以及我可以采取的修复措施！：+）

ViníciusAguiar

您只需要创建一个words访问元素文本的列表，即可进行以下更改：

word = driver.find_elements_by_xpath(column + "[" + str(j) + "]")

至：

word = [i.text for i in driver.find_elements_by_xpath(column + "[" + str(j) + "]")]

由于.find_elements_by_xpath将始终返回列表，因此.text直接访问将无效。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2020-11-8

我来说两句

0 条评论

登录后参与评论

无法使用 Selenium 获取元素的文本

在 Python 中使用 Selenium 仅从父元素获取文本（没有来自子元素的文本）？

无法从Selenium Java的html元素获取文本

使用Selenium获取段落元素的文本

在 Span 中使用 selenium PhantomJS 获取文本

如何在 Selenium Java 中使用包含文本获取元素的直接标记

无法在python硒中使用Selenium Chrome WebDriver定位元素

无法在python中使用Selenium Web驱动程序获取文本

无法在 python 中使用 selenium webdriver 获取所有页面元素？

无法使用 xPath 在元素后获取文本

无法在Chrome中使用香草Javascript获取所选文本的HTML className

在Python中使用BeautifulSoup无法在每个p元素中获取文本

在Python中使用Selenium Webdriver基于文本搜索元素

如何在Python中使用Selenium提取文本元素？

验证文本-使用Selenium Webdriver和python从元素获取innerHTML

使用python selenium chromedriver获取htlm内联元素类的文本

使用Selenium Python从隐藏元素中获取文本

如何使用Python Selenium在元素内部获取文本？

尝试在Python 3中使用Selenium获取文本

在动态命名的类中使用 Selenium 从 div 获取文本

在Selenium测试中使用XPath通过文本获取WebElement

尝试在 python 中使用 selenium 从 div 类中获取文本

无法使用Selenium和python定位文本输入元素

无法在python中使用Selenium Webdriver定位元素

无法在Python中使用Selenium获得元素链接

无法在Selenium和Python中使用OR定位元素

Selenium Webdriver在功能中使用时无法识别元素

Selenium webdriver 无法获取元素

Selenium WebDriver无法获取元素

TOP 榜单

文章

在Chrome中使用Selenium无法获取元素文本

在Chrome中使用Selenium无法获取元素文本

Android Studio Kotlin：提取为常量

计算数据帧R中的字符串频率

如何使用Redux-Toolkit重置Redux Store

http：// localhost：3000 /＃！/为什么我在localhost链接中得到“＃！/”。

如何使用tweepy流式传输来自指定用户的推文（仅在该用户发布推文时流式传输）

TreeMap中的自定义排序

TYPO3：将 Formhandler 添加到新闻扩展

遍历元素数组以每X秒在浏览器上显示

在Ubuntu和Windows中，触摸板有时会滞后。硬件问题？

警告消息：在matrix（unlist（drop.item），ncol = 10，byrow = TRUE）中：数据长度[16]不是列数的倍数[10]>？

无法连接网络并在Ubuntu 14.04中找到eth0

将辅助轴原点与主要轴对齐

我可以ping IPv6但不能ping IPv4

在Jenkins服务器中使用Selenium和Ruby进行的黄瓜测试失败，但在本地计算机中通过

提交html表单时为空

使用C ++ 11将数组设置为零

如果从DB接收到的值为空，则JMeter JDBC调用将返回该值作为参数名称

尝试在Dell XPS13 9360上安装Windows 7时出错

如何在R中转置数据

无法使用 envoy 访问 .ssh/config

未捕获的SyntaxError：带有Ajax帖子的意外令牌u