使用beautifulsoup从“td”标签解析整数

mufit 发表于 Dev

多功能

我读了很多关于beautifulsoup的文章，但我还是不明白。我需要一个例子。

我想获得“PD/DD”的值，即 1,9。

这是来源：

<div class="table vertical">
    <table>
        <tbody>
            <tr>
                <th>F/K</th>
                <td>A/D</td>
            </tr>
            <tr>
                <th>FD/FAVÖK</th>
                <td>19,7</td>
            </tr>
            <tr>
    HERE-->    <th>PD/DD</th> 
    HERE-->    <td>1,9</td> 
            </tr>
            <tr>
                <th>FD/Satışlar</th>
                <td>5,1</td>
            </tr>
            <tr>
                <th>Yabancı Oranı (%)</th>
                <td>2,43</td>
            </tr>
            <tr>
                <th>Ort Hacim (mn$) 3A/12A</th>
                <td>1,3 / 1,6</td>
            </tr>

我的代码是：

a="afyon"

url_bank = "https://www.isyatirim.com.tr/tr-tr/analiz/hisse/sayfalar/sirket-karti.aspx?hisse={}".format(a.upper())

response_bank = requests.get(url_bank)
html_content_bank = response_bank.content
soup_bank = BeautifulSoup(html_content_bank, "html.parser")
b=soup_bank.find_all("div", {"class": "table vertical"})

for i in b:
    children = i.findChildren("td" , recursive=True)


    for child in children:
        l=[]
        l_text = child.text
        l.append(l_text)
        print(l)

当我运行这段代码时，它给了我一个带有 1 个索引的列表。

['Afyon Çimento                 ']
['11.04.1990']
['Çimento üretip satmak ve ana faaliyet konusu ile ilgili her türlü yan sanayi kuruluşlarına iştirak etmek.']
['(0216)5547000']
['(0216)6511415']
['Kısıklı Cad. Sarkusyan-Ak İş Merkezi S Blok kat:2 34662 Altunizade - Üsküdar / İstanbul']
['A/D']
['19,7']
['1,9']
['5,1']
['2,43']
['1,3 / 1,6']
['407,0 mnTL']
['395,0 mnTL']
['-']

我怎样才能只获得 PD/DD 值。我期待这样的事情：

PD/DD : 1,9

哈尔

我的偏好：

使用 bs4 4.7.1，您可以使用它的文本值:contains来定位，th然后采用相邻的兄弟 td。

import requests
from bs4 import BeautifulSoup

a="afyon"
url_bank = "https://www.isyatirim.com.tr/tr-tr/analiz/hisse/sayfalar/sirket-karti.aspx?hisse={}".format(a.upper())
response_bank = requests.get(url_bank)
html_content_bank = response_bank.content
soup_bank = BeautifulSoup(html_content_bank, "html.parser")
print(soup_bank.select_one('th:contains("PD/DD") + td').text)

您还可以:nth-of-type用于位置匹配（第 3 行第 1 列）：

soup_bank.select_one('.vertical table:not([class]) tr:nth-of-type(3) td:nth-of-type(1)').text

当我们使用select_one返回第一个匹配项时，我们可以缩短为：

soup_bank.select_one('.vertical table:not([class]) tr:nth-of-type(3) td').text

如果 id 是静态的

soup_bank.select_one('#ctl00_ctl45_g_76ae4504_9743_4791_98df_dce2ca95cc0d tr:nth-of-type(3) td').text

您已经知道了，PD/DD但是可以通过以下方式获得：

soup_bank.select_one('.vertical table:not([class]) tr:nth-of-type(3) th').text

如果这些 id 至少在一段时间内保持静态，那么

soup_bank.select_one('#ctl00_ctl45_g_76ae4504_9743_4791_98df_dce2ca95cc0d tr:nth-of-type(3) th').text

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-07-22

我来说两句

0 条评论

登录后参与评论

在td标签内使用div

用python的BeautifulSoup解析“ <tbody> / <tr> / <td>”

BeautifulSoup无法正确解析<td>数据

在BeautifulSoup中使用dict解析脚本标签

BeautifulSoup-解析未返回预期的标签

BeautifulSoup返回空的td标签

使用BeautifulSoup在结果集中获取td标签的_text

如何使用beautifulSoup从<td>标签分别抓取数据？

使用python和beautifulsoup捕获来自td标签的数据

如何使用BeautifulSoup解析嵌套标签

使用BeautifulSoup解析电影字幕-如何忽略嵌套在文本中的标签？

TypeError：使用BeautifulSoup获取跨度标签类时，字符串索引必须为整数

如何使用BeautifulSoup跳过空的<td>？

使用BeautifulSoup内部标签进行解析

如何使用Python BeautifulSoup提取td HTML标签？

使用BeautifulSoup提取<span> WITH标签

使用BeautifulSoup遍历HTML标签

不带类的标签的BeautifulSoup HTML表解析

如何使用python BeautifulSoup解析与唯一值关联的名称空间标签

使用 BeautifulSoup 获取 HTML 标签

Beautifulsoup 解析 html 标签异常

使用 Python 中的 BeautifulSoup 解析具有不同数据的重复标签的 XML 文件

使用 BeautifulSoup 解析错误

使用 ElementTree 和 BeautifulSoup 解析文件：有没有办法按标签级别数解析文件？

使用 BeautifulSoup 在 xml 解析中删除包含特定子标签的标签

使用 BeautifulSoup 解析带有冒号标签的 XML

Python - Beautifulsoup，通过使用内部标签区分 html 元素内的解析文本

使用 BeautifulSoup 解析 h1 标签下的表格并存储在 df 中

使用 BeautifulSoup 替換 td 中的文本

TOP 榜单

文章

使用beautifulsoup从“td”标签解析整数

使用beautifulsoup从“td”标签解析整数

蓝屏死机没有修复解决方案

计算数据帧中每行的NA

UITableView的项目向下滚动后更改颜色，然后快速备份

Node.js中未捕获的异常错误，发生调用

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

Linux的官方Adobe Flash存储库是否已过时？

验证REST API参数

ggplot：对齐多个分面图-所有大小不同的分面

Mac OS X更新后的GRUB 2问题

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

带有错误“ where”条件的查询如何返回结果？

用日期数据透视表和日期顺序查询

VB.net将2条特定行导出到DataGridView

如何从视图一次更新多行（ASP.NET - Core）

Java Eclipse中的错误13，如何解决？

尝试反复更改屏幕上按钮的位置 - kotlin android studio

离子动态工具栏背景色

应用发明者仅从列表中选择一个随机项一次

当我尝试下载 StanfordNLP en 模型时，出现错误

python中的boto3文件上传

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID