如何从xml tspan标记中检索信息

用户名

我目前正在使用BeautifulSoup进行xml解析,但是不知道如何从tspan标记中获取信息。我正在解析的xml看起来像这样:

<text
           transform="matrix(0,-1,-1,0,7931,3626)"
           style="font-variant:normal;font-weight:normal;font-size:92.3259964px;font-family:Arial;-inkscape-font-specification:ArialMT;writing-mode:lr-tb;fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:none"
           id="text60264"><tspan
             x="0 61.581444 123.16289 184.74432 251.4037 323.23334 384.81476 410.48138 436.14801 497.72946 559.31091 625.97028 692.62964 754.21112 805.54437 831.211 918.3667 979.94818 1005.6148 1077.4445 1144.1038 1200.515 1226.1816 1256.9261 1318.5076 1390.3373 1421.0818 1446.7484 1472.415 1526.3334 1587.9149 1649.4963 1716.1556 1782.8151 1844.3965 1895.7297 1921.3964 2008.5521 2070.1335 2095.8003 2167.6299 2234.2893 2290.7004"
             y="0"
             sodipodi:role="line"
             id="tspan60262">APPROX. BARREL WEIGHT (KG): &lt;BARREL WEIGHT&gt;</tspan></text>

我可以从text标签中获取文本,但是我正在尝试获取,x="0 61.581..."以便可以将其更改为just x="0"到目前为止,我的代码仅获得tspan xml标签

from bs4 import BeautifulSoup


infile = open('1c E37 Face Electric Operation No No.svg', 'r')
contents = infile.read()
soup = BeautifulSoup(contents, 'lxml')
items = soup.find_all('tspan')
for item in items:
    print(item) 
QHarr

您也可以使用CSS选择器-通过将x属性添加到选择器中,您只会获得存在该属性的元素

from bs4 import BeautifulSoup as bs

s = '''
<text
transform="matrix(0,-1,-1,0,7931,3626)"
style="font-variant:normal;font-weight:normal;font-size:92.3259964px;font-family:Arial;-inkscape-font-specification:ArialMT;writing-mode:lr-tb;fill:#000000;fill-opacity:1;fill-rule:nonzero;stroke:none"
id="text60264"><tspan
 x="0 61.581444 123.16289 184.74432 251.4037 323.23334 384.81476 410.48138 436.14801 497.72946 559.31091 625.97028 692.62964 754.21112 805.54437 831.211 918.3667 979.94818 1005.6148 1077.4445 1144.1038 1200.515 1226.1816 1256.9261 1318.5076 1390.3373 1421.0818 1446.7484 1472.415 1526.3334 1587.9149 1649.4963 1716.1556 1782.8151 1844.3965 1895.7297 1921.3964 2008.5521 2070.1335 2095.8003 2167.6299 2234.2893 2290.7004"
 y="0"
 sodipodi:role="line"
 id="tspan60262">APPROX. BARREL WEIGHT (KG): &lt;BARREL WEIGHT&gt;</tspan></text>
'''

soup = bs(s, 'lxml')
print([i['x'] for i in soup.select('tspan[x]')])

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章