如何从下面的 HTML 代码中提取 ID 和标签(10870,7th Phase JP Nagar)
<input id="filter_data" type="hidden" value="{"Locality"
:{"Top_Results_Array"
:{"0"
:{"ID":"10870","LABEL":"7th Phase JP Nagar","SELECTED":"","COUNT":202.0},"1"
:{"ID":"2259","LABEL":"Electronic City","SELECTED":"","COUNT":126.0},"2"
:{"ID":"2265","LABEL":"Koramangala","SELECTED":"","COUNT":118.0},"3"
:{"ID":"11646","LABEL":"BTM 2nd Stage","SELECTED":"","COUNT":118.0}},"More_Locality_Array"
:{"0"
:{"ID":"2277","LABEL":"Bellandur","SELECTED":"","COUNT":102.0},"1"
:{"ID":"5467","LABEL":"Hulimavu","SELECTED":"","COUNT":95.0},"2"
:{"ID":"2261","LABEL":"HSR Layout","SELECTED":"","COUNT":94.0},"3":
:{"ID":"2293","LABEL":"Jigani","SELECTED":"","COUNT":91.0},"4"
:{"ID":"2249","LABEL":"Bannerghatta Road","SELECTED":"","COUNT":83.0},"5"
:{"ID":"2264","LABEL":"Kanakpura Road","SELECTED":"","COUNT":83.0},"6":
我试过遵循 python 代码,它只是获取 input(id=filter_data) 的值
for loc in soup.find_all('input',id='filter_data'):
print(loc.get('value'))
我低于输出
{"Locality":{"Top_Results_Array":{
"0":{"ID":"10870","Locality":"7th Phase JP Nagar","SELECTED":"","COUNT":202.0}
,"1":{"ID":"2259","LABEL":"Electronic City","SELECTED":"","COUNT":126.0}
,"2":{"ID":"2265","LABEL":"Koramangala","SELECTED":"","COUNT":118.0}
,"3":{"ID":"11646","LABEL":"BTM 2nd Stage","SELECTED":"","COUNT":118.0}}
,"More_Locality_Array":{"0":{
"ID":"2277","LABEL":"Bellandur","SELECTED":"","COUNT":102.0}
,"1":{"ID":"5467","LABEL":"Hulimavu","SELECTED":"","COUNT":95.0}
,"2":{"ID":"2261","LABEL":"HSR Layout","SELECTED":"","COUNT":94.0}
,"3":{"ID":"2293","LABEL":"Jigani","SELECTED":"","COUNT":91.0}
,"4":{"ID":"2249","LABEL":"Bannerghatta Road","SELECTED":"","COUNT":83.0}
,"5":{"ID":"2264","LABEL":"Kanakpura Road","SELECTED":"","COUNT":83.0}
但我需要低于输出
10870 第七阶段 JP Nagar
2259电子城
第 2265 章
11646 BTM 第二阶段
第2277章
5467 胡里马武
2261高铁布局
. .
你能帮我解决这个问题吗
我可以建议的一种方法是jsonify
您的结果集并根据需要提取信息。问题是unicode
. 你可以在get之后用这段代码试验一下result
,你可以按照自己的方式获取数据。您可以将数据加载为列表、字典等,并根据需要获取值。
import json
exp = soup.find_all('input', attrs={"id":"filter_data"})
abc = exp[0].get('value') # len(exp) = 1
abc = abc.decode('utf-8') # since its unicode
result = json.loads(abc)
result
如果要查看具有位置的结果值,请检查
print result.values()[2]
在字典中查看并决定您想要的所有内容。
dict(result)
玩转json,你会得到你想要的。我希望这会有所帮助。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句