使用Python Beautifulsoup从复杂的html标签获取数据

edyvedy13

我有以下HTML数据:

<div class="display-info">
    <div class="record-icon pubtype"><span class="pubtype-icon pt-academicJournal" title="Academic Journal"> </span>
        <p class="caption">Academic Journal</p>
    </div>By: Stein, Mark. <strong>Organization Studies</strong>. 2007, Vol. 28 Issue 8, p1223-1241. 19p. Abstract: While the literature on front-line service work utilizes a variety of productive images, I argue that these images do not capture certain of the more problematic experiences of front-line service employees. Drawing on words used by these workers themselves, and using concepts from psychoanalysis and its application to organizational dynamics, I therefore propose a new image, that of toxicity. I argue that — especially when under severe pressure from customers — front-line workers may have the unconscious fantasy that they have been polluted by toxic substances. The unconscious experience of the entry of toxic material is likely to result in further <strong>contagion</strong> of relationships such as those among employees and between employees and customers. This may also result in workers retaliating against customers by exacting revenge on them. A downward spiralling of relationships may follow, with the result that large parts of the work environment are experienced as toxic. The implications for theory are explored. In conclusion, I argue that the theme of toxicity helps us connect the employee-customer interface with a deep reservoir of primordial human experience that links the body with emotions. [ABSTRACT FROM AUTHOR] DOI: 10.1177/0170840607079527. (<cite>AN: 26198405</cite>)
    <p class="subjectResults"><strong>Subjects:
    </strong>Industrial relations; Personnel management; Customer relations; Corporate image; Public relations; Consumer behavior; Sales personnel; Administration of Human Resource Programs (except Education, Public Health, and Veterans' Affairs Programs); Human Resources Consulting Services; Public Relations Agencies; Psychoanalysis; Social interaction</p><span class="record-additional"><span class="item add-to-folder"><a class="folder-toggle item-not-in-folder" data-folder='{"db":"bth","uiTerm":"26198405","uiTag":"AN","ebookFormat":"false","abookFormat":"false","title":"Toxicity and the Unconscious Experience of the Body at the Employee--Customer Interface. ","resultID":"50","doid":"","segid":""}' data-isaddtofolder="true" data-itemid="50" href="#" id="add_50" name="addToFolder" title="To print, e-mail, or save multiple items">Add to folder</a> <a class="folder-toggle item-in-folder" data-folder='{"db":"bth","uiTerm":"26198405","uiTag":"AN","ebookFormat":"false","abookFormat":"false","title":"Toxicity and the Unconscious Experience of the Body at the Employee--Customer Interface. ","resultID":"50","doid":"","segid":""}' data-isaddtofolder="false" data-itemid="50" href="#" id="added_50" style="display: none;" title="Remove result from folder">Remove from folder</a></span><span class="result-list-cite-ref-label"><a data-title="Cited References" href="javascript:__doLinkPostBack('','sl~~ref||su~~50','_top');" id="references50" title="Cited References">Cited References: (92) </a></span><span class="result-list-cite-link"><a data-title="Times Cited in this Database" href="javascript:__doLinkPostBack('','sl~~cit||su~~50','_top');" id="citations50" title="Times Cited in this Database">Times Cited in this Database: (20) </a></span> </span>
    <div class="record-formats-wrapper externalLinks"><span><span class="custom-link"><a class="ils-link" href="/ehost/SmartLink/OpenIlsLink?sid=42487fcc-c655-469f-b8ed-2802260b3983@sessionmgr102&amp;vid=15&amp;sl=smartlink&amp;st=ilslink_new&amp;sv=sdbn%253Dbth%2526pbt%253DAcademic%2520Journal%2526issn%253D01708406%2526ttl%253DOrganization%252520Studies%2526stp%253DC%2526asi%253DY%2526ldc%253DCheck%252520full%252520text%252520availability%2526lna%253DFull%252520Text%252520Finder%252520%25252D%252520INSEAD%2526lca%253DfullText%2526lo%255Fan%253D26198405&amp;su=http%3A%2F%2Fresolver%2Eebscohost%2Ecom%2Fopenurl%3Fcustid%3Ds8362180%26group%3Dmain%26authtype%3Dip%2Cuid%26sid%3DEBSCO%3Abth%26genre%3Darticle%26issn%3D01708406%26ISBN%3D%26volume%3D28%26issue%3D8%26date%3D20070801%26spage%3D1223%26pages%3D1223%2D1241%26title%3DOrganization%20Studies%26atitle%3DToxicity%2520and%2520the%2520Unconscious%2520Experience%2520of%2520the%2520Body%2520at%2520the%2520Employee%2D%2DCustomer%2520Interface%2E%26aulast%3DStein%252C%2520Mark%26id%3DDOI%3A10%2E1177%2F0170840607079527" id="linkILSLink50_1" onblur="self.status='';return true" onfocus="self.status='check full text availability.';return true" onmouseout="self.status='';return true" onmouseover="self.status='check full text availability.';return true" target="_new" title="check full text availability."><img align="middle" alt="check full text availability." border="0" class="icon-image" data-defer-image="https://s3.amazonaws.com/libapps/customers/2023/images/logo-INSEAD_blanc-sur-vert_250.jpg" id="imgILSLink50_1" src="https://if.ebsco-content.com/interfacefiles/17.232.0.2749/blank.gif"/>Check full text availability</a></span></span>
    </div>
</div>

我需要得到By: Stein, Mark.Abstract: While the literature on front-line service work utilizes a variety of productive images, I argue that these images do not capture certain of the more problematic experiences of front-line service employees. Drawing on words used by these workers themselves, and using concepts from psychoanalysis and its application to organizational dynamics, I therefore propose a new image, that of toxicity. I argue that — especially when under severe pressure from customers — front-line workers may have the unconscious fantasy that they have been polluted by toxic substances. The unconscious experience of the entry of toxic material is likely to result in further <strong>contagion</strong> of relationships such as those among employees and between employees and customers. This may also result in workers retaliating against customers by exacting revenge on them. A downward spiralling of relationships may follow, with the result that large parts of the work environment are experienced as toxic. The implications for theory are explored. In conclusion, I argue that the theme of toxicity helps us connect the employee-customer interface with a deep reservoir of primordial human experience that links the body with emotions.

随着soup.select(".display-info")[0].text我得到

 Academic JournalBy: Stein, Mark. Organization Studies. 2007, Vol. 28 Issue 8, p1223-1241. 19p. Abstract: While the literature on front-line service work utilizes a variety of productive images, I argue that these images do not capture certain of the more problematic experiences of front-line service employees. Drawing on words used by these workers themselves, and using concepts from psychoanalysis and its application to organizational dynamics, I therefore propose a new image, that of toxicity. I argue that — especially when under severe pressure from customers — front-line workers may have the unconscious fantasy that they have been polluted by toxic substances. The unconscious experience of the entry of toxic material is likely to result in further contagion of relationships such as those among employees and between employees and customers. This may also result in workers retaliating against customers by exacting revenge on them. A downward spiralling of relationships may follow, with the result that large parts of the work environment are experienced as toxic. The implications for theory are explored. In conclusion, I argue that the theme of toxicity helps us connect the employee-customer interface with a deep reservoir of primordial human experience that links the body with emotions. [ABSTRACT FROM AUTHOR] DOI: 10.1177/0170840607079527. (AN: 26198405)Subjects:
    Industrial relations; Personnel management; Customer relations; Corporate image; Public relations; Consumer behavior; Sales personnel; Administration of Human Resource Programs (except Education, Public Health, and Veterans' Affairs Programs); Human Resources Consulting Services; Public Relations Agencies; Psychoanalysis; Social interactionAdd to folder Remove from folderCited References: (92) Times Cited in this Database: (20)  Check full text availability 
安德烈·凯斯利(Andrej Kesely)

对于这个任务是更好地使用rebs4在一起。

如果变量txt包含问题中的HTML文本,则此脚本:

import re
from bs4 import BeautifulSoup

soup = BeautifulSoup(txt, 'html.parser')

txt = soup.select_one('.display-info').get_text(strip=True, separator='\n')

author = re.findall(r'By:.*', txt)[0]
abstract = re.findall(r'Abstract:.*?(?=\[ABSTRACT FROM AUTHOR\])', txt, flags=re.S)[0]

from textwrap import wrap
print(author)
print(*wrap(abstract.replace('\n', ' ')), sep='\n')

# or in case Python2 just:
# print author
# print abstract

印刷品:

By: Stein, Mark.
Abstract: While the literature on front-line service work utilizes a
variety of productive images, I argue that these images do not capture
certain of the more problematic experiences of front-line service
employees. Drawing on words used by these workers themselves, and
using concepts from psychoanalysis and its application to
organizational dynamics, I therefore propose a new image, that of
toxicity. I argue that — especially when under severe pressure from
customers — front-line workers may have the unconscious fantasy that
they have been polluted by toxic substances. The unconscious
experience of the entry of toxic material is likely to result in
further contagion of relationships such as those among employees and
between employees and customers. This may also result in workers
retaliating against customers by exacting revenge on them. A downward
spiralling of relationships may follow, with the result that large
parts of the work environment are experienced as toxic. The
implications for theory are explored. In conclusion, I argue that the
theme of toxicity helps us connect the employee-customer interface
with a deep reservoir of primordial human experience that links the
body with emotions.

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

在Python中使用BeautifulSoup获取直接父标签

Python:使用Beautifulsoup从html获取文本

在使用BeautifulSoup忽略格式标签的同时,如何从html获取文本?

如何使用Python和Beautifulsoup从脚本标签获取JavaScript变量

使用Python中的BeautifulSoup从HTML脚本标签中提取JSON

使用python和beautifulsoup捕获来自td标签的数据

如何使用beautifulsoup和python在span标签中获取文本

使用文本硒beautifulsoup python获取标签

如何在Python中使用Beautifulsoup获取嵌套标签的文本?

python:无法使用BeautifulSoup从html获取特定数据

使用BeautifulSoup从html表中获取数据

使用beautifulsoup获取多个标签和属性数据

如何使用Python BeautifulSoup提取td HTML标签?

Python使用Beautifulsoup嵌套html标签

如何使用beautifulsoup从html标记的特定类中获取数据?

python beautifulsoup获取html标签内容

使用BeautifulSoup和Python从item标签获取地址文本

使用BeautifulSoup遍历HTML标签

使用BeautifulSoup4在Python中存储标签中的数据

使用 BeautifulSoup 获取 HTML 标签

Python Beautifulsoup 标签选择 - 复杂问题

使用beautifulsoup4,Python在html标签内查找链接

Python Webscraping:需要帮助从 span html 标签获取数据值

在python中使用BeautifulSoup获取文本后的'href'标签

Python:BeautifulSoup 从 html 标签中提取/解析数据

我无法使用beautifulsoup python获取HTML标签的值

如何使用 python 和 BeautifulSoup 获取标签内的文本

如何在 Python 中使用 BeautifulSoup 从文本中获取标签

Python BeautifulSoup - 使用 <div> 之间的 html 标签创建数据框