Remove CDATA from XML

FallenAngel

I am working on a SOAP api with python-suds.

Api returns result and suds parse it according to WSDL. result data have an XML data field

(MyServiceResult){
    errorMsg = "Error Message here..."
    sessionId = "..."
    outputDataXML = "<![CDATA[<Results>.....<Details>....</Details></Results>]]>"
    errorCode = "00"
 }

So I planned to use xml.etree.ElementTree to parse the xml data part outputDataXML. But since returning data starts with <![CDATA[, xml parser fails with

ParseError: syntax error: line 1, column 0

What is the best approach for a such situation except usge of regex?

unutbu

Call ET.fromstring once to extract the text from the CDATA. Call ET.fromstring a second time to parse the string as XML:

import xml.etree.ElementTree as ET

d = '<![CDATA[<Results>.....<Details>....</Details></Results>]]>'
fix = '<root>{}</root>'.format(d)

content = ET.fromstring(fix).text
print(repr(content))
# '<Results>.....<Details>....</Details></Results>'

results = ET.fromstring(content)
print(ET.tostring(results))
# <Results>.....<Details>....</Details></Results>

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related