I am a beginner in Python & XPATH and need to read an XML with non-uniform nodes (similar to the one mentioned below) using XPATH. The output format to be written to the file is also shown below. The code uses lxml library.
Please help me build a correct XPATH.
Source XML
<Classes>
<German>
<Student>
<Span><a href="">John</a></Span>
</Student>
<Student>
<Span>Adam</Span>
</Student>
</German>
<English>
<Student>
<Span>Mary</Span>
</Student>
</English>
<French>
<Student>
<Span><a href="">Anil</a></Span>
</Student>
<Student>
<Span><a href="">Jack</a></Span>
</Student>
</French>
<Spanish>
<Student>
<Span>Mary</Span>
</Student>
<Student>
<Span>Jack</Span>
</Student>
</Spanish>
</Classes>
Expected output
German
John
Adam
English
Mary
French
Anil
Jack
Spanish
Mary
Jack
Thanks, Nikhil
This code will help:
from lxml import html
xml_content = """<Classes>
<German>
<Student>
<Span><a href="">John</a></Span>
</Student>
<Student>
<Span>Adam</Span>
</Student>
</German>
<English>
<Student>
<Span>Mary</Span>
</Student>
</English>
<French>
<Student>
<Span><a href="">Anil</a></Span>
</Student>
<Student>
<Span><a href="">Jack</a></Span>
</Student>
</French>
<Spanish>
<Student>
<Span>Mary</Span>
</Student>
<Student>
<Span>Jack</Span>
</Student>
</Spanish>
</Classes>"""
tree = html.fromstring(xml_content)
classes = tree.xpath('//classes/*')
for language_class in classes:
print language_class.tag.capitalize()
for student in language_class.xpath('.//student/span//text()'):
print " {}".format(student)
Output:
German
John
Adam
English
Mary
French
Anil
Jack
Spanish
Mary
Jack
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments