Python Parsing XML

Parsing XML with Python using the standard library xml.etree.ElementTree. It converts XML text into a tree of Element objects you can traverse and query.

Basic Parsing

import xml.etree.ElementTree as ET

data = '''<person>
  <name>Chuck</name>
  <phone type="intl">
    +1 734 303 4456
  </phone>
  <email hide="yes" />
</person>'''

tree = ET.fromstring(data)
print('Name:', tree.find('name').text)
print('Attr:', tree.find('email').get('hide'))

Parsing from a File

tree = ET.parse('data.xml')
root = tree.getroot()
print(root.tag)

Iterating over Children

for child in root:
    print(child.tag, child.attrib)

Use iter() to walk the entire tree recursively:

for elem in root.iter('phone'):
    print(elem.text.strip())

Accessing Attributes and Text

elem = root.find('phone')
elem.text.strip()       # text content
elem.get('type')        # attribute value (None if missing)
elem.attrib             # dict of all attributes

Finding Multiple Elements

# All direct children matching a tag
root.findall('item')

# XPath-style expressions
root.findall('./items/item')
root.find('./items/item[@id="1"]')

Handling Namespaces

XML namespaces appear as {uri}tag. Register a prefix to keep queries readable:

ns = {'ns': 'http://example.com/schema'}
root.find('ns:item', ns)

For larger or more complex XML, consider lxml which is faster and supports full XPath:

pip install lxml

social