The Python Oracle

Parsing an arbitrary XML file with ElementTree

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Book End

--

Chapters
00:00 Question
02:13 Accepted answer (Score 4)
02:57 Answer 2 (Score 0)
03:18 Thank you

--

Full question
https://stackoverflow.com/questions/1551...

Answer 1 links:
[using BeautifulSoup]: http://www.crummy.com/software/Beautiful.../

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #xml #elementtree

#avk47



ACCEPTED ANSWER

Score 4


Using ElementTree:

import xml.etree.ElementTree as et

filehandler = open("file.xml","r")
raw_data = et.parse(filehandler)
data_root = raw_data.getroot()
filehandler.close()

for children in data_root:
    for child in children:
        print(child.tag, child.text, children.tag, children.text)

That will give you an overview of the XML-tags and associated text inside tags. You can add more loops to step further into the tree, and perform checks to see wether any of the children contains further levels. I find this method useful when the name of the XML tags varies and does not follow an already known standard.




ANSWER 2

Score 0


An example using BeautifulSoup:

import sys 
from bs4 import BeautifulSoup

file = sys.argv[1]
handler = open(file).read()
soup = BeautifulSoup(handler)

for table in soup.find_all("target_table"):
  for loc in table.find_all("rep"):
    print loc.xlocation.string + ", " + loc.ylocation.string

Output

nextXREL, nextYREL