Pretty printing XML in Python
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Pretty Printing Xml In Python
00:12 Answer 1 Score 195
00:30 Accepted Answer Score 472
00:39 Answer 3 Score 49
01:14 Answer 4 Score 123
01:41 Thank you
--
Full question
https://stackoverflow.com/questions/7497...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #xml #prettyprint
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Pretty Printing Xml In Python
00:12 Answer 1 Score 195
00:30 Accepted Answer Score 472
00:39 Answer 3 Score 49
01:14 Answer 4 Score 123
01:41 Thank you
--
Full question
https://stackoverflow.com/questions/7497...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #xml #prettyprint
#avk47
ACCEPTED ANSWER
Score 472
import xml.dom.minidom
dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()
ANSWER 2
Score 195
lxml is recent, updated, and includes a pretty print function
import lxml.etree as etree
x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)
Check out the lxml tutorial: https://lxml.de/tutorial.html
ANSWER 3
Score 123
Another solution is to borrow this indent function, for use with the ElementTree library that's built in to Python since 2.5.
Here's what that would look like:
from xml.etree import ElementTree
def indent(elem, level=0):
i = "\n" + level*" "
j = "\n" + (level-1)*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for subelem in elem:
indent(subelem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = j
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = j
return elem
root = ElementTree.parse('/tmp/xmlfile').getroot()
indent(root)
ElementTree.dump(root)
ANSWER 4
Score 49
Here's my (hacky?) solution to get around the ugly text node problem.
uglyXml = doc.toprettyxml(indent=' ')
text_re = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL)
prettyXml = text_re.sub('>\g<1></', uglyXml)
print prettyXml
The above code will produce:
<?xml version="1.0" ?>
<issues>
<issue>
<id>1</id>
<title>Add Visual Studio 2005 and 2008 solution files</title>
<details>We need Visual Studio 2005/2008 project files for Windows.</details>
</issue>
</issues>
Instead of this:
<?xml version="1.0" ?>
<issues>
<issue>
<id>
1
</id>
<title>
Add Visual Studio 2005 and 2008 solution files
</title>
<details>
We need Visual Studio 2005/2008 project files for Windows.
</details>
</issue>
</issues>
Disclaimer: There are probably some limitations.