Pretty printing XML in Python
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Pretty Printing Xml In Python
00:12 Answer 1 Score 195
00:30 Accepted Answer Score 472
00:39 Answer 3 Score 49
01:14 Answer 4 Score 123
01:41 Thank you
--
Full question
https://stackoverflow.com/questions/7497...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #xml #prettyprint
#avk47
    Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Pretty Printing Xml In Python
00:12 Answer 1 Score 195
00:30 Accepted Answer Score 472
00:39 Answer 3 Score 49
01:14 Answer 4 Score 123
01:41 Thank you
--
Full question
https://stackoverflow.com/questions/7497...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #xml #prettyprint
#avk47
ACCEPTED ANSWER
Score 472
import xml.dom.minidom
dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()
ANSWER 2
Score 195
lxml is recent, updated, and includes a pretty print function
import lxml.etree as etree
x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)
Check out the lxml tutorial: https://lxml.de/tutorial.html
ANSWER 3
Score 123
Another solution is to borrow this indent function, for use with the ElementTree library that's built in to Python since 2.5.
Here's what that would look like:
from xml.etree import ElementTree
def indent(elem, level=0):
    i = "\n" + level*"  "
    j = "\n" + (level-1)*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for subelem in elem:
            indent(subelem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = j
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = j
    return elem        
root = ElementTree.parse('/tmp/xmlfile').getroot()
indent(root)
ElementTree.dump(root)
ANSWER 4
Score 49
Here's my (hacky?) solution to get around the ugly text node problem.
uglyXml = doc.toprettyxml(indent='  ')
text_re = re.compile('>\n\s+([^<>\s].*?)\n\s+</', re.DOTALL)    
prettyXml = text_re.sub('>\g<1></', uglyXml)
print prettyXml
The above code will produce:
<?xml version="1.0" ?>
<issues>
  <issue>
    <id>1</id>
    <title>Add Visual Studio 2005 and 2008 solution files</title>
    <details>We need Visual Studio 2005/2008 project files for Windows.</details>
  </issue>
</issues>
Instead of this:
<?xml version="1.0" ?>
<issues>
  <issue>
    <id>
      1
    </id>
    <title>
      Add Visual Studio 2005 and 2008 solution files
    </title>
    <details>
      We need Visual Studio 2005/2008 project files for Windows.
    </details>
  </issue>
</issues>
Disclaimer: There are probably some limitations.