et_xmlfile - An incremental writer for Python’s xml.etree
- Author:
Elias Rabel, Charlie Clark, Daniel Hillier and contributors
- Source code:
- Issues:
- Generated:
Oct 25, 2024
- License:
MIT/Expat
- Version:
2.0.0
et_xmfile
XML can use lots of memory, and et_xmlfile is a low memory library for creating large XML files And, although the standard library already includes an incremental parser, iterparse it has no equivalent when writing XML. Once an element has been added to the tree, it is written to the file or stream and the memory is then cleared.
This module is based upon the xmlfile module from lxml with the aim of allowing code to be developed that will work with both libraries. It was developed initially for the openpyxl project, but is now a standalone module.
The code was written by Elias Rabel as part of the Python Düsseldorf openpyxl sprint in September 2014.
Proper support for incremental writing was provided by Daniel Hillier in 2024
Note on performance
The code was not developed with performance in mind, but turned out to be faster than the existing SAX-based implementation but is generally slower than lxml’s xmlfile. There is one area where an optimisation for lxml may negatively affect the performance of et_xmfile and that is when using the .element() method on the xmlfile context manager. It is, therefore, recommended simply to create Elements write these directly, as in the sample code.
Sample code:
from io import BytesIO
from xml.etree.ElementTree import Element
from et_xmlfile import xmlfile
out = BytesIO()
with xmlfile(out) as xf:
el = Element("root")
xf.write(el) # write the XML straight to the file-like object
assert out.getvalue() == b"<root />"