Merging two XML documents with the ElementTree
simply recipe how to merge many XML documents using Python with ElementTree module
In my latest project which I am working on (for Headnet.dk) we are getting data from external source (a Xerox DocuShare's instance). DocuShare provides XML API over HTTP. This API looks quite complex and after fast searching I coudn't find a way to get all documents using search functionality.
import xml.etree.ElementTree as ET
from StringIO import StringIO
responses = None
for filename in FILES:
if responses:
[ responses.append(x) for x in ET.parse(filename).getroot().getchildren() ]
else:
maintree = ET.parse(filename)
root = maintree.getroot()
responses = root.getchildren()
tmpfile = StringIO()
maintree.write(tmpfile)
tmpfile.flush()
# this line is hacky and totally stupid, but it looks like
# that solr xpath parser is broken more than they stated
# and cannot handle xml with ns0: namespace :(
correct_data = tmpfile.read().replace('ns0:', '')
filewrite = open(URLS['destination'], 'w')
filewrite.write(correct_data)
filewrite.close()
