'Include one XML within another XML and parse it with python
I wanted to include an XML file in another XML file and parse it with python. I am trying to achieve it through Xinclude. There is a file1.xml which looks like
<?xml version="1.0"?>
<root>
<document xmlns:xi="http://www.w3.org/2001/XInclude">
<xi:include href="file2.xml" parse="xml" />
</document>
<test>some text</test>
</root>
and file2.xml which looks like
<para>This is a paragraph.</para>
Now in my python code i tried to access it like:
from xml.etree import ElementTree, ElementInclude
tree = ElementTree.parse("file1.xml")
root = tree.getroot()
for child in root.getchildren():
print child.tag
It prints the tag of all child elements of root
document
test
Now when i tries to print the child objects directly like
print root.document
print root.test
It says the root doesnt have children named test or document. Then how am i suppose to access the content in file2.xml?
I know that I can access the XML elements from python with schema like:
schema=etree.XMLSchema(objectify.fromstring(configSchema))
xmlParser = objectify.makeparser(schema = schema)
cfg = objectify.fromstring(xmlContents, xmlParser)
print cfg.elemetName # access element
But since here one XML file is included in another, I am confused how to write the schema. How can i solve it?
Solution 1:[1]
Not sure why you want to use XInclude, but including an XML file in another one is a basic mechanism of SGML and XML, and can be achieved without XInclude as simple as:
<!DOCTYPE root [
<!ENTITY externaldoc SYSTEM "file2.xml">
]>
<root>
<document>
&externaldoc;
</document>
<test>some text</test>
</root>
Solution 2:[2]
Below
import xml.etree.ElementTree as ET
xml1 = '''<?xml version="1.0"?>
<root>
<test>some text</test>
</root>'''
xml2 = '''<para>This is a paragraph.</para>'''
root1 = ET.fromstring(xml1)
root2 = ET.fromstring(xml2)
root1.insert(0,root2)
para_value = root1.find('.//para').text
print(para_value)
output
This is a paragraph.
Solution 3:[3]
You need to make xml.etree to include the files referenced with xi:include. I have added the key line to your original example:
from xml.etree import ElementTree, ElementInclude
tree = ElementTree.parse("file1.xml")
root = tree.getroot()
#here you make the parser actually include every referenced file
ElementInclude.include(root)
#and now you are good to go
for child in root.getchildren():
print child.tag
For a detailed reference about includes in python, see the includes section in the official Python documentation https://docs.python.org/3/library/xml.etree.elementtree.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | imhotap |
Solution 2 | |
Solution 3 | Manolo Conesa |