'Add new element under specific element category in xml using python
Edit 2: It seems I got the persistance of the changes wrong. It will simply record the new elements for the last record in the csv file. How do I keep the added elements after they are added in the xml tree?
#########edit end#########
Edit: After running the code from furas, I can achieve what I want. With some caveats:
my cvs is
orderid,price,tax,gift
1,23.00,0.00,false
Order 1's first product is already in xml. Script will add the second product in the xml.
If I add 3rd product
orderid,price,tax,gift
1,23.00,0.00,false
1,44.00,0.00,false
and run it again the third product is not added. Moreover if I change the price from 23.00 to 55.00 and run it, the price in xml remains 23.00. Is this some cache? Running on linux mint. Very odd. My code is:
from lxml import etree
import io
csv=open('orders_export (4).csv').readlines()
for i in csv[1:]:
P=i.split(',')
if len(P[2])==0:
orderid=P[0]
tree = etree.parse(io.BytesIO(allxml.encode()))
root = tree.getroot()
items=root.find('.//{http://www.demandware.com/xml/impex/order/2006-10-31}order[@order-no="#PRIME1036"]{http://www.demandware.com/xml/impex/order/2006-10-31}product-lineitems')
new_item = etree.SubElement(items, 'product-lineitem')
net_price = etree.SubElement(new_item, 'net_price')
net_price.text = P[18]
tax = etree.SubElement(new_item, 'tax')
tax.text = '0.0'
net_price2 = etree.SubElement(new_item, 'gross-price')
net_price2.text = P[18]
tree = etree.ElementTree(root)
c+=1
tree.write('output.xml', xml_declaration=True, encoding='UTF-8')
################# end edit###################
I have a xml file showing some store orders.
Each order only has 1 product at the moment. I.e. order1 - 1 pair of jeans. Order2 - 1 ball.
Some of the orders actually have more than one product. I have a csv with the extra products, per order. I am stuck how can I add the extra products per order.
My xml look like this:
?xml version="1.0" encoding="UTF-8"?>
<orders xmlns="http://www.demandware.com/xml/impex/order/2006-10-31">
<order order-no="#PRIME1036">
<product-lineitems>
<product-lineitem>
<net-price>29.99</net-price>
<tax>0.83</tax>
<gift>false</gift>
</product-lineitem>
</product-lineitems>
</order>
<order>
.
..
Any idea how can I add the extra product(s) under for the specific order-no attribute #PRIME1036? I.e. select the order number by the attribute, find the element and add subelements under it?
I need to get:
<product-lineitems>
<product-lineitem>
<net-price>29.99</net-price>
<tax>0.83</tax>
<gift>false</gift>
</product-lineitem>
<product-lineitem>
<net-price>999.99</net-price>
<tax>0</tax>
<gift>false</gift>
</product-lineitem>
</product-lineitems>
Solution 1:[1]
Python has special modules to work with XML
- standard xml (xml.dom, xml.etree)
- lxml
- BeautifulSoup
You can read from file root = etree.parse(filename).getroot()
or use string root = etree.fromstring(text.encode())
lxml
uses XPATH
to search elements.
(because your file has xmlns="..."
so find()
needs namespace
or you would have to search {http://www.demandware.com/xml/impex/order/2006-10-31}order
and {http://www.demandware.com/xml/impex/order/2006-10-31}product-lineitems' instead of
orderand
product-lineitems`)
Next you can use SubElement(element, tag)
(or element.append(Element(tag))
to add tag, and .text
to add text in this tag.
And finally you can use etree.tostring(root)
to generate string or etree.ElementTree(root).write(filename)
to write in file.
text = '''<?xml version="1.0" encoding="UTF-8"?>
<orders xmlns="http://www.demandware.com/xml/impex/order/2006-10-31">
<order order-no="#PRIME1036">
<product-lineitems>
<product-lineitem>
<net-price>29.99</net-price>
<tax>0.83</tax>
<gift>false</gift>
</product-lineitem>
</product-lineitems>
</order>
</orders>'''
from lxml import etree
import io
#root = etree.fromstring(text.encode()) # I have to encode because xml has declared `encoding="UTF-8"`
#tree = etree.parse(filename)
tree = etree.parse(io.BytesIO(text.encode())) # I have to encode because xml has declared `encoding="UTF-8"`
root = tree.getroot()
print('nsmap:', root.nsmap)
# find element
items = root.find(f'.//order[@order-no="#PRIME1036"]/product-lineitems', namespaces=root.nsmap)
# add new subelement
new_item = etree.SubElement(items, 'product-lineitem')
net_price = etree.SubElement(new_item, 'net_price')
net_price.text = '999.99'
tax = etree.SubElement(new_item, 'tax')
tax.text = '0'
gift = etree.SubElement(new_item, 'gift')
gift.text = 'false'
print('\n--- result ---\n')
# `tostring(..., pretty_print=True)` doesn't work when items added to old XML
# `write(..., pretty_print=True)` doesn't work when items added to old XML
etree.indent(root, space=" ") # to reformat it (because `pretty_print` doesn't work when items added to old XML
print(etree.tostring(root, xml_declaration=True, encoding='UTF-8').decode())
tree = etree.ElementTree(root)
tree.write('output.xml', xml_declaration=True, encoding='UTF-8')
Result:
<?xml version='1.0' encoding='UTF-8'?>
<orders xmlns="http://www.demandware.com/xml/impex/order/2006-10-31">
<order order-no="#PRIME1036">
<product-lineitems>
<product-lineitem>
<net-price>29.99</net-price>
<tax>0.83</tax>
<gift>false</gift>
</product-lineitem>
<product-lineitem>
<net_price>999.99</net_price>
<tax>0</tax>
<gift>false</gift>
</product-lineitem>
</product-lineitems>
</order>
</orders>
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |