'Parsing arxml file with python
I have a file I need to parse, I need the information. The order of parsing is important.
- I can parse the file and get the information, but not in the sequence.
- How can I parse the information
<MSR-QUERY-ARG SI="HtmlAnchor">
?
Btw: Where can I upload the arxml file?
File-Download: ARXML-FILE
from xml.etree import ElementTree as ET
import csv
fpath = "test.arxml"
tree = ET.parse(fpath)
root = tree.getroot()
ns = {'ns':'http://autosar.org/schema/r4.0'}
for arpackage in tree.findall('.//ns:CHAPTER/ns:TRACE',namespaces=ns):
print(arpackage.findall('.//ns:SHORT-NAME', namespaces=ns)[0].text)
for arpackage in tree.findall('.//ns:CHAPTER/ns:MSR-QUERY-P-1', namespaces=ns):
print(arpackage.findall('.//ns:MSR-QUERY-ARG', namespaces=ns)[0].text)
Solution 1:[1]
Another method.
from simplified_scrapy import SimplifiedDoc, utils, req
html = utils.getFileContent('test.arxml')
doc = SimplifiedDoc(html)
names = doc.selects('TRACE').selects('SHORT-NAME>text()')
msrs = doc.selects('MSR-QUERY-P-1').select('MSR-QUERY-ARG@SI="HtmlAnchor">text()')
print (names)
print (msrs)
Result:
[['S_001'], ['S_002'], ['S_003'], ['S_004'], ['S_005'], ['S_006'], ['S_007'], ['S_008'], ['S_009'], ['S_010'], ['S_011'], ['S_012'], ['S_013'], ['S_014'], ['S_015'], ['S_016'], ['S_017'], ['S_018'], ['S_019'], ['S_020'], ['S_021'], ['S_022'], ['S_023'], ['S_024'], ['S_025'], ['S_026'], ['S_027'], ['S_028'], ['S_029'], ['S_030'], ['S_031'], ['S_032'], ['S_033'], ['S_034'], ['S_035'], ['S_036'], ['S_037'], ['S_038'], ['S_039']]
['AAA_001', 'AAA_002', 'AAA_003']
Here are more examples, including parsing and updating: https://github.com/yiyedata/simplified-scrapy-demo/tree/master/doc_examples
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | dabingsou |