'yaml anchors definitions loading in PyYAML
I'm using PyYAML. Is there a way to define a YAML anchor in a way it won't be a part of the data structure loaded by yaml.load (I can remove "wifi_parm" from the dictionary but looking for a smarter way)?
example.yaml
:
wifi_parm: &wifi_params
ssid: 1
key: 2
test1:
name: connectivity
<<: *wifi_params
test2:
name: connectivity_5ghz
<<: *wifi_params
load_example.py
:
import yaml
import pprint
with open('aaa.yaml', 'r') as f:
result = yaml.load(f)
pprint.pprint(result)
prints:
{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1},
'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1},
'wifi_parm': {'key': 2, 'ssid': 1}}
I need:
{'test1': {'key': 2, 'name': 'connectivity', 'ssid': 1},
'test2': {'key': 2, 'name': 'connectivity_5ghz', 'ssid': 1}}
Solution 1:[1]
The anchor information in PyYAML is discarded before you get the result from yaml.load()
. This is according to the YAML 1.1 specification that PyYAML follows (... anchor names are a serialization detail and are discarded once composing is completed). This has not changed in the YAML 1.2 specification (from 2009). You cannot do this in PyYAML by walking over your result
(recursively) and testing what values might be anchors, without extensively modifying the parser.
In my ruamel.yaml (which is YAML 1.2) in round-trip-mode, I preserve the anchors and aliases for anchors that are actually used to alias mappings or sequences (anchors aliases are currently not preserved for scalars, nor are "unused" anchors):
import ruamel.yaml
with open('aaa.yaml') as f:
result = ruamel.yaml.round_trip_load(f)
ruamel.yaml.round_trip_dump(result, sys.stdout)
gives:
wifi_parm: &wifi_params
ssid: 1
key: 2
test1:
<<: *wifi_params
name: connectivity
test2:
<<: *wifi_params
name: connectivity_5ghz
and you can actually walk the mapping (or recursively the tree) and find the anchor node and delete it, without knowing the keys name.
import ruamel.yaml
from ruamel.yaml.comments import merge_attrib
with open('aaa.yaml') as f:
result = ruamel.yaml.round_trip_load(f)
keys_to_delete = []
for k in result:
v = result[k]
if v.yaml_anchor():
keys_to_delete.append(k)
for merge_data in v.merge: # update the dict with the merge data
v.update(merge_data[1])
delattr(v, merge_attrib)
for k in keys_to_delete:
del result[k]
ruamel.yaml.round_trip_dump(result, sys.stdout)
gives:
test1:
name: connectivity
ssid: 1
key: 2
test2:
name: connectivity_5ghz
ssid: 1
key: 2
doing this generically and recursively (i.e. for anchors and aliases that are anywhere in the tree) is possible as well. The update would be as easy as above, but you would need to keep track of how to delete a key, and this doesn't have to be a mapping value, it could be a sequence item.
Solution 2:[2]
I wanted to do this today too and instead of switching to ruamel.yaml
like @Anthon suggests, I found the pyyaml-keep-anchors repository instead, which allowed me to continue using pyyaml
. Here's the example from that repo, which worked out of the box for me.
import yaml
from yaml_keep_anchors.yaml_anchor_parser import AliasResolverYamlLoader
with open('example/example.yaml', 'r') as fh:
data = yaml.load(fh, Loader=AliasResolverYamlLoader)
assert data['key_three'].anchor_name == 'anchor'
assert data['key_two']['sub_key'].anchor_name == 'anchor_val'
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Ani |