'`pyyaml` can't parse `pydantic` object if `typing` module is used
Let me start off by saying I wanted to open an issue in pydantic
repo. Once I started rubber duck debugging I came up to the conclusion it's actually pyyaml
that isn't working right but I'm not so sure anymore.
from dataclasses import dataclass
from functools import partial
from typing import List, Type
import yaml
from pydantic import BaseModel
yaml_input = """
!Foo
name: foo
bar:
- !Bar
name: bar
"""
def get_loader():
loader = yaml.SafeLoader
for tag_name, tag_constructor in tag_model_map.items():
loader.add_constructor(tag_name, tag_constructor)
return loader
def dynamic_constructor_mapping(model_class: Type[BaseModel], loader: yaml.SafeLoader,
node: yaml.nodes.MappingNode) -> BaseModel:
return model_class(**loader.construct_mapping(node))
def get_constructor_for_mapping(model_class: Type[BaseModel]):
return partial(dynamic_constructor_mapping, model_class)
class Bar(BaseModel):
name: str
class Foo1(BaseModel):
name: str
bar: list
class Foo2(BaseModel):
name: str
bar: List
class Foo3(BaseModel):
name: str
bar: List[Bar]
@dataclass
class Foo4:
name: str
bar: List[Bar]
foos = [Foo1, Foo2, Foo3, Foo4]
for foo_cls in foos:
tag_model_map = {
"!Foo": get_constructor_for_mapping(foo_cls),
"!Bar": get_constructor_for_mapping(Bar),
}
print(f"{foo_cls.__qualname__} loaded {yaml.load(yaml_input, Loader=get_loader())}")
which prints
Foo1 loaded name='foo' bar=[Bar(name='bar')]
Foo2 loaded name='foo' bar=[]
Foo3 loaded name='foo' bar=[]
Foo4 loaded Foo4(name='foo', bar=[Bar(name='bar')])
- list of
pydantic
objects is parsed correctly iflist
is used in static typing - list of
pydantic
objects is NOT parsed correctly ifList
is used in static typing - list of
pydantic
objects is NOT parsed correctly ifList[Bar]
is used in static typing - list of
dataclass
objects is always parsed correctly
The constructor seems to be returning the correct object in all examples so I don't understand where the problem lies.
pydantic==1.8.2
Python 3.8.10
Solution 1:[1]
So this is just a problem I've noticed with YAML in general, but it seems to me that code for de/serializing YAML to dataclasses is overall more complicated than needed.
If you don't need the data validation features that pydantic
provides, you could also check out the dataclass-wizard
, which provides a helper YAMLWizard
Mixin class that could be used for working with YAML data -- note that this does rely on the pyyaml
library as well.
Here is a simple example:
from __future__ import annotations
from dataclasses import dataclass
from dataclass_wizard import YAMLWizard
yaml_input = """
name: foo
bar:
- name: bar
"""
@dataclass
class Foo(YAMLWizard):
name: str
bar: list[Bar]
@dataclass
class Bar:
name: str
instance = Foo.from_yaml(yaml_input)
print(f'Loaded: {instance}')
To install dataclass-wizard
along with pyyaml
, you can include the yaml
extra:
pip install dataclass-wizard[yaml]
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |