'`pyyaml` can't parse `pydantic` object if `typing` module is used

Let me start off by saying I wanted to open an issue in pydantic repo. Once I started rubber duck debugging I came up to the conclusion it's actually pyyaml that isn't working right but I'm not so sure anymore.

from dataclasses import dataclass
from functools import partial
from typing import List, Type

import yaml
from pydantic import BaseModel

yaml_input = """
!Foo
name: foo
bar:
    - !Bar
      name: bar
  """


def get_loader():
    loader = yaml.SafeLoader
    for tag_name, tag_constructor in tag_model_map.items():
        loader.add_constructor(tag_name, tag_constructor)
    return loader


def dynamic_constructor_mapping(model_class: Type[BaseModel], loader: yaml.SafeLoader,
                        node: yaml.nodes.MappingNode) -> BaseModel:
    return model_class(**loader.construct_mapping(node))


def get_constructor_for_mapping(model_class: Type[BaseModel]):
    return partial(dynamic_constructor_mapping, model_class)


class Bar(BaseModel):
    name: str


class Foo1(BaseModel):
    name: str
    bar: list


class Foo2(BaseModel):
    name: str
    bar: List


class Foo3(BaseModel):
    name: str
    bar: List[Bar]


@dataclass
class Foo4:
    name: str
    bar: List[Bar]


foos = [Foo1, Foo2, Foo3, Foo4]

for foo_cls in foos:
    tag_model_map = {
        "!Foo": get_constructor_for_mapping(foo_cls),
        "!Bar": get_constructor_for_mapping(Bar),
    }
    print(f"{foo_cls.__qualname__} loaded {yaml.load(yaml_input, Loader=get_loader())}")

which prints

Foo1 loaded name='foo' bar=[Bar(name='bar')]
Foo2 loaded name='foo' bar=[]
Foo3 loaded name='foo' bar=[]
Foo4 loaded Foo4(name='foo', bar=[Bar(name='bar')])
  • list of pydantic objects is parsed correctly if list is used in static typing
  • list of pydantic objects is NOT parsed correctly if List is used in static typing
  • list of pydantic objects is NOT parsed correctly if List[Bar] is used in static typing
  • list of dataclass objects is always parsed correctly

The constructor seems to be returning the correct object in all examples so I don't understand where the problem lies.

pydantic==1.8.2
Python 3.8.10 


Solution 1:[1]

So this is just a problem I've noticed with YAML in general, but it seems to me that code for de/serializing YAML to dataclasses is overall more complicated than needed.

If you don't need the data validation features that pydantic provides, you could also check out the dataclass-wizard, which provides a helper YAMLWizard Mixin class that could be used for working with YAML data -- note that this does rely on the pyyaml library as well.

Here is a simple example:

from __future__ import annotations

from dataclasses import dataclass
from dataclass_wizard import YAMLWizard


yaml_input = """
name: foo
bar:
    - name: bar
"""


@dataclass
class Foo(YAMLWizard):
    name: str
    bar: list[Bar]


@dataclass
class Bar:
    name: str


instance = Foo.from_yaml(yaml_input)
print(f'Loaded: {instance}')

To install dataclass-wizard along with pyyaml, you can include the yaml extra:

pip install dataclass-wizard[yaml]

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1