'Given a Cerberus JSON Schema file, how can I generate a dictionary object

Problem

I have a Cerberus Json schema file and I need to generate Json files based on the schemas with default values given the data_types. The key _type in the dictionary objects is the data_type to check against.

A sample Cerberus Json Schema

{
    "schema": {
        "vars": {
            "forest_config": {
                "_type": "list",
                "_required": false,
                "_not_empty": false,
                "_item_schema": {
                    "_type": "dict",
                    "always_create_dl_here": {
                        "_required": false,
                        "_type": "bool"
                    },
                    "ldap_domain": {
                        "_type": "string",
                        "_required": false
                    },
                    "dc_dn": {
                        "_type": "string",
                        "_required": true,
                        "_not_empty": true,
                        "_regex": "^(DC|O)=[\\w\\-\\. &]+(,(DC|O)=[\\w\\-\\. &]+)*$"
                    },
                    "user_ou": {
                        "_type": "string",
                        "_required": true,
                        "_not_empty": false,
                        "_regex": "^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$"
                    },
                    "group_ou": {
                        "_type": "string",
                        "_required": true,
                        "_not_empty": false,
                        "_regex": "^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$"
                    },
                    "dl_group_ou": {
                        "_type": "string",
                        "_required": true,
                        "_not_empty": false,
                        "_regex": "^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$"
                    }
                }
            }
        }
    }
}

I tried making my own recursive function, however I am getting stuck

        new_obj = {}
        def recursively_generate_schema_obj(dict_obj: Dict):
            if isinstance(dict_obj, dict):
                var_name = None
                for key, value in dict_obj.items():
                    if not str(key).startswith("_"):
                        var_name = key

                    if var_name and '_type' not in dict_obj:
                        new_obj[var_name] = {}
                    elif var_name and '_type' in dict_obj and dict_obj['_type'] == 'list':
                        new_obj[var_name] = [recursively_generate_schema_obj(dict_obj)]
                    else:
                        new_obj[var_name] = create_default_type_value(dict_obj['_type'])

                    recursively_generate_schema_obj(value)

        recursively_generate_schema_obj(schema_data)


    def get_default_value(data_type:str):
        if data_type == 'string':
            return ''
        elif data_type == 'dict':
            return {}
        elif data_type == 'bool':
            return False
        elif data_type == 'list':
            return []
        elif data_type == 'int':
            return 0
        elif data_type == 'enum': # this needs to be handled in the calling function
            return ''

Current Output of my code

{'schema': {}, 'vars': {}, 'forest_config': {}, None: '', 'always_create_dl_here': {}, 'ldap_domain': {}, 'dc_dn': {}, 'user_ou': {}, 'group_ou': {}, 'dl_group_ou': {}}

This is definitely wrong

it should be like

{
    "schema": {
        "vars": {
            "forest_config": [
                {
                    "always_create_dl_here": false,
                    "ldap_domain": "",
                    "dc_dn": "",
                    "user_ou": "",
                    "group_ou": "",
                    "dl_group_ou": ""
                }
            ]
        }
    }
}

Ask

Can someone help me figure this out and explain some possible better approaches?



Solution 1:[1]

You were pretty close to something that will work. I modified the function to take a var_name and json_obj along with the var_schema. It can then add the var_name to the supplied json_obj using the _type from the var_schema. If there are more variables defined in the var_schema, then it recalls the function to add each one. Passing the schema details for that variable and also the json_obj to add the variables to.

I'm not familiar with the Cerberus Json schema you mention, so you'll probably want to add some extra code in the get_default_value function. Hopefully the details for your enum type are in the schema.

def get_default_value(var_schema: dict):
    data_type = var_schema.get('_type', 'dict')
    if data_type == 'string':
        return ''
        # return var_schema.get('_regex', '')  # use this if you want to get the regex as the default value
    elif data_type == 'dict':
        return {}
    elif data_type == 'bool':
        return False
    elif data_type == 'list':
        return []
    elif data_type == 'int':
        return 0
    elif data_type == 'enum':  # this needs extra handling?
        return ''


def recursively_generate_schema_obj(var_schema: dict, var_name=None, json_obj=None):
    """ Add the supplied var_name to the supplied json_obj. The type of variable added is determined by the '_type'
    field in the supplied var_schema.

    Once the variable has been added, find variables in the supplied schema and recursively call this function to
    add any found.

    :param var_name: Name of variable to add to supplied json_obj
    :param var_schema: Schema with details of variables to add below this variable
    :param json_obj: The json object to add the variable to. This can be a dict or a list. If it's a list, then the
    supplied var_name is not used and an item of type defined in the var_schema is simply appended to the list
    :return: The json_obj
    """
    if isinstance(var_schema, dict):

        next_level = get_default_value(var_schema)
        if json_obj is None:
            json_obj = next_level
        elif isinstance(json_obj, list):
            json_obj.append(next_level)
        else:
            json_obj[var_name] = next_level

        # Find each sub field in the schema and add it to the next level
        # These are keys without an '_' or, for the items to add to a list the '_item_schema'.
        for key, value in var_schema.items():
            if not str(key).startswith("_") or key == '_item_schema':
                recursively_generate_schema_obj(value, key, next_level)

    return json_obj


schema_data = {
    'schema': {
        'vars': {
            'forest_config': {
                '_type': 'list',
                '_required': False,
                '_not_empty': False,
                '_item_schema': {
                    '_type': 'dict',
                    'always_create_dl_here': {'_required': False, '_type': 'bool'},
                    'ldap_domain': {'_type': 'string', '_required': False},
                    'dc_dn': {'_type': 'string', '_required': True, '_not_empty': True,
                              '_regex': '^(DC|O)=[\\w\\-\\. &]+(,(DC|O)=[\\w\\-\\. &]+)*$'},
                    'user_ou': {'_type': 'string','_required': True,'_not_empty': False,
                                '_regex': '^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$'},
                    'group_ou': {'_type': 'string', '_required': True, '_not_empty': False,
                                 '_regex': '^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$'},
                    'dl_group_ou': {'_type': 'string', '_required': True, '_not_empty': False,
                                    '_regex': '^((CN|OU)=[\\w\\-\\. &]+(,(CN|OU)=[\\w\\-\\. &]+)*)?$'}
                }
            }
        }
    }
}

new_obj = recursively_generate_schema_obj(schema_data)
print(new_obj)

The above code produces:

{
    'schema': {
        'vars': {
            'forest_config': [{
                'always_create_dl_here': False, 
                'ldap_domain': '', 
                'dc_dn': '', 
                'user_ou': '', 
                'group_ou': '', 
                'dl_group_ou': ''
            }]
        }
    }
}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 pcoates