'Convert two types into a single type with Serde

I'm writing for a program that hooks into a web service which sends back JSON.

When a certain property isn't there it provides a empty object, with all its fields as empty strings instead of excluding the value. When the property exists, some of the properties are u64. How can I have it so Serde handles this case?

Rust Structs

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    x: u64,
    y: u64,
    name: String,
}

Example JSON

{
    "foo":[
        {
            "points":{
                "x":"",
                "y":"",
                "name":""
            }
        },
        {
            "points":{
                "x":78,
                "y":92,
                "name":"bar"
            }
        }
    ]
}


Solution 1:[1]

Serde supports an interesting selection of attributes that can be used to customize the serialization or deserialization for a type while still using the derived implementation for the most part.

In your case, you need to be able to decode a field that can be specified as one of multiple types, and you don't need information from other fields to decide how to decode the problematic fields. The #[serde(deserialize_with="$path")] annotation is well suited to solve your problem.

We need to define a function that will decode either an empty string or an integer value into an u64. We can use the same function for both fields, since we need the same behavior. This function will use a custom Visitor to be able to handle both strings and integers. It's a bit long, but it makes you appreciate all the work that Serde is doing for you!

extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_json;

use serde::Deserializer;
use serde::de::{self, Unexpected};
use std::fmt;

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "deserialize_u64_or_empty_string")]
    x: u64,
    #[serde(deserialize_with = "deserialize_u64_or_empty_string")]
    y: u64,
    name: String,
}

struct DeserializeU64OrEmptyStringVisitor;

impl<'de> de::Visitor<'de> for DeserializeU64OrEmptyStringVisitor {
    type Value = u64;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("an integer or a string")
    }

    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v)
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "" {
            Ok(0)
        } else {
            Err(E::invalid_value(Unexpected::Str(v), &self))
        }
    }
}

fn deserialize_u64_or_empty_string<'de, D>(deserializer: D) -> Result<u64, D::Error>
where
    D: Deserializer<'de>,
{
    deserializer.deserialize_any(DeserializeU64OrEmptyStringVisitor)
}

fn main() {
    let value = serde_json::from_str::<WebResponse>(
        r#"{
        "foo": [
            {
                "points": {
                    "x": "",
                    "y": "",
                    "name": ""
                }
            },
            {
                "points": {
                    "x": 78,
                    "y": 92,
                    "name": "bar"
                }
            }
        ]
    }"#,
    );
    println!("{:?}", value);
}

Cargo.toml:

[dependencies]
serde = "1.0.15"
serde_json = "1.0.4"
serde_derive = "1.0.15"

Solution 2:[2]

In str_or_u64, we use an untagged enum to represent either a string or a number. We can then deserialize the field into that enum and convert it to a number.

We annotate the two fields in Points using deserialize_with to tell it to use our special conversion:

use serde::{Deserialize, Deserializer}; // 1.0.124
use serde_json; // 1.0.64

#[derive(Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "str_or_u64")]
    x: u64,
    #[serde(deserialize_with = "str_or_u64")]
    y: u64,
    name: String,
}

fn str_or_u64<'de, D>(deserializer: D) -> Result<u64, D::Error>
where
    D: Deserializer<'de>,
{
    #[derive(Deserialize)]
    #[serde(untagged)]
    enum StrOrU64<'a> {
        Str(&'a str),
        U64(u64),
    }

    Ok(match StrOrU64::deserialize(deserializer)? {
        StrOrU64::Str(v) => v.parse().unwrap_or(0), // Ignoring parsing errors
        StrOrU64::U64(v) => v,
    })
}

fn main() {
    let input = r#"{
        "foo":[
            {
                "points":{
                    "x":"",
                    "y":"",
                    "name":""
                }
            },
            {
                "points":{
                    "x":78,
                    "y":92,
                    "name":"bar"
                }
            }
        ]
    }"#;

    dbg!(serde_json::from_str::<WebResponse>(input));
}

See also:

Solution 3:[3]

Just adding a note for future viewers: in case it is helpful, I have implemented the solution from the accepted answer and published it as a crate serde-this-or-that.

I've added a section on Performance to explain that an approach with a custom Visitor as suggested, should perform overall much better than a version with an untagged enum, which does also work.

Here is a shortened implementation of the accepted solution above (should have the same result):

use serde::Deserialize;
use serde_json::from_str;
use serde_this_or_that::as_u64;

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "as_u64")]
    x: u64,
    #[serde(deserialize_with = "as_u64")]
    y: u64,
    name: String,
}

fn main() {
    let value = from_str::<WebResponse>(
        r#"{
        "foo": [
            {
                "points": {
                    "x": "",
                    "y": "",
                    "name": ""
                }
            },
            {
                "points": {
                    "x": 78,
                    "y": 92,
                    "name": "bar"
                }
            }
        ]
    }"#,
    );
    println!("{:?}", value);
}

The Cargo.toml would look like:

[dependencies]
serde = { version = "1.0.136", features = ["derive"] }
serde_json = "1.0.79"
serde-this-or-that = "0.4"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3