'Are Elasticsearch object properties really just flat properties with a namespace?

In the Elasticsearch docs (https://www.elastic.co/guide/en/elasticsearch/reference/current/object.html) it is stated that object properties internally are essentially just flat properties with a namespace. However when I do this:

POST storage-index/_doc
{
  "person": {
    "lastName":"Miller" 
  },
  "person.lastName":"Smith"
}

The index contains this:

    "_source" : {
      "person" : {
        "lastName" : "Miller"
      },
      "person.lastName" : "Smith"
    }

It becomes even weirder when I query these both return the document:

Object property:

POST /storage-index/_search
{
  "query": {
    "query_string": {
      "query": "person.lastName:Miller"
    }
  }
}

Flat property:

POST /storage-index/_search
{
  "query": {
    "query_string": {
      "query": "person.lastName:Smith"
    }
  }
}

What am I missing?



Solution 1:[1]

The key for this question is that es can store arrays in any field, which means in your example you stored an array in person.lastName.

Another simple example--

Let's create a dynamic mapping index:

PUT my-index-000001/_doc/1
{ 
  "region": "US",
  "manager": { 
    "age":     30,
    "name": { 
      "first": "John",
      "last":  "Smith"
    }
  }
}

and see the mapping of the index, GET my-index-000001/_mapping:

{
  "my-index-000001" : {
    "mappings" : {
      "properties" : {
        "manager" : {
          "properties" : {
            "age" : {
              "type" : "long"
            },
            "name" : {
              "properties" : {
                "first" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                },
                "last" : {
                  "type" : "text",
                  "fields" : {
                    "keyword" : {
                      "type" : "keyword",
                      "ignore_above" : 256
                    }
                  }
                }
              }
            }
          }
        },
        "region" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}

manager.name.first and manager.name.last are totally the same.

Let's add another document with the format in your question:

PUT my-index-000001/_doc/2
{ 
  "region": "US",
  "manager": { 
    "age":     30,
    "name": { 
      "first": "Lucy",
      "last":  "James"
    },
    "name.first": "Kate"
  }
}

So guess what the mapping of the index is now? Does it add an additional manager.name.first? No. The mapping doesn't change. It's just a field stored with an array rather than a single object now.

The document above is to store two names as an array, so it's totally the same as the next document:

PUT my-index-000001/_doc/3
{ 
  "region": "US",
  "manager": { 
    "age":     30,
    "name": [
      { 
        "first": "Lucy",
        "last":  "James"
      },
      { 
        "first": "Kate"
      }
    ]
  }
}

The original format seems different, but the underlying storage has no difference:

{
  "region" :        "US",
  "manager.age":    30,
  "manager.name.first" : [ "Lucy", "Kate" ],
  "manager.name.last" :  "James"
}

Query with:

GET my-index-000001/_search
{
  "query": {
    "match": {
      "manager.name.first": "kate"
    }
  }
}

Both of them will be selected:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 0.43445712,
    "hits" : [
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.43445712,
        "_source" : {
          "region" : "US",
          "manager" : {
            "age" : 30,
            "name" : {
              "first" : "Lucy",
              "last" : "James"
            },
            "name.first" : "Kate"
          }
        }
      },
      {
        "_index" : "my-index-000001",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 0.43445712,
        "_source" : {
          "region" : "US",
          "manager" : {
            "age" : 30,
            "name" : [
              {
                "first" : "Lucy",
                "last" : "James"
              },
              {
                "first" : "Kate"
              }
            ]
          }
        }
      }
    ]
  }
}

but just appears as their original formats.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1