'Bulk insert documents but if it exists only update provided fields

I have an index which contains data as follows:

{
  "some_field": string, -- exists in my database
  "some_other_field": string, -- exists in my database
  "another_field": string -- does NOT exist in my database
}

I have a script which grabs data from a database and performs a bulk insert. However, only some of the fields above come from the database as shown above.

If a document already exists, I still want to update the fields that come from the database, but without overwriting/deleting the field that does not come from the database.

I am using the bulk API to do this, however, I lose all data relating to another_field when running the script. Looking at bulk docs, I can't find any options to simply update an existing doc.

I am unable to share the script, but hope this might be enough information to shine some light on possible solutions.



Solution 1:[1]

TLDR;

Yes it is use index, as the doc explain:

(Optional, string) Indexes the specified document. If the document exists, replaces the document and increments the version. The following line must contain the source data to be indexed.

But make sure to provide the _id of the document in case of an update.

To understand

I created a toy project to replay and understand:

# post a single document

POST /71177773/_doc
{
  "some_field": "data",
  "some_other_field": "data"
}

GET /71177773/_search

# try to "update" with out providing an id
POST /_bulk
{"index":{"_index":"71177773"}}
{"some_field":"data","some_other_field":"data","another_field":"data"}

# 2 Documents exist now
GET /71177773/_search 

# Try the same command but provide using the Id on the first documents
POST /_bulk
{"index":{"_index":"71177773", "_id": "<Id of the document>"}}
{"some_field":"data","some_other_field":"data","another_field":"data"}

# It seems it worked
GET /71177773/_search 

If your question was:

Is Elasticsearch smart enough to recognise I want to update an existing document without providing the Id ?

I am afraid it is not possible.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Paulo