'MongoDB insertMany and skip duplicates
I'm trying to insertMany()
items into my Mongo database but I would like to skip duplicate IDs.
I'm using Node.js
and mongodb
.
I have some data:
const myExampleData = [
{_id:'someId1', name:'I am example'},
{_id:'someId2', name:'I am second example'}
];
and I would like to insert them like this:
dbo.collection(collectionName).insertMany(myExampleData).catch(err=>{
console.error(err);
});
Lets suppose that someId1
already exists. I don't want to override it. I just want to skip it. In current situation it doesn't insert someId2
. It stops as soon as it throws duplicate exception.
Is there a way how to insertMany and skip duplicates?
Possible question duplicate.
I've found thread MongoDB insert without duplicates where is suggested to use update()
with upsert
instead of insert()
which could be fine for one item. But how about many items? As far as I know updateMany()
would update all filtered rows with the same value but I would like to insert different values.
Solution 1:[1]
Checking in your statement :
Lets suppose that someId1 already exists. I don't want to override it. I just want to skip it.
So you just wanted to skip duplicate docs as your intention is not update duplicate docs with latest data - So there is no need to use .update()
, You can still do this using .insertMany() by passing in a flag ordered
in options to query :
Ordered : Optional. A boolean specifying whether the mongod instance should perform an ordered or unordered insert. Defaults to true.
db.collection.insertMany(
[ <document 1> , <document 2>, ... ],
{
ordered: <boolean>
}
)
Your code :
dbo.collection(collectionName).insertMany(myExampleData, {ordered : false }).catch(err=>{
console.error(err);
})
As if you're checking against _id
which will have default unique index & any incoming duplicates will actually throw an error, With ordered : false
we're making this insertion operation un-ordered by that we're skipping all the incoming duplicates & proceeding further with out actually throwing any errors.
Solution 2:[2]
You can use bulkWrite to perform the operation and get similar result. For example, in mongo shell:
db.collectionName.bulkWrite(
[
{ updateOne :
{
"filter" : { _id:'someId1'},
"update" : { $set : { name:'I am example'}},
"upsert" : true
}
},
{ updateOne :
{
"filter" : { _id:'someId2' },
"update" : { $set : { name:'I am second example'} },
"upsert" : true
}
},
]
);
Running the above will only insert two documents first time it ran. And won't give you any errors on subsequent runs.
Solution 3:[3]
Building on top of Yahya's answer, if you wish to only insert non-duplicates, and completly skip updaing existing documents, use @setOnInsert
db.collectionName.bulkWrite(
[
{ updateOne :
{
"filter" : { name: 'I am example' }, // the 'deduplication' filter, can be any property, not just _id
"update" : { $setOnInsert : { name:'I am example', something: 'else' }},
"upsert" : true
}
},
{ updateOne :
{
"filter" : { name: 'I am example' },
"update" : { $setOnInsert : { name:'I am second example', more: 'properties' } },
"upsert" : true
}
},
]
);
So with the above code ran, on the first go, both documents will be inserted. on the second go, only the second document will be inserted, as the filter criteria matches an already existing document, the first one. this also works for duplicate documents passed to the same bulk operation, as long as they match on the 'deduplication' filter criteria.
With @setOnInsert
, upsert will actually not update the first existing document on the second run.
So in essence you achieve: "insert many in one query. create non existing according to match (filter) criteria, and dont update existing"
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | whoami - fakeFaceTrueSoul |
Solution 2 | Yahya |
Solution 3 | Maoration |