'How to match multiple array elements without using unwind?
I have a collection which contains documents with multiple arrays. These are generally quite large, but for purposes of explaining you can consider the following two documents:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" },
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "c" }
]
},
{
"obj1": [
{ "a": "c", "b": "b" }
],
"obj2": [
{ "a": "c", "b": "c" }
]
}
The idea is to just get the matching elements in the array to the query. There are multiple matches required and within multiple arrays so this is not within the scope of what can be done with projection and the positional $
operator. The desired result would be like:
{
"obj1": [
{ "a": "a", "b": "b" },
{ "a": "a", "b": "b" }
],
"obj2": [
{ "a": "a", "b": "b" },
]
},
A traditional approach would be something like this:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$unwind": "$obj1" },
{ "$match": {
"obj1.a": "a",
"obj1.b": "b"
}},
{ "$unwind": "$obj2" },
{ "$match": { "obj2.b": "b" }},
{ "$group": {
"_id": "$_id",
"obj1": { "$addToSet": "$obj1" },
"obj2": { "$addToSet": "$obj2" }
}}
])
But the use of $unwind
there for both arrays causes the overall set to use a lot of memory and slows things down. There are also possible problems there with $addToSet
and splitting the $group
stages for each array can make things even slower.
So I am looking for a process that is not so intensive but arrives at the same result.
Solution 1:[1]
Since MongoDB 3.0 we have the $filter
operator, which makes this really quite simple:
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": {
"$filter": {
"input": "$obj1",
"as": "el",
"cond": {
"$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]
}
}
},
"obj2": {
"$filter": {
"input": "$obj2",
"as": "el",
"cond": { "$eq": [ "$$el.b", "b" ] }
}
}
}}
])
MongoDB 2.6 introduces the $map
operator which can act on arrays in place without the need to $unwind
. Combined with some other logical operators and additional set operators that have been added to the aggregation framework there is a solution to this problem and others.
db.objects.aggregate([
{ "$match": {
"obj1": {
"$elemMatch": { "a": "a", "b": "b" }
},
"obj2.b": "b"
}},
{ "$project": {
"obj1": {
"$setDifference": [
{ "$map": {
"input": "$obj1",
"as": "el",
"in": {
"$cond": [
{ "$and": [
{ "$eq": [ "$$el.a", "a" ] },
{ "$eq": [ "$$el.b", "b" ] }
]},
"$$el",
false
]
}
}},
[false]
]
},
"obj2": {
"$setDifference": [
{ "$map": {
"input": "$obj2",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.b", "b" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
The core of this is in the $map
operator which works like an and internalized $unwind
by allowing processing of all the array elements, but also allows operations to act on those array elements in the same statement. Typically this would be done in several pipeline stages but here we can process within a single $project
, $group
or $redact
stage.
In this case that inner processing utilizes the $cond
operator which combines with a logical condition in order to return a different result for true
or false
. Here we act on usage of the $eq
operator to test values of the fields contained within the current element in much the same way as a separate $match
pipeline stage would be used. The $and
condition is another logical operator which works on combining the results of multiple conditions on the element, much in the same way as the $elemMatch
operator would work within a $match
pipeline stage.
Finally, since our $cond
operator was used to either return the value of the current element or false
if the condition was not true
we need to "filter" any false
values from the array produced my the $map
operation. The is where the $setDifference
operator is used to compare the two input arrays and return the difference. So when compared to an array that only contains false
for it's element, the result will be the elements that were returned from the $map
without the false
elements coming out of $cond
when the conditions were not met.
The result filters only the matching elements from the array without having to run through seperate pipeline stages for $unwind
, $match
and $group
.
Solution 2:[2]
return more then one match,
const { timeSlots } = req.body;
let ts = [];
for (const slot of timeSlots) {
ts.push({
$eq: ['$$timeSlots.id',slot.id],
});
}
const products = await Product.aggregate<ProductDoc>([
{
$match: {
_id: req.params.productId,
recordStatus: RecordStatus.Active,
},
},
{
$project: {
timeSlots: {
$filter: {
input: '$timeSlots',
as: 'timeSlots',
cond: {
$or: ts,
},
},
},
name: 1,
mrp: 1,
},
},
]);
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | |
Solution 2 | Rafiq |