'Slow MongoDB aggregate with $sort on _id field
I have a records collection with the following indexes:
{"_id":1}
{"car.make":1,"city":1,"car.mileage":1}
And performing the following query:
db.records.aggregate([
{
"$match":{
"car.mileage":{"$in":[1000,2000,3000,4000]},
"car.make":"Honda",
"city":{"$in":["Miami","San Francisco","New York", "Chigaco", "Seattle", "Boston"]}
}
},
{
"$sort":{"_id":-1}
}
])
The query without the $sort clause finished in a few milliseconds but adding the $sort clause makes it takes around 2 minutes. This query should return around 40 documents from a collection of 6m documents. Any clues about what could cause this huge difference in query time?
After additional testing, this problem goes away by sorting on a different field like creation_date
even if creation_date
is not indexed. Any ideas why the _id
field would perform so much worse than the unindexed creation_date
field in this aggregation?
Solution 1:[1]
I ran into the same problem today and I'm speculating here but I believe in this case sorting by _id sorts all the entries in the collection before other operations (I said speculating because if you omit the $match clause and keep only the $sort clause even then you get your data in milliseconds).
The workaround that helped me was projection. If you use a $project clause between $match and $sort then you will get your data in milliseconds again. So you can either use fields like creation_date or if you must use _id then use $project before sorting it.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | codebuilder123 |