'Poor Write perfomance with MongoDB 5.0.8 in a PSA (Primary-Secondary-Arbiter) setup
I have some write performance struggle with MongoDB 5.0.8 in an PSA (Primary-Secondary-Arbiter) deployment when one data bearing member goes down.
I am aware of the "Mitigate Performance Issues with PSA Replica Set" page and the procedure to temporarily work around this issue.
However, in my opinion, the manual intervention described here should not be necessary during operation. So what can I do to ensure that the system continues to run efficiently even if a node fails? In other words, as in MongoDB 4.x with the option "enableMajorityReadConcern=false".
As I understand the problem has something to do with the defaultRWConcern. When configuring a PSA Replica Set in MongoDB you are forced to set the DefaultRWConcern. Otherwise the following message will appear when rs.addArb is called:
MongoServerError: Reconfig attempted to install a config that would change the implicit default write concern. Use the setDefaultRWConcern command to set a cluster-wide write concern and try the reconfig again.
So I did
db.adminCommand({
"setDefaultRWConcern": 1,
"defaultWriteConcern": {
"w": 1
},
"defaultReadConcern": {
"level": "local"
}
})
I would expect that this configuration causes no lag when reading/writing to a PSA System with only one data bearing node available.
But I observe "slow query" messages in the mongod log like this one:
{
"t": {
"$date": "2022-05-13T10:21:41.297+02:00"
},
"s": "I",
"c": "COMMAND",
"id": 51803,
"ctx": "conn149",
"msg": "Slow query",
"attr": {
"type": "command",
"ns": "<db>.<col>",
"command": {
"insert": "<col>",
"ordered": true,
"txnNumber": 4889253,
"$db": "<db>",
"$clusterTime": {
"clusterTime": {
"$timestamp": {
"t": 1652430100,
"i": 86
}
},
"signature": {
"hash": {
"$binary": {
"base64": "bEs41U6TJk/EDoSQwfzzerjx2E0=",
"subType": "0"
}
},
"keyId": 7096095617276968965
}
},
"lsid": {
"id": {
"$uuid": "25659dc5-a50a-4f9d-a197-73b3c9e6e556"
}
}
},
"ninserted": 1,
"keysInserted": 3,
"numYields": 0,
"reslen": 230,
"locks": {
"ParallelBatchWriterMode": {
"acquireCount": {
"r": 2
}
},
"ReplicationStateTransition": {
"acquireCount": {
"w": 3
}
},
"Global": {
"acquireCount": {
"w": 2
}
},
"Database": {
"acquireCount": {
"w": 2
}
},
"Collection": {
"acquireCount": {
"w": 2
}
},
"Mutex": {
"acquireCount": {
"r": 2
}
}
},
"flowControl": {
"acquireCount": 1,
"acquireWaitCount": 1,
"timeAcquiringMicros": 982988
},
"readConcern": {
"level": "local",
"provenance": "implicitDefault"
},
"writeConcern": {
"w": 1,
"wtimeout": 0,
"provenance": "customDefault"
},
"storage": {},
"remote": "10.10.7.12:34258",
"protocol": "op_msg",
"durationMillis": 983
}
The collection involved here is under proper load with about 1000 reads and 1000 writes per second from different (concurrent) clients.
MongoDB 4.x with "enableMajorityReadConcern=false" performed "normal" here and I have not noticed any loss of performance in my application. MongoDB 5.x doesn't manage that and in my application data is piling up that I can't get written away in a performant way.
So my question is, if I can get the MongoDB 4.x behaviour back. A write guarantee from the single data bearing node which is available in the failure scenario would be OK for me. But in a failure scenario, having to manually reconfigure the faulty node should actually be avoided.
Thanks for any advice!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|