mongod uses background threads for a number of internal processes, including TTL collections, replication, and replica set health checks, which may require a small number of additional resources.
Hi Jared,
- How do we estimate for the purposes of setting resource limits the number of data files created when using WiredTiger?
1a. WiredTiger creates one data file per namespace (collection or index), correct? Does it ever split a namespace into multiple files e.g. if the data gets too large?
A WiredTiger file is not split for large collections, unlike MMAPv1. WiredTiger creates one file per collection or index. Since the _id
field is always indexed, at the minimum each collection consists of two WiredTiger files.
1b. Should we still budget one file handle per data file when using WiredTiger?
Yes, WiredTiger uses one file handle per collection, and one file handle per index.
- How do we estimate for the purposes of setting resource limits the number of internal threads created when using WiredTiger?
2a. Does the following statement (from the doc linked above) cover WiredTiger as well?
mongod uses background threads for a number of internal processes, including TTL collections, replication, and replica set health checks, which may require a small number of additional resources.
Yes. However, the threads mentioned above is not part of the storage engine, but are fundamental to the operation of the mongod
server process.
2b. Are there any thread pools in WiredTiger that grow (meaningfully) with load, data, cores, RAM, or some other variable resource?
No. Under normal operation WiredTiger will spawn a fixed number of internal threads.
Please note that for production purposes, I would recommend setting the ulimit
values to the values in the Recommended ulimit Settings page for best performance and to ensure that the mongod
process is not artifically restricted by sub-optimal ulimit
settings.
Best regards,
Kevin
Please note that for production purposes, I would recommend setting the
ulimit
values to the values in the Recommended ulimit Settings page for best performance and to ensure that themongod
process is not artifically restricted by sub-optimalulimit
settings.
Hi Jared,
Is it fair to say the test is no longer relevant and the warning can be safely ignored? I see no correlation between processes and file handles in any of the guidelines on resource capacity planning.
The rlimits warning you saw concerns the general case, where we recommend that the limit of number of processes should be at least 0.5 times the number of files.
However, it seems that you have a very specific requirements for your deployment, which is not covered in the “general” case. Therefore, if your calculations and testing determined that the rlimit warning does not apply to your use case, then you may be able to ignore the warning.
Best regards,
Kevin
The rlimits warning you saw concerns the general case, where we recommend that the limit of number of processes should be at least 0.5 times the number of files.
Hi Jared,
I don’t see how this is a safe general assumption. nNamespaces is driven by your schema (obviously) whereas nConnections is driven by your system architecture (e.g. how many app servers you need and how big their connection pools need to be to handle the planned load). It doesn’t seem safe to make general claims about how your schema should vary with your system architecture.
The recommendation isn’t supposed to be a guideline about architecture, but rather a warning about the relationship between number of processes (threads) the mongod/mongos is expected to run vs. the number of sockets (file descriptors) the process is allowed to have open.
So the point of the warning is really to say “If you set your file descriptor limit to X in order to accommodate Y open connections, then also set your process limit to Y” with the general estimation that half of the file descriptors are expected to be used for sockets instead of disk files.
Best regards,
Kevin