Questions about: Best practice for storing huge lists

37 views
Skip to first unread message

Eldad Yamin

unread,
Sep 19, 2017, 11:34:28 AM9/19/17
to mongodb-dev

Hi,
I’m trying to evaluate Mongo's capabilities.


Is Mongo capable of storing multiple lists, each holds between 100,000-100,000,000 records?
The records are stored in a “data series” fashion (or delayed queue) and queried accordingly.


Example
List dataset structure:

  • id
  • list_id # the list the record belongs
  • next_check timestamp
  • status
  • some other fieds…

Typical use case:
Select All records that have next_check in the past and a specific status.

SELECT * FROM RECORDS 
WHERE next_check < now()
  AND status = X
  limit, offset

Then I can perform several actions:

  • Update the record with a new next_check/status values.
  • OR delete the record and insert a new one.

Questions

What I’m trying to understand is this:

  1. If Mongo can handle such huge dataset
    • Huge lists (multiple!) of 100,000 and up to 100,000,000.
    • We don’t have a lots of parallel reads, max 10 reads per list (for palatalization).
    • Once a record is read, its going to be updated with a new status/next_check or completely deleted.
  2. What is the best way to store and query such structure?
  3. and finally, is there any Mongo limitation I need to pay attention to (i.e don’t put more then 1000 records in a partition)?
Thanks!

Kevin Adistambha

unread,
Oct 3, 2017, 11:27:09 PM10/3/17
to mongodb-dev

Hi

If Mongo can handle such huge dataset

Huge dataset is not a problem for MongoDB (see the success stories page for some relevant examples). However, the way you describe how you want to structure your data could be a problem. First and foremost, you cannot create an index containing multiple lists (I assume you mean arrays) in a single index (see Multikey indexes). Second, MongoDB stores documents as BSON (Binary representation of JSON), of which there is a 16 MB document size limit (see MongoDB Limits and Thresholds), so an array containing 100,000,000 items may possibly hit this limit.

What is the best way to store and query such structure?

The MongoDB University page contains many free courses regarding schema design, operations, security, etc. You may find courses such as M001 or M201 could answer your questions.

Regarding schema design, I find these links helpful:

and finally, is there any Mongo limitation I need to pay attention to (i.e don’t put more then 1000 records in a partition)?

For limits and thresholds, the MongoDB Limits and Thresholds page may be useful.

Best regards,
Kevin

Reply all
Reply to author
Forward
0 new messages