Regarding bulk read on mongodb db path using wt library

Dawood Bin Mansoor

unread,

May 28, 2024, 9:28:48 AMMay 28

to wiredtiger-users

Hi,

I have a requirement to read a mongodb db path using wired tiger.

Here is what I am doing

1. Open a connection on the directory

2. Open a session on the connection

3. Open a cursor on the session

Now, I open a cursor on one wt file, and iterate over each key value pair in the file. I use the WT_SESSION::next, get_key, get_value for this purpose. My question is since I am doing sequential reads, what all can I do to decrease the number of reads that go onto the directory. Is there something like a ranged read that issue only one I/O operation to the file and gets multiple sequential records. If not, is there some sort of cache policy that can be set for this particular operation.

I have just began experimenting with wired tiger and as such not really familiar with the database internal.

Hope you can indulge me.

Monica Ng

unread,

May 28, 2024, 10:17:08 PMMay 28

to wiredtig...@googlegroups.com

Hi Dawood,

Thanks for your interest in WiredTiger.

WiredTiger does not support range queries - I’d advise checking with MongoDB to see if they provide anything like this. In WiredTiger, we recently introduced the pre-fetching feature, which when enabled, attempts to read records into the cache before the application needs to access them, reducing time spent waiting on I/O. This sounds like it is likely to help in your scenario of doing sequential reads, as pre-fetching will attempt to read in the next set of pages preemptively.

The pre-fetch API consists of both connection and session level settings to allow for more control over where pre-fetching is enabled, as it is not advisable to turn pre-fetching on for all workloads. On the connection level, I’d suggest the configuration “prefetch=(available=true,default=false)” to allow pre-fetching on the database connection. Then, on the session/s doing sequential reads, to initialise them with the configuration “prefetch=(enabled=true)”.

Given that this feature was only recently introduced, it is currently not in general use. It can be used by building from source, though any usage will be at your own risk. A stable version of the pre-fetching feature can be found after the following commit: https://github.com/wiredtiger/wiredtiger/tree/01e0bb1f9cd41947e5134de33666e8dec38d1c9f.

Hope this helps to answer your questions.

Kind regards,

Monica

--
You received this message because you are subscribed to the Google Groups "wiredtiger-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wiredtiger-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wiredtiger-users/e181b883-5e16-4f53-98a7-8a92f501c189n%40googlegroups.com.

Dawood Bin Mansoor

unread,

May 29, 2024, 5:29:56 AMMay 29

to wiredtiger-users

Hey Monica,

Thanks for the detailed answer.

This should help us, will get back to you.

Reply all

Reply to author

Forward