Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

tiered object switch

16 views
Skip to first unread message

Matan Tennenhaus

unread,
Aug 13, 2024, 5:32:17 AM8/13/24
to wiredtiger-users
Hi,
In the documentation I see that when a a checkpoint with flush_tier=(enabled=true) is taken, "a new active file is created, data is checkpointed to the old active file ". In contrast to this statement, I see in the code that the active block handle is switched to a the new one in __checkpoint_prepare, which of course happen before the sync file, and therefore writes for the tree to the block layer for the checkpoint will be to the new file and not the old one.
1) Is this a mistake in the documentation?
2) If its configurable, how can I force the switch only after the checkpoint completion and not before?

Many thanks,
Matan.

Will Korteland

unread,
Aug 16, 2024, 1:35:54 AM8/16/24
to wiredtig...@googlegroups.com
Hi Matan,

Thanks for your continued interest in WiredTiger!

The "active" block handle is indeed where new writes will go. This stands in contrast to writes that have been performed, but need to be flushed. Note that a flush happens after the data is on a local disk: the flush operation moves data out to some other "tier" of storage, so isn't strictly required for durability. It's been a little while since I touched this part of tiered storage, but my understanding is that switching dhandles means new writes go to the new file, and the flush is scheduled as future work for a later stage of checkpoint. Hopefully this explains why I think your two statements are true, but not contradictory - if my explanation is unclear or I've misunderstood your question, please let me know.

As for configuration - no, unfortunately this isn't configurable. May I ask why you're interested in changing this? Perhaps it's possible to address your needs some other way.

Regards,
 - Will

--
You received this message because you are subscribed to the Google Groups "wiredtiger-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wiredtiger-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/wiredtiger-users/434886e0-a7e2-4a81-9ffd-d152d537113dn%40googlegroups.com.

Matan Tennenhaus

unread,
Aug 19, 2024, 4:15:35 AM8/19/24
to wiredtiger-users
Hi Will,
I observe that if i make new updates/inserts, and execute checkpoint with flush_tier, the new block of the page(along with the new selected updates) will be written to the new tiered file, because as i said the tier switch happend in the checkpoint prepare phase, before the tree is reconciled.
As for motivation, I want to switch the nvme device that receives the writes, and do this only after the tree is flushed (=reconciled).
Thanks.

ב-יום שישי, 16 באוגוסט 2024 בשעה 08:35:54 UTC+3, Will Korteland כתב/ה:
Reply all
Reply to author
Forward
0 new messages