Difficulties in understanding varieties of LSNs

bai...@gmail.com

unread,

Jan 27, 2021, 9:08:56 AM1/27/21

to wiredtiger-users

Hi. I spend two weeks in exploring WT's WAL implementation and really have difficulties in making sense of different kinds of LSNs manipulated by WT_LOG, WT_LOGSLOT .

For now, I have some kind of sense of what a type of LSN is like and its relation with other LSNs. But the sense is vague and uncertain.

Susan's two articles[1,2] about WAL really helps. I've had no problem in understanding the join-release mechanism. But many LSNs such as log.alloc_lsn, slot.slot_release_lsn, slot.slot_start_lsn, slot.slot_end_lsn, log.write_start_lsn, and their read-update rules are somewhat complicated and undocumented.

So I'm thinking are there any more elaborated ways to explain these LSNs, maybe like those explained in [3] ? Or maybe some kind of flow diagram to describe their relations

suggestions and guidances are welcome.

[1] https://engineering.mongodb.com/post/breaking-the-wiredtiger-logjam-the-write-ahead-log-1-2
[2] https://engineering.mongodb.com/post/breaking-the-wiredtiger-logjam-the-wait-free-solution-2-2

[3] https://dev.mysql.com/doc/dev/mysql-server/latest/PAGE_INNODB_REDO_LOG.html

bai...@gmail.com

unread,

Jan 28, 2021, 1:37:18 AM1/28/21

to wiredtiger-users

Specific question 1:

/*

* __wt_log_force_write --

* Force a switch and release and write of the current slot. Wrapper function that takes the

* lock.

*/

int

__wt_log_force_write(WT_SESSION_IMPL *session, bool retry, bool *did_work)

It seems that __wt_log_force_write doesn't necessarily write to OS buffers when the slot to be switched is not in the `done` state. So this sounds confusing, because __wt_log_force_write doesn't do writes, just switch a slot ?

Specific question 2:

In __log_slot_close, these codes are quite unclear to me:

slot->slot_end_lsn = slot->slot_start_lsn;

end_offset = WT_LOG_SLOT_JOINED_BUFFERED(old_state) + slot->slot_unbuffered;

slot->slot_end_lsn.l.offset += (uint32_t)end_offset;

WT_STAT_CONN_INCRV(session, log_slot_consolidated, end_offset);

/*

* XXX Would like to change so one piece of code advances the LSN.

*/

log->alloc_lsn = slot->slot_end_lsn;

WT_ASSERT(session, log->alloc_lsn.l.file >= log->write_lsn.l.file);

sue.l...@mongodb.com

unread,

Feb 1, 2021, 2:40:48 PM2/1/21

to wiredtiger-users

> It seems that __wt_log_force_write doesn't necessarily write to OS buffers when the slot to be switched is not in the `done` state. So this sounds confusing, because __wt_log_force_write doesn't do writes, just switch a slot ?

The writing of the slot is in 'wt_log_release'. On a 'wt_log_force_write' when the slot is not in the "done" state that means that there is a thread actively using the slot at that moment in time. Therefore, this active thread will be the one to call 'wt_log_release' instead of the thread that forced the switch in 'wt_log_force_write'. So yes, in some circumstances the caller will not actually do the write, but it forces the switch which will for that slot to be written whenever immediately upon not being in use.

> In __log_slot_close, these codes are quite unclear to me:

When switching out the active slot to a new active slot. The thread doing the switch first has to close the existing active slot, and then initialize one of the free slots as the new active slot. After the existing active slot is closed to new threads joining, that slot's information is finalized with the amount of data that this slot represents. So this slot has a starting LSN. We compute how much data was written in this slot while it was the active slot. We add that into the LSN value to get our ending LSN. We then set the 'alloc_lsn' to reflect the new end of the log as the starting LSN of the next active slot. So each slot represents a contiguous slice of the write-ahead log.

Reply all

Reply to author

Forward