Yes, i understand. Unfortunately Sophia can not do tailing iterator yet. I might implement this for next release.
Ive been writing a small tutorial how to make own streaming async replication system, using Sophia.
Assume we are making a replication system (basic master-slave replication) or integrating into existing one and using Sophia for data storage.
Lets say the system is a *database*.
Following issues must be resolved to complete the integration:
- write-ahead log support
- database recover
- schema storage
- replication
- garbage-collection
Write-Ahead Log
- the database must have it's own WAL (write-ahead log) implementation
- all database operations are written to WAL
- each operation is marked with unique monotonically increasing u64 LSN (log sequence number)
- all records of multi-statement transaction has the same LSN
- WAL implementation must be fault-tolerant: no partly written transactions, corruption detection, sync policies, etc.
- Sophia WAL is turned off (log.enable = 0)
- on Sophia transaction commit, we must pass the LSN number
Database recovery
Open an existing database:
Log files must be replayed to Sophia since some of those records might yet not been written to Sophia storage. To accomplish this Sophia has TPR mode (Two-Phase Recovery) (sophia.recover = 2).
- first sp_open(env) starts TPR
- do reply the database log files: pass origin LSN number on commit
- Sophia checks if this transaction has been commited to storage, if not: schedule for commit
- second sp_open(env) marks TPR completion, start theadpool, etc.
Schema storage
The database must store somewhere an information about Sophia configuration and it's databases.
Schema should be initialized before Sophia startup.
Replication
Two major cases must be supported:
- when replica is fresh and new (create Sophia cursor and feed all data)
- replica has data (position your log reader by last LSN (log sequence number), then feed data).
Garbage-Collection
GC may be implemented to remove old log files. Any log file may be removed, if the following conditions are true:
(a) log records are successfully added to Sophia storage:
Sophia *Checkpoints* can be used to ensure that any in-memory data has been gone to disk:
(b) some replica is behind and its LSN < then one in log file:
There are different situations, in some cases replica can be said to be reinitialized.
Thanks,
Dmitry