Tailing/Sequence iterator

22 views

Skip to first unread message

Arthur Silva

unread,

Jan 30, 2016, 3:18:06 PM1/30/16

to Sophia database

Hello,

Is it possible to create an iterator that tails the database (like tailing)? Maybe by specifying a lsn or tsn. This would really helpful on systems that do async replication.

Also, is it possible to go back in time after a transaction has been committed, like in forestdb?

Regards,
Arthur

Dmitry Simonenko

unread,

Jan 31, 2016, 3:25:20 AM1/31/16

to Sophia database

Hi Arthur,

Yes, i understand. Unfortunately Sophia can not do tailing iterator yet. I might implement this for next release.

Current systems that use Sophia and has its own WAL.

Sophia has several modes which allows to disable its own wal, forge lsn numbers, etc: http://sophia.systems/v2.1/admin/integration.html

It is possible have a point in time view: http://sophia.systems/v2.1/admin/view.html

Ive been writing a small tutorial how to make own streaming async replication system, using Sophia.

It might be useful:

- - - - - - - - - - - - - - - - - - - - - -

This is continuation of the original discussion: https://github.com/pmwkaa/sophia/issues/81

Assume we are making a replication system (basic master-slave replication) or integrating into existing one and using Sophia for data storage.

Lets say the system is a *database*.

Following issues must be resolved to complete the integration:

write-ahead log support
database recover
schema storage
replication
garbage-collection

Write-Ahead Log

the database must have it's own WAL (write-ahead log) implementation
all database operations are written to WAL
each operation is marked with unique monotonically increasing u64 LSN (log sequence number)
all records of multi-statement transaction has the same LSN
WAL implementation must be fault-tolerant: no partly written transactions, corruption detection, sync policies, etc.
Sophia WAL is turned off (log.enable = 0)
on Sophia transaction commit, we must pass the LSN number

Database recovery

Open an existing database:

Log files must be replayed to Sophia since some of those records might yet not been written to Sophia storage. To accomplish this Sophia has TPR mode (Two-Phase Recovery) (sophia.recover = 2).

first sp_open(env) starts TPR
do reply the database log files: pass origin LSN number on commit
Sophia checks if this transaction has been commited to storage, if not: schedule for commit
second sp_open(env) marks TPR completion, start theadpool, etc.

details: http://sophia.systems/v2.1/admin/integration.html

Schema storage

The database must store somewhere an information about Sophia configuration and it's databases.

Schema should be initialized before Sophia startup.

Replication

Two major cases must be supported:

when replica is fresh and new (create Sophia cursor and feed all data)
replica has data (position your log reader by last LSN (log sequence number), then feed data).

Garbage-Collection

GC may be implemented to remove old log files. Any log file may be removed, if the following conditions are true:

(a) log records are successfully added to Sophia storage:

Sophia *Checkpoints* can be used to ensure that any in-memory data has been gone to disk:

http://sophia.systems/v2.1/admin/compaction.html

(b) some replica is behind and its LSN < then one in log file:

There are different situations, in some cases replica can be said to be reinitialized.

Thanks,

Dmitry

Reply all

Reply to author

Forward

0 new messages