Tailing/Sequence iterator

22 views
Skip to first unread message

Arthur Silva

unread,
Jan 30, 2016, 3:18:06 PM1/30/16
to Sophia database
Hello,

Is it possible to create an iterator that tails the database (like tailing)? Maybe by specifying a lsn or tsn. This would really helpful on systems that do async replication.

Also, is it possible to go back in time after a transaction has been committed, like in forestdb?

Regards,
Arthur

Dmitry Simonenko

unread,
Jan 31, 2016, 3:25:20 AM1/31/16
to Sophia database
Hi Arthur,

Yes, i understand. Unfortunately Sophia can not do tailing iterator yet. I might implement this for next release.

Current systems that use Sophia and has its own WAL. 
Sophia has several modes which allows to disable its own wal, forge lsn numbers, etc: http://sophia.systems/v2.1/admin/integration.html

It is possible have a point in time view: http://sophia.systems/v2.1/admin/view.html

Ive been writing a small tutorial how to make own streaming async replication system, using Sophia.
It might be useful:

- - - - - - - - - - - - - - - - - - - - - -

This is continuation of the original discussion: https://github.com/pmwkaa/sophia/issues/81

Assume we are making a replication system (basic master-slave replication) or integrating into existing one and using Sophia for data storage.
Lets say the system is a *database*.

Following issues must be resolved to complete the integration:
  • write-ahead log support
  • database recover
  • schema storage
  • replication
  • garbage-collection

Write-Ahead Log

  • the database must have it's own WAL (write-ahead log) implementation
  • all database operations are written to WAL
  • each operation is marked with unique monotonically increasing u64 LSN (log sequence number)
  • all records of multi-statement transaction has the same LSN
  • WAL implementation must be fault-tolerant: no partly written transactions, corruption detection, sync policies, etc.
  • Sophia WAL is turned off (log.enable = 0)
  • on Sophia transaction commit, we must pass the LSN number

Database recovery

Open an existing database:

Log files must be replayed to Sophia since some of those records might yet not been written to Sophia storage. To accomplish this  Sophia has TPR mode (Two-Phase Recovery)  (sophia.recover = 2). 

  • first sp_open(env) starts TPR
  • do reply the database log files: pass origin LSN number on commit
  • Sophia checks if this transaction has been commited to storage, if not: schedule for commit
  • second sp_open(env) marks TPR completion, start theadpool, etc.

Schema storage

The database must store somewhere an information about Sophia configuration and it's databases.
Schema should be initialized before Sophia startup.

Replication

Two major cases must be supported:

  • when replica is fresh and new (create Sophia cursor and feed all data)
  • replica has data (position your log reader by last LSN (log sequence number), then feed data).

Garbage-Collection

GC may be implemented to remove old log files. Any log file may be removed, if the following conditions are true:

(a) log records are successfully added to Sophia storage:
Sophia *Checkpoints* can be used to ensure that any in-memory data has been gone to disk:
 
(b) some replica is behind and its LSN < then one in log file: 
There are different situations, in some cases replica can be said to be reinitialized.

Thanks,
Dmitry
Reply all
Reply to author
Forward
0 new messages