rocksdb for backend

346 views
Skip to first unread message

Subba G

unread,
Aug 31, 2018, 7:51:44 PM8/31/18
to etcd-dev
I would like to add a new backend to etcd. My first try would be rocksdb.
I have it working without versioning of keys, since rocksdb overwrites keys on update.
Appreciate your help.

Thanks
Subba 

Joe Betz

unread,
Sep 1, 2018, 10:40:14 AM9/1/18
to Subba G, etcd-dev
Would be great to see benchmarks!  Is that the intent?

--
You received this message because you are subscribed to the Google Groups "etcd-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etcd-dev+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sam Batschelet

unread,
Sep 6, 2018, 11:10:24 PM9/6/18
to Subba G, etcd-dev
Subba,
I am very curious to your motivation/use case can you elaborate a little on the project?

Best,

Sam

----- Sent from my iPhone -----

Subba G

unread,
Sep 7, 2018, 9:06:14 AM9/7/18
to s...@hexfusion.com, etcd...@googlegroups.com
motivation for using rocksdb as backend for etcd:
brief history of our project:
We are already using rocksdb to save metadata of cluster, which in the order of 10s of GB.
And we have implemented multi-paxos for managing consistent replicas, now we intend to replace paxos with raft to simplify.
we looked into etcd, however it is using memory mapped files (bbolt) and designed for read heavy workloads.
For our project we are weighing pros and cons of migrating our data to stock etcd or just take raft protocol implementation from etcd,
or port etcd to use rocksdb backend.

Thanks 
Subba

Xiang Li

unread,
Sep 7, 2018, 8:10:01 PM9/7/18
to gsai...@gmail.com, Sam Batschelet, etcd...@googlegroups.com
Replacing boltdb with rocksdb should not be hard. I initially use leveldb when designing the interface, and switched to b+tree due to spiky resource usage nature of LSM during compaction.

Subba G

unread,
Sep 7, 2018, 10:15:42 PM9/7/18
to Xiang Li, Sam Batschelet, etcd...@googlegroups.com
Xiang,
I have tried it. All unit tests are passing except these, probably because rocksdb overwrites previous value whereas boltdb keeps previous versions of the key.
Appreciate if you can guide me.

=== RUN   TestKVRestore

{"level":"info","msg":"compact tree index","revision":1}

{"level":"info","msg":"compact tree index","revision":1}

{"level":"info","msg":"resume scheduled compaction","meta-bucket-name":"meta","meta-bucket-name-key":"scheduledCompactRev","scheduled-compact-revision":1}

{"level":"info","msg":"finished scheduled compaction","compact-revision":1,"took":"127.455µs"}

--- FAIL: TestKVRestore (0.22s)

    kv_test.go:662: #0: kvs history = [[{Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1} {Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1}] [] [] [] [] [{Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1} {Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1}] [] [] [] []], want [[{Key:[102 111 111] CreateRevision:2 ModRevision:4 Version:3 Value:[98 97 114 50] Lease:3} {Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1}] [] [{Key:[102 111 111] CreateRevision:2 ModRevision:2 Version:1 Value:[98 97 114 48] Lease:1}] [{Key:[102 111 111] CreateRevision:2 ModRevision:3 Version:2 Value:[98 97 114 49] Lease:2}] [{Key:[102 111 111] CreateRevision:2 ModRevision:4 Version:3 Value:[98 97 114 50] Lease:3}] [{Key:[102 111 111] CreateRevision:2 ModRevision:4 Version:3 Value:[98 97 114 50] Lease:3} {Key:[102 111 111 50] CreateRevision:5 ModRevision:5 Version:1 Value:[98 97 114 48] Lease:1}] [] [] [] []]

    kv_test.go:662: #1: kvs history = [[{Key:[102 111 111] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 49] Lease:2}] [] [] [] [{Key:[102 111 111] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 49] Lease:2}] [] [] [] [] []], want [[{Key:[102 111 111] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 49] Lease:2}] [] [{Key:[102 111 111] CreateRevision:2 ModRevision:2 Version:1 Value:[98 97 114 48] Lease:1}] [] [{Key:[102 111 111] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 49] Lease:2}] [] [] [] [] []]

    kv_test.go:662: #2: kvs history = [[{Key:[102 111 111] CreateRevision:2 ModRevision:3 Version:2 Value:[98 97 114 49] Lease:2}] [] [] [{Key:[102 111 111] CreateRevision:2 ModRevision:3 Version:2 Value:[98 97 114 49] Lease:2}] [] [] [] [] [] []], want [[{Key:[102 111 111] CreateRevision:2 ModRevision:3 Version:2 Value:[98 97 114 49] Lease:2}] [] [{Key:[102 111 111] CreateRevision:2 ModRevision:2 Version:1 Value:[98 97 114 48] Lease:1}] [{Key:[102 111 111] CreateRevision:2 ModRevision:3 Version:2 Value:[98 97 114 49] Lease:2}] [] [] [] [] [] []]


=== RUN   TestKVSnapshot

2018-09-07 19:40:06.015515 I | mvcc/backend: creating snapshot using checkpoint in directory /tmp/etcd_backend_test566355646/database.checkpoint

--- FAIL: TestKVSnapshot (0.09s)

    kv_test.go:705: kvs = [{Key:[102 111 111 50] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 50] Lease:3} {Key:[102 111 111 50] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 50] Lease:3} {Key:[102 111 111 50] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 50] Lease:3}], want [{Key:[102 111 111] CreateRevision:2 ModRevision:2 Version:1 Value:[98 97 114] Lease:1} {Key:[102 111 111 49] CreateRevision:3 ModRevision:3 Version:1 Value:[98 97 114 49] Lease:2} {Key:[102 111 111 50] CreateRevision:4 ModRevision:4 Version:1 Value:[98 97 114 50] Lease:3}]


Thanks

Subba

Xiang Li

unread,
Sep 8, 2018, 11:59:30 AM9/8/18
to Subba G, Sam Batschelet, etcd...@googlegroups.com
Boltdb does not do that. It also overwirites values after the transactions are committed. I think you havr not implemented the snapshot related interface correctly. 

Subba G

unread,
Feb 16, 2019, 7:28:48 PM2/16/19
to Xiang Li, Sam Batschelet, etcd-dev
I have completed changes for using etcd with rocksdb backend. However, I would like to make it easy to plugin any other db engine, for example badger.
And if I need to submit it to the community how should I structure this code, so that it is modular and changes related to one engine does not break other.

Thanks
Subba

On Fri, Sep 7, 2018 at 8:09 PM Xiang Li <xiang...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages