User timestamp feature

455 views
Skip to first unread message

Sandhya Sundaresan

unread,
Aug 2, 2021, 2:17:28 PM8/2/21
to rocksdb
Hi, I see this feature was listed as "experimental" in 2019. Is it still the case ? I don't see the option in WriteOptions or ReadOptions yet in the code ? 
As a related question what is the best way to delete a range of keys based on timestamp range ? Appending timestamp as a prefix to the key is one option but I understand it can be inefficient for point lookups. 

Thanks
Sandhya

Yanqin Jin

unread,
Aug 2, 2021, 4:25:07 PM8/2/21
to Sandhya Sundaresan, rocksdb
Hi Sandhya,

It's still experimental in the sense that:
  • it's currently supported in a subset of RocksDB features: Get, Put, WriteBatch (with only Put and Delete), Delete, Iterator, basic Garbage collection not necessarily the most efficient implementation), and
  • new APIs and/or options may be added, and
  • (less likely) existing user-defined timestamp APIs and/or options are also subject to change in future RocksDB releases
It's available in both ReadOptions and WriteOptions. You can find them in options.h (https://github.com/facebook/rocksdb/blob/master/include/rocksdb/options.h#L1336). The Java binding is missing them though.

> Delete range of keys based on timestamp range
This is currently not supported.

One thing to note is that, for the same user key K, we currently have the following constraint:
suppose <K, ts1, seq1> and <K, ts2, seq2> are two versions of the key.
If ts1 < ts2, then seq1 <= seq2.

Therefore, care must be taken when writing keys to the db.

Yanqin


From: roc...@googlegroups.com <roc...@googlegroups.com> on behalf of Sandhya Sundaresan <san...@dremio.com>
Sent: Monday, August 2, 2021 11:17 AM
To: rocksdb <roc...@googlegroups.com>
Subject: User timestamp feature
 
--
You received this message because you are subscribed to the Google Groups "rocksdb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rocksdb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rocksdb/9e6121ef-18d5-4e73-818a-8e0047d0940bn%40googlegroups.com.

Sandhya Sundaresan

unread,
Aug 2, 2021, 4:39:02 PM8/2/21
to Yanqin Jin, rocksdb
Hi Yanqin,
 This is helpful - thank you so much !
Sandhya

Saikat Maitra

unread,
Feb 3, 2022, 4:21:30 PM2/3/22
to rocksdb
Hi,

I observed that in the latest release tag 6.28.2 the ReadOptions has java binding for user defined timestamp but the WriteOptions does not have support for java binding.


When I am trying the ReadOptions with timestamp I am getting an error in local as below 

Assertion failed: (timestamp_size_ == user_comparator_.timestamp_size()), function DBIter, file db/db_iter.cc, line 92.

I was curious to know what is the default timestamp size and if I can use it to get/iterate on keys based on ReadOptions support for timestamp?

Regards,
Saikat

Yanqin Jin

unread,
Feb 4, 2022, 10:56:22 AM2/4/22
to Saikat Maitra, rocksdb
We will remove timestamp​ from WriteOptions​ starting from RocksDB 7.0, and that's why we didn't include it in WriteOptions​ in the Java binding as part of https://github.com/facebook/rocksdb/pull/9295, which will be merged soon. From 7.0, we will add new overloaded versions of Put, Delete, SingleDelete, etc (Merge and DeleteRange not implemented) that accept an additional argument timestamp​.

There is no default timestamp size. The size of the timestamp for the column family is determined by the Comparator​. which you pass to options​ when you create the column family or open the db with default column family.
DBIter supports timestamp. You can refer to db_with_timestamp_basic_test.cc for examples.
A library that provides an embeddable, persistent key-value store for fast storage. - WriteOptions - add missing java API. by rhubner · Pull Request #9295 · facebook/rocksdb


From: roc...@googlegroups.com <roc...@googlegroups.com> on behalf of Saikat Maitra <saikat...@gmail.com>
Sent: Thursday, February 3, 2022 1:21 PM
To: rocksdb <roc...@googlegroups.com>
Subject: Re: User timestamp feature
 

Saikat Maitra

unread,
Feb 4, 2022, 12:03:56 PM2/4/22
to rocksdb
Hi Yangin,

Thank you for your prompt response. Yes, I saw the PR changes for the WriteOptions but since it was missing setTimestamp method I was thinking about adding the same in local and make a PR. I really appreciate your response regarding the reasoning why it is missing in WriteOptions java binding. Also I like the new api approach for 7.0 that timestamp will be supported in the Put, Delete, SingleDelete etc and it will be much easier to use as we can pass directly in read/write operations without setting the Read/Write Options first.

The comparator test is very helpful, I was thinking in the similar lines that we can use Long or Instant for timestamp and use the comparator to filter out the keys.

My last thought is as a user of Java bindings do you see a path forward without the WriteOptions in the 6.28.x+ releases to try the user timestamp feature ? 
 
Regards,
Saikat

Saikat Maitra

unread,
Feb 4, 2022, 4:11:07 PM2/4/22
to rocksdb
Hi Yangin,

Here is the PR for adding timestamp support in WriteOptions in java api I mentioned in my previous message. My thoughts are if these changes are accepted then we should be able to use User Timestamp feature as is in 6.28.x+ versions and moving forward adapt to new Put, Delete, SingleDelete apis in 7.0 release.


Regards,
Saikat

Yanqin Jin

unread,
Feb 4, 2022, 4:42:22 PM2/4/22
to Saikat Maitra, rocksdb
Since we merge pull requests only to the main branch from which the WriteOptions::timestamp​ has been removed already, we should not add it back, given it was contradictory to our coding/design convention. We made this change because we are going to release RocksDB 7.0 and/or the feature itself is experimental.
I would suggest starting from RocksDB 7.0. Java bindings for that are also missing. Therefore, it would be great if you can contribute.
If you are stuck with a version below 7.0, then I am afraid you can patch your local branch with the proposed changes. The WriteOptions should still have timestamp​, you just need to patch it with Java bindings.

Thanks,
Yanqin

From: roc...@googlegroups.com <roc...@googlegroups.com> on behalf of Saikat Maitra <saikat...@gmail.com>
Sent: Friday, February 4, 2022 1:11 PM

Saikat Maitra

unread,
Feb 4, 2022, 5:29:28 PM2/4/22
to rocksdb
Hi Yanqin,

Thank you for your quick response. I see the new Put, Delete and SingleDelete apis with timestamp support in the below PR, I will look into if I can add support for the same in Java bindings.


I will also try to patch the local branch with the WriteOptions changes and see if that meet my usecase until RocksDB 7.0 is released.

Happy Weekend.

Regards,
Saikat

Yanqin Jin

unread,
Feb 4, 2022, 5:39:05 PM2/4/22
to Saikat Maitra, rocksdb
Sure thing. Let us know if you run into any issues.

Thanks,
Yanqin

Sent: Friday, February 4, 2022 2:29 PM

farsee...@gmail.com

unread,
Jul 5, 2022, 12:23:14 AM7/5/22
to rocksdb
Hi yanqin,
I failed to find this timestamp ordering constraint in current main branch v7.4 code base, is this constraint lifted since v7.0?

Yanqin Jin

unread,
Jul 5, 2022, 1:37:42 PM7/5/22
to farsee...@gmail.com, rocksdb
Thanks for the interest. I don't think this constraint is enforced in the code (otherwise we need to perform a read for each Put/Delete/SingleDelete/etc to find out the latest ts and seq of the keys), but it's a contract between application and RocksDB. Although we may relax it in the future, we currently have not lifted it.

Yanqin

From: roc...@googlegroups.com <roc...@googlegroups.com> on behalf of farsee...@gmail.com <farsee...@gmail.com>
Sent: Monday, July 4, 2022 9:23 PM

To: rocksdb <roc...@googlegroups.com>
Subject: Re: User timestamp feature
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages