Re: Bigtable question - CAS design, millisecond/microsecond granularity and filters

228 views

Skip to first unread message

Solomon Duskis

unread,

Nov 27, 2018, 1:10:36 PM11/27/18

to bastien...@nuwa.be, google-cloud-b...@googlegroups.com, Douglas Mcerlean, Sandy Ghai

+google-cloud-b...@googlegroups.com for wider visibility,
+some targeted Cloud Bigtable team mates.

We did discuss allowing microseconds. Doug and Sandy, what do you think?

Solomon Duskis |

Google Cloud clients |

sdu...@google.com |

914-462-0531

On Tue, Nov 27, 2018 at 12:04 PM Bastien Duclaux <bastien...@nuwa.be> wrote:

Dear Solomon

I’m working on a large project using Bigtable. I’ve been reading many of your answers on Stack Overflow regarding Bigtable.

Please forgive my direct email, but I could not find the answers to my questions in the documentation , GitHub or SO, so I was hoping you could help me !

My backend is in C++, but that does not matter.

I’m implementing a method to safely modify a row in a concurrent environment with many threads. Basically a Check & Set (CAS) optimistic locking based on the latest cell timestamp.

The modifications I have to do are not supported by the ReadModifyWrite API, as I have to append to and then sort the content of a cell.

The modified cell could be up to 1Mb (it stores a large vector of int64 ids).

My idea is to call the CheckAndMutateRow with a predicate which will be true only if the current cell version corresponds to the previously fetched cell, in order to avoid a race condition.

The pseudo code is like this :

while (not ok)

{

row= get_last_cell ( row_key, family, col)

timestamp= row.timestamp();

modified_row= change_row_contents( row.value() )

predicate = FilterChain (

ColumnRangeClosed( family, col, col)

Latest(1)

TimeStampRangeMicros ( timestamp - 1, timestamp ) //Only match if the timestamp of the latest row is <= timestamp

)

ok= CheckAndMutateRow(

row_key,

predicate,

{ MutateRow( modified_row } },

{} //Do nothing if row has been changed to avoid race condition

)

}

I was hoping to use the timestamp in micros, but I received error messages from the API :

Error in field 'predicate filter' : Error in field 'chained filter list' : Error in element #2 : Error in field 'timestamp_range_filter' : Timestamp granularity mismatch. Expected a multiple of 1000 (millisecond granularity), but got 1543334700819999.

So it seems that the API can’t use microseconds level resolution !

This is very sad, as a lot can happen in a millisecond on a particular key, and hence having a CAS operation based on millisecond-level predicate seems too risky to me, as I have a lot of workers and threads.

So I have two questions :

- why not remove the milliseconds restriction in the API, and move to the full microseconds ranges, like in the Bigtable backend ? Any reason why this design choice was made initially by the public API developers ?

- what would be your suggestion for a predicate to check that the cell is the true last one during the call to CheckAndMutateRow ?

I could use a valueRegexpFilter with the contents of the cell, but it means sending the cell contents back to the Bigtable server, and this is a truly horrible design performance-wise given the size of my cells.

Another approach would be to store in a dedicated cell a int64 auto_incremented value every time a row is updated by the CheckAndMutateRow. And use that cell in the predicate. This would avoid sending the full content of the large cell, and only refer to this small auto increment cell. Probably quite fast. And even safer than a micro-second timestamp.

Your ideas would be very much appreciated, as you’re an expert in these topics :-)

Thanks a lot,

Bastien

Reply all

Reply to author

Forward

0 new messages