Using lightweight transactions (CAS) for Insert

741 views
Skip to first unread message

Li Wang

unread,
Nov 7, 2013, 9:11:15 PM11/7/13
to java-dri...@lists.datastax.com
Hi,

I need to do a conditional insert. Is "IF NOT EXIST" the only condition we can check for the INSERT statement, or can we do something more powerful?

All the samples I have seen for INSERT use the "IF NOT EXIST" condition.

INSERT INTO users (login, email, name, login_time)
VALUES ('jdoe', 'jd...@abc.com', 'Jane Doe', )
IF NOT EXIST

Thanks,

Li

Michael Dykman

unread,
Nov 7, 2013, 9:22:18 PM11/7/13
to java-dri...@lists.datastax.com

You can always declare a unique index on any fields you like, then do

     insert ...  on duplicate key ignore

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Blair Zajac

unread,
Nov 7, 2013, 9:37:29 PM11/7/13
to java-dri...@lists.datastax.com

Robert Mullen

unread,
Nov 8, 2013, 10:31:49 AM11/8/13
to java-dri...@lists.datastax.com
Not sure how relevant this information is to your particular situation but it might help so I thought I'd share.  If you are inserting a new record for a duplicate primary key Cassandra does not enforce unique-ness. Inserting a duplicate row key will upsert the columns contained in the insert statement rather than return a unique constraint violation.

Because of this we don't even bother with the "IF NOT EXISTS" in our inserts.

Alex Popescu

unread,
Nov 8, 2013, 11:33:53 AM11/8/13
to java-dri...@lists.datastax.com
On Friday, November 8, 2013, Robert Mullen wrote:
Not sure how relevant this information is to your particular situation but it might help so I thought I'd share.  If you are inserting a new record for a duplicate primary key Cassandra does not enforce unique-ness. Inserting a duplicate row key will upsert the columns contained in the insert statement rather than return a unique constraint violation.

Because of this we don't even bother with the "IF NOT EXISTS" in our inserts.


I might be misreading your comment but the behavior of IF NOT EXIST is the following:

- if the primary key already exists, the INSERT will return False and the existing row values
- if the primary key doesn't exist, the new row is inserted and the operation returns True

So basically it helps you ensure you are not overridding some existing values.
 

On Thursday, November 7, 2013 7:11:15 PM UTC-7, Li Wang wrote:
Hi,

I need to do a conditional insert. Is "IF NOT EXIST" the only condition we can check for the INSERT statement, or can we do something more powerful?

All the samples I have seen for INSERT use the "IF NOT EXIST" condition.

INSERT INTO users (login, email, name, login_time)
VALUES ('jdoe', 'jd...@abc.com', 'Jane Doe', )
IF NOT EXIST

Thanks,

Li

To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.


--

:- a)


Alex Popescu
Sen. Product Manager @ DataStax
@al3xandru

Mullen, Robert

unread,
Nov 8, 2013, 11:44:14 AM11/8/13
to java-dri...@lists.datastax.com
So that's the difference, without the "IF NOT EXISTS" the INSERT will be an upsert and override the original values, with it, it will preserve the original values.  This is all good to know. 
Thanks,
Rob

Li Wang

unread,
Nov 8, 2013, 2:20:33 PM11/8/13
to java-dri...@lists.datastax.com
Thanks for all the replies.  I agreed with Rob. There is a difference between using "IF NOT EXISTS" or not for the INSERT.

In my specific scenario,  I have a table defined as the following. The typical query I have is to find all the events since a particular time. To handle the potential clock skew issue within the cluster, when inserting a new event, I need to make sure the event_time is always AFTER (greater than) the max event_time of the given user.  I wonder if I can achieve this by using CAS transactions provided in 2.0.  I checked the https://github.com/riptano/cassandra-dtest/blob/master/cql_tests.py#L3064, and only found the sample of using "IF NOT EXISTS", which is not what I am looking for.


 CREATE TABLE user_events (
    username varchar,
    event_time timeuuid,
    event_name  varchar,
    event_data blob,
    PRIMARY KEY (username, event_time, event_name)
) WITH CLUSTERING ORDER BY (event_time DESC);


Thanks,

Li

Techy Teck

unread,
Nov 8, 2013, 2:50:07 PM11/8/13
to java-driver-user
As a part of this discussion, I tried using IF NOT EXISTS feature but it's not working for me -

If I execute the below query from CQL shell, it doesn't work for me at all... This is what I get -

cqlsh:pp> insert into test_new (employee_id, employee_name, value, last_modified_date) values ('1', 'e29',  'some_new_value', now()) if not exists
;
Bad Request: line 1:123 missing EOF at 'if'

Is there anything I am missing here as I am running using Cassandra CQL Shell? I am running Cassandra 1.2.3


Joaquin Casares

unread,
Nov 8, 2013, 3:00:30 PM11/8/13
to java-dri...@lists.datastax.com
Depending on how discrete your event_time is, IF NOT EXISTS may not work for you. You'd instead have to perform a read at a CL.QUORUM, if you wrote at a CL.QUORUM, then check if you should write. But by the time you issue the write, a newer timestamp may have already been inserted either way.

Do keep in mind, however, that reading before writing is also a Cassandra anti-pattern. If you can avoid having to do a read-before-write (IF NOT EXISTS also counts as a read-before-write) that'd be ideal. Doing so should also make your application more performant. You should probably consider the IF NOT EXISTS option for those rare cases when you require lightweight transactions and you can't make your application work without it.

Perhaps you can check upon read for cases where clock skew may have occurred and handle those cases then?


Techy Tech: This is a feature that's new to Cassandra 2.0: http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0


Cheers,

Joaquin Casares
DataStax
Software Engineer in Test



Reply all
Reply to author
Forward
0 new messages