UPDATE ... IF EXISTS updates only the columns that are not set in the database

1,701 views
Skip to first unread message

tanya.m...@gmail.com

unread,
Aug 11, 2017, 6:03:44 AM8/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi all :)

I have an update query which is using IF EXISTS, but I've noticed that it updates only the columns that are not set in the db. 
I cannot share the original code, but I've made a simple test project to illustrate the problem. (The project is written in Scala)

I'm using the latest version of the driver: "com.datastax.cassandra" % "cassandra-driver-core" % "3.3.0"

I have a simple table: 

CREATE TABLE user_test (

id uuid,

firstname text,

lastname text,

randomfield text,

PRIMARY KEY (id)

)


This is my insert:
 def insertUser(id: UUID, firstName: Option[String], lastName: Option[String], randomField: Option[String]): ResultSetFuture = {
  val insertQuery: Insert = QueryBuilder.insertInto(keyspace, TABLE_NAME)
.using(ttl(86400))
.value("id", id)

if(firstName.isDefined) insertQuery.value("firstname", firstName.get)
if(lastName.isDefined) insertQuery.value("lastname", lastName.get)
if(randomField.isDefined) insertQuery.value("randomfield", randomField.get)

println(insertQuery)

session.executeAsync(insertQuery)

}

I have a test: 
cassandra.insertUser(id, Some("Scrappy"), Some("Doo"), None)

The query (as printed in console): INSERT INTO mykeyspace.user_test (id,firstname,lastname) VALUES (2918728c-bba7-433f-a063-6bb0a44850c2,'Scrappy','Doo') USING TTL 86400;

The row in the database:





This is my update: 
def updateUser(id: UUID, newFirstName: Option[String], newLastName: Option[String], randomField: Option[String]): ResultSetFuture = {
  val updateQuery = QueryBuilder.update(keyspace, TABLE_NAME)
.where(QueryBuilder.eq("id", id))

if(newFirstName.isDefined) updateQuery.`with`(QueryBuilder.set("firstname", newFirstName.get))
if(newLastName.isDefined) updateQuery.`with`(QueryBuilder.set("lastname", newLastName.get))
if(randomField.isDefined) updateQuery.`with`(QueryBuilder.set("randomfield", randomField.get))
updateQuery.ifExists()

println(updateQuery)
session.executeAsync(updateQuery)
}

The update test: 

cassandra.updateUser(id, Some("Scooby"), Some("Dooby Doo"), Some("this update found the id"))

The query as printed in the console:
UPDATE mykeyspace.user_test SET firstname='Scooby',lastname='Dooby Doo',randomfield='this update found the id' WHERE id=2918728c-bba7-433f-a063-6bb0a44850c2 IF EXISTS;


However, only the 'random' field is updated:




If I run the update query directly in the DevCenter or I connect to the db using cqlsh, then the whole row is updated.

If I remove this line >> updateQuery.ifExists() << from my code, then the whole row is updated.

Is this a bug or am I doing something wrong?

Cheers,
Tanya

Andrew Tolbert

unread,
Aug 11, 2017, 11:06:01 AM8/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Tanya,

I agree that this doesn't seem to be the right behavior.  Can you share what version of Cassandra / DSE that you are running?  Perhaps there is a known issue with that version.

Another thing that comes to mind is that changes in cassandra are applied at the cell (column) level and that cassandra uses timestamps and 'last write wins' semantics to decide what the most recent data is.

Is it possible that the queries ran in very close succession in a multi-node environment where it's possible that the clocks aren't in sync between your cassandra nodes?  For example, if the timestamp of your insert was 10 and the timestamp of your update was 9, the insert values would take precedence which is why you see 'Scrappy Doo' instead of 'Scooby Dooby Doo'.

If you still have the data you can verify this by executing the following query and comparing the writetimes.

cqlsh:mykeyspace> select id, firstname, writetime(firstname), lastname, writetime(lastname), randomfield, writetime(randomfield) from mykeyspace.user_test;


 id                                   | firstname | writetime(firstname) | lastname  | writetime(lastname) | randomfield              | writetime(randomfield)

--------------------------------------+-----------+----------------------+-----------+---------------------+--------------------------+------------------------

 2918728c-bba7-433f-a063-6bb0a44850c2 |    Scooby |     1502463404035000 | Dooby Doo |    1502463404035000 | this update found the id |       1502463404035000 


If the writetime of randomfield is less than the write times of firstname and lastname, that could indicate that it's some kind of timestamp assignment issue.  In my case the writetimes are all the same, which (very likely) means that all the writes came from the same statement.

Note that this wouldn't be a problem if your statements were executed from the same client/jvm process and your Cassandra version is 2.1 or greater, since the driver uses client timestamps by default instead of relying on the Cassandra nodes.

Thanks!
Andy

Tanya Moldovan

unread,
Aug 11, 2017, 5:22:49 PM8/11/17
to java-dri...@lists.datastax.com
Hey Andrew,

Thank you for the reply!

We are using Cassandra version 3.0.13 (DSE 5.0.9).

Indeed the write timestamps are different, the one of the 'randomfield' is smaller than the others.

Is it possible that the queries ran in very close succession in a multi-node environment where it's possible that the clocks aren't in sync between your cassandra nodes? 

Yes, this is exactly the case. The queries indeed ran in very close succession and we do have multiple nodes! Though I do not know how/if the clocks are being in sync. It might be possible that they aren't.

But there's one thing I don't understand. If the problem is caused by the fact that the clock's aren't in sync, then why the update without ifExists() works? Shouldn't it fail for simple update too?

Regards,
Tanya


Tatiana Moldovan
Developer
---
skype:   tanya.moldovan


--
You received this message because you are subscribed to the Google Groups "DataStax Java Driver for Apache Cassandra User Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Andrew Tolbert

unread,
Aug 11, 2017, 5:51:50 PM8/11/17
to DataStax Java Driver for Apache Cassandra User Mailing List
Hi Tatiana,

Yes, this is exactly the case. The queries indeed ran in very close succession and we do have multiple nodes! Though I do not know how/if the clocks are being in sync. It might be possible that they aren't.

Ah, that explains it, partially at least!  Another thing that I didn't think about at the time is that it usually not a good practice to mix light weight transactions (IF EXISTS, IF NOT EXISTS) with non-lightweight transactions.  I was just looking at the documentation and found this note:

Lightweight transactions will block other lightweight transactions from occurring, but will not stop normal read and write operations from occurring. Lightweight transactions use a timestamping mechanism different than for normal operations and mixing LWTs and normal operations can result in errors. If lightweight transactions are used to write to a row within a partition, only lightweight transactions for both read and write operations should be used. 
 
But there's one thing I don't understand. If the problem is caused by the fact that the clock's aren't in sync, then why the update without ifExists() works? Shouldn't it fail for simple update too?

The reason why it succeeds is because there is data there.  It is just really odd because the data that is being inserted has an older timestamp than the data that exists.

Thanks,
Andy
To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-us...@lists.datastax.com.

Tanya Moldovan

unread,
Aug 13, 2017, 11:21:37 AM8/13/17
to java-dri...@lists.datastax.com
Hey,

 Another thing that I didn't think about at the time is that it usually not a good practice to mix light weight transactions (IF EXISTS, IF NOT EXISTS) with non-lightweight transactions. 

Indeed, if I transform my insert query also in a lightweight transaction, then UPDATE ... IF EXISTS works fine. 
Now I understand what is happening!

Thank you for your help,
Have a nice weekend,
Tanya

Tatiana Moldovan
Developer
---
skype:   tanya.moldovan


To unsubscribe from this group and stop receiving emails from it, send an email to java-driver-user+unsubscribe@lists.datastax.com.

Reply all
Reply to author
Forward
0 new messages