mutator.addDeletion(k, cf) does not work

225 views
Skip to first unread message

David Hawthorne

unread,
Jun 9, 2011, 8:24:46 PM6/9/11
to hector-users
I'm having fun over here playing with hector 0.8. :)

code:

{
String ks = "TEST";
String cf = "TEST_CF";
String key = "counter_super_test";

// se and le are string serializer and long serializer,
respectively.

Mutator<String> mutator = HFactory.createMutator(ks, se);

mutator.addDeletion(key, cf);
mutator.execute();

//
// this line throws
// [junit]
me.prettyprint.hector.api.exceptions.HInvalidRequestException:
// InvalidRequestException(
// why:invalid operation for commutative columnfamily TEST_CF
// )
//
// mutator.superDelete(key, cf, 1234567900L, le);
//

List<HCounterSuperColumn<Long, String>> counterSuperColumns =
Arrays.asList(
HFactory.createCounterSuperColumn(new Long(1234567900),
Arrays.asList(
HFactory.createCounterColumn("1", 50L, se),
HFactory.createCounterColumn("2", 10L, se)
),
le, se
),
HFactory.createCounterSuperColumn(new Long(1234568200),
Arrays.asList(
HFactory.createCounterColumn("2", 10L, se),
HFactory.createCounterColumn("3", 20L, se)
),
le, se
)
);

for (HCounterSuperColumn sc : counterSuperColumns)
{
mutator.addCounter(key, cf, sc);
}

mutator.execute();
}

After 7 runs, the value of ks: TEST, cf: TEST_CF, key:
counter_super_test, supercolumn: 1234L, subcolumn: "1" is 350.
There's no exception from the addDeletion or mutator.execute() line
right beneath it, it's just silently not doing anything.

In the cassandra system.log, I see the following output, but not from
the addDeletion execution. It's from the next execution, but it's
also partially succeeding as the value of the first counter I'm
fetching for comparison is always increasing. The mutator.execute()
calls aren't throwing an exception.


ERROR [ReplicateOnWriteStage:44] 2011-06-09 16:57:29,912
AbstractCassandraDaemon.java (line 113) Fatal exception in thread
Thread[ReplicateOnWriteStage:44,5,main]
java.lang.RuntimeException: java.lang.IllegalArgumentException:

ColumnFamily ColumnFamily(TEST_CF -deleted at 1307663849899-
[SuperColumn(1234568200 [32:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 70}
*]@1307663849909!-9223372036854775808,33:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 140}
*]@1307663849909!-9223372036854775808,]),])

already has modifications in this mutation:

ColumnFamily(TEST_CF -deleted at 1307663849899-
[SuperColumn(1234567900 [31:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 350}
*]@1307663849909!-9223372036854775808,32:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 70}
*]@1307663849909!-9223372036854775808,]),])

at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:
34)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)
Caused by: java.lang.IllegalArgumentException: ColumnFamily
ColumnFamily(TEST_CF -deleted at 1307663849899-
[SuperColumn(1234568200 [32:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 70}
*]@1307663849909!-9223372036854775808,33:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 140}
*]@1307663849909!-9223372036854775808,]),]) already has modifications
in this mutation: ColumnFamily(TEST_CF -deleted at 1307663849899-
[SuperColumn(1234567900 [31:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 350}
*]@1307663849909!-9223372036854775808,32:false:
[{445c4e90-92de-11e0-0000-242d50cf1f97, 7, 70}
*]@1307663849909!-9223372036854775808,]),])
at org.apache.cassandra.db.RowMutation.add(RowMutation.java:
117)
at
org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:
115)
at org.apache.cassandra.service.StorageProxy
$5$1.runMayThrow(StorageProxy.java:455)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:
30)

Nate McCall

unread,
Jun 9, 2011, 8:50:52 PM6/9/11
to hector...@googlegroups.com
You cant put more than one mutation for the same
column-family/key/column name combination into a batch mutation. (If
asked this out of the blue, I would have answered incorrectly, btw).
This is not just a limitation with counters, but with any mutation.

David Hawthorne

unread,
Jun 9, 2011, 11:39:10 PM6/9/11
to hector-users
Cool, thanks. What should I do about the addDeletion? I don't think
it would be affected by the behavior you mentioned, since execute() is
called immediately afterwards. Or do I need to create a separate
Mutator object for each call to execute()?

David Hawthorne

unread,
Jun 10, 2011, 12:21:32 AM6/10/11
to hector-users
On second thought, that implies that for every subcolumn in a
supercolumn that you want to update, you must make a separate call to
addCounter() and execute(). Am I understanding you correctly? I hope
not; that seems like a nasty limitation when you're working with large-
ish supercolumns. Is there a way to batch execute multiple counter
mutations in the same supercolumn?

On Jun 9, 5:50 pm, Nate McCall <n...@datastax.com> wrote:

Nate McCall

unread,
Jun 10, 2011, 12:03:50 PM6/10/11
to hector...@googlegroups.com
The subcolumn adds a level, so you cannot do:
keyspace/cf/super_col/sub_col
combination in the same mutation.

See o.a.c.db.RowMutation in the Cassandra source tree for more details.

David Hawthorne

unread,
Jun 10, 2011, 3:26:14 PM6/10/11
to hector-users
I'm going to play around with the batch mutation stuff a little, but
in the meantime I have some more information about the addDeletion
problem.

Before calling addDeletion(key, cf), if I list TEST_CF in the cli, it
shows the one row and the two super columns with two subcolumns each,
as expected. After calling execute(), the row still shows up (not
expected), but the super columns within it are gone. It looks like
it's behaving as if it's truncating the row.

I'm not entirely sure at this point what I might have to do to make
sure cassandra really deletes those supercolumns internally, so I used
nodetool and flushed/compacted/scrubbed the TEST keyspace and the
TEST_CF cf.

At this point, list TEST_CF still shows the row. I waited 10 hours
and then re-ran the tests again, hoping that the first counter
inserted (TEST/TEST_CF/counter_super_test/1234567900/"1") would equal
50 (incrementing +50 from non-existence). This turned out not to be
the case. All the counter values are 4x what they should be, which
makes sense since I ran the tests 3 times already and haven't deleted
the keyspace from the cli to force the issue.

[default@TEST] list TEST_CF;
Using default limit of 100
-------------------
RowKey: counter_super_test
=> (super_column=1234567900,
(counter=1, value=200)
(counter=2, value=40))
=> (super_column=1234568200,
(counter=2, value=40)
(counter=3, value=80))

1 Row Returned.

Here are the lines from cassandra system.log from last night when I
was running nodetool commands against it:

INFO [RMI TCP Connection(22)-127.0.0.1] 2011-06-10 01:28:09,008
ColumnFamilyStore.java (line 1011) Enqueuing flush of Memtable-
TEST_CF@967966535(720/900 serialized/live bytes, 15 ops)
INFO [FlushWriter:8] 2011-06-10 01:28:09,009 Memtable.java (line 237)
Writing Memtable-TEST_CF@967966535(720/900 serialized/live bytes, 15
ops)
INFO [FlushWriter:8] 2011-06-10 01:28:09,016 Memtable.java (line 254)
Completed flushing /var/lib/cassandra/data/TEST/TEST_CF-g-1-Data.db
(373 bytes)
ERROR [CompactionExecutor:21] 2011-06-10 01:28:22,329
CompactionManager.java (line 510) insufficient space to compact even
the two smallest files, aborting
INFO [CompactionExecutor:22] 2011-06-10 01:28:43,479
CompactionManager.java (line 632) Scrubbing SSTableReader(path='/var/
lib/cassandra/data/TEST/TEST_CF-g-1-Data.db')
INFO [CompactionExecutor:22] 2011-06-10 01:28:43,545
CompactionManager.java (line 773) Scrub of SSTableReader(path='/var/
lib/cassandra/data/TEST/TEST_CF-g-1-Data.db') complete: 1 rows in new
sstable and 0 empty (tombstoned) rows dropped
INFO [NonPeriodicTasks:1] 2011-06-10 01:28:53,529 SSTable.java (line
159) Deleted /var/lib/cassandra/data/system/Schema-g-16

I saw that the "insufficient space to compact" is a bug in the 0.8
release. I did a quick poke and changed the min_compaction_threshold
as they suggested, but it had no effect (same message).

Nate McCall

unread,
Jun 10, 2011, 4:07:50 PM6/10/11
to hector...@googlegroups.com
My understanding of the compaction bug is that it is keeping the CF
from compacting until you have more than min_compaction_threshold
SSTables. Because of this, you still have range ghosts of the deleted
row. See http://wiki.apache.org/cassandra/FAQ#range_ghosts for more
information on this.

David Hawthorne

unread,
Jun 10, 2011, 6:25:56 PM6/10/11
to hector-users
That explains why the row still shows up, even though it's empty,
which is good to know. I'm still a little confused as to why the
previous columns get reinstantiated with their previous values as a
starting point for the next increment, though. I read through some of
the FAQ on tombstoning and GCGraceSeconds, but it seems like a lot of
it wouldn't apply to a one-host test cluster.

This is looking more like a cassandra thing than a hector thing, but
I'm not sure if this is expected behavior or not, or if I just need to
configure something differently to "yes, really, delete it already,
like now would be just *great*, thaaaaanks".

Think I should send this over to the cassandra users mailing list and
see what they say?

On Jun 10, 1:07 pm, Nate McCall <n...@datastax.com> wrote:
> My understanding of the compaction bug is that it is keeping the CF
> from compacting until you have more than min_compaction_threshold
> SSTables. Because of this, you still have range ghosts of the deleted
> row. Seehttp://wiki.apache.org/cassandra/FAQ#range_ghostsfor more

Nate McCall

unread,
Jun 10, 2011, 6:42:15 PM6/10/11
to hector...@googlegroups.com
Perhaps I misread the above - you are saying that the following:
1. insert super counter column
2. delete super counter column
2a. verify deletion by getting back 'tombstone' in the CLI (this is
normal and a good thing for a distributed system to do)
3. insert same super counter column from #1

results in the values from original values in #1 added to the values from #3?

Also, where all counter mutations done via Hector or some via the CLI?

David Hawthorne

unread,
Jun 10, 2011, 7:29:21 PM6/10/11
to hector-users
That's exactly what I'm seeing. The values are all being inserted via
hector, never via the CLI. I'm only using the CLI to verify things
independently of hector.

Weird, eh?
> ...
>
> read more »

Nate McCall

unread,
Jun 10, 2011, 7:32:47 PM6/10/11
to hector...@googlegroups.com
Do open an issue then (there have been 'coming back to life' bugs once
or twice before):
https://issues.apache.org/jira/browse/CASSANDRA

Also, it would be interesting to see if you could reproduce this on the CLI.

Nate McCall

unread,
Jun 13, 2011, 12:44:45 PM6/13/11
to hector...@googlegroups.com
Saw this go buy on the cassandra-user list and wanted to call it out:
https://issues.apache.org/jira/browse/CASSANDRA-2101

Good explanation of why this may not be working as anticipated.

Reply all
Reply to author
Forward
0 new messages