Thanks Jan! That's reassuring that it can work for others, so I'll push ahead with trying to get it working for me.
I'm using the TimeWindowCompactionStrategy.java that was included in the main Cassandra 3.8 and later builds (I have 3.9). I doubt something could have broken between the jeffjirsa version and the main build, but I'll look into that possibility. I've confirmed in the cassandra debug.logs that TWCS is running. I see messages in the logs like these samples I've pulled out:
DEBUG [CompactionExecutor:70] 2017-02-07 04:34:49,439 TimeWindowCompactionStrategy.java:300 - No compaction necessary for bucket size 3 , key 1483200000, now 1483200000
DEBUG [CompactionExecutor:76] 2017-02-07 07:51:06,102 TimeWindowCompactionStrategy.java:111 - TWCS skipping check for fully expired SSTables
DEBUG [CompactionExecutor:76] 2017-02-07 07:51:06,102 TimeWindowCompactionStrategy.java:286 - Using STCS compaction for first window of bucket: data files...
CREATE TABLE metricdb.data_points (
key blob,
column1 blob,
value blob,
PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE
AND CLUSTERING ORDER BY (column1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy', 'compaction_window_size': '60', 'compaction_window_unit': 'MINUTES', 'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.1
AND speculative_retry = 'NONE';
And here is what my data files look like. Note, my compaction_window_size is 60 minutes. After running for a day, I expected one file for each hour, but instead it's still merging them into STCS groups.
[root@itm3650c data_points-53f5ef70ecae11e6a39505e3aa874c40]# ls -laht *-Data.*
-rw-r--r-- 1 root root 23M Feb 7 07:41 mc-1396-big-Data.db
-rw-r--r-- 1 root root 25M Feb 7 07:40 mc-1395-big-Data.db
-rw-r--r-- 1 root root 66M Feb 7 07:40 mc-1394-big-Data.db
-rw-r--r-- 1 root root 750M Feb 7 07:37 mc-1379-big-Data.db
-rw-r--r-- 1 root root 65M Feb 7 07:33 mc-1385-big-Data.db
-rw-r--r-- 1 root root 758M Feb 7 06:05 mc-1242-big-Data.db
-rw-r--r-- 1 root root 3.0G Feb 7 05:01 mc-1117-big-Data.db
-rw-r--r-- 1 root root 3.0G Feb 6 22:49 mc-568-big-Data.db
I think I've narrowed down my issue to this:
DEBUG [CompactionExecutor:2] 2017-02-06 15:55:28,283 TimeWindowCompactionStrategy.java:300 - No compaction necessary for bucket size 2 , key 1483200000, now 1483200000
...
DEBUG [CompactionExecutor:76] 2017-02-07 07:33:31,169 TimeWindowCompactionStrategy.java:300 - No compaction necessary for bucket size 3 , key 1483200000, now 1483200000
Whatever I'm doing, it isn't incrementing the key and now variables. And as shown in the code snippet above, key < now is needed to trigger the trimToThreshold(bucket, maxThreshold)
I'll keep looking at the TimeWindowCompactionStrategy.java and see if it gives me any clues as to what is going wrong for me.
I'm inserting 0.5 million records per minute through KairosDB. I'm watching my compaction threads with `/opt/ibm/cassandra/bin/nodetool tpstats | grep CompactionExecutor` and they are not always busy. I'm very good on resource usage.