create table foo^.cellstore.index (
Size,
CompressedSize,
KeyCount
);
For each column family, there would be one qualified column for each block in the CellStore indexes. The column qualifier would have the format: <filename>:<hex-offset>. Also, the row key would be the same as the row key in the CellStore index entries (we assume that's what most people will want to aggregate this info on). So for example, the CellStore index block entry for file 2/2/default/ZwmE_ShYJKgim-IL/cs103 at offset 0x28A61 might generate the following keys:
To query the cellstore.index pseudo-table for table foo to find an estimate of large rows, you would issue a query along the lines of the following:
SELECT sum(Size) FROM foo^.cellstore.index WHERE sum(Size) > 100000000;
Please respond with feedback or if you have any questions. Thanks!
- Doug