TSDB compaction won't save space over TSDB Append's, they're effectively equivalent. But the default config for OpenTSDB is to write individual columns per data point and enable compactions. Compactions and appends save a TON of space over the individual columns because:
1) Each column has an 8 byte timestamp in storage associated with the write time (or dp time in tsdb 2.4 with date-tiered hbase compactions).
2) During serialization, HBase returns the row key with every column which is really inefficient for us.
So the recommended configurations depend on priorities:
A) If you need to save space as much as possible but have lots of CPU and IO, try OpenTSDB's appends with compaction disabled. But watch the region server's resources to make sure they aren't running out of IO.
B) If you need as much write throughput as possible, use OpenTSDB's default puts and disable compactions.
C) If you have a low write throughput and enough TSDs and region servers, then try appends or just use puts and compactions.
Yahoo is working on a better append co-processor that gives us the space savings without the region server impact.
Yeah, I need to update that doc. The qualifier for data points was only two bytes when OpenTSDB only supported second timestamp resolution. When we added millisecond resolution, the qualifiers could be 4 or 2 bytes depending on whether the DP had a second or millisecond resolution. (When we add nanoseconds it may be 6 or 8 bytes).
So the offset and flag encoding is similar but the resolution is different.
Hope that helps :)