Hi Sebastian,No news from my side on this PR -- I've done a fair bit of testing with it, and I'm quite confident in its ability to handle the cases as described in my earlier post here (and haven't run into any real problems with it yet). I'm not currently actively working on it further at the moment, but from my perspective it's pretty much ready to go.If there's an issue with getting it merged in and you want to test it, another option is to simply build it from my branch[1]- Gabriel
{
"metrics": [
{
"tags": {
"type": [
"percent_inodes"
],
"type_instance": [
"used"
]
},
"name": "collectd.df",
"group_by": [
{
"name": "tag",
"tags": [
"host",
"plugin_instance"
]
}
]
}
],
"plugins": [],
"cache_time": 0,
"start_relative": {
"value": "1",
"unit": "hours"
}
}
Kairos-A (release 1.2.2)
query without filters
Query Time: 32,995 ms
Sample Size: 3,814,889
Data Points: 3,814,889
query with filters
Query Time: 5,979 ms
Sample Size: 313,100
Data Points: 313,100
Kairos-B (your branch)
query without filters
Query Time: 30,597 ms
Sample Size: 3,758,180
Data Points: 3,758,180
query with filters
Query Time: 6,376 ms
Sample Size: 306,853
Data Points: 306,853
Kairos-A (1.2.2)
Query Time: 11,222 ms
Sample Size: 1,205,747
Data Points: 1,205,747
Kairos-B (your branch)
Query Time: 6,683 ms
Sample Size: 1,204,574
Data Points: 1,204,574
Kairos-A (1.2.2)
Query Time: 4,153 ms
Sample Size: 1,440
Data Points: 1,440
Kairos-B (your branch)
Query Time: 165 ms
Sample Size: 1,440
Data Points: 1,440
write_cluster: { # name of the cluster as it shows up in client specific metrics name: "write_cluster" keyspace: "kairosdb" replication: "{'class': 'NetworkTopologyStrategy','us-east' : '3'}" cql_host_list: ["cas01", "cas02", "cas03"]
# Set this if this kairosdb node connects to cassandra nodes in multiple datacenters. # Not setting this will select cassandra hosts using the RoundRobinPolicy, while setting this will use DCAwareRoundRobinPolicy. #local_dc_name: "<local dc name>"
#Control the required consistency for cassandra operations. #Available settings are cassandra version dependent: read_consistency_level: "ONE" write_consistency_level: "TWO"
#The number of times to retry a request to C* in case of a failure. request_retry_count: 2
connections_per_host: { local.core: 4 local.max: 100
remote.core: 4 remote.max: 10 }
# If using cassandra 3.0 or latter consider increasing this value max_requests_per_connection: { local: 128 remote: 128 }
max_queue_size: 500
#for cassandra authentication use the following #auth.[prop name]=[prop value] #example: #auth.user_name=admin #auth.password=eat_me
# Set this property to true to enable SSL connections to your C* cluster. # Follow the instructions found here: http://docs.datastax.com/en/developer/java-driver/3.1/manual/ssl/ # to create a keystore and pass the values into Kairos using the -D switches use_ssl: false }
read_cluster: [ { name: "read_cluster" keyspace: "kairosdb" replication: "{'class': 'NetworkTopologyStrategy','us-east' : '3'}" cql_host_list: ["cas01", "cas02", "cas03"] #local_dc_name: "<local dc name> read_consistency_level: "ONE" write_consistency_level: "TWO"
connections_per_host: { local.core: 4 local.max: 100 remote.core: 4 remote.max: 10 }
max_requests_per_connection: { local: 128 remote: 128 }
max_queue_size: 500 use_ssl: false
# Start and end date are optional configuration parameters # The start and end date set bounds on the data in this cluster # queries that do not include this time range will not be sent # to this cluster. #start_time: "2001-07-04T12:08-0700" #end_time: "2001-07-04T12:08-0700" } ]
}
Thanks for the kind words Brian :-)I don't think I specifically read about the technique that I used there, but I'm sure it was inspired by other things I've seen in the past.- Gabriel
Thanks for the kind words Brian :-)I don't think I specifically read about the technique that I used there, but I'm sure it was inspired by other things I've seen in the past.- Gabriel
Hi Brian,From what I recall, the main reasoning for this approach was purely performance. As you mentioned, it does save a little bit on (storage) space. This same space savings is paid back repeatedly seeing as this is a table that has generally has a much higher read load than write load, so the general overhead of parsing tag pairs from strings instead of a 32-bit integer, sending this data over the wire, keeping it in various caches and key indexes within Cassandra, etc could possibly add up to a somewhat significant impact over time.On the other hand, only storing the hash of course makes debugging a bit more difficult, and it's additional logic to consider, so there's also certainly a case for just storing the plaintext tag pair.- Gabriel
Hey Brian,I've started work on this (I'll be working on https://github.com/kairosdb/kairosdb/compare/develop...metricly:feature/index-backfill). A couple questions come to mind as I start this work:- Do you see this endpoint as sync or async? As I bubble up the index statements I see that your add statements are all batched. As these are reindex statements are they worth performing synchronously as a sort of rate limiting mechanism? I think I'd prefer that as I'll be backfilling a lot of data and want to push KDB as fast as it can go without overloading. Otherwise my backfill scripts outside of KDB will need to rate limit for KDB. If performance is going to be way worse, though, then I should still batch them.- I expect in my endpoint to run a normal query for the data (turn a metric name and time period into a Kairos query object and call all the normal methods) then feed those datapoints directly into the createIndexStatements method as if they were new datapoints. This won't rewrite the datapoints and will only insert the necessary index rows. Does this seem reasonable to you?Thanks,
On Sun, Mar 10, 2019 at 12:55 PM Brian Conn <> wrote:
Sounds good, I'll give it a shot this week.
--- Brian Conn
Ok, I've made a little progress and have more questions.- I think this is what you meant by modeling the API off of the queryMetricTags: https://github.com/kairosdb/kairosdb/compare/develop...metricly:feature/index-backfill#diff-5cc760fcd7185d42167a9ce3ff854de4R405
- When indexing a small number of samples this batch size gets enormous. I assuming I'm creating some duplicate statements. How could I deduplicate them to make this a more reasonable batch
- Are you thinking query params or a POST body for the endpoint? https://github.com/kairosdb/kairosdb/compare/develop...metricly:feature/index-backfill#diff-606a7a359f8892fae94d43380a4ed07fR345
- Anything else I'm missing or is this the bones of it?
Thanks for the quick feedback, I'm glad I'm on the right track. Sounds good on query params.I was getting a huge number of insert statements. I started batch submitting when I hit a few thousand statements, but it seemed like far too many still. My test data only had a few unique tag values. I think it's because I call createIndexStatements on every single resulting row: https://github.com/kairosdb/kairosdb/compare/develop...metricly:feature/index-backfill#diff-5cc760fcd7185d42167a9ce3ff854de4R414.I'm not sure I understand what you're suggesting. Don't I need to create insert statements on every matching row?Lastly, would you prefer I open a PR and continue this conversation there? Thanks again,