Hi all,
I am trying to setup a m3db cluster on three physical nodes to store prometheus data.
I installed the binaries on the nodes, the m3dbnode processes are running and
communicating with each other.
I tried to setup an unaggregated namespace using the "database/create" api:
curl -X POST http://localhost:7201/api/v1/database/create -d '{
"type": "cluster",
"namespaceName": "metrics_shortterm",
"retentionTime": "744h",
"numShards": "512",
"replicationFactor": "3",
"hosts": [
{
"id": "myhost01",
"isolationGroup": "rack01",
"zone": "embedded",
"weight": 100,
"address": "111.222.33.44",
"port": 9000
},
{
"id": "myhost02",
"isolationGroup": "rack02",
"zone": "embedded",
"weight": 100,
"address": "111.222.33.45",
"port": 9000
},
{
"id": "myhost03",
"isolationGroup": "rack03",
"zone": "embedded",
"weight": 80,
"address": "111.222.33.46",
"port": 9000
}
]
}'
In
the output of "curl
http://localhost:7201/api/v1/placement" I see many
shards in "INITIALIZING" state (even 24 hours later). I do see network
connections between the nodes on port 9000 (output of 'netstat'). But no
progress. During the database creation I find this in the logs:
Nov 13 16:33:56 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659236.796988,"msg":"initial topology / placement value received"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659246.9907064,"msg":"cluster tchannelthrift: listening","address":"0.0.0.0:9001"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659246.9920237,"msg":"cluster httpjson: listening","address":"0.0.0.0:9003"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659246.9921637,"msg":"cluster database initializing topology"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659246.992175,"msg":"cluster database resolving topology"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659246.9921849,"msg":"cluster database resolved topology"}
Nov 13 16:34:06 myhost01.mydomain.de m3dbnode[103829]: {"level":"warn","ts":1573659246.992204,"msg":"topology has no shard set for host ID","hostID":"myhost01.mydomain.de"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0082498,"msg":"creating namespaces watch"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0083137,"msg":"waiting for dynamic namespace registry initialization, if this takes a long time, make sure that a namespace is configured"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0100417,"msg":"initial namespace value received"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.010194,"msg":"resolving namespaces with namespace watch"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"warn","ts":1573659247.010264,"msg":"can not update namespace schema to empty","namespace":"metrics_shortterm","currentSchema"
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0103104,"msg":"updating database namespaces","adds":"[metrics_shortterm]","updates":"[]","removals":"[]"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0199006,"msg":"using registered interrupt handler"}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.020883,"msg":"bootstrap finished","namespace":"metrics_shortterm","duration":0.000017814}
Nov 13 16:34:07 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659247.0209155,"msg":"bootstrapped"}
Nov 13 16:34:37 myhost01.mydomain.de m3dbnode[103829]: {"level":"info","ts":1573659277.4726896,"msg":"successfully updated topology","numHosts":3}
When
I try to write a metric into the namespace (using ''curl -sS -X POST
localhost:9003/writetagged -d ...") I get this error message:
{"error":{"message":"Error({Type:INTERNAL_ERROR
Message:failed to meet consistency level majority with 0/3 success, 3
nodes responded, errors: [error writing to host myhost02: connection
pool has no connections error writing to host myhost03: connection pool
has no connections error writing to host hollow01:
Error({Type:INTERNAL_ERROR Message:not responsible for shard 32})]})","data":{"type":"INTERNAL_ERROR","message":"failed
to meet consistency level majority with 0/3 success, 3 nodes responded,
errors: [error writing to host myhost02: connection pool has no
connections error writing to host myhost03: connection pool has no
connections error writing to host hollow01: Error({Type:INTERNAL_ERROR
Message:not responsible for shard 32})]"}}}
Any idea on how to find out whats wrong or missing? Or how to create the database correctly?
Additionally:
What attribute would I need to add if I want an aggregated namespace?
What would be the advantage of running an extra etcd on the nodes
(instead of using the integrated one)?
Thanks for any help...