Hi all,
I posted this in IRC the other day but am not sure whether my question
was seen.
Every so often I observe Brooklyn fail to poll for a Redis server's
status. The root error, occurring when Brooklyn polls for the server's
stats, is:
Caused by: java.lang.IllegalStateException:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at
brooklyn.util.internal.ssh.sshj.SshjTool.checkConnected(SshjTool.java:329)
~[brooklyn-core-0.7.0-SNAPSHOT.jar:na]
Brooklyn sets the entity's status to on fire, but really the problem is
Brooklyn's. The machine and the Redis process are still running. There
are several other entities on the same machine, so a lot of SSH
connections are happening. It occurs fairly regularly in deployments of
my app. There are no policies on the entity.
Can anybody suggest what is going wrong? I've included more output from
an instance of the error below. I can put more logs in a gist if useful.
Thanks,
Sam
2014-04-04 17:17:52,475 WARN Execution failed, invocation error for
check-running RedisStoreImpl{id=aelKjh80}:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
Put(path=[/tmp/brooklyn-20140404-171728400-OTf6-check-running_RedisStoreImpl_i.sh
195]) (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not
connected! (throwing)
2014-04-04 17:17:52,476 INFO STDIN of problem in Task[ssh:
check-running RedisStoreImpl{id=aelKjh80} [Stream[stdin/179B],
TRANSIENT, SUB-TASK,
Wrapped[contextEntity:RedisStoreImpl{id=aelKjh80}]]; EhH9XZkf]:
export
RUN_DIR="/home/users/brooklyn/brooklyn-managed-processes/apps/NMQDtBQA/entities/RedisStore_aelKjh80"
mkdir -p $RUN_DIR
cd $RUN_DIR
./bin/redis-cli -p 6384 ping > /dev/null
2014-04-04 17:17:52,482 WARN Read of
RedisStoreImpl{id=aelKjh80}->Sensor: service.isUp (java.lang.Boolean)
gave exception: brooklyn.util.exceptions.PropagatedRuntimeException:
Execution failed, invocation error for check-running
RedisStoreImpl{id=aelKjh80}:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
Put(path=[/tmp/brooklyn-20140404-171728400-OTf6-check-running_RedisStoreImpl_i.sh
195]) (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
2014-04-04 17:17:52,484 WARN Error executing DstJob:Task[service.isUp @
aelKjh80 <- FunctionPollConfig [TRANSIENT,
Wrapped[contextEntity:RedisStoreImpl{id=aelKjh80}]]; NRJi502L]
(scheduled job of Task[MfIjSMW6] - ); cancelling scheduled execution
brooklyn.util.exceptions.PropagatedRuntimeException:
at
brooklyn.util.exceptions.Exceptions.propagate(Exceptions.java:70)
~[brooklyn-utils-common-0.7.0-SNAPSHOT.jar:na]
Caused by: java.util.concurrent.ExecutionException:
java.lang.IllegalStateException: Execution failed, invocation error for
check-running RedisStoreImpl{id=aelKjh80}:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
Put(path=[/tmp/brooklyn-20140404-171728400-OTf6-check-running_RedisStoreImpl_i.sh
195]) (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
[na:1.7.0_51]
Caused by: java.lang.IllegalStateException: Execution failed, invocation
error for check-running RedisStoreImpl{id=aelKjh80}:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
Put(path=[/tmp/brooklyn-20140404-171728400-OTf6-check-running_RedisStoreImpl_i.sh
195]) (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at
brooklyn.entity.basic.lifecycle.ScriptHelper.logWithDetailsAndThrow(ScriptHelper.java:325)
~[brooklyn-software-base-0.7.0-SNAPSHOT.jar:na]
Caused by: brooklyn.util.internal.ssh.SshException:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
Put(path=[/tmp/brooklyn-20140404-171728400-OTf6-check-running_RedisStoreImpl_i.sh
195]) (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at
brooklyn.util.internal.ssh.SshAbstractTool.propagate(SshAbstractTool.java:148)
~[brooklyn-core-0.7.0-SNAPSHOT.jar:na]
Caused by: brooklyn.util.internal.ssh.SshException:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22)
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) error acquiring
SFTPClient() (attempt 1/1, in time 24.1s/2m); out of retries:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at
brooklyn.util.internal.ssh.SshAbstractTool.propagate(SshAbstractTool.java:148)
~[brooklyn-core-0.7.0-SNAPSHOT.jar:na]
Caused by: java.lang.IllegalStateException:
(
broo...@ec2-54-197-39-158.compute-1.amazonaws.com:22) ssh not connected!
at
brooklyn.util.internal.ssh.sshj.SshjTool.checkConnected(SshjTool.java:329)
~[brooklyn-core-0.7.0-SNAPSHOT.jar:na]