I am seeing lots of errors when trying to GET/PUT/POST configurations. About 50% of the time I will get connection refused
eg, from ansible:
failed: [10.0.136.98] => {"failed": true}
msg: Socket error: [Errno 111] Connection refused to http://connect-elasticsearch-indexer.service.consul:31099/connectors/elasticsearch-indexer/config
And when I can successfully get and then post a new configuration errors occur in the logs:
ERROR Unexpected error during connector task reconfiguration:
ERROR Failed to reconfigure connector's tasks, retrying after backoff:
ERROR Request to leader to reconfigure connector tasks failed
ERROR Task reconfiguration for elasticsearch-indexer failed unexpectedly, this connector will not be properly reconfigured unless manually triggered.
ERROR IO error forwarding REST request:
In this scenario I am running two process (via marathon/mesos) on two different slaves.
Other strange behaviors:
If after the above I manually GET the configuration via /connectors/elasticsearch-indexer/config the result will show all the topics I PUT above -- this is for both processes. But when I look at the tasks via /connectors/elasticsearch-indexer/tasks it shows each task using the previous set of topics, but not havin any errors.
After restarting the process, the tasks have the full set of topics.
Logging is very flakey. If there is any kind of load, or if any error is ever emitted, logs no longer get written. The tasks will continue to do work, and log statements will continue to be called, but no more output. It's hard to imagine what could cause this, as slf4j/logback is usually rock solid. This seems to only happen with a FileAppender. When I use a console appender I don't see this. The kafka-connect processes are running docker containers writing the logs to a mounted volume. But this is how all our processes work, and we've never seen logging just stop.