harness-docker-compos
04:45:19.004 ERROR NetworkClient - Node [localhost:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...04:45:19.011 ERROR URAlgorithm - Spark computation failed for engine test_ur with params {{"engineId":"test_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.driver.memory":"3g","spark.executor.memory":"1g","spark.serializer":"org.apache.spark.serializer.KryoSerializer","spark.kryo.registrator":"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator","spark.kryo.referenceTracking":"false","spark.kryoserializer.buffer":"300m","spark.es.index.auto.create":"true","spark.es.nodes":"localhost","es.nodes":"localhost","spark.es.nodes.wan.only":"true","es.nodes.wan.only":"true"},"algorithm":{"indicators":[{"name":"purchase"},{"name":"view"},{"name":"category-pref"}],"num":4}}}org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only
"spark.es.nodes": "localhost","es.nodes":"localhost","spark.es.nodes.wan.only": "true","es.nodes.wan.only":"true"
yellow open test_ur_1587957559524 x0FasDlKQoiPzcsH7Y1lbw 1 1 0 0 283b 283byellow open test_ur_1587962717198 4GwAdFlrQLOr9IMqQJg-pA 1 1 0 0 283b 283byellow open test_ur_1587954743979 HI4adyyyRQm25gVBIf1wRw 1 1 0 0 283b 283byellow open test_ur_1587945057452 gWGq94QlRBm948zar3eTEg 1 1 0 0 283b 283b
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/5118afa5-28cb-4a46-a6c6-dac3993b47f1%40googlegroups.com.
{
"jobId": "f28cd967-90cd-4e4d-b800-9ac1d6e3fec6",
"status": {
"name": "successful"
},
"comment": "Spark job",
"createdAt": "2020-04-27T04:44:56.985Z",
"completedAt": "2020-04-27T04:45:19.030Z"
},
04:45:15.830 INFO URAlgorithm - RankRDDs[1]
popRank
(4,5.0)
(14,8.0)
(19,8.0)
(7,7.0)
(42,8.0)
(36,5.0)
(6,4.0)
(37,4.0)
(45,4.0)
(2,13.0)
(16,14.0)
(41,7.0)
04:45:17.191 INFO DAGScheduler - Job 19 finished: collect at URModel.scala:81, took 1.103642 s04:45:17.198 INFO ElasticSearchClient$$anon$1 - Create new index: test_ur_1587962717198, List(popRank, category, purchase, id), Map(id -> (keyword,true), category-pref -> (keyword,true), category -> (keyword,true), purchase -> (keyword,true), popRank -> (float,false), view -> (keyword,true))04:45:18.465 WARN RestClient - request [PUT http://elasticsearch:9200/test_ur_1587962717198?include_type_name=true] returned 1 warnings: [299 Elasticsearch-7.6.0-7f634e9f44834fbc12724506cc1da681b0c3b1e3 "[types removal] Using include_type_name in create index requests is deprecated. The parameter will be removed in the next major version."]04:45:18.468 INFO ElasticSearchClient$$anon$1 - Number of ES connections for saveToEs: 104:45:18.995 INFO HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused (Connection refused)04:45:18.996 INFO HttpMethodDirector - Retrying request04:45:18.997 INFO HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused (Connection refused)04:45:18.999 INFO HttpMethodDirector - Retrying request04:45:19.000 INFO HttpMethodDirector - I/O exception (java.net.ConnectException) caught when processing request: Connection refused (Connection refused)04:45:19.002 INFO HttpMethodDirector - Retrying request
04:45:19.004 ERROR NetworkClient - Node [localhost:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...04:45:19.011 ERROR URAlgorithm - Spark computation failed for engine test_ur with params {{"engineId":"test_ur","engineFactory":"com.actionml.engines.ur.UREngine","sparkConf":{"master":"local","spark.driver.memory":"3g","spark.executor.memory":"1g","spark.serializer":"org.apache.spark.serializer.KryoSerializer","spark.kryo.registrator":"org.apache.mahout.sparkbindings.io.MahoutKryoRegistrator","spark.kryo.referenceTracking":"false","spark.kryoserializer.buffer":"300m","spark.es.index.auto.create":"true","spark.es.nodes":"localhost","es.nodes":"localhost","spark.es.nodes.wan.only":"true","es.nodes.wan.only":"true"},"algorithm":{"indicators":[{"name":"purchase"},{"name":"view"},{"name":"category-pref"}],"num":4}}}
org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: Cannot detect ES version - typically this happens if the network/Elasticsearch cluster is not accessible or when targeting a WAN/Cloud instance without the proper setting 'es.nodes.wan.only' at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:340) at org.elasticsearch.spark.rdd.EsSpark$.doSaveToEs(EsSpark.scala:104) at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:79) at org.elasticsearch.spark.rdd.EsSpark$.saveToEs(EsSpark.scala:76) at org.elasticsearch.spark.package$SparkRDDFunctions.saveToEs(package.scala:56) at com.actionml.core.search.elasticsearch.ElasticSearchClient.hotSwap(ElasticSearchSupport.scala:379) at com.actionml.engines.ur.URModel.save(URModel.scala:83) at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:296) at com.actionml.engines.ur.URAlgorithm$$anonfun$train$1.apply(URAlgorithm.scala:255) at scala.util.Success$$anonfun$map$1.apply(Try.scala:237) at scala.util.Try$.apply(Try.scala:192) at scala.util.Success.map(Try.scala:237) at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:237) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)Caused by: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[localhost:9200]] at org.elasticsearch.hadoop.rest.NetworkClient.execute(NetworkClient.java:160) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:432) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:428) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:388) at org.elasticsearch.hadoop.rest.RestClient.execute(RestClient.java:392) at org.elasticsearch.hadoop.rest.RestClient.get(RestClient.java:168) at org.elasticsearch.hadoop.rest.RestClient.mainInfo(RestClient.java:745) at org.elasticsearch.hadoop.rest.InitializationUtils.discoverClusterInfo(InitializationUtils.java:330) ... 20 common frames omitted
Look in harness logs to see why Spark jobs fail. It is 99% caused by not enough memory but the ERRORs in logs will be a variety of things. Spark uses in-memory calculations to speed training and when it runs out it throws exceptions.As I have said many times, docker-compose is limited in scaling and so is not recommended for production deployments. Ideally each service should be put in independent deployments so they can be scaled independently. We see most scaling needs in Spark but performance can be boosted by scaling Mongo and ES.
To unsubscribe from this group and stop receiving emails from it, send an email to action...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/7fe62433-b4f6-45d9-bc0b-1e36eeae6f62%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/etPan.5ea71f7c.5da1302b.419d%40occamsmachete.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/7fe62433-b4f6-45d9-bc0b-1e36eeae6f62%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/etPan.5ea71f7c.5da1302b.419d%40occamsmachete.com.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/a5ad2447-34a9-4070-84ab-5f6975a3892a%40googlegroups.com.
ERROR NetworkClient - Node [localhost:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/5118afa5-28cb-4a46-a6c6-dac3993b47f1%40googlegroups.com.
The error:
ERROR NetworkClient - Node [localhost:9200] failed (java.net.ConnectException: Connection refused (Connection refused)); no other nodes left - aborting...
indicates that the CONTAINER cannot connect to ES on its “localhost:9200” location. The container’s “localhost” is not the same as the host’s localhost. Containers work somewhat like VMs. For a container to container connection use the container name, in this case “elasticsearch:9200” not “localhost:9200”. These container names are maintained by the “docker-compose network” as if they almost like DNS names.TL;DR set “spark.es.nodes”:”elasticsearch” when using docker-compose. You may want to brush up on how docker-compose maintains a pseudo-network.es.nodes is not used anymore, only keys that start with “spark.” are needed.Also be aware that the docker-compose.yml references a tag id for all images. The published image will change even though the tag remains the same. docker-compose pull tells the system to refresh all images that are out of date. Another docker/docker-compose detail you will want to understand is how containers are instantiated. We include the watchtower container to refresh automatically in some cases.Also be aware that by default our docker-compose project launches images from our develop branch, which does not contain released code. This is meant for people who need to run pre-released code.If you run docker-compose in production, you should fork the project and pin versions to known image tags like release numbers. This is left for you to do as you wish.
To unsubscribe from this group and stop receiving emails from it, send an email to action...@googlegroups.com.
Thanks a lot.I changed it to "elasticsearch" and it works now. I had tried a few different combinations of the ip address but none of them had worked. Since, it was able to create those byte size indices I assumed, ES is accessible from the containers, but for some reason it was not able to write to it.
Thanks for all the help.
On Monday, 27 April 2020 19:43:58 UTC-4, pat wrote:
To unsubscribe from this group and stop receiving emails from it, send an email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/5118afa5-28cb-4a46-a6c6-dac3993b47f1%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/b2b05a8f-55cb-46d3-8be2-c314654b2558%40googlegroups.com.
Docker, and it’s more complex cousin Kubernetes, both deliver many benefits over old style native host installations but they come with a new set of concepts and tools.Docker-compose makes installation super easy, compared with native installs. But this ease can be deceptive — there is a cost for this.For instance:- you won’t ssh to an ip address, you’ll use docker of k8s to login to a container or run a command on the container- you won’t tail -f logs, you’ll use docker-compose and k8s to view them- you will need to explicitly attach a service to localhost with configuration and this forwards traffic back and forth between the host and containers- docker-compose runs on a single machine so all containers share the host resources- containers must communicate via network interfaces that involve routing even on a single machine — in fact k8s comes with a built in DNS resolver- All config for a container is passed in via env variables set when a container is launched — these are specified in different ways for docker-compose and k8s- in docker-compose the same storage can be attached to more than one container and will be seen in different places in the filesystems of the host and each container. This mapping of filesystems can be hard to visualize until you become familiar with it and the mapping is controlled by configuration so it changes with config.There are many other differences — too many to note here.We encourage anyone using docker-compose to not underestimate these differences in operation. Do some reading if you are unfamiliar with container technology.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/b2b05a8f-55cb-46d3-8be2-c314654b2558%40googlegroups.com.