"Fails to update history" warnings due to "NoSuchElementException"

297 prikaza
Preskoči na prvu nepročitanu poruku

Markus Halttunen

nepročitano,
27. srp 2017. 01:57:5927. 07. 2017.
u ReactiveMongo - http://reactivemongo.org
We are running ReactiveMongo 0.12.3 and MongoDB 3.4.3 on production. Sometimes without any apparent reason, our logs start to get flooded by warnings saying things like "Fails to update history: ConnectAll$IsMaster". We might get about a dozen of these messages every 10 seconds. The system does not recover without a restart. 

The full stacktrace is:
java.util.NoSuchElementException: null
at java.util.ArrayDeque.removeFirst(ArrayDeque.java:280) ~[na:1.8.0_131]
at java.util.ArrayDeque.remove(ArrayDeque.java:447) ~[na:1.8.0_131]
at shaded.google.common.collect.EvictingQueue.add(EvictingQueue.java:107) ~[org.reactivemongo.reactivemongo-shaded-0.12.3.jar:0.12.3]
at shaded.google.common.collect.EvictingQueue.offer(EvictingQueue.java:91) ~[org.reactivemongo.reactivemongo-shaded-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$class.go$1(MongoDBSystem.scala:155) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$class.reactivemongo$core$actors$MongoDBSystem$$updateHistory(MongoDBSystem.scala:163) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$8$$anonfun$apply$9.apply(MongoDBSystem.scala:130) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$8$$anonfun$apply$9.apply(MongoDBSystem.scala:129) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$class.updateNodeSet(MongoDBSystem.scala:969) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.StandardDBSystem.updateNodeSet(MongoDBSystem.scala:1318) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at reactivemongo.core.actors.MongoDBSystem$$anonfun$reactivemongo$core$actors$MongoDBSystem$$onIsMaster$1.apply$mcV$sp(MongoDBSystem.scala:935) [org.reactivemongo.reactivemongo_2.11-0.12.3.jar:0.12.3]
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) [com.typesafe.akka.akka-actor_2.11-2.4.16.jar:na]
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39) [com.typesafe.akka.akka-actor_2.11-2.4.16.jar:na]
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415) [com.typesafe.akka.akka-actor_2.11-2.4.16.jar:na]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [org.scala-lang.scala-library-2.11.8.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [org.scala-lang.scala-library-2.11.8.jar:na]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [org.scala-lang.scala-library-2.11.8.jar:na]
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [org.scala-lang.scala-library-2.11.8.jar:na]

Any ideas what could cause this and how do we fix it? Should I report a bug? It happens seemingly randomly so it is difficult to reproduce.

Thanks,
Markus

Cédric Chantepie

nepročitano,
27. srp 2017. 08:31:2127. 07. 2017.
u ReactiveMongo - http://reactivemongo.org
That's a non fatal issue in monitoring which can be safely ignored,and will be fixed.

Markus Halttunen

nepročitano,
19. kol 2017. 01:43:4219. 08. 2017.
u ReactiveMongo - http://reactivemongo.org
Thank you for the update. Do you have any estimate when this might be fixed? The problem occurs every few days and completely messes up our alerting. Usually when the log is flooded with this many warnings, it is a sign that something is seriously wrong.

Cédric Chantepie

nepročitano,
19. kol 2017. 07:51:3319. 08. 2017.
u ReactiveMongo - http://reactivemongo.org
Have you tried recent releases?

Sebastian

nepročitano,
31. kol 2017. 08:14:3431. 08. 2017.
u ReactiveMongo - http://reactivemongo.org
Hey,
 
Have you tried recent releases?

We are running 0.12.6 and this is still happening quite extensively (as in thousands of occurrences over a few hours.)

java.util.NoSuchElementException: null
at java.util.ArrayDeque.removeFirst(ArrayDeque.java:280)
at java.util.ArrayDeque.remove(ArrayDeque.java:447)
at shaded.google.common.collect.EvictingQueue.add(EvictingQueue.java:107)
at shaded.google.common.collect.EvictingQueue.offer(EvictingQueue.java:91)
at reactivemongo.core.actors.MongoDBSystem$class.go$1(MongoDBSystem.scala:155)
at reactivemongo.core.actors.MongoDBSystem$class.reactivemongo$core$actors$MongoDBSystem$$updateHistory(MongoDBSystem.scala:163)
[...] 

Cédric Chantepie

nepročitano,
31. kol 2017. 13:17:0631. 08. 2017.
u ReactiveMongo - http://reactivemongo.org
If using 0.12.6, there could be a warning ("Fails to update history: ..."), but there is no way it can throw up an exception: https://github.com/ReactiveMongo/ReactiveMongo/blob/0.12.6/driver/src/main/scala/core/actors/MongoDBSystem.scala#L154
Poruka je izbrisana

Markus Halttunen

nepročitano,
1. ruj 2017. 03:00:3601. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
They are not exceptions. They are warnings that include the stacktrace of the original exception. 

From our perspective, the current behaviour is pretty much unacceptable. A system that is functioning normally is not supposed to spam the logs with thousands of warnings per hour. The more instances you have, the more likely this is to happen every few days - forcing us to do unnecessary restarts.

Cédric Chantepie

nepročitano,
1. ruj 2017. 03:36:4701. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
I can understand that you find the verbosity of log to high. Calling that "spam" or "unacceptable" ...

On the other side the effective level of log is configurable on application side (what should anyway be done for prod ready apps).

To clarify, the history is not updated each time there is a read or write query to the DB, but when there are state changes about the ReplicaSet.

So even if there is no failure on the client/driver side when such warning happens, seeing that frequently can be a sign a instability on the nodes side with excessive state changes.

Markus Halttunen

nepročitano,
1. ruj 2017. 04:26:3201. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
The main problem is not that we see these warnings sometimes. The problem is that ReactiveMongo does not recover from the error situation (i.e. keeps logging the warning indefinitely) until the instance is restarted.

Cédric Chantepie

nepročitano,
1. ruj 2017. 05:16:5901. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
"keeps logging the warning indefinitely" doesn't require "recover from the error situation", as that's as non fatal warning for the driver (as said), so if there is no "error" that occurs on operations using the driver, there is nothing to recover from.

Markus Halttunen

nepročitano,
1. ruj 2017. 06:16:4901. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
So from your perspective it is completely OK that the driver keeps reporting this warning a thousand times per hour until the app is restarted? And your suggested solution to the root cause is to change the logging level of this class from "warn" to "error"? 

Cédric Chantepie

nepročitano,
1. ruj 2017. 06:33:3901. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
I'm suggesting that a warning is a warning. That if there is no other element indicating something is wrong, that don't mean anything more, whatever is the opinion on the warning verbosity.

Also as previously said, if there are so many ReplicaSet changes that it such non fatal issue occurs when updating the history, the ReplicaSet stability should be investigated (even on very large/prod RS, such warning is not frequent).

Sebastian

nepročitano,
1. ruj 2017. 08:26:1501. 09. 2017.
u ReactiveMongo - http://reactivemongo.org

Also as previously said, if there are so many ReplicaSet changes that it such non fatal issue occurs when updating the history, the ReplicaSet stability should be investigated (even on very large/prod RS, such warning is not frequent).


Not sure if that information helps, in our case the spam is happening also on a "single-instance" mongodb without an actual replica set.

Cédric Chantepie

nepročitano,
1. ruj 2017. 09:15:3001. 09. 2017.
u ReactiveMongo - http://reactivemongo.org
Even if there is a single node, there are ReplicaSet (rather nodeset) events, such channel events (in case of network events). BTW the way, a single node setup would hardly be ok as a prod setup.

Yardena Meymann

nepročitano,
8. sij 2018. 00:59:4908. 01. 2018.
u ReactiveMongo - http://reactivemongo.org
Hi,

The issue happens for us as well, version 0.12.6. Just clogged our monitoring overnight. 

Thanks,
  Yardena

Cédric Chantepie

nepročitano,
8. sij 2018. 07:37:5708. 01. 2018.
u ReactiveMongo - http://reactivemongo.org
It should

Yardena Meymann

nepročitano,
8. sij 2018. 07:45:5308. 01. 2018.
u ReactiveMongo - http://reactivemongo.org
Thanks Cédric, that's great news. 
Do you plan a release in the near future that will include the fix?

On Monday, January 8, 2018 at 2:37:57 PM UTC+2, Cédric Chantepie wrote:
It should

Cédric Chantepie

nepročitano,
10. sij 2018. 06:53:3310. 01. 2018.
u ReactiveMongo - http://reactivemongo.org
It's included in latest snapshot for me.

Michael Vilensky

nepročitano,
14. sij 2018. 05:30:1314. 01. 2018.
u ReactiveMongo - http://reactivemongo.org
Hi Cédric,

We are having the same issue. Once we are getting this warning with the traceback, all future mongo queries fail.
Is it possible to release a version with this fix?

Thanks!
Odgovori svima
Odgovori autoru
Proslijedi
0 novih poruka